Names are not refs vs some names are based on refs

ezekielf · March 2, 2024, 5:46pm

As I previously stated, I’m not talking about noname=yes situations. It is your opinion that the roads I’m talking about are nameless. I don’t agree with this. I don’t doubt that in areas you know well there are roads that locals really do consider nameless and therefore noname=yes is appropriate. However, that is not the case in my area. The main road through the village of Jericho is Vermont Route 15. The street signs through town all say “VT Route 15”.

https://maps.app.goo.gl/VQYKbCwcF8JVJhdU6

If I ask anyone who lives there the name of the road, the answer will be “Route 15”. To capture the full breadth of names this road is referred to by, it could be tagged name=Vermont Route 15, loc_name=Route 15, short_name=VT-15. I’m not going to pretend that the road doesn’t have a name just because that makes things easier for routers to give voice direction. At the same time, I’m not going to challenge your local knowledge when you say these types of roads are nameless in your area. Please afford the same respect to the mappers in my area.

Baloo_Uriza · March 2, 2024, 5:49pm

What’s so special about Vermont that we’re ignoring the global consensus?

ezekielf · March 2, 2024, 5:53pm

I have not seen evidence of a global consensus on this topic.

Minh_Nguyen · March 2, 2024, 6:23pm

These days, many navigation applications are capable of translating the dynamic pieces of information that go into a guidance instruction. For example, “Turn left onto Monterey Highway, State Route Eighty-Two” becomes “Doble a la derecha a Monterey Highway, Ruta Estatal Ochenta y Dos”. Spanish-speaking users appreciate it when the route number and network get translated like that. But note how the street name remains untranslated: the same users still expect the Spanish text-to-speech voice to attempt its best Spanglish for a street name, even a boring, predictable one like “North 24th Street”.

The user would comprehend “Calle Veinticuatro del Norte” and “Carreterra de Monterrey”, but it would be incredibly pedantic, whereas “Nort Tuentifort Estrit” and “Monterrey Xai-guey”^[1] would be unsurprising. Most off-the-shelf TTS engines are capable of code-switching when embedding an English name verbatim, especially if the SSML markup identifies it as English. Meanwhile, other languages have different norms. If the German Wikipedia is to be believed, it would be simply “State Route 82” in that language.

So as a thought experiment, would the mappers in Vermont object to also tagging name:en, name:es, etc. on each street named after a numbered route? Would it be OK to set name:de equal to name:en?

Forgive me for badly mangling the Spanish language here. This is just to illustrate the effect for those unfamiliar with the language. ↩︎

ezekielf · March 2, 2024, 7:20pm

But would they appreciate hearing/reading “Doble a la derecha a Vermont Route Fifteen, Ruta Vermont Quince”? I’d expect that may seem just as redundant in Spanish as “Turn right on Vermont Route Fifteen, Vermont Route Fifteen” sounds in English. If that is the case, wouldn’t the system ideally want to detect that name=Vermont Route 15 is a name based on ref=VT 15 (or network=US:VT + ref=15) and thus redundant to announce? ^[1]

I don’t see a huge problem with this, but I don’t speak for all Vermont mappers. I don’t see how it would help achieve non-redundant voice guidance, though. Maybe that’s not the goal you had in mind?

As I mentioned earlier, if this is too hard to detect, another tag could provide this hint ↩︎

Minh_Nguyen · March 2, 2024, 8:14pm

Yes, a router could try to deduplicate any street name/route number combination that includes the same number. It could even merge the two, for the quadrant directional case I mentioned earlier. There’s some precedent for this sort of heuristic: in 2017, OSRM began merging a route relation’s direction into the way ref. For example, if the way is tagged ref=VT 15^[1] and is a member of a unidirectional route relation tagged network=US:VT ref=15 direction=east, an English speaker would hear “vee-tee fifteen east”, a Spanish speaker “ve-te quince este”, a Vietnamese speaker “vê tê mười lăm đông”, etc. Valhalla does the same thing, and in principle a renderer or geocoder could do something similar.

Someone writing a custom OSRM profile could technically have it avoid mentioning the route if the entire value of the route relation’s ref=* tag appears inside name=*. However, this makes some fragile assumptions. I can’t say for sure that a Route 6 will never traverse a 6th Street or Avenue 6 East or Six Flags Parkway. To do this well, the data consumer would need to first synthesize “Vermont Route 6” by looking up network=US:VT in Wikidata or a hard-coded lookup table, then match it against the name=*. All this to detect a non-idiosyncratic name, when maybe it’s the idiosyncratic names that should be tagged specially.

In some ways, this issue parallels the debate over whether data consumers should be responsible for interpreting an ad hoc list of names in name=* in order to replace it with a more presentable list. It’s not that tagging name=Vermont Route 15 would be wrong in the sense of claiming that the current weather in Burlington is high in protein. Rather, we’re debating how to optimize the database schema around certain use cases. Ultimately, I think the name-less approach would become obvious and unremarkable once editors and data consumers implement the ability to synthesize human-readable text representations of network=* and ref=*. It’s like how mappers in a lot of countries didn’t take route relations seriously until they got to experience a shield-capable renderer.

My point is that, if a router can only get the Spanish “Ruta Vermont Quince” by synthesizing it from structured data, but only after analyzing the English name to determine that it’s ignorable, then why not synthesize the English name too?

Unfortunately, most routers still treat the way ref as the source of truth about route numbering. But this is beside the point. ↩︎

ezekielf · March 2, 2024, 8:50pm

This all makes perfect sense and is why I suggested:

A router looking for <tag indicating that name is based on ref> could then decide to use either name or a constructed name based on ref but would know that using both would be redundant.

It’s pretty clear from the above discussion that a separate tag hinting that the name is ignorable if you’re going to be constructing a name from route information would be a much more robust solution. Sure, the name can be synthesized but not every data consumer is going to be already synthesizing route names. For those than aren’t, just having a name tag on the way is more straightforward.

Minh_Nguyen · March 2, 2024, 9:09pm

So, name:separate=yes?

This is the current rationale for continuing to tag ref redundantly on each way that belongs to a route relation. For better or worse, that has been a winning argument for a while. But at least in that situation, one could argue that a non-relation-savvy data consumer is left with too little information to serve user needs. By comparison, “Vermont Route 15” adds relatively little information that ref=VT 15 doesn’t already provide. In particular, it communicates systematic, stylistic information about the route network as a whole rather than an individual street.

We can also contrast chain hotel names: a given chain may or may not have a consistent, well-documented standard for combining brand and branch into a name. Besides, the domain of POIs is considerably more open-ended than the domain of route networks. Similarly, if a town does prepend or append a direction, as in the cases I brought up, then I favor maintaining the fully qualified name in name. After all, we have no established alternative for storing these directionals, and the difference between a prefix and suffix can matter a great deal in some places. And tracking these nuances comprehensively would be quite a distraction.

ezekielf · March 2, 2024, 9:33pm

While it’s true that name=Vermont Route 15 is semi redundant with ref=VT 15, it does provide additional information. It specifies that the local street has a name rather than being nameless as noname=yes would indicate. It clarifies that the preferred name format locally is “Vermont Route 15”, not “State Route 15”, “State Highway 15”, “Vermont Highway 15”, or something else. You’ve shown that such a format can be stored in Wikidata to help data consumers construct names in the preferred local format, but this seems to me like an elaborate workaround that some data consumers have to deal with so that other data consumers don’t have to deal with some redundancy between name and ref.

ZeLonewolf · March 2, 2024, 9:44pm

I’ve separated these sentences as two specific concerns that should be separately addressed.

Could you clarify for us what you see as the distinction between a named stretch of Vermont Route 15 versus an unnamed stretch of Vermont Route 15?
Are there actual cases where one of the other formats are used in that route network? If no (which I think you indicated further back thread), would the issue be solvable purely within OSM if the route relation for Vermont Route 15 contained the name formatting information?

ezekielf · March 2, 2024, 10:13pm

There are no unnamed stretches. The distinction would be between a different road that is unnamed and this one. Certain stretches are named things like Mill Street or Center Road (here the overall name Vermont Route15 is an alternate name), other stretches use directional suffixes like Vermont Route 15 East, Vermont Route 15 West. Other stretches just use Vermont Route 15 as the local street name.

I would say mostly no. But there are cases of using directional suffixes as noted above. Format info on the route relation could surely be workable. But again, it feels more complicated than it needs to be for answering the simple question “what is the name of this street/road?” (specifically not considering routes).

Here’s a similar case with a different class of objects. Fast food establishments like McDonald’s, Burger King, etc typically have the exact same name and brand values. We could say that tagging name=McDonald's is redundant because brand=McDonald's already has all the information needed. We might even convince ourselves that these locations are nameless! Data consumers could be expected to use the brand tag in lieu of a missing name tag when processing amenity=fast_food POIs. This could be another one of those things that data consumers are expected to “just know” when processing OSM data. Of course this is not a perfect comparison, but my point is that redundant data exists on other object types so I don’t see why it shouldn’t be allowed on roads too.

ezekielf · March 2, 2024, 10:55pm

Since our discussion over road way names has now eclipsed the original descriptive names in road route relations discussion, I’ve split this out to its own topic. Apologies for thread hijacking .

Minh_Nguyen · March 3, 2024, 12:12am

The reasoning here is that a use case that doesn’t refer to the store location by the brand verbatim is a very obscure use case. What’s more, users of most languages would expect a data consumer to refrain from translating the brand’s name. That makes it closer to “North 24th Street” than “State Route 82” in my earlier example.

Most languages do routinely translate the names of some kinds of POIs, such as schools and offices named after government agencies. Sometimes these POIs are part of chains, like U.S. Post Office locations. But the usual solution is to directly store the translation in each language, for example, “Oficina Postal de Burlington”, rather than to fashion a name formatter per chain. (Actually, the USPS prefers not to translate its branch names either.)

Hence my question about whether each individual roadway should be tagged with a translation of the spelled-out and abbreviated route number in every language.

Baloo_Uriza · March 3, 2024, 12:49am

This is widely considered a noname=yes situation.

Minh_Nguyen · March 3, 2024, 1:07am

To me, the presence of a directional prefix and/or suffix demonstrates that the route number has been adapted into a street name. So do some idiosyncratic formulations involving route concurrencies, like “United States 22 and 3” (where 3 is a state route).

I don’t think navigation applications translate these derived names like they translate the originals. After all, Key West is Caya Hueso in Spanish, but “Key West Street” wouldn’t become “Calle Caya Hueso”. Maybe a “West State Route 123” would show up on screen for reference but the voice instruction would omit it in favor of a translated route number.

ezekielf · March 3, 2024, 1:32am

I agree. But I take it you don’t consider a street name to be sufficiently adapted if a town doesn’t use directional prefixes or suffixes? If so, why not? To me it seems no different. The town has simply chosen to use the same name that also refers to the route as a whole.

I’ve tried to find discussions where this wide consensus has emerged and come up empty. Perhaps I just haven’t searched hard enough. I did find that you added statements supporting this position to the Name wiki page in 2018. Prior to this, it looks like the wiki page didn’t mention refs or any recommendation that names shouldn’t contain them. I didn’t see any links to supporting discussions about this change. If you can remember where these discussions happened I’d be interested to read them.

Baloo_Uriza · March 3, 2024, 1:40am

Try the tagging mailing list archives.

Minh_Nguyen · March 3, 2024, 4:05am

As I’ve said, a bare route number can function as a name. Even “McDonald’s Drive-Through” or “Unnamed Alley” can stand in for a road name in routing instructions and perhaps geocoding results too. Of course, “Vermont Route 15” is more than that – it’s apparently a bona fide name for local addressing purposes. I just don’t think it should be indistinguishable from a more developed name when looking at OSM data in isolation without local context. Anything that requires parsing a string looks a lot less effective from my perspective than anything that requires concatenation.

If we rely on a tag to indicate that the name is fungible compared to the available route metadata, then something akin to street:name would be consistent with how ref or route_ref can duplicate the information on a route relation, adding a marginal amount of information on top. It would basically be the longer version of a way ref. Also note how street:name would replace name, not complement it. At that point, your suggestion of a clarifying key would be almost indistinguishable from @Baloo_Uriza’s suggestion of addr:street, though I don’t know if the latter would have any side effects in existing software.

ezekielf · March 5, 2024, 2:47am

Interestingly, I recently found myself on the other side of a similar naming disagreement. I had noticed some buildings on the University of Vermont Campus tagged with name values like 70 South Williams Street and 481 Main Street. Since these buildings also already had addr:housenumber and addr:street tags with the same information, I figured this was just redundant information that could be removed. However another local mapper commented on my changeset to express disagreement with the removal, feeling that the addresses functioned as names within the campus community. I ended up reverting my changes in deference to his knowledge of the campus. Clearly it can be difficult to tell what exactly is and is not a name.

Minh_Nguyen · March 5, 2024, 3:21am

This is sometimes a point of disagreement regarding the names of urban apartment buildings and high-rise office buildings. At least in those cases, it’s essentially the name of a business, which wouldn’t necessarily be subject to all our rules about names. The owner could have chosen to call it “70 South Williams Street”, “70 South Williams”, “70 Williams”, “70 South”, “The Seventy”, or “The Building at 70 South Williams Street”.

Even though all these names communicate the address in whole or in part, none is so obvious that we could presume it as the name based on any general criterion. Moreover, in the less likely event that the name needs translation, it wouldn’t necessarily be a rote translation. A data consumer that would normally abbreviate words in street names might not abbreviate one of these names so aggressively. This goes back to the point about idiosyncrasy: