Restrict wikimedia_commons URLs as image=* tag values?

I can confirm your numbers,
osmium cat 220902-planet.osm.pbf -f opl|grep "image=File:"|wc -l
28913

1 Like

Your argument is basically “tagging for the renderer” it seems.
Having the link in a separate tag makes it a lot easier for all data consumers.
You can create a grease monkey script for sites like waymarked trails if you are not happy with their interface ignoring the tag you are using.

not all of them - it makes harder for ones already using image and only image

(I would not treat it as a blocker, but pretending that it does not exist is not helpful)

2 Likes

Here is a a suggestion for a possibly more long term and well structured solution for guideposts: Wikidata items for all guideposts?

I’m not sure duplication of information is a bad thing. I mean, redundancy can help in many cases (if Commons has a temporary downtime an alternate link could back up the same information so the user can still see the image). But having both image and wikimedia_commons link to the same file I agree is quite useless. I’m not sure what’s my favourite chiuse in this case, though. I think that using wikimedia_commons only for categories of page-galleries could be good (although, if there’s a Wikidata item linked that could suffice and provide images from linked category) but I’d use it for files hosted on Commons too.

for start, many OSM items have neither

also, I used few times specific image to match exactly state (after rebuilt) or perspective

I know, but that wasn’t the point of my message.

I’m thinking the same.

In fact, the wikimedia_commons links can represent various types of content, including categories, images, videos, and audio. I see it as something more versatile and generic. On the other hand, the image link is more specific; it explicitly indicates that you’ll receive a single image and nothing else.

Please, read all my message and don’t extract only what agrees with your position. I’ve also stated that I’d use wikimedia_commons tag also for files (it doesn’t really matter what type of file a user uploads, to be honest. Be it a video or a picture they could be both useful for the end user, only a madman would link an audio or a pdf file). Moreover, who could say what a link that points to a file hosted on another platform could contain? It could link to a malware or a spyware, whilst a Wikimedia Commons file are more safe.

I suppose you misunderstood my intention. I extracted the part I agree to remark that.

The image tag effectively conveys that the content is an image, which is crucial information for visualizers.

Let’s consider the common scenario where a visualizer wishes to display an image that accurately represents an object. In this context, the wikimedia_commons tag may be too generic, as it can encompass a wide range of content. For instance, if it’s a collection, it doesn’t specify which image serves as the most representative. In contrast, the image tag provides precisely what a visualizer needs.

Regarding security, the choice of security checks to be performed is at the discretion of the visualizer. They can still filter by URL if it aligns with their specific use case. It’s essential to keep in mind that just because a file is hosted on Wikimedia Commons doesn’t necessarily mean it’s suitable or safe for display within the OpenStreetMap context. Many types of images, such as medical images, could raise concerns.

That’s why displaying the whole category (OsmAnd does this, for example) could be more useful than choosing only one picture.

I don’t think this applies to OSM, so why using it as an argument? Moreover, most of Commons images have a name that makes it clear what they represent, and it could be more easy to spot vandalism. What can you understand what image links an URL such as imgur.io/ag5ja1 (random string)?

It depends on the specific context, and one way doesn’t exclude the other. You can have wikipedia_commons with a collection of images, and at the same time image with the image most representative of the object. Then the visualizer can choose what it needs.

To emphasize that an image being in wikipedia_commons doesn’t guarantee safety, and using safety as an argument in favor of wikipedia_commons over image is a weak argument. Neither of them provides a safety guarantee. It’s the tool or visualizer making use of such links that should take care of this aspect.

While it’s true that Wikimedia Commons contains some NSFW content, it is a much more predictable and trustworthy source of images than the Internet writ large. At least there is some degree of moderation to prevent vandalism from persisting too long. Some mappers even trust Commons enough that they expect renderers to display any arbitrary Commons image that appears in a wiki:symbol tag – directly on the map:

When fetching arbitrary content from the Internet, a data consumer doesn’t only have to worry about NSFW content and vandalism. The content can go away at any time, because the host has no way of knowing that it’s being used on OSM and has made no commitment to keeping it online. Commons has a system to track which files are used on OSM so administrators are aware of this usage; they can avoid arbitrarily deleting the image or they can update OSM if they have to delete the image for copyright reasons.

Worse, an arbitrary domain could get squatted and start serving up malware or otherwise violate your privacy. It’s a really bad idea to hotlink this content and load it automatically without the user’s consent.

This is not a theoretical concern. OpenHistoricalMap has customized the OSM frontend to embed the contents of image verbatim in the sidebar when you visit an element’s page. Unfortunately, many of the images tagged in OHM are broken links. The vast majority of the images that still work are from Commons:

1 Like

I diverted from the main argument (you wouldn’t have a string of random characters for a Commons image name, so this is a non-problem for the discussed point), so I’ll bring it back on rails.

Why I think that using a Commons tag is better than using image: string length. OSM tags have a maximum characters allowance of 255 characters. The obligatory string https://commons.wikimedia.org/wiki/ takes 35 characters, which are not plenty, but could be saved when using the wikimedia_commons tag.

1 Like

Moreover, the correct URL to put in image isn’t obvious for Wikimedia Commons–hosted files. For example, this traffic sign node links to the following page on Commons via wikimedia_commons=*:

If I were to convert this tag to image=*, which URL format would a data consumer expect me to use?

  • https://commons.wikimedia.org/wiki/File:Lane_use_diagram_sign_at_Interstate_280_and_Almaden_Plaza_Way,_San_Jose,_California.jpg
  • https://upload.wikimedia.org/wikipedia/commons/d/d2/Lane_use_diagram_sign_at_Interstate_280_and_Almaden_Plaza_Way%2C_San_Jose%2C_California.jpg
  • https://commons.wikimedia.org/wiki/Special:Redirect/file/Lane_use_diagram_sign_at_Interstate_280_and_Almaden_Plaza_Way,_San_Jose,_California.jpg
Format Problem Prevalence
https://commons.wikimedia.org/wiki/File:… Points to an HTML page, not an image per se. 74,748
https://upload.wikimedia.org/wikipedia/commons/… Not a permalink: if someone uploads a new version of the image, for example to touch it up, then they’ll break this URL. (Old image revisions are moved to an archive/ directory.) Hotlinking this file violates its license. 23,428
https://commons.wikimedia.org/wiki/Special:Redirect/file/… No one knows about Special:Redirect. Hotlinking this file violates its license. 0

Sure, a data consumer could sniff out one of these URL formats and convert it to the desired format – either a link to the image description page, which contains the legally required attribution and license, or an API call that fetches the attribution along with the raw image URL. But parsing URLs is error-prone, and our general tendency is to prefer structured tags over freeform ones.

2 Likes

What you say is right, but such advantages are the consequence of using Wikimedia Commons as host, and not related to use the wikimedia_common tag.

If you put a Wikimedia Commons link in the image tag, you get the same advantages.

Wikimedia Commons discourages hot linking, so the widely accepted best practice is to link to the description page (the one you call as “HTML page”).

Data consumers such as OsmAnd can typically handle this page. This process is no more complex than handling the wikimedia_commons tag, and it doesn’t provide a direct link to the image. In fact, the information conveyed is exactly the same.

If so, this key has been misused tens of thousands of times. That should be cleaned up before consolidating another key into it.

Are you referring to OsmAnd’s “Images nearby” panel? That isn’t coming from image tags on features in OSM. For example, this spaceship hangar is tagged with image=http://commons.wikimedia.org/wiki/File:Last_Look_at_Hangar_One.jpg to indicate what it looks like today (note insecure URL):

But OsmAnd shows this historical image instead, based on the linked Wikidata item:

This robotic company’s office has no wikidata tag, but it does have image = https://www.inorbit.ai/hubfs/Jackal%20on%20the%20street%2001.jpg:


Yet no image appears on OsmAnd:

In order for an application to present an image in the UI using OSM tags, it would need to isolate the file name or manipulate the URL to use Special:Redirect to get the raw image and quite possibly the attribution string. By contrast, the wikimedia_commons key contains a page name that can be used verbatim with the Wikimedia Commons API. If we don’t see this as an advantage, then I don’t really see the advantage of image over description or url.

Additionally, editors such as iD and Go Map!! have fancy preset fields that let you pick a file to add to wikimedia_commons. This makes the process less error-prone for mappers too. image has no such convenience because its format is more freeform.

1 Like

My bad, I had an outdated download. The image does appear now:

But here’s another example, a building last edited a decade ago to add image=http://commons.wikimedia.org/wiki/File:Bank_of_Italy_%28Livermore,_CA%29.JPG (note insecure URL):

So OsmAnd will fetch direct image URLs but will not resolve URLs to image description pages.

I have to correct you on this. Likely OsmAnd does it only in some cases, like for some objects, but definitively it’s able to get images from the description page

See for example this guidepost: https://www.openstreetmap.org/node/10969256248

It has this image tag: image=https://commons.wikimedia.org/wiki/File:20230608-serina_cornalba_giro_redo-163.jpg

This is what I see in OsmAnd when I click on it (note that you have to click exactly on the guidepost to select it, and not nearby)