Restrict wikimedia_commons URLs as image=* tag values?

Hi everyone,

this topic has been discussed many times, but nothing has been done, so let’s try another time in the new forum :upside_down_face:. At the moment, there are at least 82650 image=* tags having a commons url as a value (I just looked for “commons” value in taginfo, but some are tagged using the File:/Category: syntax so the value count is way higher).

Reading the wiki talk page seems like the majority of the people agree that duplicate the information in both image=* and wikimedia_commons=* isn’t the best choice. But yet the Wiki describes one of the three tagging styles as:

a Wikimedia Commons filename (formatted as File:image.jpg ) - although this is commonly tagged as wikimedia_commons=*

Even with the “although” note, using a commons file as image=* file is suggested as one of the three tagging styles by the wiki itself.

My proposal would be to discuss it again, and if the majority agrees, change the wiki to state that the image tag has to be used only when a wikimedia commons file is not available for that feature OR when the image url is a different one to the commons one.

In simple terms: no duplication.

A follow up could be an Osmose check or something similar.

What you think about it?

1 Like

I would state “is recommended to be used” and I would support this.

Also, as I understand things like image=https://commons.wikimedia.org/wiki/File:006781_-_Ayllón_(8059045733).jpg would not be affected?

It seems currently the image=File:* form is slightly more popular than wikimedia_commons=*, 100k vs 90k.

Do you propose automatic retagging?

Also, as I understand things like image=https://commons.wikimedia.org/wiki/File:006781_-_Ayllón_(8059045733).jpg would not be affected?

It would be affected as well in my proposal, because even if it has different syntax, it’s the same “information”, so it’s still a duplicate.

Do you propose automatic retagging?

No, I propose to make more clear that duplicating information isn’t that useful, and that one tag is enough. I don’t propose to bulk edit everything has be done as far, but to start having more clear guidelines from now on.

1 Like

Yes, good idea.

I would have no problem if image keys pointing to wikimedia commons are semi-automatically updated to wikimedia_commons=*, that is for a limited area where the mapper is active.

An Osmose check would be good be even better would it to have a JOSM MapCSSTagChecker rule so you get a warning in JOSM but these rules are also automatically checked by Osmose.

It may be a good idea to reopen https://github.com/waymarkedtrails/waymarked-trails-site/issues/359 first (with thorough explanation while both image=File:filename and image=https://commons.wikimedia.org/wiki/File:006781_-_Ayllón_(8059045733) are problematic - this requires extra handling anyway. And direct linking image also requires extra handling anyway to get licensing info.).

See Changeset: 125323901 | OpenStreetMap

1 Like

The result of such a change would be that some images are tagged using image=* and other images are tagged using wikimedia_commons=*, depending on where they’re hosted. Subjectively, that still doesn’t seem like a particularly elegant outcome.

If your goal is to avoid duplication, wouldn’t it be cleaner to move all image links, including those to images hosted on Commons, to image=*?

Hi,

Removing the image= tag is BAD because it’s the one with the widest support by websites/visualizer.

As a mapper, I add images to have them used. If they are not going to be used anymore, I won’t add them.

In the past, I even tried to convince sites to support the wikimedia_commons tag. But without success. The most prominent example is waymarkedtrails.org. See https://github.com/waymarkedtrails/waymarked-trails-site/issues/359 where I raised the problem.

So, I settled to add both image and wikimedia_commons tags. But if avoiding duplicate is a requirement, I’m going to remove wikimedia_commons and keep the image one.

Ciao,
Andrea

1 Like

I would agree with dedicated image hosting websites, but Wikimedia Commons is not one of them. You could also link a video (Commons allows audio, text and data files as well). Also the Category: syntax allows to link categories (aka galleries of files). This permit us to don’t use separators like “;” or tagging like this, with 10 different image tags.

We make a big use of wikidata (operator:wikidata, brand:wikidata, subject:wikidata and so on), Wikidata is connected to Wikimedia Commons (via the P18 property and categories), and Commons is connected to Wikidata (via the P180 to mention one). OpenStreetMap is connected to both with wikidata=* and wikimedia_commons=*. Wikidata is connected to OpenStreetMap via at least 5 properties (such as P402 and P1282). Wikimedia Commons keeps track of Commons files used on OpenStreetMap also with “Commons:Files used on OpenStreetMap”. Also the projects share part of their philosophy. I live in Italy, and here the OpenStreetMap local chapter is Wikimedia Italia. In short, I don’t think is correct to consider Commons like a image hosting website like Flickr, to mention one.

But if avoiding duplicate is a requirement, I’m going to remove wikimedia_commons and keep the image one.

I wouldn’t, for all the reasons I just wrote.

1 Like

In the special cases you mentioned, having the wikimedia_commons tag is fine. But in the general case, when you need to link a single image, the image tag is always the best choice because it’s the most supported one.

When a mapper adds an image, he wants it to be as accessible as much as possible. So, the image tag is going to be always preferred by mappers. Any proposal that don’t take this fact into account won’t really work.

At least until someone can convince waymarkedtrails.org and other sites to support the wikimedia_commons tag. But I’m skeptical that this is going to happen soon.

Ciao,
Andrea

1 Like

I do not agree the image tags is the best choice for the reasons mentioned under Unresolved issues.

I am with @ivanbranco that Wikimedia Commons is not like a plain link to an image and the good thing is that it addresses the concerns mentioned in unresolved issues.

I am not sure mappers always will prefer actual links, what matters if things can be “clicked” and that is possible when viewing the object (example), when the properties of an object are viewed in Josm (View image on WikiMedia Commons) and also in Id you can open the link direct.

See taginfo, wikimedia_commons= is used more often than image= in combination with
information=guidepost in that sense waymarkedtrails would better support wikimedia_commons= and if it only wants to support one, drop image= support.

1 Like

So, the image tag is going to be always preferred by mappers.

Uhm, I looked at the data and doesn’t seem like image=* is always being preferred by mappers. In Italy we have 2842 guidepost with the image=* tag, of them the 80% (2269) were added by you. There are 2681 information=guidepost with the wikimedia_commons=* tag alone and 2502 (90% [2269] of them added by you) with the same image linked in both tags. So seems like wikimedia_commons is preferred for guideposts (at least in Italy).

Also in Italy the most important alpine club (CAI) suggests (source) in the wiki the wikimedia_commons tag for guideposts, image=* is not even mentioned.

But I get your point and I think it would be cool for waymarkedtrail to implement commons, I tried commenting the issue but I found out I can’t re-open issues closed by another user. Could you open it for me please?

Also you mentioned:

At least until someone can convince waymarkedtrails.org and other sites

What are these other sites? Maybe we can try asking for this feature again since usage of the tag increased since then.

1 Like

Avoiding duplication is not the only important part here.

wikimedia_commons has benefit of being sane to process rather than mapper sometimes linking file page on Commons, sometimes linking full-size image, sometimes linking specific image.

And if you actually want to use this data then you need site-specific parsing anyway (to retrieve licensing info or at least link back to credit page).

wikimedia_commons has benefit of being (almost always) having files safe to use, while image is often used to link files unavailable on open license

2 Likes

that is happening already anyway

So, you now know why I’m interested to see them shown in waymarkedtrails. It was a lot of work, and having them now hidden, would be really a loss.

Just to be clear, I also wanted to use wikimedia_commons, because I recognize its benefits. But being unable to convince waymarkedtrails was the blocker. I saw other mappers adding both tags, and I just did the same.

I’ve now reopened the issue on the waymarkedtrails github. Let’s see if we can change it.

Ciao,
Andrea

2 Likes

ivanbranco:

But if avoiding duplicate is a requirement, I’m going to remove wikimedia_commons and keep the image one.
I wouldn’t, for all the reasons I just wrote.

But if avoiding duplicate is a requirement, I’m going to remove wikimedia_commons and keep the image one.
I wouldn’t, for all the reasons I just wrote.

In the special cases you mentioned, having the wikimedia_commons tag is fine. But in the general case, when you need to link a single image, the image tag is always the best choice because it’s the most supported one.

I agree, have wikimedia_commons for categories and image for images

Good the waymarkedtrails issue has been reopened and updated.

I had a look in the code and it looks like to me it would be relatively easy to add, see this line. This is where the image key is checked if if present the value is set to an image attribute. Just extend that code to check first for the presence of the wikimedia_commons key and if present set this image attribute to:

'https://commons.wikimedia.org/wiki/' + loctags.get_url(keys=['wikimedia_commons'])

Probably better to link to the commons page this way then to an image directly as that way also licensing is properly addressed.

1 Like

Back to the original discussion:

I think recommended is a bit to soft, what about “should” or “strongly recommended”?

my planet file from a few days ago shows 276.332 objects with image= (closely matching taginfo). In that I can find only 28.830 entries with ‘image=File:’ and … 19.423 objects with ‘/commons/’ in the url.

with taginfo “image” searching for “File” I get 109k, with “File:” it is 88k, not sure but I believe it searches for the substring inside the value, not just the beginning.

In the wikimedia_commons tag there are 83k total, of which 28k have the string category: inside, so there are about 55k images.

I can reproduce the numbers with taginfo “image” searching for “File”/“File:” but I am pretty sure it is bogus just looking at the first entry:

The way I did get the number is using osmium-tags-filter to extract an .opl file with all image= objects.

Then use grep on that .opl file searching for ‘image=File:’ and count the lines using wc.