Invalid wiki tags to Polish Wikipedia

Hi, I found many incorrect wikipedia/wikidata tags to Polish wiki. They point to “disambiguation pages”. Also, lets discuss the guidelines for wiki tags (see below). Thanks!

My original post to

TLDR: researching ways to validate wikipedia and wikidata tags, wrote a
script to cross-check OSM and Wikidata, found many incorrect disambig
references, would love to start community discussion on best guidelines
going forward.

I have been analyzing the quality of OSM’s wikipedia and wikidata tags by
cross-checking data using both OSM tags and Wikidata. My first goal is to
fix “disambiguation” references - when OSM object links to the Wikipedia
disambiguation page, instead of the real location page. I have already
fixed about 200 objects, but there are about 800+ relations left, and I
could really use some help. I don’t think its possible to add them to
MapRoulette just yet.

While fixing wd/wp tagging issues, I have been putting together a list of
open questions on how we want to improve wikipedia and wikidata tags in
general, and create some guidelines. Lets discuss them in the talk page?

Lastly, if you have any suggestions on different ways to validate data
using the mixture of Wikidata and OSM, let me know. At the moment I have a
list of all types of OSM objects’ wikidata IDs, and mark the bad ones with
a value. If OSM’s wikidata’s “instance of” of one of the bad types, my
script puts those OSM objects it into a separate list that I can analyze.
The list of types is here - sort by the second column:
Feel free to modify the second value of any row to indicate that those
objects should be fixed.

The one called “Józefów” on your list is none of the many “Józefów’s” in Wikipedia. It has no Wikipedia page of its own, AFAIK.

I think wikipedia and wikidata tags should be deleted then (unless there is a wikidata entry?). It is better to not have a tag, than to have an incorrect one. Thanks for doing it!

@nyuriks, will you be refreshing this list?

Updated -

I will be updating the list occasionally.

It would be a good idea to set some bot to update the list so that it may function as the current “to do” list.

…aaaaand whooops. Guess who added wikidata tag pointing to disambiguation page :smiley:

Of course - thanks to the wikidata tags, it is now possible to find all these incorrect wikipedia tags, and fix them. I can only do complex detections based on wikidata.

Well, I suspected some bot of yours to do it :wink:

If we made a MapRoulette challenge out of it, I might be able to set some people to play it…

@rmikke, I tried, but couldn’t figure out how to make MR deal with relations. I think it cannot. In any case, there are now only 177 Polish links left, and all others are done. Volunteers needed :slight_smile:

Can you hack around this limitation by using out center; ?

Possibly, but the editing still has to be done in iD or in JOSM, not via MR. Plus I haven’t figured out how to add any items to MR yet, even when copy/pasting them :frowning:

Yeah. I find MR being quite overcomplicated while having a few annoying bugs / UX issues. For now, sadly, mvexel is busy with another project, it seems.

Is there a quick way to get wikidata entry for wikipedia article?
For example, how do I get wikidata entry for Chrusty?

For now, I had to manually browse through search results, as the entry has no label.

In Wikipedia, on the left hand side, click “Element Wikidanych” (Wikidata item). Also, if you use JOSM, there is Wikipedia plugin that will fetch all wikidata IDs for the current elements if wikipedia tag is set. Also, in iD editor, if you add wikipedia field (not tag!), it will auto-add wikidata field.

Now that’s much faster :smiley:

Just thinking… Couldn’t it automated the other way round?

When you have a place and a disambiguation page, bot could browse corresponding wikidata pages. They contain coodinates that can be compared to coordinates of a place and if only one wikidata entry is placed within reasonable radius from place on OSM, you’ve got your wikidata link for a place.

There still would be places where no corresponding wikidata would be found, or some with more than one, but there would be less than 5% of manual job left.

Not really worth it for the disambig case - there are now less than 100 disambig links left (out of the initial 1500+ that accumulated ever since wikipedia tag was added), and with the wikidata tag present, they won’t happen very often. On the other hand, there are thousands of links to “lists”, and those most likely will need to be fixed, likely by hand, by replacing them with something like “wikipedia:partof:…” tag, or possibly finding better wikipedia articles. Also, the coordinates are frequently not there on wikidata, which adds to the confusion. But yes, I agree that this process should be automated more.

Bots don’t get confused, bots just ignore entries with no coordinates :smiley: