I believe there’s quite a number of challenges with such an approach. Obviously Mapbox is running such a merging process inhouse for their multilingual maps, though they don’t provide the raw merged data to the outside world anymore. Even this scenario might cause some issues for map users, as they have no idea, where and how to fix potential issues. (and asking Mapbox to fix issues seems a bit difficult these days (see link)).

Let’s take a look at the general process. Other data consumers don’t only rely on Planet dumps, but also apply minutely diffs to keep their distributed databases up to date (you’ve been through this already…). Mappers also use such distributed mirror databases to download parts of the OSM data, hopefully do some improvements to the data and then upload it again to OSM.

If we were to inject some data from wikidata (say the population, or an elevation tag, or maybe some tag which no longer is kept around in OSM) at some point in this process, it very quickly becomes quite messy: Once the replicated data hits such a database mirror, you have no idea where the data originates from anymore. On the other hand, people expect those tags (e,g, population / elevation) to be in OSM, so you won’t have much fun running a “pure OSM” version.

Bottom line: Keeping that clear cut distinction between “OSM only” and “augmented version” is something I consider not feasible in real life day to day usage of the data. And those are only the technical issues…

Population is an interesting topic. First of all, I noticed that your query somehow gives random results when multiple population tags at different points in time are available. Somehow this should be restricted to those valid at the current date. Apart from that it’s probably worthwhile looking into more details:

I noticed several cities in Australia, where population figure were copied to Wikidata from a more recent government source, while OSM still has data that is about 5 years older. However, on closer inspection it looks like the OSM value is still somewhere around 20000 inhabitants, while Wikidata had only 19. Unlikely that a population sees such a sharp decline in Western Australia in such a short timeframe.

For sure this requires further manual checking, and it’s even possible that the government figures are currently completely off due to some error in their process. Cross checking OSM vs. Wikidata could reveal such very strange discrepancies.

Other issues I noticed were caused by different definitions and interpretations of respective city boundaries (I know this sounds pretty weird). Again, a query to identify such cases adds a lot of value. Fixing issues requires further manual work. After all, both OSM and Wikidata might be wrong, and you cannot tell just by looking at the number itself.