Is there a procedure to prevent link rot?

Interesting. Looking through the logs I see that the overpass queries for the u0, u1, u2, u3 geohashes have been timing out and need to be broken into smaller chunks. I’m not sure how I’ve missed that before as I have refined the list of chunks at least 7 times since I’ve started. I’ll refine and catch them in the next run or two. Thanks for pointing it out.

5 Likes

OK, I think I have things wrapped up with running the “https nanny” over several large areas that had been missing from previous runs. After doing a full run, I’ve come up with over 110,000 OSM objects that have http:// URLs where the hostname doesn’t resolve (for me) through DNS. I have now learned that MapRoulette limits challenge sizes to 50,000 so I have two challenges in my MR project. One for OSM IDs that are odd, and another for ones that are even. This leaves ~10,000 objects out of the challenge for now, but will hopefully be included as the original tasks get fixed and purged.

I had some preliminary challenges set up with a partial dataset and those challenges were getting a healthy amount of attention with no other advertising than MR surfacing the challenge. So I’m rather happy with that and will continue to produce updates, maybe on a quarterly basis.

Because my source data is all of the websites in the database that are http://, I’m only seeing part of the broken website links. My next task is to modify my scripts to look at https:// links that don’t resolve in DNS and generate MR challenges for those as well. It will be interesting to see how many of those are unresolvable compared to http:// links. As it stands, over 3% of the website tags are unresolvable. This doesn’t even take into account the ones that 404.

The even numbered and odd numbered challenges are now available for anyone that wants to check them out and help.

7 Likes

Nice. No more tasks in Munich. :blush: Only two were found, which i just fixed.

1 Like

This is awesome! I tried to solve some random tasks all over the world and found it very hard. If a place no longer operates I think it’s almost impossible to find the successor without local knowledge, in a foreign language.

For broken links in my home region this will be very helpful. It is a bit like a stale POI check.

2 Likes

While foreign language does indeed complicate things, I find that searching the web for amenity name and other data available (like phone number, address etc) often yields new webpage or facebook/instagram/etc pages which belong to the same (most often I see the situation that website/domain was let to expire, presumably due to too high maintenance).

But yes, this challenge does require some more research effort then some other challenges (like missing ways / buildings or such which are easily checked on aerial photos).

Anyway, there is too much URLs to fix even in my own country (Croatia), so I would recommend people concentrate on that first (their own country I mean, although I wouldn’t mind if they concentrated on Croatia either :smile:)…

6 Likes