The new rate limiting prevents participants to Missing Maps mapathons from saving buildings

Making a significant API change like that would be a much bigger project and involve many more people, across multiple software projects. If you think that is worthwhile, you need to start to propose the API that you want to see, and then figure out the UI for it. Also, while a change to the rate limit could be done fairly quickly, a new API takes longer.

7 Likes

MissingMaps publishes Cumulative Statistics for the period 2015-oct.2023 (see
Statistics ). It reports 50 million buildings created since 2015 in context of these projects, 15 million modified and 700,000 deleted. It also shows a very low retention with only 7.4% of contributors still active after 5 days.

Various studies including my diary PierZen's Diary | OSM Contributors Outlook - The Pulse of OpenStreetMap Contributors | OpenStreetMap shows this pattern of participation to OSM. I compiled OSM Planet Changeset monthly statistics of contribution with a classification of days of participation similar to Pascal Neis. For the period 2010-01 to 2017-08, I estimate that Contributors who last 1-2 days contributed on average 122 objects while contributors with 3-14 days of contribution, edited on average 1, 013 objects (ie. node, way, relation).

In such context, I would like to read arguments that justify to increase the rate over 1,000 objects per hour. It would be interesting also to document contribution of those that have hit the limit in their first days of contribution.

There are some actions from new contributors that should flag Let’s look more closely. Deletions by new contributors, removing names or revising/deleting relations are surely part of this. With editors such as RapiD, there is also the risk that new contributors simply import thousand of buildings without revising with appropriate background image.

Detailed Compilations available on OsmContributorStats-Changesets/Pulse_of_OpenStreetMap_Contributors_2017-10-11 at master · pierzen/OsmContributorStats-Changesets · GitHub.

1 Like

I had a DWG case a while ago where new mappers, contributing to a Govt program, were apparently having a competition amongst their group, to see who could add 10000 random, untagged, nodes the quickest! :face_with_raised_eyebrow:

One of them, on their first day of mapping, had ~150k “edits” to their name :angry:

No, this wasn’t HOT or MSF, but it still indicates a problem.

I see two different factors in making a decision whether to accept changes:

  • “whether the user is newbie or regular/advanced” - I understand that getting some global reputation score for user is difficult and require extra caching fields and periodic updates (but would bring best results in the long run). On the other hand simpler detection (i.e. is account created in the past 5 days) should be relatively easy.

  • (only) if the user is determined as newbie, what counts as highly likely problematic activity? (not all changes are the same, e.g. adding tagless nodes should have much less impact then say deleting or changing name tags)


What I am wondering is how hard it would be to “simply” count some of changes by the user differently, as it seems there are some relatively low-hanging fruits there.

As I understand it (please correct me if wrong), currently limiting works by comparing raw count of added/deleted/changes objects by that user in recent past, e.g. something like this pseudocode (exact numbers are just examples):

reject if count_changes("uid=%uid% and time < now-30min") > 1000

If we are considering increasing the limit of changes (from 1000 in that example to say 3000) to avoid problem with mapathons, how hard would it be to have limit extended so certain things (like deletion/modification of name tags) are counted with some multiplier, in order to re-balance the equation in case of vandalism? e.g. something like:

reject if count_changes('uid=%uid% and time < now-30min') + 
          10 * count_changes('uid=%uid% and time < now-30min' and 
              (tag_changed='name' or tag_deleted='name'))
       > 3000
1 Like

Absolutely, this is a discussion that started in June this year, nobody is taking it lightly.

The rate-limiting is, just like you pointed out, the low hanging fruit and has been rolled out not very long ago. I’m no programmer but I expect it was not trivial either.

So the reason I commented here is to align the viewpoints across a bigger part of our ecosystem and indicating that the pains we see in this thread are expected to be temporary. We can all join into the design and communication to make the transfer as smooth as possible, what we can not do is ignore the rationale and need for actual limits.

1 Like

I was curious to see how exactly it is implemented now, so for anyone else who also wants to have a look:

The main code (in the form of a database function): https://github.com/openstreetmap/openstreetmap-website/blob/master/lib/database_functions.rb#L2-L57
Settings (specific values, used by the function): https://github.com/openstreetmap/openstreetmap-website/blob/master/config/settings.yml#L67-L73

2 Likes

note that links go to dev repo of specific person, not deployed version (so if something changes it may be not visible here)

1 Like

Ups, was a bit to quick there, fixed!

In today’s mapathon #mmmza11 we have reached limit just once. It was mostly because there were just a few new mappers. It was reached with this changet Changeset: 144803560 | OpenStreetMap You can see first upload try when changeset was created and after 10 minutes of waiting it was possible to upload changes.

5 Likes

For info, the account that reached the limit had these edits:

    id     | num_changes |     created_at
-----------+-------------+---------------------
 144768383 |           2 | 2023-12-04 20:06:58
 144768391 |           1 | 2023-12-04 20:07:09
 144768401 |           3 | 2023-12-04 20:07:28
 144768384 |           4 | 2023-12-04 20:06:59
 144783569 |           2 | 2023-12-05 09:13:36
 144783587 |           1 | 2023-12-05 09:14:18
 144783598 |           3 | 2023-12-05 09:14:34
 144800157 |         130 | 2023-12-05 17:11:52
 144800595 |          65 | 2023-12-05 17:23:53
 144801277 |         386 | 2023-12-05 17:44:31
 144801622 |         298 | 2023-12-05 17:55:29
 144802449 |         565 | 2023-12-05 18:19:38
 144802600 |          33 | 2023-12-05 18:24:23
 144803022 |         411 | 2023-12-05 18:35:53
 144803502 |         518 | 2023-12-05 18:50:48
 144803560 |          25 | 2023-12-05 18:52:40
 144805432 |           1 | 2023-12-05 20:04:31
 144805427 |           3 | 2023-12-05 20:04:21
(18 rows)

I hope they don’t mind me posting that here - to be clear, there’s no suggestion that they’re doing anything wrong; I’m just including it here to help understand the sorts of changes involved. Most of the buildings here are rectangular, not round.

5 Likes

And on a first glance, those changesets look fine to me, e.g. changeset #144803502 added 84 new houses and modified dozen of them in about 15 minutes. So they could easily go over 1000 objects limit in an about a half hour.

So, about 10 seconds per building average, which looks to me like it shouldn’t be blocked for new user (i.e. it does not seem too fast at all, especially for JOSM in which that changeset was done - if they were shown building_tools (which they absolutely should be, if they are instructing them to draw buildings in JOSM!), it is quite expectable that even a OSM newbie can quickly learn to draw quite decent square building in 2 seconds or so (much less 10 seconds!), especially if they are proficient with using a mouse.

11 Likes

Here is a list of users which have paticipated yesterday. Somebody can count averages per hour. (These were not new mappers, but at least we get some data about how fast limit can be reached with mostly good data)
openstreetmap.org/user/L Sochorova/history
openstreetmap.org/user/Pavol Žigo/history
openstreetmap.org/user/MatusD/history
openstreetmap.org/user/MaKr0/history
openstreetmap.org/user/Branisla_0/history
openstreetmap.org/user/matty351/history
openstreetmap.org/user/Pinzo/history
openstreetmap.org/user/kristiand22/history
openstreetmap.org/user/ErikHadida/history
openstreetmap.org/user/Filip Nadžady/history
openstreetmap.org/user/mkropuch/history
openstreetmap.org/user/Kristián Bačik/history
openstreetmap.org/user/abil3311/history
openstreetmap.org/user/infiniteatomic/history
openstreetmap.org/user/abil3311/history
openstreetmap.org/user/alfonzalfonz/history
openstreetmap.org/user/michalkova/history
openstreetmap.org/user/TheElevatedOne/history
openstreetmap.org/user/psondreicka/history
openstreetmap.org/user/VlNecesany/history
openstreetmap.org/user/Krysa9/history
.

4 Likes

More and more I see the need to introduce the functionality of account moderation. I myself also run Mapatons in Poland. Mostly for people who are professionally involved in mapmaking. It should be possible to increase the limit for accounts, either by a moderator or for people who belong to a group such as MM or Local Chapter. Extending the functionality of OSM.org accounts with such features should have happened a long time ago.

2 Likes

In principle, I agree with Cristoffs, although in our case we can speak of mapathon participants belonging to a local chapter of Missing Maps only in Czechia, Slovakia and the UK, and to OSM sometimes in case our GIS officers organize a mapathon locally, in Zimbabwe, for example. In some other countries, we organize them regularly, like every 2 or 3 months, yet there is no self-identified community. So that would be limiting for MSF mapathon participants, to have to belong to a defined group to have a more reasonable limit, in about 17 other remaining countries, 65% of our mapathons.

1 Like

Deviating from the topic (Sory) :smile:
We could think about organizing mapathons under the aegis of MM also in Poland. Get back to me on PM, we will talk.

1 Like

That is an interesting suggestion to count more ‘grave’ changes, like deleting buildings or changing tags, with a multiplier! I would support that.

In order to help people think about the sort of edits we’re seeing on the other side of the coin, here’s a set of changesets that I’ve just reverted:

    id     | num_changes |     created_at
-----------+-------------+---------------------
 143552401 |           7 | 2023-11-02 23:28:08
 144221297 |           2 | 2023-11-19 19:52:56
 144879446 |         523 | 2023-12-07 20:14:48
 144879477 |       10000 | 2023-12-07 20:15:37
 144879565 |       10000 | 2023-12-07 20:19:34
 144879570 |         609 | 2023-12-07 20:19:47
 144879580 |        3438 | 2023-12-07 20:20:05
 144879619 |       10000 | 2023-12-07 20:21:24
 144879621 |        4823 | 2023-12-07 20:21:33
 144879632 |        1900 | 2023-12-07 20:22:17
 144879677 |        7289 | 2023-12-07 20:24:12
 144879700 |        2593 | 2023-12-07 20:25:11
 144879719 |        7289 | 2023-12-07 20:25:52
 144879746 |         437 | 2023-12-07 20:26:41
 144879759 |         779 | 2023-12-07 20:27:07
 144879783 |       10000 | 2023-12-07 20:27:55
 144879824 |       10000 | 2023-12-07 20:29:28
 144879831 |       10000 | 2023-12-07 20:29:35
 144879834 |        8356 | 2023-12-07 20:29:43
 144879905 |        1249 | 2023-12-07 20:32:23
 144880015 |         224 | 2023-12-07 20:37:07
 144880085 |         224 | 2023-12-07 20:40:28
 144880311 |          16 | 2023-12-07 20:49:46
 144880419 |         199 | 2023-12-07 20:54:32
 144880657 |           0 | 2023-12-07 21:03:15
(25 rows)

Example: https://www.openstreetmap.org/changeset/144879446

2 Likes

I wonder if the entire contribution of the local mapathon could be stored in a kind of ‘staged repository’ before ultimately being committed to the main OSM database.

It could function similarly to a Git repository:

  1. Pull data from the main OSM database (for a specific area) and create a local repository.
  2. Push all contributions from the mapathon to that local repository.
  3. After undergoing reviews by the mapathon manager, this local repository will be pushed to the main OSM database.

Optional : Create a local OSM Carto renderer for the repository containing edits from the local mapathon.

Corollary : This will create numerous ‘OSM local forks’ in a specific local area. I don’t know whether this is a good thing or not.


The technical challenge here is how to make the core OSM infrastructure more mobile, allowing anyone to initiate their local “OSM server” and begin serving the read-write API for local mapathon contributors.

1 Like

Proxy accounts merging contribution from multiple users are in general really bad idea.

6 Likes

I’m also doubtful of that idea (both because it has a high risk of “fragmenting” the community and because it would be a gigantic technical undertaking, which, since I still haven’t gotten a response on if there would be developers from the humanitarian communities who’d be able to help implement any changes, I doubt anyone is willing to do).

However, something similar can be done with our tools today; it’s possible to download change files from both iD and JOSM. Those could then either be uploaded later by the mapper (though with the risk that this gets forgotten) or by someone with higher rate limits (though would get the wrong attribution, similar issue as Mateusz mentioned).

It would also be possible to build a “queue” service, where anyone can login and upload a change file, and the service would then periodically retry to upload it, this would be similar to your intermediary/staging area idea. Shouldn’t be too hard technically, but would need some safeguards against usage by bad-faith actors (otherwise they’d just end up queueing a lot of vandalism changes).