The new rate limiting prevents participants to Missing Maps mapathons from saving buildings

Yes, they were trying to upload each minute or more, until they were successful.

Yes, they could continue editing, but with more they add, more later they can upload. So it’s counterproductive to map more.

It’s possible to do it like when they would map and upload changes after an hour, when they come home.

Organizer can upload they changes, but this would not help with morality, because organizer will “steal” whole work. So they would not see their results. How much they have mapped.

2 Likes

But are we really interested in new users barging in and uploading 2500
edits? Is that a healthy approach for good and lasting contributions?
Even in a Mapathon situation, do we want to gamify the whole thing into
a “who maps most” competition? There’s enough overlapping, duplicated,
jarry-angled buildings in OSM as it is. What’s wrong with “you can only
add 100 buildings on your first day, so take your time and do them well”?

5 Likes

They can map more at the next mapathon!

We all own OSM. Every map edit benefits every one of us!

Frederik, the new users are not “barging in”, they are invited by us, representatives of humanitarian organisations (Médecins Sans Frontières in my case, that is why I started this thread) to help us improve the base maps in areas where we either have (mostly medical, in our case) projects and activities ongoing, or are assessing the needs.
And what is wrong with that approach is that you demotivate and likely loose the most capable mappers, quick learners, who could become eventually either inspiring role models in their communities/for other students (like @Filip009 above) and/or eventually validators. As I already wrote above, at some of our mapathons for Czech & Slovak mappers, the organisers teach JOSM directly and we do a thorough training and answer all the questions regardless the mapoing editor, so the buildings the new users create are usually orthogonalized and, in general, validators report high quality mapping. So the limit in such cases tends to be misplaced.

2 Likes

Totally agree to some stringent limits on new accounts without track record lest this gets pre-agreed on incidentals with QA. Came across a village recently which from above looks like 150+ distinct buildings, residentials mostly, sheds, gardens… mapped as 6 following the edges of the streets. MapWithAI wonders or something like ‘map filler’. There’s the case @ivanbranco analysed in Algeria, 156k buildings or so imported with, yes MWA, with way crossings and 131K building overlaps. No responses so guess it’s still an open case. Mopping with the tap full open.

1 Like

Another recent example is this one in Equador. New mappers seem to have been given some particularly poor advice, resulting in duplicated and triplicated buildings, and buildings called “edits”, etc.

Can anyone in this thread help find who organised that activity?

1 Like

I’d like to rectify some facts that might have gone lost.

The rate limit of 1000 is per first hour, not per first 24 hours. The clock starts with the upload of the first object, not with account creation.

Any such mechanism must be extremely simple: the algorithm must be such that it scales faster than any potential attack. And more importantly, judgement criteria must be so simple that the total time spent on reviewing complaints does not overwhelm the real people tasked with that. This makes any ideas with bounding box or edit type adaptive rules or whatever too complex to administer.

The rate limits would have been hit in the past 11 years relatively rarely (in the talk after 11 minutes) by benign users. Note that I have hand checked for a substantially higher limit. The lower limit has been hit more often: absolutely too often for manual review, relatively rarely.

Not every hyperactive editing activity is vandalism. There are unusual editing patterns (pattern, not content) that run into the limit accidentally. However, the limit is set such that mappers with an average degree of diligence and current tooling will not run into the limit.

In total we have seen in the past 11 years an estimated 200-2000 benign recurring users (out of a total of almost 2 million mappers) that have uploaded in the first hour more than 1000 edits. The 20k number in the talk includes one-off mappers, imports, bots, and spam, and the 200-2000 are extrapolated from the findings after manually analyzing users as explained in the talk.

The limit is designed to give long term users and ultimately the DWG enough time to stop vandalism early. This has and must have priority, because these caring users are scarce. The number of now long term users we might have put off if we had the limit already in the past is most likely a one-digit figure per year or zero.

However, it would be a beneficial research task: How many (and which) users that now fulfill the conditions for active users have uploaded over 1000 object versions in the first hour of their mapping activity, structured by number of years of ongoing activity?

6 Likes

Currently the limit is at 1000 nodes.
Would it be a better solution to set the limit to 1000 modified (or created) objects (ways/nodes/relations) or would this make vandalism to easy again (based on the known vandalism patterns)?

why you think so? I am pretty sure that ways and relations are also counted already.

The number of changes/edits is what is considered, wit openstreetmap-website/app/controllers/api_controller.rb at master · openstreetmap/openstreetmap-website · GitHub

It should be noted that the relationship between changes made by an user in an editor and what gets uploaded and recorded as a change isn’t directly one to one or as simple as you think it may be (but that is likely not an issue in the context of Missing Maps).

1 Like

Did you read my earlier comment? The question then becomes “what is wrong in taking time going to the mapathon, listening to theory and instructions for an hour, and then going back home after doing only 15 minutes of actual mapping, because the edits you do after would be lost anyway”.

It’s not very motivating, is what is the problem (if intention is drawing-in new contributors).

Now, if the technical implementation was different (e.g. if it allowed users to press save and having object upload successfully, but only be applied later somehow) I agree it wouldn’t be a problem.
It is primarily allowing the user to invest time doing things, and then refusing to accept those changes, which would be highly off-putting to many users IMHO. (And also secondary concern of setting the limit of how much new user may do too low. But primarily that they only find about it after they’ve already invested time).


If they’re teaching them drawing buildings in JOSM with building_tools plugin (as they absolutely should be if they’re teaching them to draw buildings in JOSM at all), even a total OSM newbie (provided that they used computer with mouse before in their life) can easily create more than 200 completely square and perfectly positions buildings in an hour (which is >1000 elements limit), with no effort.

And JOSM will warn about overlapped/duplicated buildings, although if someone didn’t care about data quality that wouldn’t be fixed by warnings (nor by limiting a number of edits). So that would depend on having taught organizers to validate data before upload and/or as a separate step (e.g. Tasking manager has separated mapping sessions, and separate validation sessions; and yes I know that isn’t perfect either).

They could, provided they weren’t hugely demotivated by the work they invested being lost because of arbitrary limits.

Still, many people take pride in something being accomplished by themselves (“this exists because I personally did it!”), and value that experience much more than if the data were just handed to them.

It is IMHO combination of both of those (having most of the map handed to you freely, but also contributing your part free to others) that makes OSM database (and many FOSS projects with multiple contributors) successful.

3 Likes

I do found the methodology there somewhat problematic. It seems that they’ve cut out as unimportant outliers anybody with “less than 100 changesets” because they’re “not sustainably active, thus not a big loss for a community”.

One thing is that is IMHO horrible justification (do we really expect to attract new contributors with attitude like that?! What happened with “If everybody mapped only their own neighbourhood we’d have perfect map of the world in minutes”? That ideal scenario requires one changeset per mapper)

But another thing is that premise is problematic for those activities designed to attract new users (basically most users on a mapathon that succeeds in attracting new users will by definition have “less than 100 changesets”), thus will not account for the issues we’re seeing here at all. Yes, I understand that this cutoff simplified an analyses a lot, but it also lost a lot of crucial information.

That would be interesting, yes. But still suffer from survivorship bias (i.e. “how many didn’t become active solely or mostly because of such arbitrary limits driving them away”, which would be much more interesting, though)

note that it would use data before such limits existed, so would not be subject to this effect

Yes, that was the idea for gathering that info:

  • to see how “new users retention” worked before the introduction of limits, and
  • see how “new users retention” works now after the introduction of limits, and
  • then compare them (of course, as “with limits” period is much shorter then “without limits” period, the latter should probably be limited to same amount of time immediately preceding the introduction of limits, or taken into account in some other way).

But such suggested information would be interesting in addition (to see what negative effect introduction of limits already had) to original @drolbr idea, not as a replacement of it.

What happened with
/“If everybody mapped only their own neighbourhood we’d have perfect map
of the world in minutes”/? That ideal scenario requires one changeset
per mapper)

The map gets stale. Depending on location, it will become a second class
choice within three to ten years.

The problem of maintenance gets often overlooked or underestimated. It
is somewhat hidden in the absolutely true statement: “We build a
community, not (only) a map”, where it is implied that community members
strive to keep their stamping ground (and probably unloved surroundings)
up to date, over years and years to come.

1 Like

None of the blocked Mapathon participants has edited close to 1000 Objects. It seems as they hit the limit way earlier.
hrinik5: 958 created nodes in the first hour.
Victoria Gáliková: 948 created nodes in the first hour.
They definitely did not create 1000 buildings (ways).

Some time ago, there was a discussion about users that got blocked while creating round buildings with ~20 nodes each. The high count of nodes per building was identified as the critical part, not the number of buildings (ways).

2 Likes

For the avoidance of doubt - from looking at the changeset feed:

changesets=> select id,num_changes,created_at from osm_changeset where user_name = 'hrinik5';
    id     | num_changes |     created_at
-----------+-------------+---------------------
 148062309 |         365 | 2024-02-29 17:01:33
 148062470 |          95 | 2024-02-29 17:04:51
 148062560 |         100 | 2024-02-29 17:07:24
 148062719 |         169 | 2024-02-29 17:11:23
 148062813 |         115 | 2024-02-29 17:13:42
 148063047 |         110 | 2024-02-29 17:19:04
 148064192 |          93 | 2024-02-29 17:50:14
 148064903 |          89 | 2024-02-29 18:09:48
 148065216 |          10 | 2024-02-29 18:20:24
 148065518 |         311 | 2024-02-29 18:29:13
 148066112 |         590 | 2024-02-29 18:45:18
(11 rows)
1 Like

Object in OSM means a single node/way/relation as the server does attempt to attach any wider meaning to things or decide which components constitute a single real world object.

3 Likes

I absolutely agree with you that maintenance is often even more important then initial mapping, especially regarding relatively fast-changing objects (like POIs). But was that disputed anywhere?

Point is that even if “everyone re-mapped their neighbourhood (and thus whole planet) every year”, they’d still likely all be dead before they reached those “required 100 changesets” to be included by that methodology, which is what I find somewhat :slight_smile: problematic

Absolutely. Which is why I find it crucial not be off-putting to new potential members and instead try to be inviting as much as possible to them, as all existing mappers today will get tired of it (or dead) one day relatively soon, and it will be upon those new members we managed to make involved to continue the work.

And yes, I understand that vandalism is also an even bigger (or at least more pressing) issue. But I feel we should strive to find a better balance / solution of false positives and false negatives than is currently the case. (otherwise, we might as well just disable new users signups, and thus solve the vandalism issue rather quickly – but that extreme would obviously be as bad as the other extreme of simply ignoring all vandalism).

3 Likes

Dont you think that mapathons organisers can easily re-arrange their sessions with first a short period of editing, then followed by a group discussion where we evaluate the various mapping done, problems encountered, difficulty to interpret images, etc. The best would be to have a projector to show with OSMCha or similar tool the various contributions. Plus the prolific mappers could surely help others. This might contribute to interest more mappers to come back later and map again.

I briefly present below some statistics that I compiled from the Changesets Planet file. I observed for 2023 that 2,370 new mappers did map more then 1,000 objects for the first hour. These mappers represents 1.6% of the 152,000 new mappers last year.

And note that the mappers contribution reduce rapidly in the following hours. The first hour alone represents 57% of the objects edited for the first 3 days by the new contributors (66% for the first 4 hours).

The graphic below shows the objects edited by the new mappers in their first hour of contribution classed by the no of objects edited. From Oct.17 to 28, we observe a high volume of edits in the first hour. Many of these accounts it seems do not exist anymore. While these were high volumes of edits, the OSM-Ops need some flexibility to operate when such events arise.