I don’t think that anyone is questioning that these are good-faith edits (I’m certainly not). We’re just trying to understand how we can best distinguish between problematic edits (that the rate limiting is designed to prevent) and the sort of editing that any new user might make (in various circumstances).
As these are mappers introduced to OSM via organised editing sessions, I was just trying to understand what edits they were actually making before they hit problems (see also the github issue for more on this, including suggestions about how to change what is measured).
If the OSM server (possibly through an API) receives information about organized editing activities, including:
The BBOX (bounding box) area,
The duration of the campaign in hours,
The OpenStreetMap keys affected by the campaign,
The link to the campaign
The suggested hashtag
The requested ratelimit increase
The details of at least two organizers involved (user_id)
Then, I believe it might be possible to develop a smarter vandalism filter algorithm for the specified area. This could include a higher rate limit, and be active only within the specified BBOX and for the duration of the campaign, under the responsibility of the organizers.
ImreSamu has a good suggestion, which could also trigger better practices on term of paperwork.
Now, the rate limit introduced as a necessary reaction to recent vandalism was undoubtedly needed. However this can of worm is now open and unfortunately for volunteers time, this will need constant attention as this thread shows.
This came with responsabilities: some stuff was possible before, they are not anymore, and that’s not OK, wether we think such edits are okeyish or not. It would be bad if this sadly needed measure generates bad sentiment within the community, and care must be taken not to express one’s own feeling about Mapathons in general here.
Looking at this in more detail now that I’m at home, how can I identify the edits that were made as a result of this (or any other) editing session? I’ve got access to a changeset database and so can do:
changesets=> select distinct id,user_id,tags -> 'comment' from osm_changeset where created_at > '2023-11-16 00:00:00' and created_at < '2023-11-16 23:59:59' and tags -> 'comment' like '%something%';
or similarly tags -> 'someothertag' like '%something%'. The challenge is knowing what something or someothertag should be when searching.
I have long experience collaborating for humanitarian mapping and did coordinate the major OSM Disaster Response before the arrival of Missing Maps. I also wrote often about data quality problems related to Tasking Manager related coordinated mapping projects.
You all want to strictly discuss about rate limits. Looking at the history of HOT/Missing Maps projects mapping escalation via tasking manager and mapathons vs quality problems plus the dificulty for the community to have a good discussion with these humanitarian partners, I personally think that it is important to again discuss about better monitoring of such projects and the quality of edits even more with the arrival of AI related data imports.
For those not familiar with previous discussions, below are a few links :
See for example the changeset 144106801 where 940 buildings are imported. Do you want to remove the rate limit to import more rapidly such type of data ? No reference in the Changeset about the imagery used to compare the import and fix quality problems, and no reference to the tasking manager or any coordinated project.
Comparing the building data added to OSM with Esri or Bing, makes me think that no effort was made to correct / align buildings with standared imageries. If we speedup such AI detected buildings imports, we will surely not assure quality.
It is not enough for our humanitarian partners to say that these projects are important for their project and often admit that they worry more about the number of buildings in an area then about mapping quality.
Or at least a quick solution to the immediate problem. That might involve tweaking the limits, but it might also involve, say, marking the accounts of mapathon participants as exempt from the limit in an unbureaucratic fashion. (I think we already have such a solution for import accounts.)
I’m fully in favor of better organized editing practices as well, but that should be a separate conversation. Our humanitarian partners deserve a timely solution to their pressing issue as a first step.
We have started these discussions from the start of the Missing Maps project back in 2014-2015. We accomplished a lot for the Nepal 2015 Earthquake response but a lot of frustration was expressed by experience mappers about mapping quality and validation.
If you look at the link above for the Nord Kivu Ebola outbreak in 2018, mapathons organized by partners impacted with bad quality of tracing. In the context of a major emergency, we did have to erase all buildings for Butembo and restart the mapping
Looking more closely at Changeset 144106801, it seems that this was a contribution from a local contributor using AI generated data .
Missing Maps TM-Project-15737 from Nov.14 was in the eastern part of Homa bay and this Overpass Query covers the area from Nov.14. (22,772 buildings from nov.14).
I have to say that Data quality has improved from a few years ago. But still some mappers that trace non-orthogonal buildings or misalligned them comparatively to base imagery. Validation function from JOSM returns for these buildings :
section of a way repeated twice - 5 buildings
area not closed - 5 ways
duplicate - 1 way
similar name - 40 (name=edificio)
no zone style for a multipolygon (1 building traced with 2 ways grouped in a relation)
Part of this extraction, there are 66 buildings that overlap. There are also many non-orthogonal buildings. But note that using the JOSM validation function, it did not report these non-orthogonal buildings The Changeset session144564707mapped three days ago shows such quality mapping problems. Have the JOSM rules been modified and not catching overlaps and non-orthogonal buildings anymore ? Or do I need to specific parameters in JOSM ? JOSM Preferences - Data validator panel - all options are already checked.
Tested by creating a new OSM account and mapping buildings in Somalia until I hit the limit. I mapped 52 buildings in iD before the limit was hit. This is one of the changesets that went through before the block came in. You can see here all of the buildings that comprise this limit exceeding activity (in iD against black for clarity):
I chose an area with circular buildings to demonstrate how the limit can be reached quickly when features require more nodes. But please note that this is not exceptional in humanitarian/development, lots of areas that MSF focuses on include plenty of circular buildings.
8 nodes are too coarse an approximation for a circle, as 45° angles are not small enough for a heuristic to safely assume that they are supposed to represent a curve.
I don’t think the response to overly strict API limits should be to decrease mapping quality. In any case, it would only allow about twice as many buildings to be mapped before you’d run into the limit anyway.