How about limit new accounts?

Well it sort-of was - someone scripted a sign-up procedure that was designed to be gone through interactively.

The technical mitigation (to slow them down and require more interaction at their end) does seem to be working against this actor so far, but it doesn’t prevent some “clever” person from automating the “sign up to OSM” step in their app so that some other user-contributed data is added to OSM without the contributor ever having read our contributor terms etc.

In the commercial arena there are many of these (just based on my own experience). The range of information that a new user needs to provide to prove who they are varies hugely - compare signing up for a new bank account and for some new random website.

The bit we need to do as a community is to figure out how much “finding out about the person enrolling” we’re happy to have. Requiring e.g. government-issued ID would put many people off from contributing (and to be clear, there have been no major calls for anything like that). People in these forums an elsewhere have said “we must have some sort of effective CAPTCHA” and others have said “we absolutely must not”. There have been calls from within these forums to “turn off all new user registrations”, “lock down geographical area” or “limit what new users can do”. Any of these is in some sense technically possible but some of these suggestions are not at all desirable by most OSMers.

Most of the “attacks” on OSM that I’ve seen look like the work of one person, or perhaps a very small group.

8 Likes

bottleneck is not on stage “what should be hardened/changed” but at stage of implementing it (which is something that OSMF should solve and tries)

For now, not everything is even properly rate-limited which is absolutely basic low-hanging fruit and obvious task to take an obvious example.

To find things like Prevent blocked users from signing up with the "same email address" again and again · Issue #4206 · openstreetmap/openstreetmap-website · GitHub we do not need experts in cybersecurity (or, if you count experienced DWG members as cybersecurity experts - we have them already)

(all being equal such cybersecurity person or people would be nice to have but there is an opportunity costs here)

(again, that is personal opinion not consulted with anyone else on OSMF board)

2 Likes

Er, I don’t think we really have a community consensus on “what should be hardened/changed”, do we? A bunch of people (including me) have made suggestions but they need discussion - see for example Tom’s comments on issue 4018 that I raised.

Edit: I previously linked to the wrong issue here - sorry about that!

3 Likes

this comment sequence looks to me like initial closure was based on misreading what was proposed? An reopening happened once it was noticed? (not entirely sure, but I did similar things few times)

Sorry - I linked to the wrong issue above.

I think you may be confusing content policy and user moderation with countervandalism. The social hierarchy arose to resolve legitimate editorial disputes and address the sort of behavioral issues that any online community encounters. We have these social structures too, only more implicit and ad hoc.

On the other hand, Wikipedia primarily employs technical measures against vandalism. You may be familiar with page protection or semi-protection (against new and dormant users). It’s easy to see why this mechanism would be incompatible with OSM at the data model level. But there are also a number of other layers of protection, including:

  • All manner of API rate limits
  • CAPTCHAs, though they are a mess
  • Personal watchlists for closely tracking changes to specific content
  • Built-in tools to visualize edits and revert them, available to any user
  • Banning IP addresses and ranges
  • Abuse filters (akin to the DWG configuring the OSM API to automatically reject any changeset that matches an Overpass query, and optionally automatically ban the user too)
  • ML-based revision scoring (akin to OSMCha labels)
  • A legion of supervised bots run by the community, some of which are empowered to revert on sight
  • A concerted effort to detect and block open proxies (trading off the ability to edit via Tor for protection against botnets behind VPN)
  • CheckUser (akin to the DWG hunting for patterns in users’ IP addresses etc. upon suspicion of sockpuppetry)

I think it would be hard to argue that any of these technical measures has hampered Wikipedia’s editor recruitment and retention, except maybe the overuse of reverting or the ban on open proxies. For perspective, Wikipedia still welcomes anonymous IP edits, a practice that OSM banned in 2009.

I don’t harbor any illusion that implementing any of these measures would be easy or even feasible in this particular moment. But I’m pretty sure we’ll eventually end up with something resembling them. Wikipedia went many years with a lightweight trust model like OSM’s, but eventually there came a time when the burden on bystanders began to justify something slightly more draconian.

15 Likes

My perception when Richard said “this isn’t a technical vulnerability” was that this is not a purely technical issue, the kind that you can easily specify a solution to and point a technical team at. It is more a mixture of technical and policy issues, that need to be sorted out along the way.

The distinction between content policy and countervandalism is certainly a useful basis. But when I think about the number of vandalism accusations we see on this forum, I feel that the borderline may be difficult to draw sometimes.

4 Likes

I think it’s fundamental for a project like OSM to be in constant contact with other similar open source ones, like Wikipedia, to make sure learnings are shared. OSM is not the first project to encounter these problems and others have already solved most of the issues around “automated vandalism to our open source dataset”.

I don’t know if there is already a formal relationship OSMF-Wikimedia Foundation that would enable these conversations to happen.

1 Like

You’re right that policy is important. Most of Wikipedia’s technical countervandalism measures arose in order to implement or automate a preexisting policy. Thanks to this context, the measures are always designed to keep a human in the loop. We could do a much better job of documenting clear expectations and procedures, but that entails hard work to drive a consensus on many matters.

As an administrator on the OSM Wiki, I benefit from many of MediaWiki’s technical measures that I described earlier, yet every decision about blocking a user or writing a new abuse filter is fraught with uncertainty: as I take what seems to be the best option, am I on the same page as the community? Will someone come out of the woodwork to complain that I didn’t follow their expectations, and will I be trapped in the absence of a clear written policy?

Psst, OSM operates through backchannels more than formal relationships. :wink: The parallels between OSMCha and Wikimedia’s revscoring service aren’t entirely a coincidence. Some early ad hoc conversations between the WMF’s data science team and Mapbox’s OSM tooling team influenced both services, but that was a long time ago for both organizations.

This could be a good opportunity for mappers to work their connections within the Wikimedia movement in order to better understand the impact of the ongoing vandalism attempts and get our challenges on the radar of Wikimedia’s affiliate groups, if not the WMF itself. Wikimedia Maps is embedded in countless Wikipedia articles; their tile refresh rate leaves them vulnerable to vandalism that sticks around for much longer than on osm.org.

4 Likes

Definitely we have no agreement on all issues, but we have at least some where there is general agreement that it should be done - and missing step is someone writing the code (because that is the bottleneck right now)

I mostly agree, and recognize you understand both data models in far more depth than I do. I do, however, think that plenty of methods of content protection could be added to OSM if there was will to do so, though it’d be hard.

I think this gets at what I was suggesting about a working group for this. I don’t see community consensus on this without a group driving at it. The DWG/OWG are doing great work dealing with the immediate tasks at hand, but I don’t want to suggest that anyone there signed up for driving consensus on such wide and interconnected array of thorny issues. I think this thread demonstrates that we need to advance quickly as a group. I think something formal that makes recommendations and keeps this issue in the community spotlight could help, with implementation then going back to the relevant parts of the existing community like OWG, DWG, editor developers, etc.

2 Likes

I’m glad to hear that the OSMF has a secret plan to fight cyber attacks and is just waiting for someone to implement it…

However, I think the actual reality is that a handful of folks have tossed out some ideas to make it harder for someone to execute this specific attack in the future and we are patting ourselves on the back that we have it all figured out. My point remains that there are no cyber security experts on the team, only people that have IT/programming/networking/database/cartography skills who have only a casual/adjacent understanding of cyber defense that’s increasingly inadequate in a modern cyber environment.

You are 1/5 of the OSMF board, and if you think our cyber protection is adequate, I am deeply concerned about our project’s ability to weather future cyber attacks. I think you are dismissing legitimate concerns about cyber vulnerability and overconfident in your own ability to set cyber policy without subject-matter experts. Please don’t confuse a degree in computer science with one in cyber security. I am additionally concerned by your comment that our cyber security posture has not been discussed by the OSMF board and I hope this incident at least serves to prompt such discussions.

Consensus may be appropriate for subjective decisions like what color to render golf courses. It is wholly inappropriate for determining what level cyber defense is appropriate for the project. That is primarily a policy decision that should be ultimately set by the the board, who ought to be responsible for making decisions about the tradeoffs between security and convenience and accessibility of the user base. Those trade-offs should be informed by a threat and vulnerability analysis, an understanding of current industry best practices in data security, and an understanding of our ability to detect and counter attacks when they occur.

If you are waiting for a consensus from the broader community on key decisions about securing our data and infrastructure, then in my opinion you aren’t doing your most important job as a board member.

1 Like

Sure, though the overuse of reverting is a big issue IMX and one of the reasons I don’t really contribute to Wikipedia - your mileage may of course vary.

Going back to the comment I was actually replying to, I don’t believe there is a “critical gap in expertise” in OSM which makes us unable to implement technical measures like this. There have been organisational issues which have made the response in this case slower than it could have been, and I guess that’s where I’d like to see some brainpower applied. In terms of lessons learned, I have high confidence that EWG/sysadmins will consider what changes need to be made to the site code, but perhaps a little less that OSMF will develop a response protocol in time for the next incident.

3 Likes

Given that there are 7 people on the OSMF Board, I wonder which ²⁄₃₅th of another board member you think Matseusz has eaten. :rofl: I know some board members are more vocal than others, but that doesn’t mean they don’t exist!

1 Like

Well, I stand fractionally corrected then :crazy_face:

I would like to disagree, either in whole or in the interpretation of the phrases “cyber” and “security”.

Moderation (anti-vandalism) has little to do with “cyber security”. Anti-vandalism tools are not really security products, and they do not need anyone on board. In theory I could write a simple bot in a few days to process changes and run a filtering engine/analysis on them and alert someone, or even trigger kind of automagic revert, but first I do not have the time handy at the moment and second it would really be much better to organise it instead of people individually solving it in a possibly conflicting way.

I would even guess that the current tools provide adequate output to handle that, even more so if someone would detail volunteer editors how to coordinate their efforts. I am not very active here because I do not really meet these kinds of abuse.

In general, yes, but in a project as visible as OSM, abuse can run along a spectrum from casual graffiti by bulbasaur gardeners all the way up to massive, sustained attacks by well-resourced organizations. Even if that isn’t quite what we’re dealing with right now, we should consider it a wake-up call: OSM’s well-intentioned vulnerabilities make us an easy target during real-world conflicts. Not everyone needs to concern themselves with such weighty issues, but someone should have a security hat on.

4 Likes

Does Wikipedia ever publicly name and shame or press legal action?

Wikipedia’s editor community has previously name-and-shamed organizations for trying to use the site as a public relations channel – most notably, various members of the U.S. Congress. But that only had an impact because those legislators had a reputation to defend (and embarrassingly didn’t always know what their staffers were up to). Vandalism is a different beast with different incentives. Even if the OSMF could identify a specific malicious actor, it would only have a legal recourse in some jurisdictions, and exercising that recourse wouldn’t be without cost. So at best, this would be a complementary approach alongside technical defenses and mitigations.

1 Like

We are getting a lot of traffic and interest in this topic because it is clearly surpassed the level of moderation and the amount of people actively working against the one bot is showing this is an attack.

The terms cyber-security is apt and appropriate to mention here, security the login system and indeed the actual data all our volunteers put into OSM is something that could very much be improved upon.

I’m very happy to see small steps being taken, the rate-limiting features added by Tom are great and make total sense. I doubt you’ll find anyone going against that. Well, other than those that are doing the vandalism anyway :slight_smile:

This post (and I’m the post-starter) was indeed aimed at the foundation and I guess the board as the decision making people there. The goal is to get the ball rolling on essentially the same thing we’ve seen now. Rate-limiting at a bit more advanced level than we’ve seen so far.

A new account having the editing rights that match the account age. A zero day mapper will do a LOT less than one that has logged a lot of mapping days. This is a natural rate-limiting that would not have anyone actually complain as the limits will likely never be set so low as to cause zero-day mappers to hit them.

But I’m not entirely sure if the board and the foundation have even considered this approach. The actual replies on this topic here are positive about the idea. But are we ever going to see this? Or is this idea dismissed as not having majority consensus?

So, apart from this needing a coder, what is the chances of “limiting new accounts” being actually implemented?