How about limit new accounts?

Many social platforms share this, the discourse platform being a prime example. I tried in my topic Earning trust as Newbie to put some light on this. Unfortunately, openstreetmap is missing even the basics for something like that. Never mind the permissions (ACLs?) following up.

There’s a valuable concept that every negative event must be turned into a lesson so this or a similar event would less likely to happen in the future. And here’s something we need to realize and remember well:

Before this happened, for years, various people were saying exactly this about smaller-scale issues (e.g., the infamous Pokemon Go cheaters adding fake features). And there were other people, willing to flip the situation upside down and say that such issues aren’t “burning through the good will”, and everyone should just be super-polite and patiently remove the rubbish objects, hoping that one of those who added it would, all of a sudden, become a valuable mapper. And they were calling those who opposed it all kinds of names.

It also reminds me (and this analogy is not accidental) how some European governments finally realized that not all governments (and people) want a peaceful life, so they are now reopening the rearmament programs and spending millions on defense. While just “yesterday” they blamed their a little more skeptical and less idealistic neighbors for imaginary “xenophobia” and unwillingness to destroy the borders, etc.

Malicious intent exists. And the least wise thing that can be done about it is to pretend it doesn’t and effectively put a burden of dealing with it on regular people. That applies to opensource projects, international politics, local government efforts against the crime - anything of that sort.

6 Likes

Idea: block account creation for 2 weeks. That may make the vandal get bored and go away.

The biggest issue is that it’s a multi-dimensional supervised pattern-detection problem. And even that, if undesirable changes have a good mix of commonality and entropy, can completely miss them. But if the objective isn’t to build a proactive detection method, making a discovery method for finding more instances of the same action is possible. Say, in this particular case, harmful edits had 0% geometry changes and 100% name=* changes. 100% of the resulting name values were in capital letters and no substring overlaps with the previous value. User names had untypically high entropy of small and capital characters mixed.
Would a vandal be able to change if this pattern was discovered? Likely, yes. But if it’s combined with an effective revert tool allowing to erase the changesets by list completely, the incentive for vandalism would be pretty low.

1 Like

Add rate limiting for changeset comments by tomhughes · Pull Request #4202 · openstreetmap/openstreetmap-website · GitHub was also merged

new accounts cannot make 10 000 changeset comments within an hour anymore

this makes harder to spam people from new accounts and should reduce other bad consequences

2 Likes

Sorry, but that’s not good enough.

The board needs to recognize three things:

  1. This cyber attack has exposed vulnerabilities in our cyber security posture
  2. We will continue to see cyber attacks over time because our data is valuable and many competing groups care very deeply about what goes in the map for political and commercial reasons
  3. We don’t have the expertise we need to fix the problem

As long as we are reacting to specific attacks, we will continue to see a pattern of new attackers and delayed responses while we scramble to make a fix. In the meantime, people like @SomeoneElse and his colleagues on the DWG will continue to be stuck fighting wildfires with garden hoses.

As long as we have a lot of smart people with IT, programming, and digital cartography expertise, and no smart people with expertise in cyber security, this problem will persist.

Please consider putting out a call to action to get the right people on the team, or hiring someone to fill this critical gap in expertise.

4 Likes

I’m not convinced this is a classic “cyber security” problem or one where a traditional “cyber security professional” would be the right approach. This isn’t a technical vulnerability, though there may be technical solutions/ameliorations; it’s largely a social contract issue. There are comparatively few best practice exemplars for a massive public contribution project like OSM, and for the one that does exist (Wikipedia), its response (a strong user hierarchy) has been documented over many years to have problematic effects on attracting and retaining new contributors.

That’s not to say we can’t learn from elsewhere, but I suspect we’re largely on our own figuring this one out.

11 Likes

Every cyber attack will have a person behind it. It this case though, a lot of this edits and accounts seem like bots, it’s possible that just one person is behind all of this. So I would count this as pretty regular cyber attack, that need pretty regular approach.

1 Like

Well it sort-of was - someone scripted a sign-up procedure that was designed to be gone through interactively.

The technical mitigation (to slow them down and require more interaction at their end) does seem to be working against this actor so far, but it doesn’t prevent some “clever” person from automating the “sign up to OSM” step in their app so that some other user-contributed data is added to OSM without the contributor ever having read our contributor terms etc.

In the commercial arena there are many of these (just based on my own experience). The range of information that a new user needs to provide to prove who they are varies hugely - compare signing up for a new bank account and for some new random website.

The bit we need to do as a community is to figure out how much “finding out about the person enrolling” we’re happy to have. Requiring e.g. government-issued ID would put many people off from contributing (and to be clear, there have been no major calls for anything like that). People in these forums an elsewhere have said “we must have some sort of effective CAPTCHA” and others have said “we absolutely must not”. There have been calls from within these forums to “turn off all new user registrations”, “lock down geographical area” or “limit what new users can do”. Any of these is in some sense technically possible but some of these suggestions are not at all desirable by most OSMers.

Most of the “attacks” on OSM that I’ve seen look like the work of one person, or perhaps a very small group.

8 Likes

bottleneck is not on stage “what should be hardened/changed” but at stage of implementing it (which is something that OSMF should solve and tries)

For now, not everything is even properly rate-limited which is absolutely basic low-hanging fruit and obvious task to take an obvious example.

To find things like Prevent blocked users from signing up with the "same email address" again and again · Issue #4206 · openstreetmap/openstreetmap-website · GitHub we do not need experts in cybersecurity (or, if you count experienced DWG members as cybersecurity experts - we have them already)

(all being equal such cybersecurity person or people would be nice to have but there is an opportunity costs here)

(again, that is personal opinion not consulted with anyone else on OSMF board)

2 Likes

Er, I don’t think we really have a community consensus on “what should be hardened/changed”, do we? A bunch of people (including me) have made suggestions but they need discussion - see for example Tom’s comments on issue 4018 that I raised.

Edit: I previously linked to the wrong issue here - sorry about that!

3 Likes

this comment sequence looks to me like initial closure was based on misreading what was proposed? An reopening happened once it was noticed? (not entirely sure, but I did similar things few times)

Sorry - I linked to the wrong issue above.

I think you may be confusing content policy and user moderation with countervandalism. The social hierarchy arose to resolve legitimate editorial disputes and address the sort of behavioral issues that any online community encounters. We have these social structures too, only more implicit and ad hoc.

On the other hand, Wikipedia primarily employs technical measures against vandalism. You may be familiar with page protection or semi-protection (against new and dormant users). It’s easy to see why this mechanism would be incompatible with OSM at the data model level. But there are also a number of other layers of protection, including:

  • All manner of API rate limits
  • CAPTCHAs, though they are a mess
  • Personal watchlists for closely tracking changes to specific content
  • Built-in tools to visualize edits and revert them, available to any user
  • Banning IP addresses and ranges
  • Abuse filters (akin to the DWG configuring the OSM API to automatically reject any changeset that matches an Overpass query, and optionally automatically ban the user too)
  • ML-based revision scoring (akin to OSMCha labels)
  • A legion of supervised bots run by the community, some of which are empowered to revert on sight
  • A concerted effort to detect and block open proxies (trading off the ability to edit via Tor for protection against botnets behind VPN)
  • CheckUser (akin to the DWG hunting for patterns in users’ IP addresses etc. upon suspicion of sockpuppetry)

I think it would be hard to argue that any of these technical measures has hampered Wikipedia’s editor recruitment and retention, except maybe the overuse of reverting or the ban on open proxies. For perspective, Wikipedia still welcomes anonymous IP edits, a practice that OSM banned in 2009.

I don’t harbor any illusion that implementing any of these measures would be easy or even feasible in this particular moment. But I’m pretty sure we’ll eventually end up with something resembling them. Wikipedia went many years with a lightweight trust model like OSM’s, but eventually there came a time when the burden on bystanders began to justify something slightly more draconian.

15 Likes

My perception when Richard said “this isn’t a technical vulnerability” was that this is not a purely technical issue, the kind that you can easily specify a solution to and point a technical team at. It is more a mixture of technical and policy issues, that need to be sorted out along the way.

The distinction between content policy and countervandalism is certainly a useful basis. But when I think about the number of vandalism accusations we see on this forum, I feel that the borderline may be difficult to draw sometimes.

4 Likes

I think it’s fundamental for a project like OSM to be in constant contact with other similar open source ones, like Wikipedia, to make sure learnings are shared. OSM is not the first project to encounter these problems and others have already solved most of the issues around “automated vandalism to our open source dataset”.

I don’t know if there is already a formal relationship OSMF-Wikimedia Foundation that would enable these conversations to happen.

1 Like

You’re right that policy is important. Most of Wikipedia’s technical countervandalism measures arose in order to implement or automate a preexisting policy. Thanks to this context, the measures are always designed to keep a human in the loop. We could do a much better job of documenting clear expectations and procedures, but that entails hard work to drive a consensus on many matters.

As an administrator on the OSM Wiki, I benefit from many of MediaWiki’s technical measures that I described earlier, yet every decision about blocking a user or writing a new abuse filter is fraught with uncertainty: as I take what seems to be the best option, am I on the same page as the community? Will someone come out of the woodwork to complain that I didn’t follow their expectations, and will I be trapped in the absence of a clear written policy?

Psst, OSM operates through backchannels more than formal relationships. :wink: The parallels between OSMCha and Wikimedia’s revscoring service aren’t entirely a coincidence. Some early ad hoc conversations between the WMF’s data science team and Mapbox’s OSM tooling team influenced both services, but that was a long time ago for both organizations.

This could be a good opportunity for mappers to work their connections within the Wikimedia movement in order to better understand the impact of the ongoing vandalism attempts and get our challenges on the radar of Wikimedia’s affiliate groups, if not the WMF itself. Wikimedia Maps is embedded in countless Wikipedia articles; their tile refresh rate leaves them vulnerable to vandalism that sticks around for much longer than on osm.org.

4 Likes

Definitely we have no agreement on all issues, but we have at least some where there is general agreement that it should be done - and missing step is someone writing the code (because that is the bottleneck right now)

I mostly agree, and recognize you understand both data models in far more depth than I do. I do, however, think that plenty of methods of content protection could be added to OSM if there was will to do so, though it’d be hard.

I think this gets at what I was suggesting about a working group for this. I don’t see community consensus on this without a group driving at it. The DWG/OWG are doing great work dealing with the immediate tasks at hand, but I don’t want to suggest that anyone there signed up for driving consensus on such wide and interconnected array of thorny issues. I think this thread demonstrates that we need to advance quickly as a group. I think something formal that makes recommendations and keeps this issue in the community spotlight could help, with implementation then going back to the relevant parts of the existing community like OWG, DWG, editor developers, etc.

2 Likes

I’m glad to hear that the OSMF has a secret plan to fight cyber attacks and is just waiting for someone to implement it…

However, I think the actual reality is that a handful of folks have tossed out some ideas to make it harder for someone to execute this specific attack in the future and we are patting ourselves on the back that we have it all figured out. My point remains that there are no cyber security experts on the team, only people that have IT/programming/networking/database/cartography skills who have only a casual/adjacent understanding of cyber defense that’s increasingly inadequate in a modern cyber environment.

You are 1/5 of the OSMF board, and if you think our cyber protection is adequate, I am deeply concerned about our project’s ability to weather future cyber attacks. I think you are dismissing legitimate concerns about cyber vulnerability and overconfident in your own ability to set cyber policy without subject-matter experts. Please don’t confuse a degree in computer science with one in cyber security. I am additionally concerned by your comment that our cyber security posture has not been discussed by the OSMF board and I hope this incident at least serves to prompt such discussions.

Consensus may be appropriate for subjective decisions like what color to render golf courses. It is wholly inappropriate for determining what level cyber defense is appropriate for the project. That is primarily a policy decision that should be ultimately set by the the board, who ought to be responsible for making decisions about the tradeoffs between security and convenience and accessibility of the user base. Those trade-offs should be informed by a threat and vulnerability analysis, an understanding of current industry best practices in data security, and an understanding of our ability to detect and counter attacks when they occur.

If you are waiting for a consensus from the broader community on key decisions about securing our data and infrastructure, then in my opinion you aren’t doing your most important job as a board member.

1 Like