Proposal: Add reactions/votes to changesets

I don’t see a way to make this work well since we wouldn’t be able to automatically tell what the comments say, for purposes such as rate limiting (sure, we could do some automated sentiment analysis, but that sounds like it’d just invite all sorts of problems). The first comment might have been “Awesome, this really improves the map!” in which case the original author not replying would in no way imply that something is wrong, or the original author could respond to someone raising valid issues with “Hahaha I’m not gonna do that”.

So if I understand it correctly you’re afraid people will write meaningless discussion comments just to be able to mark something as bad? I think that’s gonna be hard to solve, but I also don’t really think that this would be a notable issue.

While a feature such as the one proposed does bring some elements from social networks into OSM, we are still far from being (and neither will nor should be) a social network proper. On e.g. Facebook it’s common for people to dislike something and then just write some non-constructive comment (such as just another emoji), but that’s in a setting where people quickly scroll on, and few take more than a few seconds to consider the original post. I believe that the dynamic in OSM is/will be quite different; the wast majority of people who look at a changeset and who’d make use of a feature such as this actually want the map to improve, and as such should be much more inclined to leave a constructive comment together with their review/reaction, compared someone just scrolling quickly past something on Facebook.

Of course, there will be some number of people who’d have more of a knee-jerk reaction (for example the kind of people who are opposed to anyone else mapping in what they perceive as “their” area), but I think all in all that number should be insignificant enough not to matter (and if it should become a big problem, a feature such as allowing to DWG to restrict an account from reviewing/reacting could solve the problem).

The original problem statement that spurred this discussion was that OSM lacks a social aspect. There’s a couple different ways to interpret that need for social interaction. One is that we need to formalize or even gamify the process of judging a changeset (and, by extension, its author). Another is that we need to facilitate authentic, constructive interactions between mappers. I disagree that a rating system producing user metrics could possibly meet the need for meaningful human interaction on osm.org.

The intended audience for ratings would be reviewers and data consumers rather than the mapper’s peers per se. Mappers are savvy – they know when they’re being judged. If the rating system isn’t well-designed to handle a wide variety of circumstances, it will create a chilling effect for some mappers, while others will get defensive over their ratings.

We see this today with the validator-based changeset tags that iD generates. These tags were only ever intended as low-stakes, noisy signals to data consumers, but then How Did You Contribute gamified these warning counts by including them in each mapper’s permanent record. On the one hand, many mappers do try to score points by solving validator warnings. On the other hand, the validator wasn’t designed to be infallible, so oftentimes people map for the validator’s quirks rather than based on sound judgment.

OSMCha’s ratings and labels are far enough out of the way that most well-meaning mappers don’t fuss over their personal reputation there. Still, mappers occasionally see false positive labels and fear that their contributions are being ignored by data consumers. Sometimes people take it out on each other with these ratings. Abuse isn’t hypothetical when it comes to a system that can express negative judgments.

Defensiveness, cheating, and fear. These are the (certainly manageable) downsides of a rating system, the cost of automating accountability. It’s a far cry from the simple act of expressing gratitude. I think it’s great that you’re thinking through the various aspects of preventing rating abuse, but I don’t think overloading the idea with user engagement goals will result in a better-designed feature.

MediaWiki’s Thanks extension coexists with a completely separate revision scoring system that feeds into a machine learning prediction model. The system’s creators have previously expressed an interest in extending their approach to support OSM. I think they’d be the first to acknowledge that countervandalism tools are not a direct solution for improving contributor morale.

3 Likes

The labels don’t have to be literally hashtags, or in the text. I was thinking of wiki voting (not necessarily “approve”) and other templates, or Github reviewing (“left a comment”, something for questions and doubts) , etc.
Upload comment hashtags should have been “temporary”, and should be retired. Or do you think the hashtags= metadata is still bad?
To reiterate what I said in the opening, I don’t support directly applying OSMCha’s existing style. It commenting with uppercase hashtags in discussion is certainly one of the worst designs.

For “rating” a changeset one needs to be able to review it. I mean how else could a Aunt Tily Mapper otherwise judge about a changeset except voting up their buddys?

I dont buy into the gamification needed to be built into OSM - We have tools like Maproulette and
HDYC to fulfill this.

Yes - we need better tools for colaboration - Emojis on Changesets aint one of them.

We need better tools for reviewing changesets, we need to incentivize people to create smaller and
more self-contained changesets, and we need better tools for communication e.g. subscribe button on notes, automatic follow-up after time on changesets/notes.

Flo

Sure, I’m just saying that judging/rating/reviewing/scoring is fundamentally different than saying welcome/thank you/hello, so it’s two features not one.

1 Like

Ah, I understand better what you mean now.

To begin with, I don’t think that we should have completely separate functionality (one for “reviews” and one for “thanks”), as that would just be confusing (most people wouldn’t understand the difference between “mark good” and “say thanks”), so unless the two can be combined then we’d have to choose one (focus on human interaction (mostly positive) or accountability/quality assurance).

But I think we could still find a way to meaningfully combine these aspects. For example, we could choose to hide negative reviews, either completely (just using them for e.g. rate limiting) or just the reviewer (less risk for retaliation for a bad review).

I don’t think we need to care too much about data consumers in this discussion; unless we get to a point where we systematically review (almost) every change (which would be good for quality, but not realistic for a volunteer-driven project) the information about ratings would likely not be of much use for data consumers. Sure, they could exclude changesets with bad ratings, but that’d quickly become a very hard problem to solve technically, and the number of changesets that will get reviewed is anyway unlikely to be a large enough proportion to make a big difference.


So let’s think about how a system could look that both (at least somewhat) satisfy the human interaction aspect, while also being useful for things such as rate limiting and badges.

The main issue you mention is mappers becoming defensive when receiving a negative review, for that I think there are two things we can do; do our best to ensure that the negative review is accompanied by constructive feedback and ensure that the language used (by the UI) is not “inflamatory”. How about if we drop “bad” as a possible review so that we just have “good” and “could improve”, but also add a “damaging to the data” checkbox (when choosing “could improve”). “Could improve”, together with proper help texts in the UI should invite to specify how it can improve, and the reviewed mapper would only ever see either “good” or “could improve”, while “damaging to the data” would be hidden and just used for e.g. rate-limiting (and maybe as a marker for the DWG when investigating the changes done by a user).

This would still leave the issue that a single changeset that was bad (possibly early in one’s “mapping career”) could damage one’s “permanent record”. Algorithms using this information should probably take the time since the changeset into account (weighing recent events more heavily), but we could also add some mechanism to “get rid of” the negative review. For example, we could notify the original reviewers if a user later creates a changeset with the tag fixes=<original changeset id>, encouraging them to reconsider (and should they not do so, maybe because they have stopped working on OSM, others could review and as well and as before the reviews would be tallied up). This would require some thought UX-wise (if I review a changeset that has fixes=, and it does fix the issues in the original changeset by also introduces new issues, I need to ensure that I mark the old one as “good” and the new one as “could improve”), but I don’t think it should be impossible to find something that works.

You also mentioned the risk of cheating the system. This would mostly be a problem for the algorithm, and the best way to handle this would be to come up with different ways to cheat and ensure that the algorithm cannot be cheated that way. For example, one could cheat by only doing very small changesets, which could be fixed by counting the number of actual changes in the changesets (as the current rate limiting code does).

Do these ideas sound closer to a workable solution? Or do you have some other suggestions on how we could move forward?

Just to be clear, I don’t propose to build gamification into OSM here, just the infrastructure that could be used by others (such as HDYC) to implement this gamification (or at least as a stimulus/input to it).

(though I do like the ideas presented here and would love to see that implemented, but that is off-topic from this thread)

MediaWiki puts these two things right next to each other, and I think everyone knows the difference:

Undo, thank

A revert is the ultimate :-1: reaction. A revert is logged in the system and notifies the original author automatically. Over time, Wikipedia has built up a rich dataset of the kinds of edits that get reverted, based on the metadata that accompanies each revert. This data informed machine-learning models, powering bots that today can revert some bad edits automatically.

With the Undo and Thanks buttons, MediaWiki has an environment where you can casually, effortlessly encourage others, but you really have to mean it when you criticize someone publicly. There’s no room for grandstanding or peer pressure. On the other hand, putting a revert tool in the hands of ordinary users can fuel trigger-happy edit wars. To mitigate against abuse, the Undo button and an even more powerful Rollback button are limited to users with a certain amount of experience as contributors. Other than that, these tools have been remarkably stable over the past decade or so, because the social dynamics are so straightforward.

Of course, OSM has tools for reverting, and there are ways to guess which changesets reverted which changesets, but it’s all fuzzy without direct integration into the site.

3 Likes

Hmm, even though it has worked well for MediaWiki, it does seem drastic for me not to have a middle ground between reverting and thanking. However, I think it could work to replace the “damaging to the data” option I came up with above by reverting (as that is likely usually what one would do).

This would mean that there’d be three options when reviewing (other than just a plain comment): “good”/“thank”, “could improve”+comment or revert.

How to integrate reverting would then be an interesting question (especially since there are multiple tools, as well as the manual route), but a first step could be adding “reverted_by”/“reverts” as a proper data field on changesets, instead of just a fuzzy tag. The existing tools could then implement that tag, and the UI could use it to cross-link (and rate limits etc. could be calculated based on it).

1 Like

Please have a look at the earlier shared recent changeset link and check how many are really positive. It are mainly questions, doubts and welcomes, i.e. review comments. I really do not see how adding votes can help here while I see how it can create a chilling effect.

On that response from the person that committed the changeset, I am pretty sure a overwhelming part of the users chose to ignore the comment, that is not replying instead of taking the effort to write a message like “Hahaha I’m not gonna do that”. In my view it is in practice always positive if committer does respond.

I think it will be notable but I agree with you the big problem is likely not completely meaningless discussion comments but instead I see the quality of the average review comment go down if there is also vote functionality.

Although the wiki-like private thanks might be nice, I don’t think publicly voting on changesets is a good idea.

If a changeset discussion cannot be resolved, then the discussion must first be escalated to the wider community in another venue (e.g. this forum or mailinglist), and then any necessary voting can subsequently take place in that venue.

1 Like

Most (?) of the present revert tools have the option to add an “automatic comment” e.g. “Reverted for xxxx”, which then appears on the original CS as e.g.

"DWG revert

Changeset: 143554258 | OpenStreetMap"

Very helpful to then be able to look at CS & see that they’ve already been reverted!

Instead of public reactions on changesets, which I think are liable to misuse and misinterpretation, I would prefer a button to privately thank the author for a changeset, like there is on the Wiki.

Just writing “thanks” in a changeset comment does that without much fuss.

2 Likes

You mean, with words? :scream:

:grinning:

1 Like

I hope that such comments do not require a minimum character count as in this forum.

Sure, the majority of cases aren’t just a single positive comment, but as pointed by another response this actually the only way to express positivity right now.

Also, even if 99% of all comments were of the negative sort, that would still be way too many potentially positive to be able to do any simple calculations on that (for example to calculate rate limits). Otherwise, we’d end up with people refraining from positive (including welcoming) comments, as they’d then potentially degrade the “status” of the mapper, since they’d all be counted as negative unless they respond.


So, based on the responses so far, let’s try another approach:

We add two ways to “review” a changeset: thanks and actual reviews

Thanks would be either private (only visible to the thanker and changeset author) or public (visible to all). Personally, I’d prefer for them to be public (at least the number of likes on a changeset), I think it would overall give a more positive vibe to the community and be more in line with our general “openness”.

Reviews would be modeled similarly to PR reviews on Github or similar. There can be multiple reviews per changeset, each review would consist of:

  • A textual description (usually describing the issue)
  • Who created the review
  • Optionally a reference to a changeset that fixes the issues
  • A state (open, resolved, rejected, closed, stale)
  • Further comments
  • (maybe in the future: Optionally the IDs of specific elements in the changeset that the review refers to)

A review process could then look like this (comments can be added by anyone during the entire process):

  1. The reviewer creates the issue with a description and a default state of open
  2. The mapper gets a notification of the review
  3. The mapper can either fix the issue in a new changeset and add a reference to this changeset (changing the state to resolved) or reject the suggestions with a comment (changing the state to rejected)
  4. The reviewer can choose to accept the resolution or rejection, changing the state to closed, or change it back to open

It gets more interesting in case of disputes, or when either the mapper or reviewer stops interacting.

  • If the mapper stops interacting nothing really bad happens; the open review might reduce their rate limit or prevent them from receiving some badge or something, but that’s all on them, they can fix that by just responding. Reviews automatically get marked as stale after one year of no activity.
  • If the reviewer stops interacting, the mapper can request the DWG to step in and close the review (provided they have done what’s been requested of them, potentially gaining approval from the wider community first)
  • If locked in a stalemate (for example, reviewer keeps reopening, mapper keeps rejecting, the wider community has reached a consensus but mapper and/or reviewer aren’t following it), either can request the DWG to step in and close the review

In my original proposal, I tried not to involve the DWG (or other WG) in the process so as not to add to their workload, but I think in the cases described above the DWG would already today without explicit review features have to step in and handle the situation, just that they might have to handle an edit war rather than review war.


This would result in much more explicit reviews, and bring the following benefits compared to today’s flat comment discussion:

  • It is explicit that changes are requested, and how to resolve the review
  • It is explicit when the discussion is resolved
  • It provides input to e.g. rate limits and badges

Compared to my original proposal it decouples positive from negative without introducing overlap.

Does something like this sound better?

Your proposal sounds reasonable to good to me. You did not indicate how to handle “welcomes”.

The thing is that there is already a system and I am not sure this proposed system is better, the only benefit I see is that things are more explicit and by that it is easy to extract stats. It looks to me you add more value to that then I am.

More explicit has also drawbacks, the current system is lightweight; adding overhead will mean people use it less. That can be offset if people working with it see benefits compared to the current system.

No specific new way, this would not replace normal comments. Though if the welcome is also a review then the welcome text could be part of the review.

Indeed I think having a way to get statistics is an important reason, as a big part of the idea was a result from the rate limiting discussion. But statistics are after all the best way to lie, so therefore this discussion is important so that we get a system that is hard to cheat.

This is the main reason why I initially just proposed a “reaction” system (and to a large extent that is still what I’d prefer). The more “full-blown” review system I proposed now has some benefits, not least that it should solve most of the doubts that have been expressed here, but it is a lot more complex which is a tradeoff to be taken into account.