Hashing to track contributors of notes

InsertUser · January 20, 2024, 3:51pm

IP addresses have to be acquired somewhere, headers used by fingerprints can be changed on the fly. If we ended up blocking whole IP ranges associated with popular VPN services then I’m OK with that, especially if legitimate contributors only have to spend five minutes setting up an account in order to post.

It’s a bit of a weird double negative, but no I don’t think that an “IP address is not more persistent”. By irreversibly hashing the IP along with more transient information then your persistence is limited by the most transient property hashed (at least as I understand it).

An IP based sequential token or hash with server side salt (that doesn’t live in GitHub) would make OSM’s identifier either very difficult or impossible for another website to use its information to de-anonymise the ID. To my mind and they shouldn’t be able to reconstruct the ID just by knowing the IP that went into it as they don’t have the salt. I don’t know enough about cryptography to be sure of this though so someone more technically literate would need to weigh in. FWIW I’m also not sure how far OSM needs to go legally so I don’t really know if it is workable.

If we went the hash route we could also consider deliberately truncating the result so that with the final ID length we would intend to get some collisions. This would make it very difficult for someone without the admin rights to see the full IP to be sure that any particular comment was definitely the same person as two or three IPs might have been merged during the process and they wouldn’t know if this has happened. This would of course mean that there is a risk of sweeping up a few extra people when a “bucket” is blocked but with low odds, probably dwarfed by the chance they are behind a shared institutional IP address. I’ve seen this approach suggested in a different context (p 12 onward although take this with a grain of salt as the author isn’t a cryptographic or legal expert). I don’t know how well it stacks up against UK and European (e.g. IE) guidance.