User Tracking by discourse

I just noted in the profile settings that discourse in engaging in a tight user tracking. It tries to identify the machines used for logging in and seems to track sessions tightly.

What exactly does discourse track about a user’s activities?

Is there a way to remove/limit that?

From many discussions in the past I think it is against the spirit of an open community like OSM to do such Google-Style client tracking. This should provoke complaints from the more idealistic and privacy-aware members of the community or keep them away from using discourse at all.

And while we’re at the topic:
Has this been examined from a legal perspective?
Is all the data collected required and justified for operation of discourse?
Is the data deleted automatically after it is no longer needed? AFAIK reasonable times for web sites do not exeed 7 days.
Is there a privacy concept formulated for it and documented?
To my knowledge the GDPR requires all of that and OSMF is a large enough organization to better make sure there is no obvious violation.

5 Likes

Can you explain what exactly you are referring to? Fairly obviously a site like this can’t function without basic tracking of login sessions so presumably there is something else that you are referring to when you talk about tracking sessions tightly?

1 Like

I think @Nop is referring to the list of recently used devices in preferences/security.
That’s afaik a security feature, but the device recognition might seem spooky at first for the privacy aware user. As I understand it, discourse does this as secure as possible, see:

The other part is imho a very important question: What about compliance with GDPR and similar laws?

We are self hosting Discourse. If there are any settings you believe we should fix, I am happy to hear them.

With regards to GDPR and similar laws, we obviously want to stay in compliance. We are all volunteers with good intentions.

4 Likes

Yes, exactly.

I don’t have more to say than that but stupid discourse wont’t let me post less than 20 characters.

New topic: Can we reduce that stupid minum number in some setting?

2 Likes

As I know from doing the GDPR compliance for several private websites this can be quite some work and give you some surprises. E.g. that an apache server with default settings is violating GDPR.

The posts sounds like there have been no concrete steps/results yet to ensure GDPR compliance?

I am not OSMF’s legal team, speak to them if you want a more complete answer. I am part of the sysadmin Ops team, we do take steps to keep in compliance with GDPR.

If you have anything specific you believe we should be doing, please let us know.

1 Like

Link please.

(extra text added to satisfy 20-character minimum limit)

1 Like

That’s discourse’s safe default to incentivize people to post meaningful replies, rather than just “Yes, no, I like it, sure thing…” that adds little value to conversations.

Having said that, we can evaluate after some time if 20 is the right setting or not.

I can’t give you quick links, I did that research when GDPR came out. And of course IANAL. But this was my findings: By default, Apache logged every IP address for every access and keeps the logs for a year. IPs are considered personal data, they are collected without consent and there is no justification for keeping them for a year => GDPR violation. A save period appeared to be a week, and for compliance you can delete the logs or filter the IPs out of them (that’s what I am doing).

I’m surprised about this when some sites are obligated to keep logs for legal reasons, having a link to something to back this would be important (also if this is related to a sentence at EU-level or country-level).

In France, IP addresses are considered personal data, and we also have legal requirements to keep server logs for one year.

When you go in the details… we legaly need to keep logs ONLY for content published or modified not when accessing/reading content but almost nobody knows that and logs are usually keeping everything by default.

As long as you do nothing with that data and do not share it with a third-party nobody will complain.

1 Like

It logs every IP address but apache does not have any log expiry at all - most distributions do of course include other tools like logrotate to manage that. I can’t remember what the default on Debian/Ubuntu is but on Fedora it’s 4 weeks.

As far as OSM goes we default to 14 days normally for apache. I believe that is all documented in the privacy policy.

Discourse is a bit of a special case because it’s using the upstream docker containers and I’m not sure how they handle logging.

1 Like

I’ve checked on the instances I’ve deployed and the nginx log is kept only for a week.

3 Likes