I'm wondering where the line is between acceptable and unacceptable logs. Obviously no one appreciates analytics used by marketing teams, but virtually every internet service has logs used by engineers (which seem to be what this post is about). A few factors that seem relevant:
- Is the service running locally?
- Do we trust/expect that the data is not used for marketing (i.e. would the user have complained if the domain was "error-reporting.dropbox.com")?
- Is the data anonymous (think twice, everyone who has IPs or user IDs in request logs)
- Did we agree to relevant ToS or privacy policies?
If we think carefully about this, I'd bet that most people here have used or even implemented some form of logging that has privacy problems.
It’s not that difficult to make anonymous (usually pseudonymous) usage stats. Of course you don’t store IPs, computer names, user names, emails, detailed geographical data etc. I think in the past this was a lot messier but these days with GDPR it’s quite easy to draw the line. Basically store nothing that is individual, nor enough data (entropy) associated with one pseudonymous user that they could be identified as individuals.
- Is the service running locally?
- Do we trust/expect that the data is not used for marketing (i.e. would the user have complained if the domain was "error-reporting.dropbox.com")?
- Is the data anonymous (think twice, everyone who has IPs or user IDs in request logs)
- Did we agree to relevant ToS or privacy policies?
If we think carefully about this, I'd bet that most people here have used or even implemented some form of logging that has privacy problems.