Tencent provides a list of regexps, and anything matching those regexps is passed to them. As far as I can tell, we don't get to know what those regexps are (and presumably they can be changed at Tencent's whim). Can you not see the issue?
We do not know if Tencent can use arbitrary regex to find, I don't know, anti-Chinese sentiment content or just preapproved ones like "tencentToken=([a-zA-Z\d]{15})". Also, it's just for public repositories!
In any case, this announcement changes nothing. If you trusted GitHub with something before that you wouldn't trust them with now, your mental model is wrong. GitHub might allow any kind of partner (customer?) to scan their private or public repos in any way they want without making it public. In other words, if you are someone this announcement is problematic to, you shouldn't have anything on GitHub in the first place.
I cannot see the issue because the regex are pre-approved by GitHub. And even then, the service will only return the string, not who wrote it. Unless GitHub approves /Jonh Doe said:.*/ there is no issue whatsoever.
> I cannot see the issue because the regex are pre-approved by GitHub.
GitHub is a private company with one dual obligation, to prolong its existence and keep increasing its profit margin.
It is not any sort of arbiter for morality - morality being an externality to its central obligation - so it cannot be relief upon to “do the right thing”.
So it is not in any position of authority that would enable it to “approve”, in the moral sense of the word. They can only “allow” for the regex to be ran and the results sent off.
For example, the “right thing” for GH would be to increase profit, while for another entity might instead be to uphold its users’ privacy.
(You may think that it’s only for public repos, so they’re already made public, but isn’t GH here facilitating an aggressive collection and summation of information, that would otherwise be much more difficult and error-prone?)
The power of approval would rather come from an elected entity that would also determine who may request that such searches are executed, and which reasons would be valid.
Otherwise, we get a William Gibson-esque megacorp cyberspace future with clear but corporate Orwellian overtones.
Isn’t this obvious?
(I’m not being snarky at all - I’m genuinely asking: isn’t this glaringly and terrifyingly obvious?)
Generally when you come up with something from first principles that appears to be "terrifyingly obvious", that thing is false.
Microsoft's mission in life is to do whatever its directors want, insofar as its shareholders don't get /too/ annoyed about this, and to not break the law. They have no actual "obligations" or "fiduciary duties" to keep increasing profits or anything like that.
So they have no duty towards the general public either, until the public’s too irritated with them, and calls for overview. How’s that better, or any different at all?
They’re not out to profit from anything they can get though. They’re out to do Microsoft things.
This is pretty obvious; most companies stay in their industry even if it becomes irrelevant. Furniture companies rarely switch to software and software companies rarely just start selling heroin. (Except for gacha games.)
I guess I just have a lot less faith in the ability of companies to design perfect processes, and the ability of humans to perfectly carry them out, than you do.
And? What are you going to do with a singular token that you don’t know what company it belongs to? But obviously those devs at GitHub don’t know what they’re talking about so they’ll gladly notify two companies at once.
It’s super easy too: take a look at GitHub’s tokens, they all start with gh.