Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So I've just tested it, and I can confirm, yes, copilot refuses to give suggestions related to gender. Now I know a lot of people are calling this absurd, but looking more closely, there are two PR nightmare scenarios.

1. Copilot makes a suggestion that implies gender is binary, a certain community explodes with anger and an entire news hype cycle starts about how Microsoft is enforcing views on gender with code.

2. Copilot makes a suggestion that implies gender is nonbinary, a certain community explodes with anger and an entire news hype cycle starts...

You can't win... so why not plea the fifth?

To all those claiming this is an example of "wokeism", remember the proper response from an individual who believes in nonbinary gender would be to offer suggestions of the sort. There is no advocacy here. Mums the word.



Those aren’t the only options. You can just let it suggest what it is going to suggest. Copilot is a product for adults who should be able to comprehend what machine learning is. Anybody who throws a fit about it will only be exposing themselves as a fool.


I might even share your idea about adults _should_ behave. But that doesn't invalidate fny's musings based on how _adults_ do behave.


I would love to see a origin/latin etc breakdown of the word behave. One of my least favorite words (authority issues much? Yes).



The problem is that if you train an ML model with a bunch of data that happened to be available in the past, then the system will perpetuate the same biases as were inherent in the training data. This leads to the (real issue) Google image classifier categorizing an image of a black man as a "gorilla" etc.

Certain words are heavily loaded and are worth just skipping to avoid all the hassle for now.


Btw, the gorilla incident was overblown. Overblown in the sense that people from other races (including whites) were also classified as some hilarious animals.

Gorilla and black just was the most politically charged one of the bunch.

(The other potentially politically charged one was some tendency to misclassify people of various levels of body fats as various animals.)

> Certain words are heavily loaded and are worth just skipping to avoid all the hassle for now.

If memory serves right, that was Google's pragmatic solution: if they detected a human in the picture, they 'manually' suppressed the animal classification.

So they lost being able to classify 'Bob and his dog' in return for not accidentally classifying a picture of just Alice as a picture of a seal.


[flagged]


No, not at all.

I see it as little more than GPT-3 having a list of words like "cunt", "fuck" and "shit" and realizing that there is little to be gained in including these words right now, so skipping them makes sense until we figure out some more urgent things first.


It’s not censorship; it isn’t muzzling you. Microsoft is choosing not to emit speech on this topic.

It is a deliberate and voluntary omission, not censorship.


Microsoft is censoring itself. Which they are allowed to do.

I am censoring myself, too.


If you insist? I suppose


[flagged]


>That's just reality.

Except it really isn't. If the datasets used, truly represented everyone in the world, that would be a reasonable argument. The point is that right now, the most cheaply available and voluminous data sets online tend to have a whole bunch of examples from western nations and far fewer from other parts of the world, for the simple reason that historically most of the people taking photographs and sticking them on webservers were from those places.

"Reality" doesn't have the same statistical anomalies as these data sets (e.g. there are a hell of a lot more people with brown skin in the world than are included in common training data), so "that's just reality" really isn't a strong argument.

This is a very, very common problem in ML and isn't limited to politically-charged words. For example, in some of the earliest attempts at using computer vision to detect tanks in an image for military purposes, the photographs with tanks in them all had different lighting than the ones without tanks and so the (super simplistic ML model) overfit the model based on a bias in the data. Unless the data set is truly representative, you'll often get biases in the resultant model.

> If you have your own set of politically correct answers hardcoded by a team of blue haired people you're not doing machine learning.

Well, this is just silly. We all know Deepmind has a policy of only allowing green hair dye on campus.


> Copilot is a product for adults [...]

If you didn't meant "should be" (for which I'm not willing to take any position), no, Copilot is not a product for adults [1] [2].

[1] https://docs.github.com/en/site-policy/github-terms/github-t... "A User must be at least 13 years of age."

[2] https://docs.github.com/en/site-policy/github-terms/github-t...


I'm sure the commenter didn't mean adults as legally adults but as someone that understand what machine learning is and won't throw a fit if the computer says something he disagrees with.

A 13 years old is perfectly capable of that, I know many 40 years old that aren't.


A minimum age for accepting terms of use isn't the same thing as a target demographic.


No, but the minimum age requirement affects the handling of questionable contents, which can never be "doing nothing" as the GP suggested.


Fair point.


That implies corporations are ruled by adults that aren't confusing twitter with the real world and aren't afraid to tell the screeching activists to leave them alone. Nothing we've seen in the latest decade suggests it is even close to being the case.


Not every country where Microsoft is doing business in has the same mores as the western world.



Didn't know about this. Thank you for making my day. (Disclosure: I used to work at Microsoft)


Most humans are fools. And you'll get a lot of flak if they think you stepped on their toes.


Agreed. The answer is approved by Dave Cheney, he works at GitHub, and if you've ever attended one of his talks it's plain to see he's a very scrupulous person. I also don't think this is an example of Microsoft taking a side; rather I read it as them refusing to bat, which seems fine.

What I would've preferred one of these threads to be about is how all of this works. Like, how do they post-hoc filter certain things? Is that the only way to deal with things defined as issues in ML?


Making Copilot stop in its tracks when it sees the word "gender" and refuse to continue until the word is removed is still making a statement. Refusing to bat would be treating "gender" as a meaningless token, just as if you'd typed "traqre" instead.


No, refusing to generate stuff in an area where the output is likely to be controversial (in either direction) is refusing to bat. It'll wait for a pitch that it thinks it can hit, just like it refuses to play for many other categories-- you'll have hard time to get Copilot to enumerate races, too.

Ignoring the potential offensiveness and YOLOing through it is swinging the bat wildly at every pitch.


I think you might not fully grasp the scope of the issue here. Right now, if a file you're editing contains one of the restricted words, Copilot will refuse to make any suggestions at all in that file while that word is present -- even if the word isn't relevant to the part of the file you're editing. To keep to the baseball metaphor, Copilot is going on strike at the first whiff of controversy.

What I'm suggesting is that Copilot should keep working when these words are present, but refuse to attach any significance to the specific word. This could probably be implemented by replacing the problematic words with randomly generated strings before processing the text, then swapping those strings back afterwards.

(It could be reasonable for Copilot to refuse to make suggestions at all if the output would contain truly offensive language, like unambiguous racial slurs or sexual terms. But "gender" clearly isn't that.)


> Right now, if a file you're editing contains one of the restricted words, Copilot will refuse to make any suggestions at all in that file while that word is present -- even if the word isn't relevant to the part of the file you're editing.

Yah-- it's unfortunate but it's easy. It might be OK to tolerate it if it's clearly outside the range of tokens used in suggestions, but the filtering doesn't use tokenized stuff.

> What I'm suggesting is that Copilot should keep working when these words are present, but refuse to attach any significance to the specific word. This could probably be implemented by replacing the problematic words with randomly generated strings before processing the text, then swapping those strings back afterwards.

The problem is, the trained model is much smarter than the keyword-based filtering. If you just whiteout the watchwords, it still has a pretty good chance of gleaning context and making a commentary on gender that Microsoft would rather not deal with.

> (It could be reasonable for Copilot to refuse to make suggestions at all if the output would contain truly offensive language, like unambiguous racial slurs or sexual terms. But "gender" clearly isn't that.)

Right now the list is quite a large variety of things. Mostly racial slurs and sexual terms. But letting an AI ramble on after "blacks" is kind of dangerous, as are various gender-related terms that do have innocuous interpretations. It's easy to put words in the filter list and much harder to try and use nuance on these topics that even humans struggle with nuance around.


Yes but people haven't quite figured out WHY people should be offended at Microsoft for doing this, so it's quite convenient for them before people discover their reasons for being mad.


> I also don't think this is an example of Microsoft taking a side; rather I read it as them refusing to bat, which seems fine.

You can't be neutral on a moving train, as they say.


I don't get the whole discussion. There are just many different models of gender. Its like particles vs waves. In one model, there are only two genders, in another five. There are those who say gender is culture and sex is real, and those who say sex is constructed, too. Some models describe reality better than others, some are useful, some are harmful. But nobody can or should stop you from thinking about reality with the model of your choice.

If I were Microsoft, I would post a shrugie and say copilot offers arbitrary responses based on the actual code it reads; it is not supposed to be "correct" or good or fair, but just follow what it sees other people do.


>>Nobody can or should stop you from thinking about reality with the model of your choice.

While I agree with you, that is very much the game that is being played here. We have competing world views and one way to help a world view dominate is to play a linguistic war. That was the point of Newspeak in 1984 (https://en.wikipedia.org/wiki/Newspeak). If you control the language such that competing ideas are instantly taboo just by the words required to describe them you can stop people from promulgating those ideas. So you gain ground without ever having to debate the new ideas.

This has happened in many countries when one religion dominated. Western society was starting to get to the point where it taboos were being shed and ideas could win based on their merit. Sadly we're regressing back to a society controlled by dogma rather than an open exchange of ideas. I suspect this is the normal state of human societies, we fluctuate between open and closed societies.


> Western society

The problem seems fairly limited to the USA from where I stand.


I'm in NZ and it's very much here as well.


The whole point of (post-)structuralist philosophy, which informs the left-wing view on this, is that all language is already newspeak. (And since it follows that you can't choose not to play, you may as well play to win.)

> Western society was starting to get to the point where it taboos were being shed and ideas could win based on their merit.

Name a year that things were actually better. (In the 40's, before the civil rights movement? In the 80's, when queer people were still regularly oppressed and excluded from participation in society? Did ideas "win on their merit" when police beat up people in gay bars?)


>The whole point of (post-)structuralist philosophy, which informs the left-wing view on this, is that all language is already newspeak. (And since it follows that you can't choose not to play, you may as well play to win.)

Exactly that. This is the current conflict.

Things were not better in the 40s, action movies were better in the 80s :p.

Things were trending towards a more open society they had not become perfect by any stretch of the imagination. That trend, IMO, has reversed, due to the tactics involved. That is not to say some groups haven't benefited from this. There is a genuine drive to create a utopia here. However I fear the cure might be worse than the disease.


> If I were Microsoft, I would post a shrugie and say copilot offers arbitrary responses based on the actual code it reads; it is not supposed to be "correct" or good or fair, but just follow what it sees other people do.

The last time Microsoft did that, they ended up with their bot posting racist content on twitter. They of all people understand that just following what people do on the internet is a recipe for disaster.


> Some models describe reality better than others, some are useful, some are harmful.

The idea of science is to get rid of models that are wrong.


It’s really not some complicated multiverse of possibilities. It’s biological, very factual and the underlying genetic basis is as objective as something can get.


There are actually corner cases here, although that’s not what usually comes up: https://en.wikipedia.org/wiki/Intersex

Just a reminder that often reality is more complicated than we think. Names, numbers and upper/lower case are the usual examples.


> There are actually corner cases here, although that’s not what usually comes up: https://en.wikipedia.org/wiki/Intersex

No biologist would claim that the sex is constructed.


Choosing door 3 unfortunately leads to ...

A certain community explodes with anger since their machine learning dev-tooling is closed and has arbitrary restrictions.

If you try to please everybody, someone won't like it.


Unfortunately the people that care (like HN people) are less likely to spend time organizing protests and riling up an internet mob


I'd argue it's fortunate. While it does seem like a good idea to get out there and promote your sides view of thing, I suspect the best option is to excel in your life and rise to influence.

My hope is that the people here, at least the level headed ones, will rise to positions of influence and not the people rioting at every chance.


I'm going to have to say it is ridiculous because there are all sorts of things that cause problems that the copilot generated code is going to have to keep out following this reasoning -

let's not handle ethnicity, if we're going to be sensitive about gender that is an area which is also sensitive for many people.

should it take border disputes etc. into consideration, if you're using it in country X and country X thinks a particular area belongs to them despite most of the world disagreeing will you not be able to use copilot to generate code that supports your remote employers international operations?

it would make better sense if Copilot had warnings it could issue and when you wanted gender put up some sort of warning about that - or allow you to choose binary gender / multi gender solutions.

The idea that it should fail, and that makes sense for it to do so is essentially a critique of the whole code generation idea.

on edit: obviously HN should be able to come up with lots of other things that might cause media related problems if CoPilot handled it, code to detect slurs, etc. etc.


The nightmare scenario is caving to either mob. There is no good reason to moderate this.


It’s just following the old advice not to talk about religion.


This is similar to the stupid branch rename saga. It is certainly pointless, but not doing it could be disastrous.


> Copilot makes a suggestion that implies gender is binary

How would that work though? What can Copilot suggest that can imply that?

  If gender is true 
     Do something…
  Else if gender is not true
     Do something else
  Else
      Do nothing


There is a safe version of gender. Grammatical gender is, for now, binary and as far as I'm aware not offensive to most.

But I agree you can't avoid offending people. The world is nuts everything is offensive to someone.


Grammatical gender is not as simple/uniform as you state https://en.m.wikipedia.org/wiki/List_of_languages_by_type_of...


Thank you, I stand corrected.


Solution: let the user choose their political stance on such a polarized topic in the Copilot settings so that the user gets suggestions that fit his stance.


The solution is conceptually simple (no idea of practicality): propose an answer related to the context.

And also: give the list of banned words


its only a PR nightmare because its a closed service and not an open tool


Pick 95% of your users, not a hard choice.


They have. 95% don’t give a shit tbh :)


[flagged]


> such a group is a very small minority

They got "master" changed to "main" in Github.


Also got a code of conduct popularized that explicitly seeks to moderate behaviour on unrelated platforms.


You beat me to it, as stupid as it is, you don't want to deal with this small minority.

People on the other side who complain about this being an "intrusive, misguided attempt at preventing discrimination" should take the time to talk to them and say hey, this is not about you.


That only shows that GitHub is willing to bend to that group and nothing else.


It's a total nonsense, how can someone be angry at a soulless machine? Is it a real thing to face anger towards an AI like it was a real human? It's a serious mental problem then, cause the anger is actually directed inward in this case


The anger is clearly not at the "soulless machine", but at the people and corporation that built, trained, and tend to it. The parent comment did not say "the community explodes with anger [at copilot]", they just said "with anger".

You have made up a total strawman. It is like if someone said "If that person were stabbed with a knife, they would be angry", and you responded "Do people really get angry at emotionless knives? That's a mental problem, their anger is directed inward".


Yeah you're right, thanks for unwindling it, still you have made up somehting too, cause i actually wanted to say not "Do people really get angry at emotionless knives?", but "Do people really get angry at knife manufacturers and knives?" taking your example. I mean, you can only be angry with a person who used the knife incorrectly, but at knife factories they dont dull their knifes like microsoft did with copilot




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: