It is unsurprising that a system trained on human-generated content might end up...

It is unsurprising that a system trained on human-generated content might end up encoding implicit bias, toxicity, and negative goals. And the more powerful and general-purpose a system is, the more suitable it is for a wide range of powerfully negative purposes.

Neither specializing the model nor filtering its output seems to have worked reliably in practice.