2. There is a lot of ongoing work on mechanistic interpretability by e.g. antropic that shows we can understand LLMs better than we initially thought.
2. There is a lot of ongoing work on mechanistic interpretability by e.g. antropic that shows we can understand LLMs better than we initially thought.