Reminds me of an interview with Kyle Fish on 80k hours [1], where he talks about his research and how when models talk to each other:
> [...] something consistently strange: the models immediately begin discussing their own consciousness before spiraling into increasingly euphoric philosophical dialogue that ends in apparent meditative bliss.
> [...] something consistently strange: the models immediately begin discussing their own consciousness before spiraling into increasingly euphoric philosophical dialogue that ends in apparent meditative bliss.
Keen to see if that happens here.
[1]: https://80000hours.org/podcast/episodes/kyle-fish-ai-welfare...