Short answer: for all practical purposes, yes, it can and it does.
For each specific example, there is no way to tell for sure (afaik) if the example was in the training set. But you can easily run some experiments yourself, inventing your own words which would not likely be in the training set, especially when taken together.
I have done this, and GPT4 will frequently make inferences on par with the "optimystic" one. For example I just tried "surfrandma" and it said "It appears to be a combination of the words "surf" and "grandma", but without additional context, it's challenging to provide a precise meaning."
Like just about anything. And the measure is something like "does someone who has spent some time with GPT-4 find it at all surprising that it can do X". A posteriori, it would be much more surprising if GPT-4 failed to resolve "optimystic" to "mystic" and "optimistic". Even though it's handicapped by its encoding when it comes to wordplays.
Its the problem with fully proprietary AI like this: You cannot prove that this question and this answer wasnt in the training set, so you cannot argue for its ability to infer or reason.
You're making my point for me. Exactly, a fully closed source language model cannot be evaluated because there is no way to know why it replies the way it does. My point exactly.
I wonder if “optimystic” shows up at all in the training data or if this was purely from some ability to detect those two source words.