It would if the language model did reasoning according rules of logic. But they ...

incangold · on May 9, 2023

Honest question: are we sure that it doesn’t do logical reasoning?

IANAE but although an LLM meets the definition of a Markov Chain as I understand it (current state in, probabilities of next states out), the big black box that spits out the probabilities could be doing anything.

Is it fundamentally impossible for reasoning to be an emergent property of an LLM, in a similar way to a brain? They can certainly do a good impression of logical reasoning- better than some humans in some cases?

Just because an LLM can be described as a Markov Chain doesn’t mean it _uses_ Markov Chains? An LLM is very different to the normal examples of Markov Chains I’m familiar with.

Or am I missing something?

In any case, coemu is an interesting related idea to constrain AIs to thinking in ways we can understand better:

https://futureoflife.org/podcast/connor-leahy-on-agi-and-cog...

https://www.alignmentforum.org/posts/ngEvKav9w57XrGQnb/cogni...

VictorLevoso · on May 9, 2023

All programs that you can fit on a computer can be described by a sufficiently large Markov chain(if you imagine all the possible states the memory as nodes) Whatever the human brain is doing is also describable as a massive Markov chain.

But since the markov chain becomes exponentially larger whit the amount of states this is a very nitpicky and meaningless point.

Clearly to say something its a markov chain and have that mean something you need to say the thing its doing could be more or less compressed to a simple markov chain for bigrams or something like that, but that is just not true empirically, not even for gpt2. Just this is already pretty hard to make into a reasonable size markov chain https://arxiv.org/abs/2211.00593.

Just saying that it outputs probabilities from each state is not enough, the states are english strings, there's (number of tokens)^contex_lenght possible states for a certain length that's not a reasonable markov chain that you could actually implement or run.

galaxyLogic · on May 10, 2023

> Honest question: are we sure that it doesn’t do logical reasoning?

It's not the Creature from the Lagoon, its an engineering artifact created by engineers. I haven't heard them say it does logical deduction according to any set of logic-rules. What I've read is it uses Markov chains. That makes sense because basically an LLM given a string-input should reply with another string that is the most likely follow-up string to the first string, based on all the texts it crawled up from the internet.

If internet had lots and lots of logical reasoning statements then a LLM might be good at producing what looks like logical reasoning, but that would still be just response with the most likely follow-up string.

The reason the results of LLMs are so impressive is that at some point the quantity of the data makes a seemingly qualitative difference. It's like if you have 3 images and show them each to me one after the other I will say I saw 3 images. But if you show me thousands of images 24 per second and the images are small variations of the previous images then I say I see a MOVING PICTURE. At some point quantity becomes quality.

mrcode007 · on May 9, 2023

My understanding is that at least one form of training in the RLHF involves supplying antecedent and consequent training pairs for entailment queries.

The LLM seems to be only one of the many building blocks and is used to supply priors / transition probabilities that are used elsewhere in downstream part of the model.