Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It just hasn't been prompted or fine-tuned to have the neutral, self effacing personality of ChatGPT.

It's doing the pure, "try to guess the most likely next token" task on which they were both trained (https://heartbeat.comet.ml/causal-language-modeling-with-gpt...).

ChatGPT is further trained with reinforcement from human feedback to make them more tool-like (https://arxiv.org/abs/2204.05862 & https://openai.com/blog/chatgpt & https://arxiv.org/abs/2203.02155),

with a bit of randomness added for variety's sake (https://huggingface.co/blo1g/how-to-generate).



That bit about "fine-tuned on the Alpaca dataset" is precisely about that. But, yeah, no RLHF so far, although some people are already working on that.


The randomness link doesn’t work. All the ChatGPT products need randomness or it can’t be creative.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: