Very cool idea. That'd be a good way to see how it can generalize, actually. I s...

dvt · on Dec 4, 2022

It even fails at pretty rudimentary toy examples. The over-hyping here is absurd.

> what doesn't fit: red, blue, green, dog

The word "dog" does not fit with the other three words because they are colors, while "dog" is a type of animal.

> what doesn't fit: dog, 1, 2, 3

The number "1" does not fit with the other three words because the other three words are all types of animals, while "1" is a number.

comboy · on Dec 4, 2022

GPT-3 has been trolling simple examples too. I think it comes from what it crawled. You will find very few conversations like that with just correct data, what gets popular and how jokes are made is when there is a twist.

Your comment is an example of such data too.

Luz_K · on Dec 5, 2022

It's true that there are a lot of limitations still, but I think you are underestimating how powerful it can get with the right input prompt. For example for your task I wrote:

>You are a professional puzzle solver. Only answer with the element in the list that does not fit. Do not include an explanation. Do not write anything except the element that does not fit. Dog, 1, 2, 3

Dog

>Computer, Phone, Tree, Microwave

Tree

>red, blue, green, dog

dog

It seems to perform quite a lot better in my short time of testing than before which seems quite extraordinary to me. Now that is not to say you can't find a bunch of examples where it fails or that it is even close to human level for this particular task. But this still seems like a huge technological advancement to me and I did not expect that ai systems would be at this level quite so soon.

johnfn · on Dec 4, 2022

Works fine for me:

> The word "dog" does not fit with the other words in the list because "1", "2", and "3" are numbers, while "dog" is a type of animal. The other words in the list are all examples of numbers, whereas "dog" is not a number.

dvt · on Dec 4, 2022

Screenshot for proof: https://i.imgur.com/nw6rstc.png

Slightly different second answer this time around, but still equally wrong.