I've been playing with it since yesterday. I was able to ask it for output that literally had my crying with laughter (e.g. "Write a country song about Sansa Stark and Littlefinger" or "Write a sad song about McNuggets"). That scared me for a minute because it's giving me what I want, mentally anyway, beyond anything else I've seen recently. I'd be worried it's addictive. But it seems like it has an ability to enhance my own mind as well, because I can ask it things about what I'm thinking about, and it generates a certain amount of seemingly generic ideas but I can expand on it or get more specific. I can take the ideas I want from it into my actual life. I've come up with several insights, realized certain ways of thinking I've been stuck in, and even based on its examples realized things about generating creative ideas for myself. Maybe I'm over-reacting but it's really something new. I haven't cared that much about AI but now that I have access to it it's another matter. In comparison, I also played around with DALL-E just now but that's not really achieving anything special for me like that.
I'm wholeheartedly confused why so many people are only just now learning of OpenAI/GPT-3 and its chat mode, I guess presentation truly is everything. Nothing here is particularly new, it's just a better model than before.
Statements like "the people haven't realized it yet" confuse me because "the people" is two groups. People in the know, and people not in the know. Everyone in the know realizes where this is headed and what the potential is.
Those not in the know simply lack the technical background to have followed the incremental developments up till now which have led to this moment, for them it's a parlor trick because even today they cannot grasp the potential of existing technology. I could similarly lament about how people treat the Internet.
It's like with Dalle-2 and StableDiffusion, so many people were just not understanding how it was even possible, some people even going as far as calling it a hoax in some way.
But for anyone paying attention, it's been easy to see the progression. I'm not even an ML person but I could give you a map from every paper to every other paper for how this has all been happening faster and faster, basically starting with AlexNet in 2012.
That said, this chatGPT is different than GPT-3's first demos earlier last year or the Codex interface in that it is implementing a consistent memory and seems to have a token length capability much, much longer than before. This is having a huge effect on what you can coax out of a chat with it. You can tell it to act a certain way and then continuously interact with that entity- with GPT, you got the one prompt, but once you tried again with a new prompt that memory was gone. You could attempt to feed the entire output back in as input, but at least initially the token length would cut things off eventually. Meanwhile, with chatGPT, I just had a 20-minute conversation with a "girl from Reseda, CA' who's a barista and like, totally is going to go on a keto diet like her sister. " because I told it that is who it should act like it is and under all circumstances it should respond to my chat in that way.
BTW she says that "bangs are totally in style right now" and she really likes "exploring new hairstyles like ones from the 90's"
I feel very much at odds with this - it’s not going beyond a couple commands, this is exactly what I’d expect a language model to be able to do today based on the past three years progression. It’s not actually executing anything ofc, it’s finding the output quite literally a well formed amalgamation of all the learned examples online of which there’s be tons.
It’s something like novelty * length * complexity with * accuracy that impressed me, of which it’s not far beyond simple tutorials or snippets you’d find online.
But isn't it just predicting text patterns? It doesn't really know about Docker, just that after running commands X,Y you usually get output Z (of course with the stateful AI magic to make things more stable/consistent).
I mean not to veer to far into the philosophical side of this, but what does it actually mean to know or understand something?
Did you see the demo the other day that was posted here of using stylographic analysis to identify alt accounts? Most of the comments were some form of "holy shit this is unbelievable", and the OP explained that he had used a very simple type of analysis to generate the matches.
We aren't quite as unique as we think was my takeaway from that. My takeawy from this, as well as the SD, DALL-E stuff is that we're all just basically taking what we heard from the past, modifying it a teeny bit, and spitting it back out.
…but people are getting the mistaken impression that this is an actual system, running actual commands.
I can also emulate a docker container. I’ll just write down the commands you send me and respond with some believable crap.
…but no one is going to run their web server on me, because that’s stupid. I can respond hundreds of times a second and maintain the internal state required for that.
Neither can this model.
It’s good, and interesting, but it’s not running code, it’s predicting sentences and when you’re running software it was to be accurate, fast, consistent and have a large internal data state.
Trying to run docker in gpt is fun. Trying to use docker in gpt to do work is stupid.
It’s never going to work as well as actually running docker.
It’s just for fun.
Models that write code and the execute that code will be in every way superior to models that try to memorise the cli api of applications.
It’s an almost pointless use of the technology.
Gpt may have “learnt” python; that’s actually interesting!
Docker is not interesting.
If I want to use the docker api, I can type `docker` on my computer and use it.
It's pretty sad that the thing that excites people the most about an amazing new language model is that it can do trivial command line actions, that you can do without the model.
Spending millions of dollars to produce a model that can do what you can already trivially do is very seriously not what openai just did.
> I can also emulate a docker container. I’ll just write down the commands you send me and respond with some believable crap.
Right. The thing that is impressive is that ChapGPT can do this effectively. This means that it has some "understanding" of how `pwd`, `ls`, `apt`, `docker`, etc all work. In some sense, this is an AI that knows how to read code like a human instead of like a machine.
> In some sense, this is an AI that knows how to read code like a human instead of like a machine.
It's literally spitting out responses like a machine. Isn't that the opposite of what you wanted?
> The thing that is impressive is that ChapGPT can do this effectively.
? What is impressive about it?
Forget this is an AI model for a moment. Lets say I give you a black box, and you can type in shell commands and get results. Sometimes the results don't make sense.
Are you impressed?
I am not impressed.
I could implement the blackbox with an actual computer running and actual shell and the results would be better. Why would I ever use a LLM for this?
It's like discovering that the large hadron collider can detect the sun. Yes, it can. Wow, that's interesting, I didn't realize it could do that. I can also look up at the sun, and see the sun. mmm... well, that was fun, but pointless.
There are so many other things GPT can do, this... it's just quite ridiculous people are so amazed by it.
It is not indicative of any of the other breakthrough functionality that's in this model.
It's impressive because if it can learn enough about how shell scripting works, how filesystems work, and can translate from human language, then we can feasibly stop learning to code (or at least outsource a lot of it). It's mostly not there yet, and I'm not sure how long it will take to actually be useful, but it's not insignificant that a language model can write code that works and manipulates filesystems.
I was prompting it along this line of thought earlier. What I found was that it doesn't seem like it can do anything novel, which is to be expected, but I can see myself working with it to discover novel things.
Sure, I agree there - but the point is it cannot understand code. It can try to describe it, but it isn't able to reason about the code. You won't be able to coax it to the correct answer.
"It’s never going to work as well as actually running X. It’s just for fun." You must realize that X was also built by some kind of neural networks, i.e. humans, and the only reason we can't run an entire Linux kernel "in our heads" is mostly due to hardware, i.e. brains, limitations. Although, I do remember Brian Kernighan saying in an interview how he was able to run entire C programs "in his head" faster than the 1980s CPUs.
The point is that the future programming language will probably be the human language as an extremely high-level specification language, being able to hallucinate/invent/develop entire technological stacks (from protocols to operating systems to applications) on the fly.
> what does it actually mean to know or understand something?
I think it means that you're able to apply the information to make predictions about the world. For example, you'll encounter something novel and be able to make accurate guesses about its behavior. Or, conversely, you will have high likelihood of inventing something novel yourself, based on the information you acquired (rather than through brute force).
I think there is an element of it producing reasonable results because it is trained on largely seeing canned example output. In tutorials, the command that includes ‘hello world’ always outputs ‘hello world’, right? So it doesn’t take a genius to guess that <long blob of golfed code that includes the string ‘hello world’> should produce some output that includes ‘hello world’
Similarly in my explorations of this ‘pretend Linux’, it often produces whatever would be the most helpful output, rather than the correct output.
It is and people haven't realize it yet.