Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Having studied Sutton for a long time now, what I take away from the bitter lesson is that the only pathway to Generally capable agents is to have the same scale of computational capacity in an embodied system as humans or other intelligent systems have.

It’s effectively a product of physics and so we keep trying to outsmart physics is what Suttons point is - and you just can’t outsmart physics

So, while the method probably is important in terms of efficiency or functionality within the current state of technological systems, the method is less important than the scale, and we’re not even close to the scale necessary yet.



I don't think it's obvious that we don't have sufficient computational scale already...

The human brain has ~86 billion neurons, but they only fire at like 2Hz, so you get 192 billion firings per second. GPT-3 has 175 billion parameters, and can apply those parameters much faster than the brain can fire neurons.

Lots of folks like to point out that there's more complexity in neurons than model weights, which is fine, but it's not clear what the impact of that additional complexity actually is. Extra slow channels (eg, hormonal signals) also aren't an impossibility.

So /maybe/ the models need to scale more, or maybe we need to figure out some better combination of training tasks to get the next big leap. There's massive progress being made with multi-modal inputs (which helps the model create a more coherent world-model, by relating text to images, audio, or video). Data selection. - picking good data instead of just throwing in everything - is also showing lots of promise.

I tend to think there's some need for more 'interactive' or 'social' component to training, eg active/online learning with robotics - and figuring out how to get models to 'play'. Unstructured play is an essential mechanism in smart animals associated with hormonal signals and rewards - it's important, and we don't really know how to harness it yet.

But overall, I don't think we're yet at a local maximum. There's too much cool stuff going on and the iron is still quite hot.


“Embodied” being one of the key things that you’re ignoring

The brain =\= a human agent

You need sensors and effectors and a highly variable motor system

You can’t be “generally intelligent” if you do not have boundaries on your computing system which are mobile and have independent actions.

In order to perform as well if not better than a human then you need to perform as well, if not better than a human in all possible environments, and those include the top of the world, the bottom of the ocean, every factory line, flying airplanes, etc…


How could you do anything intelligent without a strong beak?

https://falseknees.tumblr.com/post/654380023602708480

The interactivity I mentioned is the bit that I think is actually important from embodiment - the ability to take an action in the world, see the result, and adjust expectations. What you've called 'independent actions.'

But there's certainly no proof that a general intelligence needs to be bounded and mobile - a pedantic thought-experiment-counterexample would be an 'uploaded' human mind: the people of San Junipero don't stop being generally intelligent once they are in a distributed simulation...

More generally, we don't actually know the boundaries on how general intelligence could arise and what shape it could take, because we don't really understand intelligence at all.


The only thing we do know about “intelligence” is that more compute = better performance on tasks we use to evaluate increasing generalization on tasks.

So map out the computing power and hardware requirements of an average adult human with IQ 100 and no disabilities and that should tell you what you need.

It’s probably 3x harder than just a brain.


I don't think that's even true. Where are you getting this? I can create an extremely compute heavy model that performs very poorly. Just adding more compute does not mean better performance on its own.

Given the same model too much compute can even mean over-fitting and worse generalization. It doesn't all come down to compute.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: