I think the point he makes is that RL doesn't work as the world believes it to. RL is seen as the holy grail for AGI. He is saying that isn't the case, at least as of yet since RL algorithms don't generalize. For all other tasks, where RL can be used, other existing alternatives fare much better.
That's a little unfair to RL. Reinforcement Learning is great even outside of autonomous control. [It's become quite important in NLP, for example.] It should be seen as a valuable set of approaches in a toolbox, not a silver bullet.
> [It's become quite important in NLP, for example.]
[citation needed]
Maybe it's the specific NLP tasks I've been paying attention to (goal oriented dialog), but most of the RL for NLP work I've seen has not been super impressive.
There's plenty of work applying RL techniques, it's just not actually useful for building real systems.
It's basically in the same state as the rest of this blog post where it's so horribly sample inefficient you're usually better off with supervised learning if you're paying annotators. Or you're doing REINFORCE for some proxy metric or simulator which you're probably over-fitting and not actually improving your system with.
The Alexa prize is basically the only situation where you're getting anywhere near enough RL rewards to be meaningful, because everywhere else the feedback is rare enough to not be helpful due to the sample inefficiency.
Which is why I disagree with the characterization that it's important. There's a lot of it, and it can squeeze a little bit of performance on whatever dataset you're looking at, but it's had nowhere near the impact of, say, word vectors or (Bi)LSTMs.