I definitely agree with your points, although I think this reframing of the prob...

I definitely agree with your points, although I think this reframing of the problem from "we need to explicitly state what we want" to "we should teach robots to want to learn what we want" is at least conceptually very useful and interesting.

I think the part about "robots could learn what Russell calls our meta-preferences: 'preferences about what kinds of preference-change processes might be acceptable or unacceptable.'" is what would be used to resolve preference conflict issues. People tend to be biased in consistent/similar ways so it doesn't seem implausible that a machine that could infer preference from action could take the extra step to infer circumstances affecting that preference.