Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is absolutely not cheating; every hand designed algorithm can access and compare everything in the array!

The real point of what they’re saying in your italicized quote is actually that giving the net full access hinders efficiency, so they actually restrict it. Almost like the opposite of cheating.



It is not a pointer to an array I'm concerned about but the "neighbour diff vector" or what you should call it that provided by the "environment". See A.1.2.

Doing so many comparisons and storing them has a cost. Also the model can't decide if it is done so at each step the array has to be iterated to see if it is sorted by the "environment". Are they only counting function calls? I guess so. The paper is really hard to follow and the pseudocode syntax is quite madding.

If I understand the paper correctly of course I could be wrong.

If so, "our approach can learn to outperform custom-written solutions for a variety of problems", is bogus.


>> Are they only counting function calls? I guess so.

Oh that, yes, it's true. They're listing "average episode lengths" in tables 1-3 and those are their main support for their claim of efficiency. By "episode length" they mean instruction or function calls made during training by the student agent which they compare to the instructions/function calls by the teacher agent. So, no asymptotic analysis, just a count of concrete operations performed to solve e.g. a sorting task.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: