Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Quite dull research wrapped in fancy words.

Essentially they generate functions that map a program state to another program state and counts those function calls and compares with e.g. the number of function calls in quick sort. They "cheat" by feeding how all elements compares the their neighbours at each step. Like, if you give the algorithm that input for free what's the point.

Notably, none of the popular sorting algorithms decide which elements to swap by looking at the whole input at each execution step. On the contrary, the decisions are typically made based on local evidence only. ... We implement this principle for the sorting task by employing a small set of k(independent of n) index variables, and include in the input the information about how A[i] compares with A[i+ 1] and A[i−1]for each index variable i.



>> Like, if you give the algorithm that input for free what's the point.

The point, as also explained in the paper, is that even with such information and with a handful of basic operations from which to compose a program, the space of all possible programs is huge and finding a prorgam that does what it should in any reasonable amount of time is extremely difficult.

If you are familiar with the literature of program induction (which includes a lot more than neural program induction and goes back a few decades) you will notice that the hypothesis spaces for program learning are truly vast and it is impossible to make any kind of progress without some kind of "dirty trick" to reduce their size. The common approach is to impose a strong inductive bias on the representation language, so as to guide the search to find a correct program. The approach in the article is no different. And of course "inductive bias" here always means domain knowledge.

But, amicably, if you think there's any better way to do this, many people, myself included, would be very curious to hear what you have to say. In fact, if you could really innovate in this field you'd be guaranteed an illustrious research career. Learning programs from examples is hard.


This is absolutely not cheating; every hand designed algorithm can access and compare everything in the array!

The real point of what they’re saying in your italicized quote is actually that giving the net full access hinders efficiency, so they actually restrict it. Almost like the opposite of cheating.


It is not a pointer to an array I'm concerned about but the "neighbour diff vector" or what you should call it that provided by the "environment". See A.1.2.

Doing so many comparisons and storing them has a cost. Also the model can't decide if it is done so at each step the array has to be iterated to see if it is sorted by the "environment". Are they only counting function calls? I guess so. The paper is really hard to follow and the pseudocode syntax is quite madding.

If I understand the paper correctly of course I could be wrong.

If so, "our approach can learn to outperform custom-written solutions for a variety of problems", is bogus.


>> Are they only counting function calls? I guess so.

Oh that, yes, it's true. They're listing "average episode lengths" in tables 1-3 and those are their main support for their claim of efficiency. By "episode length" they mean instruction or function calls made during training by the student agent which they compare to the instructions/function calls by the teacher agent. So, no asymptotic analysis, just a count of concrete operations performed to solve e.g. a sorting task.


> They "cheat" by feeding how all elements compares the their neighbours at each step. Like, if you give the algorithm that input for free what's the point.

How is that "cheating"? The algorithm is still learning how to do a comparison-sort, which has direct applications (assuming it performs well).


So do they feed such information only when training the system?


You can generate that input only when needed. So whenever the sorting algorithm asks whether A[i] > A[i+1], you can perform the comparison there and then. So the answer to your question is yes.


The input is always needed, the neural net is an iterative algorithm that views the "local information" gathered from the array on every time step. It's asking continuously whether e.g. `A[i] > A[i+1]`.


No, the neural net needs to receive a feature vector before each instruction in order to sort the array. So whenever you want to sort things, you need to create that vector over and over.


> none of the popular sorting algorithms decide which elements to swap by looking at the whole input at each execution

Yes. In particular, the O(n log n) lower bound holds for comparison sorts, that is algorithms that rely on pairwise comparison only.


"all" of a constant number k of such inputs, not "all" of the entire input. It's not possible to feed a neural network an arbitrarily-large number of inputs in one batch.


My error it seemed to be a reference implementation with all. Is it correct that they feed a vector of 68 such comparisons in the window inferface?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: