Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

1. Differentiable Programming is horrible branding. It's hard to say, not catchy, and not as easily decipherable.

2. Isn't the evolution of Deep Networks more advance setups such as GANs, RNNs, and so on?



> Differentiable Programming is horrible branding. It's hard to say, not catchy, and not as easily decipherable

Tell that to the people who deliberately popularized the term Dynamic Programming for something that was neither dynamic nor programming.

____

(From Wiki)

Bellman explains the reasoning behind the term dynamic programming in his autobiography, Eye of the Hurricane: An Autobiography (1984, page 159). He explains:

"I spent the Fall quarter (of 1950) at RAND. My first task was to find a name for multistage decision processes. An interesting question is, Where did the name, dynamic programming, come from? The 1950s were not good years for mathematical research. We had a very interesting gentleman in Washington named Wilson. He was Secretary of Defense, and he actually had a pathological fear and hatred of the word research. I’m not using the term lightly; I’m using it precisely. His face would suffuse, he would turn red, and he would get violent if people used the term research in his presence. You can imagine how he felt, then, about the term mathematical. The RAND Corporation was employed by the Air Force, and the Air Force had Wilson as its boss, essentially. Hence, I felt I had to do something to shield Wilson and the Air Force from the fact that I was really doing mathematics inside the RAND Corporation. What title, what name, could I choose? In the first place I was interested in planning, in decision making, in thinking. But planning, is not a good word for various reasons. I decided therefore to use the word “programming”. I wanted to get across the idea that this was dynamic, this was multistage, this was time-varying. I thought, let's kill two birds with one stone. Let's take a word that has an absolutely precise meaning, namely dynamic, in the classical physical sense. It also has a very interesting property as an adjective, and that it's impossible to use the word dynamic in a pejorative sense. Try thinking of some combination that will possibly give it a pejorative meaning. It's impossible. Thus, I thought dynamic programming was a good name. It was something not even a Congressman could object to. So I used it as an umbrella for my activities."


>> Let's take a word that has an absolutely precise meaning, namely dynamic, in the classical physical sense. It also has a very interesting property as an adjective, and that it's impossible to use the word dynamic in a pejorative sense. Try thinking of some combination that will possibly give it a pejorative meaning. It's impossible.

Now I have to try:

  "Dynamic, multimodal failure" (fail).

  "Dynamic instigation of pain for information retrieval" (torture).

  "Dynamic evisceration of underage humans" (slaughtering of children).

  "Dynamic destruction of useful resources" (environment destruction).

  "An algorithm for calculating dynamic stool-rotor collision physics" (shit hits the fan).
Not terribly good I guess but I think not that bad either.


Even in your examples I don't think the 'dynamic' part is negative.


Deep Learning -> "Dynamic Learning"? Post even describes it "dynamic".


I like it, at least from the little that I read about it. The name describes the core of what it is: differentiate programs, in order to figure out how changes the the program affect the output and using that for optimization purposes. Do we really need to invent obscure, new names for everything just so that it sounds catchy?


Couldn't agree more. A technique should be judged by its usefulness. Not by its catchiness.


to be fair though, the subject is a judgement of a technique's name, not a judgement of the technique itself.


not to forget, naming is one of the harder problems considered in programming.


> 2. Isn't the evolution of Deep Networks more advance setups such as GANs, RNNs, and so on?

That's not what Yann LeCun is getting at I think. Most neural networks models are sort of like a non-programmable computer. They are built around assuming the data is a fixed size, fixed dimensional array. But in computer programming there is an ocean of data structures. You have sorted sets, linked lists, dictionaries, and everything else. Imagine that we knew that a data-set was arranged as a sort of "fuzzy" dictionary and we wanted the computer to do the rest. All we need to do is load up the right deep neural network (I mean differential programming something or other) and wallah.

Something like where the the value in a piece of data dictate the layers that get stacked together and how those layers connect to layers for the next value in that piece of data


it seems like it's made by analogy with "probabilistic programming", i.e. defining complex probability distributions by writing familiar-looking imperative code (with loops and whatnot).

I think the idea is that thinking in terms of passing data through layers in a graph is cumbersome sometimes, and that expressing it as a "regular" program that just happens to come with gradients could be more comfortable.

I'd argue that GANs in particular are a natural fit for this style. The training procedure doesn't really fit exactly into the standard "minimize loss function of many layers using backprop".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: