My takeaway from the article - the success of their language model illustrates w...

constantcrying · on Nov 17, 2022

It really is the exact opposite approach to static analysis, which tries to see what the code really does andhow that leads to bugs. I have had (quite expensive) static analysis tools detect genuine bugs, e.g. a somewhat subtle overflow. What it can never detect though is correct code that misses the intention of the programmer. E.g. whether some mathematical function is accurate.

The language models try, by statistical means, to derive what should be there. Given enough data they will start to have sone (statistical) grasp on the intention.

I am not entirely sure about the boilerplate though. Often you need some minor variation of an already existing pattern. Trying to unify those slightly divergent patterns into one schema can very easily lead to very hard to understand code. Another thing is thar boilerplate is fairly easy to write and to test, because it is familiar, reducing the actual effort which goes into it. Sometimes it is just better to not reuse code.

ajuc · on Nov 17, 2022

When you go too far reducing boilerplate you get to a point where configuration becomes the code and the actual code becomes black box that people barely understand. And then they replicate what's already in the black box because they aren't sure it's there, and you do the same things over and over in every layer. And in some layers you do it one way and in others - other way. And then requirements change and you change the code and it still works the old way SOME of the time, but you only discover that in production, because the test case you used when you developed is handled on the layer where you changed it correctly.

And then if it's buried deep enough somebody will add another layer and fix the cases that were found - there.

And that's how the disgusting legacy code happens.

KISS, please. Unnecessary abstraction is the root of almost all problems in programming.

Micoloth · on Nov 18, 2022

That was beutifully written.

Although, I feel like very often, the idea of resucing boilerplate should __not__ to hide the boilerplate one layer below- it should be to __try not to write the boilerplate__ …

Take this very specific example at hand.. What is the meaning of this “JoelEvent” class? Why is it there?

It appears that is wraps a list of functions, with methods to push and pop from it. Why is it necessary to write them?

“Dispatch” reimplements function application , apparently? Btw, it is beyond me why one would loop over this.listeners and the check if the element is in this.listeners .

In a reasonable language or framework, I cannot in any way see a reason why this code needs to be written.. This is the idea of removing boilerplate, to me!

fennecfoxy · on Nov 18, 2022

I think the biggest problem with templating/reusing instead of boilerplate is just how hard it is for a dev to answer the question(s): has somebody already done this, is their solution flexible enough to fit mine, etc.

Hell, just helpers/utilities functions within an organisation aren't always used, devs end up reimplementing stuff all the time simply because there's no easy way to know about it (documentation is only one part of this).

yccs27 · on Nov 17, 2022

+1 on the last paragraph: Predictability means the code follows a pattern, not that it is boilerplate. Some amount of predictable code is necessary just to spell out what the code does, so that even someone unfamiliar with the pattern can simply read and understand it.

kneebonian · on Nov 17, 2022

Honestly this is what I have loved about Kotlin. There seems like there is now just a certain amount of boilerplate in every Java file and Kotlin just chose to bake that all into the language, the other instance where we see this happening is with the Lombok library in Java. Although personally I hate annotations.

lmm · on Nov 17, 2022

We don't talk in maximally compressed strings; a bit of redundancy in grammar helps make sure people can understand each other. The fact that spellcheckers are possible doesn't mean our language is too sparse and wasteful.

A lot of programming languages have a lot of room for improvement though.

WalterBright · on Nov 17, 2022

A programming language with no redundancy in it means the compiler cannot detect any errors - because every sequence of characters forms a valid program.

The skill is in selecting the optimal amount and form of redundancy.

For example, the typical ; statement terminator is redundant. People often ask, since it is redundant, why not remove it? People have tried that, and found out that the redundancy makes for far better error detection.

pharmakom · on Nov 17, 2022

Semicolon is a strange example when so many successful languages don’t have them.

I think a better example is CoffeeScript, where almost any string is a valid program but probably not the one you wanted.

WalterBright · on Nov 17, 2022

> Semicolon is a strange example when so many successful languages don’t have them.

They usually have another statement terminator instead, like a newline.

lmm · on Nov 17, 2022

On that specific example, not my experience at all. (The general point is of course valid, though I think it follows very directly from what I said)

amelius · on Nov 17, 2022

No, with redundancy your compiler can catch a certain class of errors (call them "avoidable") that doesn't exist in case of no redundancy. Of course, in both cases you still have the unavoidable errors.

ColonelPhantom · on Nov 17, 2022

I think the problem is not that "not all tokens are valid" Rather it is that we often repeat the same token sequences, and we should seek to abstract those predictable sequences into more unique tokens, e.g. by turning it into a library function.

Of course, the downside is that now you have way more tokens you need to know to understand some code, similarly to how Haskell code tends to have tons of mega-abstract function combinators or whatever whereas Go code is very simple. Which one is more readable depends on the reader, because a language like Go requires the reader to sift through more details, whereas Haskell is much terser but also requires much more pre-existing knowledge to understand.

I'm wondering if we can use these code-generation models to find "low entropy code" that is a prime target for turning into libraries.

newaccount74 · on Nov 17, 2022

I think the Dont-Repeat-Yourself mantra is a bit overused. For example, assume you have a bunch of functions that all have the following code sequence

    A()
    B()
    C()

So someone thinks, hey, lets extract that bit, and create a function that does all three things

    def ABC():
      A()
      B()
      C()

Now the calls site is much simpler and there is no more repetition:

    ABC()

Except you realise that in some call sites the B() call was missing, so to fix that you add an argument:

    def ABC(doB):
      A()
      if(doB): 
        B()
      C()

The call site now looks a bit less clean, but still no repetition!

    ABC(True)

or

    ABC(False)

Then you realise in some cases the C() call is not necessary, so just add another argument:

    def ABC(doB, doC):
       A()
       if (doB):
          B()
       if (doC):
          C()

And your call sites look like this:

    ABC(True, False)

etc.

At that point the repetition in the original is definitely preferrable...

another-dave · on Nov 17, 2022

If you put names to those functions it instantly feels a lot less like 'the original is definitely preferrable', for example something like:

    def handleNewEmail(isInboxDisplayed, areNotificationsEnabled):
      AddToInbox();
      if (isInboxDisplayed):
        RefreshInboxView();
      if (areNotificationsEnabled):
        SendSystemNotification();

Not defending needless refactoring in all cases, it's definitely a judgement call.

(Edit: formatting)

newaccount74 · on Nov 17, 2022

In your example the if statements would be needed in any case, since they depend on external state. In that case the refactoring would make sense.

I'm talking about the case where the conditionals are only needed because of the refactoring.

another-dave · on Nov 17, 2022

Ah I presumed from your example that conditionals were external to the doABC() logic since they were subsequently passed in as params?

To be honest, I find it hard to discuss readibility without a real-world example. The algebraic placeholders feel too terse.

newaccount74 · on Nov 17, 2022

I'm talking about the case where you have these two functions:

    def func1():
      ...
      A()
      B()
      C()
      ...  

    def func2():
      ...
      A()
      B()
      C()
      ...

Which is then refactored to

    def func1():
      ...
      ABC()
      ...  

    def func2():
      ...
      ABC()
      ...  

    def ABC():
      A()
      B()
      C()

Which is sensible, but then other functions show up:

    def func3()
      ...
      A()
      C()
      ...

And you refactor to:

    def func1():
      ...
      ABC(True)
      ...  

    def func2():
      ...
      ABC(True)
      ...  

    def func3()
      ...
      ABC(False)
      ...

    def ABC(doB):
      A()
      if (doB):
        B()
      C()

I'll leave the final example to your imagination, since I've already used so much screen space.

I'm currently in the process of undoing a lot of such refactorings in my codebase because it has become unmanageable.

I really think repetitive code is much easier to work with than overabstracted code.

ColonelPhantom · on Nov 21, 2022

I think the main problem is lack of expressiveness (what does True mean?). Since your examples seem to be in Python, I would solve it there with named parameters, maybe even giving them default values. That way the code may be abstract, but it is also informative.

hbrn · on Nov 18, 2022

I used to share the same example against obsession with DRY. But it's a bit more nuanced.

Both options can be valid. DRY is a tangential topic, the main goal should be to keep your code as close to your mental model as possible (also keeping mental models in sync between team members, which is quite hard).

ABC being a sequence could be a pattern, or it could be a coincidence. You can't know which one it is just by looking at the code. The knowledge whether it's a sequence or not is in the business domain and in your mental model of that domain.

You might think you're disagreeing on DRY with another team member, but in reality you two have different mental models, and one of you is using DRY to justify a his.

mejutoco · on Nov 17, 2022

Very well put. I run into this all the time in the name of "reducing complexity", when it is really hiding it under the rug.

My personal approach to combat this is better data structures that model the problem (and are checked by the compiler). Once this is in place I try to "flatten" the calls, so that things are mostly at the top level, or few levels deep, which usually comes up naturally once the data structure is consciously defined.

I try to have as much code as possible be "structure in - structure out" (pure functions) and to concentrate stateful code to work on the structure's fields/values. This is surprisingly easy once the data structures match the problem, instead of only growing organically.

lmm · on Nov 17, 2022

A codebase is a living thing. Inlining a function or splitting it into multiple cases should always be an option, and boolean flags are generally a code smell. I don't see this as an argument against DRY; when the facts change, your code structure needs to change too, but that doesn't mean your original structure was wrong.

anilakar · on Nov 17, 2022

Can't wait for the next code review.

"You seem to be using ifs and fors very much. These should be abstracted into a function."

ColonelPhantom · on Nov 18, 2022

You missed the point; an if or for is a single token. The problem is with predictable token sequences, and if you write the same for loop (or extremely similar ones) in multiple places then yes, it should be turned into a function.

danjc · on Nov 17, 2022

Effort in a codebase is unevenly distributed. The 90% of the code that looks like boilerplate probably represents 20% of the effort.

drpixie · on Nov 17, 2022

You're probably right about the effort - but I expect that such "boilerplate" code contains (or leads to) much more than 20% of the bugs.

This 90% of code is not genuine word-for-word boilerplate (copy/pasted from a known good source). This code is typically constructed fresh each time; or worse, copied from somewhere similar and quickly tweaked for names/types! (I do it, and I see it done all the time.)

I expect that the remaining 10% non-boilerplate code, taking 80% of the effort, is much more carefully considered, and less likely to contain those clumsy/forgetful off-by-one or buffer-overflow bugs.

kbenson · on Nov 17, 2022

There's a reason many languages have large overflowing repositories of modules (or there are well known libraries) that can be downloaded and used that provide boilerplate solutions for many things.

Most people don't like writing that boilerplate once they know how to do it and have done it a few times, and would rather just call a function do_that_thing_need_done_on(input1, input2).

If it can't be factored out like that and is actual language boilerplate beyond a few lines, that's a failure of the language.

If these AI models are suggesting the code that could be called in a library/module instead of the the code to actually include and call a well known and trusted library or module, I'm not sure that's progress. At least when someone notices a bug or better way to do it and updates that module or library, consumers of that module can update and benefit from it, or at a minimum see that there were bugs in the version they're running they might want to address at some point.

another-dave · on Nov 17, 2022

> If these AI models are suggesting the code that could be called in a library/module instead of the the code to actually include and call a well known and trusted library or module, I'm not sure that's progress.

I think the judgement call of when to use a library & what library to use is quite subjective, even for humans to get right.

If I'm doing JSON deserialisation it might suggest I use Gson library which would be much better than rolling your own. But the original authors are saying that you should prefer Moshi over Gson — I think it'd be hard for an AI to reach that conclusion though (though maybe not if it's doing something like tracking migrations in OS projects from Gson->Moshi).

With something a little more trivial — I don't want it to add in a dependency on left-pad, even though it has 2.5M weekly downloads so is arguably both well-known and trusted :)

You could probably set a threshold for how complex code is before it's suggested to be swapped out for a lib, but then is my code simple because I'm ignoring edge cases I should support, or because I've trimmed the fat on what I'm choosing to support (e.g. i18n, date handling, email validation etc.)

kbenson · on Nov 17, 2022

I agree it's not always cut and dry which module to use, or whether to use a module for something extremely simple (which is why I mentioned it being more than a few lines, which should weed out stuff like left-pad I would hope), but I think knowing there is a module and it's suggesting it might be a good first step.

The only thing worse than using a module that has a bug/security problem for a function that's just a few lines and not used again in the codebase is when the content of that function is copied in place instead of being included and nobody has an easy way of knowing whether that's the code that was suggested and included in their project. Worst of both worlds.

fiddlerwoaroof · on Nov 17, 2022

Yeah, one of the interesting results in empirical studies of defect rates is that defect rate is influenced by lines of code more than other factors like “static types”. Similarly, analyses of defects have discovered that they tend to occur at the end of repetitive sequences of code, because the developer has sort of switched into autopilot mode. I think the obvious conclusion here (and my experience bears this out to some extent) is that languages and libraries that force boilerplate on you produce buggier code than languages and libraries that abstract the boilerplate away.

nradov · on Nov 17, 2022

Yes and I raised the same concern when GitHub Copilot was released. If our code contains so little entropy that an AI can reliably predict the next sequence of tokens then that is evidence we are operating at too low a level of abstraction. Such tools can certainly be helpful in working with today's popular languages, but what would a future language that allows for abstracting away all that boilerplate look like?

Since this is HN I'm sure someone will say that the answer is Lisp. But can we do better?

tgv · on Nov 17, 2022

Gzipped Lisp would be an improvement in that reasoning. I’m sure you agree that’s not a very desirable way to write code.

pharmakom · on Nov 17, 2022

The number of bytes isn’t really significant. It’s the conceptual distance between what we want to achieve and the concrete steps in the code.

Lisp may be an improvement.

tgv · on Nov 17, 2022

But now you're already heading into much vaguer territory. Readability is also important. Very important, I would say. That requires easily identifiable markers for loops, conditions, functions, etc., something Lisp lacks. This might be a place where keyword coloring could be useful, but then we're relying on external help.

Another issue is consistency. Take C, Javascript, or Go. Many loops are of the form

    for (var i = 0; i < n; i++) { ... }

You could argue that "for i < n" provides the same information, but then you'd have to find a way to start the loop at a different offset, use a different end condition, or different "increment".

nradov · on Nov 17, 2022

But that's exactly the issue. In most cases with that for loop we just want to apply the same operation to every item in a collection and shouldn't need to explicitly code a loop for that. So it should be possible to take advantage of higher level language constructs to express that, or define our own constructs through some form of meta programming. Is there a way to accomplish that while still retaining readable code?

tgv · on Nov 17, 2022

Meta programming can make things worse. It can be useful for constructing/representing rule-like objects or functions, but when you start overloading basic syntactic elements, people will loose track. It was the staple trick for the obfuscated C contest, so much that it's been forbidden now, IIRC. It's really difficult to come up with something both terse, readable (and unambiguous).

But the situation is not that bad, is it? A few characters too many, so be it. I find reusability a much larger problem.

pharmakom · on Nov 17, 2022

> But now you're already heading into much vaguer territory.

Yes programming language design is a social science (imo)

GuB-42 · on Nov 17, 2022

I don't think so.

Sure, most code is boilerplate, except for that one thing, and that one thing can be anywhere. For example, let's say you want to write a function that returns the checksum of a bunch of data. That's a very common thing to do, there are plenty of libraries that do that, and I have seen the CRC32 lookup table in many places, sometimes I am the one who put it there.

Now, why rewrite such a function?

- Ignore some part of the message

- Use different constants

- Fetch data in a special way (i.e. not a file or memory buffer)

- Have some kind of a progress meter

- The library you may want to use is not available (can be for technical, legal or policy reasons)

- Some in-loop operation is needed (ex: byte swapping)

- Have a specific termination condition (ex: end-of-message marker)

- And many others, including combinations of the above

If you ignore all these points and only see the generic checksum function, yes, it is boilerplate and can be factorized. But these special cases are the reason why it may not be the case, and the reason why there are so many coding jobs.

It is also the reason why we don't have real (Lv5) self driving cars yet, why there are pilots in the cockpit, why MS Office and the like have so many "useless" features, why so many attempts to make software cleaner and simpler fail, etc...

julian37 · on Nov 17, 2022

That's hardly a convincing example. All of these points can be solved elegantly with a stream abstraction, which can be cheap or free given a sufficiently advanced language and compiler.

As for legal or policy reasons, those still aren't reasons to write boilerplate code. Your reimplementation can be tight and reuse other abstractions or include their own.

GuB-42 · on Nov 17, 2022

A stream abstraction is a solution so some (not all) of these problems, and indeed some libraries use them, but a stream abstraction that is powerful enough to solve most of these problems may result in more complex code than just rewriting the checksum algorithm from scratch. And there is a limit on how compilers can optimize, especially considering that checksum calculation may be critical to performance.

In reality, few people need to write their own checksumming function, but sometimes, it is the best thing to do. And it is just an example, there are many other instance where an off the shelf solution is not appropriate because of some detail: string manipulation, parsing, data structures (especially the "intrusive" kind), etc... And since you are probably going to have several of these in your project, it will result in a lot of boilerplate. If it was so generic not to require boilerplate, it probably has been developed already and you would be working on something else.

Abstractions are almost invariably more complex, slower, more error-prone and generally worse than the direct equivalent. They are, however, reusable, that's the entire point. So one person goes through the pain of writing a nice library, and it makes life a little easier for the thousands of people who use it, generally, that's a win. But if you write an abstraction for a single use case, it is generally worse than boilerplate.

ludovicianul · on Nov 17, 2022

This is exactly my view, especially with the web apps. If you take a distributed system, the majority of components/microservices will have more than 50% commonality in behaviour. Therefore you do mostly the same things when you start a new one. Even if code itself might be harder to generate, as even a CRUD app might have specific behaviour, testing it is definitely the same, especially when doing negative scenarios, boundary testing, CRUD operations, etc. I wrote a tool specifically for this purpose, targeted at REST APIs, aiming to automate this repetitive work and let you focus on the tests which are specific to the context.

pwinnski · on Nov 17, 2022

We are not compression algorithms! If we were, we could replace the most common block of boilerplate code with token 'A', the second-most block of boilerplate code with token 'B,' and so on, writing programs in very few bytes. God have mercy on anyone trying to debug such a program, though.

Any language with no boilerplate at all is a black box of incomprehensibility. Java has, I think, more boilerplate than average, while some other languages have less boilerplate than average.

IDEs can help with some of this, which is why I finally stopped writing all code in vim.

AnimaLibera · on Nov 17, 2022

Allowing to compress code this much is the goal of golfing languages (such as 05AB1E (or osabie) or Pyth (not Python)). The code golf stack exchange forum contains a lot of programming challenges where the goal is to write the shorest program (in bytes) that does what the challenge asks, and some answers are truly impressive, with somewhat non-trivial algorithms being implemented in as few as 4 bytes (in extreme cases). Granted, these are programming challenges and not production code to be deployed, and some golfing languages are designed for a specific kind of task or algorithm that may let us think that the algorithm was actually pre-implemented in the language (and sometimes it is kinda true), but still, worth taking a look at it.

alenmilk · on Nov 17, 2022

Yes, this is true as long as the method or function doesn't contain if's changing the behaviour depending on the data. In other words if the problem is so well defined that you can create a method that solves the problem and it doesn't need to take into account x variations of the problem then it is fine. This is the copy-paste versus creating a function discussion. Problem is the x variations of the problem and you need the code to do different things depending on the variation, we usually break the modularity of the function instead of separating the generic and non-generic parts. Hence the ifs in the function. From what I have seen people are unable to do this in their own codebase properly so I don't think it will happen globally. But on the other hand libraries are kind of the answer to the problem and as problems get well defined, one starts using libraries. Raising the level of abstraction is a continuous process.

IanCal · on Nov 17, 2022

Written language has this too. A basic lookup table of frequencies can tell you that jkkj is a typo in an English word. "Nobody else is really writing code that has the fragment in you just wrote" can find syntax errors. Better language models can find more subtle relationships.

At some point the variations mean that more abstractions don't really help.

WithinReason · on Nov 17, 2022

I think you misunderstand machine learning. "probability of each token appearing given the previous tokens" is how humans write code too: We write code based on what we want to do and what we have written before. "what I want to do" was captured in the comments added.

andai · on Nov 17, 2022

>probability of each token appearing given the previous tokens

Sounds like tokenizer -> Markov chain? Surely something trained on a TPU is more sophisticated than something we could have done in the middle of the 20th century?

pharmakom · on Nov 17, 2022

I agree. And yet most developers claim JS, Python, Java etc are totally sufficient.

wikfwikf · on Nov 17, 2022

Perl is a wonderful, innovative language which failed because it tried to remove intratextual redundancy in the way you are suggesting.

A string of length N is vastly more likely to be a valid Perl program than a valid Python program. Ultimately this meant that Perl programs, while easier to type, were much harder to read, and extremely easy to misinterpret.

ilyt · on Nov 17, 2022

There is also something to be said about nudging developer on the "right" way to do stuff.

Perl is not only hard to read because there are many shortcuts that might look like line noise for the inexperienced (hell, Rust have a bunch of those too), it's because there is a bunch of the ways to do anything.

Like take humble array

    Perl> @a
    $VAR1 = 1;
    $VAR2 = 2;
    $VAR3 = 3;
    $VAR4 = 4;
    $VAR5 = 5;

    Perl> @a + 2
    $VAR1 = 7;

Why adding a number to array results in number ? Because array in scalar context returns its length. It leads to some very compact code

    if (@a > 3) {print "big array"}

but, uh, what you do if you want to see length of string ? well length($string).

Will that also work for arrays ? Nope, if you want to force scalar context you're supposed to do scalar(@array). So how to add 2 arrays ?

   @c = (@a, @b) #

obviously. But wait, what we really typed after variable expansion is ((1,2),(2,3)) and in other languages

    irb(main):001:0> a = [1,2]
    => [1, 2]
    irb(main):002:0> b=[2,3]
    => [2, 3]
    irb(main):003:0> c = [a,b]
    => [[1, 2], [2, 3]]

    >>> a = [1,2]
    >>> b = [2,3]
    >>> c = [a,b]
    >>> c
    [[1, 2], [2, 3]]

that's exactly what we get. Confusing ? Sure. But it saves few characters!