Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One other advantage of signed overflow not listed here -- loop termination. Consider the following loop:

    for(int i = A; i <= B; i+=4) ...
This loop can be infinite if overflow is defined, and B is less than 4 from the largest value. Checking for this is a pain (and extra code), and in practice (almost) no-one would ever mean to write this and get an infinite loop when B is close to the top of the integer range.

Special case checking for this case is a pain for various kinds of loop unrolling / peeling, and also removing empty loops if they don't contain any code (which often comes up in practice with C++ templates).



Speaking as a professional compiler engineer, this is the biggest reason why signed overflow is treated as UB.

It is incredibly useful to talk about a loop's trip count or the values an induction variable can have.

It turns out that taking advantage of signed overflow lets us do that in more places, a boon to all the programs which don't dynamically execute signed overflow.

The tradeoff, however, is that the programing model gets way more complicated. I have talked to the engineer who introduced this optimizing power to our compiler and he sorta regrets it.

IMO, we need better, higher level ways of talking about a loop's iteration space. Today, in C or C++, we are forced to represent the iteration space of a loop using the language's algebra. This is severely limiting when each operation on a signed type might introduce UB. With a higher level mechanism, we can separate the behavior that we want for loops from the behavior we want for everything else in a nice, clean way.


> With a higher level mechanism, we can separate the behavior that we want for loops from the behavior we want for everything else in a nice, clean way.

Yes! One decisive advantage that high-level iteration constructs have over for loops is that they're easier to optimize.

One obvious example here is bounds checks: there's no need to have infrastructure to optimize out bounds checks over arrays during loops if you have a construct that eliminates the indexing entirely.


Genuine question: what do you do when you want to implement your own loop mechanism, iterator, array type, or whatever?


It is the job of the language to ensure that language primitives are reusable and composable.

I think this would apply to whatever mechanism is designed to describe a loop's iteration space.


Most languages use some sort of interface whereby if you implement certain methods for a type then that type can be iterated over using a for loop.


It's just one line. Put this before the loop if the code logically shouldn't trigger this.

  assert( B < B_TYPE_MAX - 4 );
Or use an if statement if it could trigger at runtime.

Also the code is clearer on intent.


You misunderstood the problem. The problem isn't that programmers can't write loops that are easy to optimize. It's rather that, in practice, they don't write loops that are easy to optimize. That's ultimately a problem with the C language.


If the loops can be written and the programmers don't write them, the problem is with the language!?

No, it's pretty clear where the problem is with C, programmers.


The easiest way to write the loop is the way that is hard to optimize. This is a fact that is beyond dispute: empirically, programmers use int to index over arrays in for loops without writing asserts. That is in fact a problem with the language.


programmers use int to index over arrays in for loops without writing asserts. That is in fact a problem with the language.

Ok let me get this clear, you're saying that the language is at fault because the programmers don't write correct code. I have no words.


Emphatically yes.

When designing tools for use by humans, you need to accept the fact that humans, with all our failings, will be the ones using the tools. So tools should be made resilient against typical ways in which humans fail.

We've been paying for shortcomings in C's design for decades with bugs and security failures that simply don't happen in other languages. That you refuse to see this is baffling to me.


> Ok let me get this clear, you're saying that the language is at fault because the programmers don't write correct code.

Yes.

A good language makes the correct thing the easy thing.


When a common idiom becomes the wrong way to do something, someone messed up pretty badly (and preexisting idiom at that).

Arguably the designers of amd64 should have caught this before releasing the ABI or the language design could have been specified so this wasn't an issue in the first place.


Can assert() actually function this way? Can you use assert to tell the optimizer it can assume something is true?

I've noticed that memcpy(x, y, 4) on x86 can generate very efficient code (register move), but on ARM it expands to something much more verbose because the addresses might not be aligned.

Could this effectively function as a way of promising to the compiler that the addresses are aligned?

    void move4_aligned(void *dst, const void *src) {
      assert(((uintptr_t)dst & 0x3) == 0);
      assert(((uintptr_t)src & 0x3) == 0);
      memcpy(dst, src, 4);
    }


> Can assert() actually function this way? Can you use assert to tell the optimizer it can assume something is true?

Trivially.

  #ifdef NDEBUG
  # define assert_and_assume(cond) if(!(cond))     
    __builtin_unreachable((cond))
  #else
  # define assert_and_assume assert
  #endif


Interesting! This assert_and_assume() seems strictly better than vanilla assert() for any predicates that don't have side effects. But I guess you have to be sure that the compiler is able to deduce that there aren't side effects and feels comfortable optimizing away the predicate in release mode.


Can assert() actually function this way? Can you use assert to tell the optimizer it can assume something is true?

Yes, but it would have the potentially surprising behavior that compiling your release version with -DNDEBUG might slow it down. We ran into this a couple years ago: https://plus.google.com/+DanielLemirePhD/posts/BTSams19Ero

Recent compilers support some extensions that can do it better. GCC and CLang use __builtin_assume_aligned(), and ICC uses __assume_aligned() (haven't tested the synonym).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: