> is the return value the number of bytes written, like sprintf, or the number of bytes that could have been written, like snprintf?
Both report the same number. snprintf just doesn't write more than n bytes.
> when I'm evaluating code for security issues, it's useful to know exactly how a function will behave in edge cases
simple data: no edge cases. Pointer + length, that's simple. The most complexity you might be facing is if you allow a NULL pointer in case of 0 length (which is common). There should be hardly any function that cares for that.
Aside, C strings are still a valid alternate representation in some cases. For example, for many small strings (where an explicit length field would double the cost) or purely as a convenience in conjunction with string literals. Or for some serialized representations (it's nice not having to deal with compatibility problems concerning the physical representation of the length value).
> This is the reason why there are so many security vulnerabilities, because there are too many nuances and inconsistencies.
I think the reason is that the standard library is overused. Functions like strcat etc. are not only inefficient. They have a needlessly complex API. Would people create the functionality they actually needed instead of working around the ill-fitting API in each line of code, then there would be fewer vulnerabilities.
> strcpy() [..] strncpy() [..] strlcpy [..] no clear replacement to reach for, and we expect every programmer to know this.
No. Just don't use this stuff. It's too complex. It's the wrong interface. If you deal with strings and don't do automatic reallocation you need to get their lengths anyway. So what you should do is something like
void do_silly_stuff(const char *a, const char *b, const char *c,
int alen, int blen, int clen)
{
char buf[FIXEDSIZE];
ASSERT(alen + blen + clen <= sizeof buf);
int len = 0;
memcpy(buf + len, a, alen); len += alen;
memcpy(buf + len, b, blen); len += blen;
memcpy(buf + len, c, clen); len += clen;
do_more_silly_stuff(buf, len);
}
Nice and explicit. You cannot get less dangerous in C.
Or if you do automatic realloction ("dynamic string"):
String sillyconcat(String a, String b, String c) {
String result = new_string();
//optionally:
// string_reserve(string_length(a) + string_length(b) + string_length(c));
string_append(result, a);
string_append(result, b);
string_append(result, c);
return result;
}
... but I'd only recommend this approach if you want to get super-comfortable and are willing to pay the cost of being maybe a bit opaque and of buying in into a specific String type.
Ah, but even this can be dangerous. Suppose FIXEDSIZE = 4096. If you pass 715,827,883 as all three length parameters, your assertion (assuming it's not compiled out) passes, but a buffer overflow occurs.
Basically, using the C string routines with untrusted input is terrifying.
Yep. You could argue to death almost any line of C code that contains an arithmetic operation in this way. But the ASSERT is not supposed to catch everything. It's a basic protection against programming errors. (Realistically I rarely have strings that are 700 MB large).
Unfortunately, arithmetic overflow doesn't result in an exception. I rarely want wraparound.
> Realistically I rarely have strings that are 700 MB large
maybe not that long, but it's not that hard to lose a '\0' when serializing complicated data structures to disk. when you read the file back in, suddenly one of your structs contains an arbitrarily long string. i've seen several 32+MB strings get created this way.
Both report the same number. snprintf just doesn't write more than n bytes.
> when I'm evaluating code for security issues, it's useful to know exactly how a function will behave in edge cases
simple data: no edge cases. Pointer + length, that's simple. The most complexity you might be facing is if you allow a NULL pointer in case of 0 length (which is common). There should be hardly any function that cares for that.
Aside, C strings are still a valid alternate representation in some cases. For example, for many small strings (where an explicit length field would double the cost) or purely as a convenience in conjunction with string literals. Or for some serialized representations (it's nice not having to deal with compatibility problems concerning the physical representation of the length value).
> This is the reason why there are so many security vulnerabilities, because there are too many nuances and inconsistencies.
I think the reason is that the standard library is overused. Functions like strcat etc. are not only inefficient. They have a needlessly complex API. Would people create the functionality they actually needed instead of working around the ill-fitting API in each line of code, then there would be fewer vulnerabilities.
> strcpy() [..] strncpy() [..] strlcpy [..] no clear replacement to reach for, and we expect every programmer to know this.
No. Just don't use this stuff. It's too complex. It's the wrong interface. If you deal with strings and don't do automatic reallocation you need to get their lengths anyway. So what you should do is something like
Nice and explicit. You cannot get less dangerous in C.Or if you do automatic realloction ("dynamic string"):
... but I'd only recommend this approach if you want to get super-comfortable and are willing to pay the cost of being maybe a bit opaque and of buying in into a specific String type.