Optimisation gone wrong

This post is just a quick rant about compilers trying to be helpful.

While I usually develop on NetBSD, I've recently been taking care to keep things running smoothly under Linux. I'll set up automated builds for these, shortly. Meanwhile, the differences between the two platforms often bring up strange little points.

Most recently, there are two cases covered by [73] which I'll discuss here as I think they have something interesting to show about optimisation. The first:

#define _POSIX_SOURCE

#include <string.h>

int main(void) {
        char d[] = { 'a', '\0' };

        strtok_r(NULL, d, NULL);

        return 0;
}

gives:

void% cc -O0 -Wconversion a.c
void% cc -O1 -Wconversion a.c
a.c: In function `main':
a.c:8: warning: passing arg 2 of `__strtok_r_1c' with different width due to prototype
void%

because of:

/usr/include/bits/string2.h:__strtok_r_1c (char *__s, char __sep, char **__nextp)

Hence -Wconversion is removed, which is really intended when converting pre-ANSI code only.

The interesting point here is that a perfectly valid call to strtok_r() has somehow been contorted behind the scenes to produce illegal code, in the name of optimisation.


Now, libgreat is in violation of C99 7.26.11 P1: Function names that begin with str, mem or wcs and a lowercase letter may be added to the declarations in the <string.h> header.

From /usr/include/bits/string2.h (preprocessor conditionals omitted):

extern char *__strdup (__const char *__string) __THROW __attribute_malloc__;
#  define __strdup(s) \
  (__extension__ (__builtin_constant_p (s) && __string2_1bptr_p (s)           \
                  ? (((__const char *) (s))[0] == '\0'                        \
                     ? (char *) calloc (1, 1)                                 \
                     : ({ size_t __len = strlen (s) + 1;                      \
                          char *__retval = (char *) malloc (__len);           \
                          if (__retval != NULL)                               \
                            __retval = (char *) memcpy (__retval, s, __len);  \
                          __retval; }))                                       \
                  : __strdup (s)))
#   define strdup(s) __strdup (s)

This causes all hell to break loose from our definition of strdup():

string.c:50: error: syntax error before "__extension__"
string.c:50: error: `__len' undeclared here (not in a function)
string.c:50: error: initializer element is not constant
string.c:50: error: syntax error before "if"
string.c:50: warning: type defaults to `int' in declaration of `__retval'
string.c:50: error: conflicting types for `__retval'
string.c:50: error: previous declaration of `__retval'
string.c:50: warning: redundant redeclaration of `__retval' in same scope
string.c:50: warning: previous declaration of `__retval'
string.c:50: error: ISO C forbids data definition with no type or storage class
string.c:50: error: syntax error before '}' token

C99 7.1.4 F163 recommends to #undef macros in situations where "real" functions are required. However in this case, libgreat now simply doesn't include <string.h> at all, and provides its own prototype, instead.