If this holds true, I'll concede this specific point.
As we know, however, benchmarking can often come down to tuning. If this most basic of compiler options has not been set to the obvious choice for speed, how can we have any confidence that the C code as written is written in an efficient way?
Are we comparing language against language here, or somebody's implementation in one language against somebody's implementation in another?
I note that there appear to be hand optimisations in the C code. Were these done well, or would the compiler have done a better job?
Of course we are comparing implementations; languages do not have a speed. My language (purely hypothetical, unfortunately) language at builds on 'Principia Mathematica' may need a 10000 page program to compute 1+1, but its compiler could, in theory, produce the same executable as C (or Fortran, or whatever) would from their one-liners that do the same thing.