Compiler optimizations are weird

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Compiler optimizations are weird

Christian Weisgerber
Modern compiler optimizations are a sight to behold.

When I extract the bitrev32() function from sys/dev/fdt/if_dwge.c
and compile it on its own on aarch64, clang with optimization
recognizes the purpose and reduces the arithmetic to a single "rbit"
instruction.  Amazing.

Somewhat less (or more?) amazingly, this appears to be a late stage
optimization that is sabotaged by earlier steps.

When compiling the full if_dwge.c, clang seems to inline the function,
and then strips it down to the fragments needed to reverse the few
bits actually used.

Over in the sys/dev/rasops code, the same bit reversal is performed
by the MBE() macro.  Here clang recognizes that the code is called
inside some loops and extracts the invariant parts: It loads the
eight constants into registers up front, and only leaves the
and/or/shift operations inside the loop.

So those clever optimizations of the arithmetic prevent the even
more clever substitution with "rbit".

It's just an observation I thought I'd share.  Let's file it under
"the compiler moves in a mysterious way".

Christian "naddy" Weisgerber                          [hidden email]