One famous example: Csmith exposed that GCC’s vectorizer could miscompile a simple loop, leading to wrong integer results – a bug that had existed for years.

It doesn’t need a seed corpus. It can produce billions of unique, valid C programs from scratch.

Developed by John Regehr, Xuejun Yang, and Yang Chen at the University of Utah, Csmith was designed to solve a critical problem: Compilers are too complex to be tested by hand. Human-written test suites (like GCC’s testsuite/ ) are valuable but limited. Humans tend to write code that "makes sense" and avoid weird edge cases. Compiler bugs, however, often lurk precisely in those weird edge cases.

./csmith > test.c

The secret sauce is that Csmith generates programs. For any given random seed, the program’s output is predictable. More importantly, Csmith can also generate a built-in "checksum" or "hash" of the program’s state at the end of execution. If two compilers produce different checksums for the same source code, that is a smoking gun —a guaranteed miscompilation.