korn.c
is the "Best One Liner" winner of the 1987 International Obfuscated C Code Contest, by David Korn (yes, the author of the Korn Shell).
korn.hint
, as the name implies, offers some hints.
A commenter on Stack Overflow asked for some clarification. I didn't want to post spoilers on the site, so I'm posting them here instead. If you haven't already (and if you're familiar with the rules of C) I encourage you to study the program for a while first.
=====
Here's the code:
main() { printf(&unix["\021%six\012\0"],(unix)["have"]+"fun"-0x60);}
This was written 2 years before the ANSI standard was published. Modern compilers are likely to accept it with warnings, but a few changes are needed to bring it into conformance:
#include <stdio.h>
int main(void) { printf(&unix["\021%six\012\0"],(unix)["have"]+"fun"-0x60);}
But that's not quite as much fun -- and it still depends on unix
being a predefined macro that expands to 1
. (Depending on the compiler, you can probably address that by compiling with -Dunix
.)
Commenter Sebastian wrote:
Hmm... Whenever I try to evaluate the first arg to printf in my head, I get "21%six", but not "%six" as I would expect. Can anyone enlighten me where it went wrong?
You missed a couple of things. The format string starts with \021
, an octal escape that expands (well, contracts) to a single character with the value 21 octal or 17 decimal. (The \0
by itself doesn't expand to a null character, though it would if it were followed by something other than another octal digit.) The \012
expands to character 10, which on most systems is the same as \n
; probably \021
was chosen for symmetry with \012
. The value \021
doesn't matter, because it's skipped.
Remember that the array indexing operator is commutative, as discussed here, and that unix
(for some compilers in some modes) expands to 1
. So the first argument to printf
:
&unix["\021%six\012\0"]
which is equivalent to:
&"\021%six\012\0"[1]
That's a string literal indexed by 1
, which refers to the second character of the string, the %
. Taking the address of that character gives us a string pointer (note: a pointer to a string is by definition a pointer to the string's first character) pointing to a string with the value "%six\012\0"
, or, equivalently, "%six\n"
.
So the format string is "%six\n"
.
The second argument is:
(unix)["have"]+"fun"-0x60)
which, once you realize unix
expands to 1
and indexing is commutative and that the ASCII value of 'a'
is 0x61
, is equivalent to the string "un"
. (I might go into more detail on this later.)
Taking all this into account, the printf
call is equivalent to this:
printf("%six\n", "un");
and therefore to:
printf("unix\n");
@jhudsoncedaron "Undefined behavior" is defined by the C standard as "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements". It means the standard doesn't define the behavior. If something else (say, a secondary standard like POSIX) happens to define the the behavior, it's still undefined behavior in the context of the C standard.
It's perfectly legitimate for code that implements the C standard library to have undefined behavior in this sense -- as long as it works correctly as part of the implementation. Code that implements
malloc
, for example, doesn't even have to be written in C. (If you ported that code to a platform where the pointer comparisons fail, the result would be a non-conforming implementation.)