korn.c
is the "Best One Liner" winner of the 1987 International Obfuscated C Code Contest, by David Korn (yes, the author of the Korn Shell).
korn.hint
, as the name implies, offers some hints.
A commenter on Stack Overflow asked for some clarification. I didn't want to post spoilers on the site, so I'm posting them here instead. If you haven't already (and if you're familiar with the rules of C) I encourage you to study the program for a while first.
=====
Here's the code:
main() { printf(&unix["\021%six\012\0"],(unix)["have"]+"fun"-0x60);}
This was written 2 years before the ANSI standard was published. Modern compilers are likely to accept it with warnings, but a few changes are needed to bring it into conformance:
#include <stdio.h>
int main(void) { printf(&unix["\021%six\012\0"],(unix)["have"]+"fun"-0x60);}
But that's not quite as much fun -- and it still depends on unix
being a predefined macro that expands to 1
. (Depending on the compiler, you can probably address that by compiling with -Dunix
.)
Commenter Sebastian wrote:
Hmm... Whenever I try to evaluate the first arg to printf in my head, I get "21%six", but not "%six" as I would expect. Can anyone enlighten me where it went wrong?
You missed a couple of things. The format string starts with \021
, an octal escape that expands (well, contracts) to a single character with the value 21 octal or 17 decimal. (The \0
by itself doesn't expand to a null character, though it would if it were followed by something other than another octal digit.) The \012
expands to character 10, which on most systems is the same as \n
; probably \021
was chosen for symmetry with \012
. The value \021
doesn't matter, because it's skipped.
Remember that the array indexing operator is commutative, as discussed here, and that unix
(for some compilers in some modes) expands to 1
. So the first argument to printf
:
&unix["\021%six\012\0"]
which is equivalent to:
&"\021%six\012\0"[1]
That's a string literal indexed by 1
, which refers to the second character of the string, the %
. Taking the address of that character gives us a string pointer (note: a pointer to a string is by definition a pointer to the string's first character) pointing to a string with the value "%six\012\0"
, or, equivalently, "%six\n"
.
So the format string is "%six\n"
.
The second argument is:
(unix)["have"]+"fun"-0x60)
which, once you realize unix
expands to 1
and indexing is commutative and that the ASCII value of 'a'
is 0x61
, is equivalent to the string "un"
. (I might go into more detail on this later.)
Taking all this into account, the printf
call is equivalent to this:
printf("%six\n", "un");
and therefore to:
printf("unix\n");
There's actually a bit of undefined behavior in this program. Consider the process of evaluating the second argument. After evaluating
(unix)["have"]
to'a'
, you're left with'a'+"fun"-0x60
. By order of operations, this is evaluted as('a'+"fun")-0x60
."fun"
is a char array of size 4 ({'f', 'u', 'n', '\0'}
), and'a'
is equal to 97. The result of the addition is a pointer that points neither into nor just beyond said array.From the C standard:
So it doesn't matter that the subsequent subtraction of 0x60 (96) "should" result in a pointer to the
'u'
in"fun"
(even though it does on practically every platform), as the initial addition has already rendered the entire program undefined. Clang has a warning about this:Were the second argument to instead be
-0x60+(unix)["have"]+"fun"
, it would then be well-defined to behave as required.