Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Discussion of korn.c, 1987 IOCCC entry, mentioned in

korn.c is the "Best One Liner" winner of the 1987 International Obfuscated C Code Contest, by David Korn (yes, the author of the Korn Shell).

korn.hint, as the name implies, offers some hints.

A commenter on Stack Overflow asked for some clarification. I didn't want to post spoilers on the site, so I'm posting them here instead. If you haven't already (and if you're familiar with the rules of C) I encourage you to study the program for a while first.


Here's the code:

main() { printf(&unix["\021%six\012\0"],(unix)["have"]+"fun"-0x60);}

This was written 2 years before the ANSI standard was published. Modern compliers are likely to accept it with warnings, but a few changes are needed to bring it into conformance:

#include <stdio.h>
int main(void) { printf(&unix["\021%six\012\0"],(unix)["have"]+"fun"-0x60);}

But that's not quite as much fun -- and it still depends on unix being a predefined macro that expands to 1. (Depending on the compiler, you can probably address that by compiling with -Dunix.)

Commenter Sebastian wrote:

Hmm... Whenever I try to evaluate the first arg to printf in my head, I get "21%six", but not "%six" as I would expect. Can anyone enlighten me where it went wrong?

You missed a couple of things. The format string starts with \021, an octal escape that expands (well, contracts) to a single character with the value 21 octal or 17 decimal. (The \0 by itself doesn't expand to a null character, though it would if it were followed by something other than another octal digit.) The \012 expands to character 10, which on most systems is the same as \n; probably \021 was chosen for symmetry with \012. The value \021 doesn't matter, because it's skipped.

Remember that the array indexing operator is commutative, as discussed here, and that unix (for some compilers in some modes) expands to 1. So the first argument to printf:


which is equivalent to:


That's a string literal indexed by 1, which refers to the second character of the string, the %. Taking the address of that character gives us a string pointer (note: a pointer to a string is by definition a pointer to the string's first character) pointing to a string with the value "%six\012\0", or, equivalently, "%six\n".

So the format string is "%six\n".

The second argument is:


which, once you realize unix expands to 1 and indexing is commutative and that the ASCII value of 'a' is 0x61, is equivalent to the string "un". (I might go into more detail on this later.)

Taking all this into account, the printf call is equivalent to this:

printf("%six\n", "un");

and therefore to:


This comment has been minimized.

Copy link

@mcornella mcornella commented Nov 15, 2013

If you want to compile the original source you can include and define from the command line:
gcc -include "stdio.h" -Dunix main.c -o main.exe

Also this works too and it has unix cats :)

main() { printf(&unix["\021%six\012\0"],(unix)["cats"]+"run"-0x60);}

This comment has been minimized.

Copy link

@ghost ghost commented Jan 10, 2017



This comment has been minimized.

Copy link

@josephcsible josephcsible commented Oct 13, 2018

There's actually a bit of undefined behavior in this program. Consider the process of evaluating the second argument. After evaluating (unix)["have"] to 'a', you're left with 'a'+"fun"-0x60. By order of operations, this is evaluted as ('a'+"fun")-0x60. "fun" is a char array of size 4 ({'f', 'u', 'n', '\0'}), and 'a' is equal to 97. The result of the addition is a pointer that points neither into nor just beyond said array.

From the C standard:

The behavior is undefined in the following circumstances: [...] Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that does not point into, or just beyond, the same array object (6.5.6).

So it doesn't matter that the subsequent subtraction of 0x60 (96) "should" result in a pointer to the 'u' in "fun" (even though it does on practically every platform), as the initial addition has already rendered the entire program undefined. Clang has a warning about this:

korn.c:1:57: warning: the pointer incremented by 97 refers past the end of the
      array (that contains 4 elements) [-Warray-bounds-pointer-arithmetic]
        main() { printf(&unix["\021%six\012\0"],(unix)["have"]+"fun"-0x60);}
                                                ~~~~~~~~~~~~~~ ^

Were the second argument to instead be -0x60+(unix)["have"]+"fun", it would then be well-defined to behave as required.


This comment has been minimized.

Copy link

@jhudsoncedaron jhudsoncedaron commented Aug 14, 2020

@josephcsible: It's not undefined behavior; the old platform defined general pointer comparison and arithmetic to work.


This comment has been minimized.

Copy link

@josephcsible josephcsible commented Aug 14, 2020

@jhudsoncedaron: What's "the old platform"? Where does it say those things are defined to work?


This comment has been minimized.

Copy link

@jhudsoncedaron jhudsoncedaron commented Aug 14, 2020

@josephcsible: The reference copy of the C standard library included a copy of malloc that depended on arbitrary pointer arithmetic and comparison working. I'm not exactly sure which versions I looked at anymore, but it wasn't until C was being ported off unix that weird pointer arithmetic became itself undefined.


This comment has been minimized.

Copy link
Owner Author

@Keith-S-Thompson Keith-S-Thompson commented Aug 14, 2020

@jhudsoncedaron "Undefined behavior" is defined by the C standard as "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements". It means the standard doesn't define the behavior. If something else (say, a secondary standard like POSIX) happens to define the the behavior, it's still undefined behavior in the context of the C standard.

It's perfectly legitimate for code that implements the C standard library to have undefined behavior in this sense -- as long as it works correctly as part of the implementation. Code that implements malloc, for example, doesn't even have to be written in C. (If you ported that code to a platform where the pointer comparisons fail, the result would be a non-conforming implementation.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment