Skip to content

Instantly share code, notes, and snippets.

@Muffo Muffo/codingstyle.md
Last active Aug 29, 2015

Embed
What would you like to do?
OpenSSL coding style - standard

This document describes the coding style for the OpenSSL project. It is is derived from the Linux kernel coding style, which can be found at: https://www.kernel.org/doc/Documentation/CodingStyle This guide is not distributed as part of OpenSSL itself. Since it is derived from the Linux Kernel Coding Style, it is distributed under the terms of the kernel license, available here: https://www.kernel.org/pub/linux/kernel/COPYING

Coding style is all about readability and maintainability using commonly available tools. OpenSSL coding style is simple. Avoid tricky expressions.

Chapter 1: Indentation

Indentation is four space characters. Do not use the tab character.

Pre-processor directives use one space for indents:

#if
# define
#else
# define
#endif

Chapter 2: Breaking long lines and strings

Don't put multiple statements, or assignments, on a single line.

if (condition) do_this();
    do_something_everytime();

The limit on the length of lines is 80 columns. Statements longer than 80 columns must be broken into sensible chunks, unless exceeding 80 columns significantly increases readability and does not hide information. Descendants are always substantially shorter than the parent and are placed substantially to the right. The same applies to function headers with a long argument list. Never break user-visible strings, however, because that breaks the ability to grep for them.

Chapter 3: Placing Braces and Spaces

The other issue that always comes up in C styling is the placement of braces. Unlike the indent size, there are few technical reasons to choose one placement strategy over the other, but the preferred way, following Kernighan and Ritchie, is to put the opening brace last on the line, and the closing brace first:

if (x is true) {
    we do y
}

This applies to all non-function statement blocks (if, switch, for, while, do): switch (suffix) { case 'G': case 'g': mem <<= 30; break; case 'M': case 'm': mem <<= 20; break; case 'K': case 'k': mem <<= 10; /* fall through */ default: break; }

Note, from the above example, that the way to indent a switch statement is to align the switch and its subordinate case labels in the same column instead of "double-indenting" the case bodies. There is one special case, however. Functions have the opening brace at the beginning of the next line:

int function(int x)
{
    body of function
}

Note that the closing brace is empty on a line of its own, EXCEPT in the cases where it is followed by a continuation of the same statement, such as a "while" in a do-statement or an "else" in an if-statement, like this:

do {
    ...
} while (condition);

and

if (x == y) {
    ...
} else if (x > y) {
    ...
} else {
    ...
}

In addition to being consistent with K&R, note that that this brace-placement also minimizes the number of empty (or almost empty) lines. Since the supply of new-lines on your screen is not a renewable resource (think 25-line terminal screens here), you have more empty lines to put comments on. Do not unnecessarily use braces around a single statement:

if (condition)
    action();

and

if (condition)
    do_this();
else
    do_that();

If one of the branches is a compound statement, then use braces on both parts:

if (condition) {
    do_this();
    do_that();
} else {
    otherwise();
}

Nested compound statements should often have braces for clarity, particularly to avoid the dangling-else problem:

if (condition) {
    do_this();
    if (anothertest)
        do_that();
} else {
    otherwise();
}

Chapter 3.1: Spaces

OpenSSL style for use of spaces depends (mostly) on whether the name is a function or keyword. Use a space after most keywords: if, switch, case, for, do, while, return

Do not use a space after sizeof, typeof, alignof, or __attribute__.

They look somewhat like functions and are usually used with parentheses in OpenSSL, although they are not required in the language:

s = sizeof(struct file);

Do not add spaces around the inside of parenthesized expressions. This example is wrong:

s = sizeof( struct file );

When declaring pointer data or a function that returns a pointer type, the the asterisk goes next to the data or function name, and not the type: char *openssl_banner;

unsigned long long memparse(char *ptr, char **retptr);
char *match_strdup(substring_t *s);

Use one space on either side of binary and ternary operators, such as this partial list:

=  +  -  <  >  *  /  %  |  &  ^  <=  >=  ==  !=  ?  : +=

Do not put a space after unary operators:

&  *  +  -  ~  !  defined

Do not put a space before the postfix increment and decrement unary operators or after the prefix increment and decrement unary operators:

foo++
--bar

Do not put a space around the '.' and "->" structure member operators:

foo.bar
foo->bar

Do not leave trailing whitespace at the ends of lines. Some editors with "smart" indentation will insert whitespace at the beginning of new lines as appropriate, so you can start typing the next line of code right away. But they may not remove that whitespace if you leave a blank line, however, and you end up with lines containing trailing, or nothing but, whitespace. Git will warn you about patches that introduce trailing whitespace, and can optionally strip the trailing whitespace; however, if applying a series of patches, this may make later patches in the series fail by changing their context lines.

Chapter 4: Naming

C is a Spartan language, and so should your naming be. Do not use long names like ThisVariableIsATemporaryCounter. Use a name like tmp, which is much easier to write, and not more difficult to understand. Except when otherwise required, avoid mixed-case names. Do not encode the type into a name (so-called Hungarian notation). Global variables (to be used only if you REALLY need them) need to have descriptive names, as do global functions. If you have a function that counts the number of active users, you should call that count_active_users() or similar, you should NOT call it cntusr(). Local variable names should be short, and to the point. If you have some random integer loop counter, it should probably be called i. Calling it loop_counter is non-productive, if there is no chance of it being mis-understood. Similarly, tmp can be just about any type of variable that is used to hold a temporary value. If you are afraid that someone might mix up your local variable names, perhaps the function is too long; see Chapter 6.

Chapter 5: Typedefs

OpenSSL uses typedef's extensively. For structures, they are all uppercase and are usually declared like this:

typedef struct name_st NAME;

For examples, look in ossl_type.h, but note that there are many exceptions such as BN_CTX. Typedef'd enum is used much less often and there is no convention, so consider not using a typedef. When doing that, the enum name is should be lowercase and the values (mostly) uppercase. The ASN.1 structures are an exception to this. The rationale is that if a structure (and its fields) is already defined in a standard it's more convenient to use a similar name. For example, in the CMS code, a CMS_ prefix is used so ContentInfo becomes CMS_ContentInfo, RecipientInfo becomes CMS_RecipientInfo etc. Some older code uses an all uppercase name instead. For example, RecipientInfo for the PKCS#7 code uses PKCS7_RECIP_INFO. Be careful about common names which might cause conflicts. For example, Windows headers use X509 and X590_NAME. Consider using a prefix, as with CMS_ContentInfo, if the name is common or generic. Of course, you often don't find out until the code is ported to other platforms. A final word on struct's. OpenSSL has has historically made all struct definitions public; this has caused problems with maintaining binary compatibility and adding features. Our stated direction is to have struct's be opaque and only expose pointers in the API. The actual struct definition should be defined in a local header file that is not exported.

Chapter 6: Functions

Ideally, functions should be short and sweet, and do just one thing. A rule of thumb is that they should fit on one or two screenfuls of text (25 lines as we all know), and do one thing and do that well. The maximum length of a function is often inversely proportional to the complexity and indentation level of that function. So, if you have a conceptually simple function that is just one long (but simple) switch statement, where you have to do lots of small things for a lot of different cases, it's OK to have a longer function. If you have a complex function, however, consider using helper functions with descriptive names. You can ask the compiler to in-line them if you think it's performance-critical, and it will probably do a better job of it than you would have done. Another measure of complexity is the number of local variables. If there are more than five to 10, consider splitting it into smaller pieces. A human brain can generally easily keep track of about seven different things, anything more and it gets confused. Often things which are simple and clear now are much less obvious two weeks from now, or to someone else. An exception to this is the command-line applications which support many options. In source files, separate functions with one blank line. In function prototypes, include parameter names with their data types. Although this is not required by the C language, it is preferred in OpenSSL because it is a simple way to add valuable information for the reader. The name in the prototype declaration should match the name in the function definition.

Chapter 7: Centralized exiting of functions

The goto statement comes in handy when a function exits from multiple locations and some common work such as cleanup has to be done. If there is no cleanup needed then just return directly. The rationale for this is as follows:

  • Unconditional statements are easier to understand and follow
  • It can reduce excessive control structures and nesting
  • It avoids errors caused by failing to updated multiple exit points when the code is modified
  • It saves the compiler work to optimize redundant code away ;)

For example:

int fun(int a)
{
    int result = 0;
    char *buffer = OPENSSL_malloc(SIZE);
    if (buffer == NULL)
        return -1;
    if (condition1) {
        while (loop1) {
            ...
        }
        result = 1;
        goto out;
    }
    ...
out:
    OPENSSL_free(buffer);
    return result;
}

Chapter 8: Commenting

Use the classic /* ... */ comment markers. Don't use // ... markers. Comments are good, but there is also a danger of over-commenting. NEVER try to explain HOW your code works in a comment. It is much better to write the code so that it is obvious, and it's a waste of time to explain badly written code. You want your comments to tell WHAT your code does, not HOW. The preferred style for long (multi-line) comments is:

/*-
 * This is the preferred style for multi-line
 * comments in the OpenSSL source code.
 * Please use it consistently.
 *
 * Description:  A column of asterisks on the left side,
 * with beginning and ending almost-blank lines.
 */

Note the initial hypen to prevent indent from modifying the comment. Use this if the comment has particular formatting that must be preserved. It's also important to comment data, whether they are basic types or derived types. To this end, use just one data declaration per line (no commas for multiple data declarations). This leaves you room for a small comment on each item, explaining its use.

Chapter 9: Deleted

Chapter 10: Deleted

Chapter 11: Deleted

Chapter 12: Macros and Enums

Names of macros defining constants and labels in enums are in uppercase:

#define CONSTANT 0x12345

Enums are preferred when defining several related constants. Macro names should be in uppercase, but macros resembling functions may be written in lower case. Generally, inline functions are preferable to macros resembling functions. Macros with multiple statements should be enclosed in a do - while block:

#define macrofun(a, b, c)   \
    do {                    \
        if (a == 5)         \
            do_this(b, c);  \
    } while (0)

Do not write macros that affect control flow:

#define FOO(x)                 \
    do {                       \
        if (blah(x) < 0)       \
            return -EBUGGERED; \
    } while(0)

Do not write macros that depend on having a local variable with a magic name:

#define FOO(val) bar(index, val)

It is confusing to the reader and is prone to breakage from seemingly innocent changes. Do not write macros that are l-values:

FOO(x) = y

This will cause problems if, e.g., FOO becomes an inline function. Be careful of precedence. Macros defining constants using expressions must enclose the expression in parentheses:

#define CONSTANT 0x4000
#define CONSTEXP (CONSTANT | 3)

Beware of similar issues with macros using parameters. The GNU cpp manual deals with macros exhaustively.

Chapter 13: Deleted

Chapter 14: Allocating memory

OpenSSL provides the following general purpose memory allocators: OPENSSL_malloc(), OPENSSL_realloc(), OPENSSL_strdup() and OPENSSL_free(). Please refer to the API documentation for further information about them.

Chapter 15: Deleted

Chapter 16: Function return values and names

Functions can return values of many different kinds, and one of the most common is a value indicating whether the function succeeded or failed. Usually this is:

  • 1: success
  • 0: failure

Sometimes an additional value is used:

  • -1: something bad (e.g., internal error or memory allocation failure)

Other API's use the following pattern:

  • >= 1: success, with value returning additional information
  • <= 0: failure with return value indicating why things failed

Sometimes a return value of -1 can mean "should retry" (e.g., BIO, SSL, et al). Functions whose return value is the actual result of a computation, rather than an indication of whether the computation succeeded, are not subject to these rules. Generally they indicate failure by returning some out-of-range result. The simplest example is functions that return pointers; they use NULL to report failure.

Chapter 17: Deleted

Chapter 18: Editor modelines

Some editors can interpret configuration information embedded in source files, indicated with special markers. For example, emacs interprets lines marked like this:

-*- mode: c -*-

Or like this:

/*
Local Variables:
compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c"
End:
*/

Vim interprets markers that look like this:

/* vim:set sw=8 noet */

Do not include any of these in source files. People have their own personal editor configurations, and your source files should not override them. This includes markers for indentation and mode configuration. People may use their own custom mode, or may have some other magic method for making indentation work correctly.

Chapter 19: Processor-specific code

In OpenSSL case the only reason to resort for processor-specific code is for performance. As it still exists in general platform-independent algorithm context, it has to be always backed up by neutral pure C one. This implies certain limitations. The most common way to resolve this conflict is to opt for short inline assembly function-like snippets, customarily implemented as macros, so that they can be easily interchanged with other platform-specific or neutral code. As with any macro, try to implement it as single expression. You may need to mark your asm statement as volatile, to prevent GCC from removing it if GCC doesn't notice any side effects. You don't always need to do so, though, and doing so unnecessarily can limit optimization. When writing a single inline assembly statement containing multiple instructions, put each instruction on a separate line in a separate quoted string, and end each string except the last with \n\t to properly indent the next instruction in the assembly output:

asm ("magic %reg1, #42\n\t"
         "more_magic %reg2, %reg3"
         : /* outputs */ : /* inputs */ : /* clobbers */);

Large, non-trivial assembly functions go in pure assembly modules, with corresponding C prototypes defined in C. The preferred way to implement this is so-called "perlasm": instead of writing real .s file, you write a perl script that generates one. This allows use symbolic names for variables (register as well as locals allocated on stack) that are independent on specific assembler. It simplifies implementation of recurring instruction sequences with regular permutation of inputs. By adhering to specific coding rules, perlasm is also used to support multiple ABIs and assemblers, see crypto/perlasm/x86_64-xlate.pl for an example. Another option for processor-specific primarily SIMD capabilities is called compiler intrinsics. We avoid this, because it's not very much less complicated than coding pure assembly, and it doesn't provide same performance guarantee across different micro-architecture. Nor is it portable enough to meet our multi-platform support goals.

Chapter 20: Portability

To maximise portability the version of C defined in ISO/IEC 9899:1990 should be used. This is more commonly referred to as C90. ISO/IEC 9899:1999 (also known as C99) is not supported on some platforms that OpenSSL is used on and therefore should be avoided.

Appendix A: References

The C Programming Language, Second Edition by Brian W. Kernighan and Dennis M. Ritchie. Prentice Hall, Inc., 1988. ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback). URL: http://cm.bell-labs.com/cm/cs/cbook/

The Practice of Programming by Brian W. Kernighan and Rob Pike. Addison-Wesley, Inc., 1999. ISBN 0-201-61586-X. URL: http://cm.bell-labs.com/cm/cs/tpop/

GNU manuals - where in compliance with K&R and this text - for cpp, gcc, gcc internals and indent, all available from http://www.gnu.org/manual/

WG14 is the international standardization working group for the programming language C, URL: http://www.open-std.org/JTC1/SC22/WG14/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.