- Our case study is the open source fractal generator Iterated Dynamics, a fork of FRACTINT.
- This is a 30 year old code base that started as a 16-bit MS-DOS program written in C and 16-bit x86 assembly.
- The code base has contributions from many, many authors over the years.
- Source formatting and identifier conventions are varied due to the large number of authors.
- Source code organization has been tortured by the constraints of early MS-DOS programs operating in a limited amount of physical memory and the use of overlays.
- Conditional compilation was sprinkled throughout to enable various debugging facilities or experiments in alternative implementations.
- As C does not have inline functions, there are some places that heavily use preprocessor macros.
- Some of the functions are exceedingly long*.
- The program uses an inordinately large number of global variables*.
- The code exhibits almost every code smell that you could have, except for those that require object-oriented programming.
- The code uses global headers for data* and function* declarations, exacerbating compile times.
- Links to the case study repository are shown with an asterisk (*).
- Prioritize your debt; not every loan is at the same interest rate
- Get rid of daily annoyances first; they make it painful to do any work in the code
- Don't wake the dragon; if you don't need to change something now, defer it to later
- Work carefully to ensure always forward progress
- Use version control
- Proceed in small steps; commits are free
- One kind of change per commit; do not change too many things at once so mistakes can be easily reverted
- Use branches to isolate and prototype changes
- Stay focused on the task at hand; if you encounter other changes that should be made, add them to your To Do List and come back to them later
- Leverage any existing automated tests
- Consider creating automated regression tests before making risky changes
- FitNesse is an automated acceptance test framework with scaffolding for C++
- Use property-based testing
- Use fuzz testing using libFuzzer or OSS-Fuzz for open source projects
- Modernize your build with CMake
- Use CMake 3.0 at least
- Adhere to modern CMake best practices
- Legacy code bases often have multiple build scripts to support different target environments (
Makefile
for unix, VC6.dsp
files for Windows,.xcodeproj
for MacOS) - Replace these with a
CMakeLists.txt
* that generates project files for IDEs in a version-independent manner
- Upgrade your compiler to the latest version, consider C++14 a minimum as of 2018.
- Use continuous integration to ensure timely notification of problems
- Format source files consistently; use an automated tool
- Visual Studio can reformat code
- ReSharper for C++ add-on provides more formatting options to Visual Studio
- AStyle
- clang-format
- Commit style parameters (
.astyle
,.clang-format
) to the source tree - Proceed in small steps* by working one file at a time
- Some of these tools have bugs even though they are supposed to adjust only whitespace; compile frequently
- Use a periodic job to keep source code consistently formatted (cron job, jenkins, etc.) if inconsistencies are likely to creep back into your code based on past team behavior
- Name identifiers consistently
- Use a tool to automatically rename them, or...
- Rename the definition and Lean on the Compiler to find all references and rename them manually
- Proceed in small steps*; commits are free
- Watch out for inactive preprocessor blocks of code; some tools can't rename inside inactive alternatives
- Use "Find in Files" to ensure all mentions of the old names are gone
- Refactoring: Convert C to C++
- Leverage C++'s improved static type checking and facilities for data abstraction
- Keep exported functions and data as
extern "C"
if ABI compatibility is needed
- C++ has better alternatives for most uses of the preprocessor
- Look for dead code and eliminate it
- Look for conditional compilation that is no longer relevant and eliminate it
- Refactoring: Guard Macro Body
- Refactoring: Replace Macro With Inline Function*
- Refactoring: Replace Macro With Enum*
- Use clang-tidy to automatically modernize code constructs
- Running manually requires a
compile_commands.json
database:- Generate one with CMake
CMAKE_EXPORT_COMPILE_COMMANDS
on unix-like platforms - Generate one with the sourcetrail add-on for Visual Studio on Windows
- Generate one with CMake
- ReSharper for C++ 2017.3 integrates clang-tidy support
- Clang Power Tools is a free add-on for Visual Studio provides clang-tidy support
- Proceed with caution:
- The implementation of clang-tidy checks is of variable quality
- Work one file at a time until you have confidence the check can be applied globally
- Always review generated diffs for more complicated checks
- Lean on the compiler and on your automated regression suite to verify the result of transformations
- Some suggested checks:
- modernize-deprecated-headers
- modernize-loop-convert to switch to range for loops
- modernize-redundant-void-arg to remove old C-style declarations/definitions
- modernize-use-bool-literals
- modernize-use-equals-default
- modernize-use-equals-delete
- modernize-use-nullptr
- modernize-use-override
- readability-container-size-empty
- readability-delete-null-pointer
- readability-else-after-return
- readability-implicit-bool-cast is a good way to find
bool
disguised as an integral type - readability-redundant-control-flow
- readability-redundant-string-cstr
- readability-redundant-string-init
- readability-simplify-boolean-expr
- Static analysis tools to find old bugs
- OCLint
- cppcheck
- SonarQube w/open source C++ support
- clang static analyzer (clang-tidy can run static analyzer checks, too)
- Clang Power Tools is a free add-on for Visual Studio provides clang static analyzer support
- ReSharper for C++ 2017.3 integrates support for clang static analyzer checks provided by clang-tidy
- Visual Studio provides static analysis and C++ Core Guidelines checking
- Coverity is free for open source projects
- Dynamic analysis tools to find old bugs
- Use them with an automated regression test suite to exercise code at runtime
- Clang sanitizers:
- Address sanitizer detects memory errors
- Memory sanitizer detects reads of uninitialized memory
- Thread sanitizer detects data races
- Undefined behavior sanitizer
- Refactor: v. To change the structure of existing code without changing the behavior
- Modularize the "global header"*, a single header file that declares all functions and data and is included in every source file.
- Create a header file for a single source file.
- Move declarations exported from the single source file from the global header to the new header file.
- Lean on the compiler to identify all dependent source files that need to include the new header file.
- Repeat until all declarations have been removed from the global header.
- Remove the global header and all the
#include
directives of it.
- Refactoring: Split Module
- Refactoring: Guard Macro Body
- Refactoring: Replace Integer With Boolean (example*)