Skip to content

Instantly share code, notes, and snippets.

@daniel-j-h
Created May 6, 2014 12:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save daniel-j-h/06f16015c89bec4fbb42 to your computer and use it in GitHub Desktop.
Save daniel-j-h/06f16015c89bec4fbb42 to your computer and use it in GitHub Desktop.
ceph/ceph @24c5ea8: -Weverything

A plea for more tooling usage

Motivation

I tried to build Ceph with the Clang (3.4) compiler. The only issue preventing the build to finish was a "VLA of non-POD type" usage which I fixed here:

ceph/ceph#1768

While the build was resuming I was wondering why no one stumpled upon this issue in the first place. Has nobody ever tried to compile Ceph with Clang?

So I did what I'm always doing to quick-check a project's code quality: I set the compiler to Clang. I set flags to -Weverything.

The idea is to gradually disable warnings one by one that are not interesting.

Instead of doing this by hand I used the following flags that I accumulated over the last weeks or month to disable a handful of not so important warnings:

-Weverything -Wno-padded -Wno-c++11-long-long -Wno-variadic-macros -Wno-c++11-extensions -Wno-global-constructors -Wno-exit-time-destructors -Wno-sign-conversion -Wno-weak-vtables -Wno-disabled-macro-expansion -Wno-c99-extensions -Wno-shadow -Wno-documentation-unknown-command -Wno-undef -Wno-documentation -Wno-duplicate-enum -Wno-switch-enum -Wno-unused-parameter -Wno-packed

Let's build Ceph and have a look at what Clang's -Weverything catches:

./autogen.sh
./configure --without-libxfs
make -j 1 2> diagnostics.txt

Many warnings are duplicates because of header includes and the like. Let's remove duplicates and only keep unique diagnostics:

$ egrep "\[-.*\]" diagnostics.txt | sort | uniq > uniq.txt
$ wc -l uniq.txt 
12286 uniq.txt

Wow. 12286 unique diagnostics. Interested in what showed up?

How bad is it

I ended up going through the diagnostic categories, because I had a few hours to spare. The methodology I used was the following:

Check for diagnostic categories, prioritize them, and after that remove all the occurences in the original diagnostics file by:

$ sed -i '/Wflag/d' uniq.txt

In the following I'll give you a quick summary of the most important diagnostics. After this there are still some diagnostics left.

$ wc -l uniq.txt 
3362 uniq.txt

This can be explained by diagnostics that are not important at all, such as unused exception parameter in a catch clause.

But there are also conversions between types that have to be studied one by one to check if a loss in precision or comparison of types with different signedness could lead to dangerous behavior.

Note: Clang's diagnostics can be customized, too:

http://clang.llvm.org/docs/UsersManual.html#formatting-of-diagnostics

So let's look through the categories.

Undefined Behavior

warning: 'va_start' has undefined behavior with reference types [-Wvarargs]

$ grep Wvarargs uniq.txt | wc -l
1

Here's the paragraph from the C++11 Standard, 18.10 §3:

If the parameter parmN is declared with a function, array, or reference type, or with a type that is not compatible with the type that results when passing an argument for which there is no parameter, the behavior is undefined.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf

Conditional Uninitialized

warning: variable '%' may be uninitialized when used here [-Wconditional-uninitialized]

$ grep Wconditional-uninitialized uniq.txt | wc -l
4

Unreachable Code

warning: will never be executed [-Wunreachable-code]

$ grep Wunreachable-code uniq.txt | wc -l
155

Extensions to the C++ language

There's this vast amount of code that is definitely not conforming to standard C++. Some extensions are used, such as Variable Length Arrays (VLAs, which were the reason for my issue in the first place), some GCC specificas.

You probably have to decide if you want to go this route or if you want to do it the portable and standard C++ way.

VLAs are a good example where the standard C++ way of doing it is in no way harder to accomplish. See the following explanation:

http://clang.llvm.org/compatibility.html#vla

warning: anonymous types declared in an anonymous union are an extension [-Wnested-anon-types]

$ grep Wnested-anon-types uniq.txt  | wc -l
28

warning: zero size arrays are an extension [-Wzero-length-array]

$ grep Wzero-length-array uniq.txt | wc -l
8134

warning: anonymous structs are a GNU extension [-Wgnu-anonymous-struct]

$ grep Wgnu-anonymous-struct uniq.txt  | wc -l
2

warning: '%' may not be nested in a struct due to flexible array member [-Wflexible-array-extensions]

$ grep Wflexible-array-extensions uniq.txt | wc -l
2

warning: variable length array used [-Wvla] warning: variable length arrays are a C99 feature [-Wvla-extension]

$ grep Wvla uniq.txt | wc -l
178

warning: cast between pointer-to-function and pointer-to-object is an extension [-Wpedantic] warning: use of non-standard escape character '%' [-Wpedantic]

$ grep Wpedantic uniq.txt | wc -l
15

warning: use of GNU old-style field designator extension [-Wgnu-designator]

$ grep Wgnu-designator uniq.txt | wc -l
43

warning: anonymous unions are a C11 extension [-Wc11-extensions]

$ grep Wc11-extensions uniq.txt | wc -l
3

Miscellaneous:

warning: '%' hides overloaded virtual function [-Woverloaded-virtual]

$ grep Woverloaded-virtual uniq.txt | wc -l
3

warning: using namespace directive in global context in header [-Wheader-hygiene]

$ grep Wheader-hygiene uniq.txt | wc -l
87

warning: extra ';' after member function definition [-Wextra-semi]

$ grep Wextra-semi uniq.txt | wc -l
135

warning: struct '%' was previously declared as a class [-Wmismatched-tags]

$ grep Wmismatched-tags uniq.txt | wc -l
93

warning: missing field '%' initializer [-Wmissing-field-initializers]

$ grep Wmissing-field-initializer uniq.txt | wc -l
3

warning: private field '%' is not used [-Wunused-private-field]

$ grep Wunused-private-field uniq.txt | wc -l
11

warning: function '%' could be declared with attribute 'noreturn' [-Wmissing-noreturn]

$ grep Wmissing-noreturn uniq.txt | wc -l
23

warning: token pasting of ',' and VA_ARGS is a GNU extension [-Wgnu-zero-variadic-macro-arguments]

$ grep Wgnu-zero-variadic-macro-arguments uniq.txt | wc -l
4

Style

While fixing my initial VLA issue I skimmed through the relevant code in src/rgw/rgw_main.cc.

What I noticed is that already in this file there's no consistent style. There's a "using namespace std;" at the top, then in the file there's an alternate usage between vector and std::vectorstd::string.

It also seems to me that the memory management is often done by hand, e.g. in line 377 "string *objs = new string[num_objs];" and later the "delete []" instead of just using a std::vectorstd::string.

Summary

Using Clang-3.4's -Weverything reveals critical issues.

What I hope for Ceph's general code quality is to improve by using the tools at hand. This means especially:

Do the same check with an up to date Clang (that's one of the reasons I'm not pointing out all the issues or the concrete source code positions), look through them in detail, such as conversions and left-out diagnostics, and fix as many as critical issues as possible.

You probably could even use Clang's -fsanitize flag to check for issues at runtime. http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation

Also keep in mind that this check was only done on the userspace code in the ceph/ceph repository.

@dalgaaf
Copy link

dalgaaf commented May 8, 2014

You need at least to filter out all warnings that are caused by code that is not under control of the ceph project like the most git submodules and some code that was copied e.g. from the kernel or other generic projects (like the gtest code). This will reduce the number of warnings.

@daniel-j-h
Copy link
Author

Although it will reduce the number of warnings, the argumentation is flawed.

If there's e.g. undefined behavior in a library that ceph uses there exists a transitive relation such that now ceph also suffers from the undefined behavior. Therefore I wanted to do the check for the complete build process, not just ceph's core.

If you really want to do this it's trivial to filter the diagnostics by their path prefix.

@dalgaaf
Copy link

dalgaaf commented May 10, 2014

I don't say: ignore the warnings in external code completely but don't count them for ceph. Since the code state in ceph for these external projects are may not up-to-date it's hard to spend time on them since they may no longer exist in the the latest code base. It's maybe better to ask these projects to use coverity, cppchecker, clang or other tools for better code quality.

And on the other hand we need to think about update the external code on a more regular base to profit from fixes there.

Btw. I fix issues found by coverity and other tools in these projects on a regular base already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment