-
-
Save lisovy/b2e8633a53915d7e95c6 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Undefined behavior | |
================== | |
http://blog.regehr.org/archives/213 | |
...If any step in a program's execution has undefined behavior, | |
then the entire execution is without meaning. This is important: | |
it's not that evaluating (1<<32) has an unpredictable result, | |
but rather that the entire execution of a program that evaluates | |
this expression is meaningless. Also, it's not that the execution | |
is meaningful up to the point where undefined behavior happens: | |
the bad effects can actually precede the undefined operation... | |
Integer overflow | |
'''''''''''''''' | |
Checking for it: | |
Since Integer overflow is Undefined behavior, the correct check | |
has to be performed before the arithmetic operation. If it were | |
performed after the operation, compiler might optimize the check | |
out since it assumes that the Undefined behavior program path | |
(=overflow) would not occur. | |
Some quotes from Stackoverflow: | |
The undefined behavior of signed arithmetic overflow is used to | |
enable optimizations; for example, the compiler can assume that | |
if a > b then a + 1 > b also; this doesn't hold in unsigned | |
arithmetic where the second check would need to be carried out | |
because of the possibility that a + 1 might wrap around to 0. | |
... | |
Detection of the integer overflow should be done BEFORE the | |
actual addition/subtraction because of possible undefined behavior. | |
Another example: | |
| int stupid (int a) | |
| { | |
| return (a+1) > a; | |
| } | |
The precondition for avoiding undefined behavior is: | |
(a != INT_MAX) | |
Here the case analysis done by an optimizing C or C++ compiler is: | |
Case 1: a != INT_MAX | |
Behavior of + is defined -> Computer is obligated to return 1 | |
Case 2: a == INT_MAX | |
Behavior of + is undefined -> Compiler has no particular | |
obligations | |
Again, Case 2 is degenerate and disappears from the compiler’s | |
reasoning. Case 1 is all that matters. Thus, a good x86-64 | |
compiler will emit: | |
| stupid: | |
| movl $1, %eax | |
| ret | |
General | |
''''''' | |
http://stackoverflow.com/questions/367633/what-are-all-the-common-undefined-behaviours-that-a-c-programmer-should-know-a | |
* Dereferencing a NULL pointer (In this case it is not about memory location 0; | |
NULL pointer is not always ((void *)0), | |
it is 'special pointer' which is forbidden | |
to being dereferenced) | |
* Converting pointers to objects of incompatible types (convert the pointer either | |
to pointer to another datatype or to 'uintptr_t'; conversion to int and long | |
might (?) be possible, buy try to avoid this since this may truncate the pointer | |
value) | |
* Signed integer overflow (i.e. no special requirement to use twos complement; | |
unsigned overflow is defined -- btw. what about some | |
architecture that supports only saturated arithmetic?) | |
* Left-shifting values by a negative amount (right shifts by negative amounts | |
are implementation defined) | |
* Shifting values by an amount greater than or equal to the number of bits in | |
the number (e.g. int64_t i = 1; i << 72 is undefined) | |
* Attempting to modify a string literal or any other const object during its lifetime | |
* Not returning a value from a value-returning function | |
Side effecting operations | |
''''''''''''''''''''''''' | |
http://blog.regehr.org/archives/232 | |
...The optimizer performs transformations like this | |
(i.e. reordering) when they are thought to increase | |
performance and when they do not change the program’s | |
observable behavior... | |
...reordering is legal since stores to global variables | |
are not defined as side-effecting... | |
...Actually, just to make things confusing, stores to | |
globals are side effecting according to the standard, | |
but no real compiler treats them as such... | |
Sequence point | |
'''''''''''''' | |
http://en.wikipedia.org/wiki/Sequence_point | |
A sequence point is a point in the program's | |
execution sequence where all previous side- | |
effects shall have taken place and where all | |
subsequent side-effects shall not have taken place. | |
Between the previous and next sequence point an | |
object shall have its stored value modified at most | |
once by the evaluation of an expression. | |
a = a++; | |
Is undefined since the rules for sequencing says that | |
you can only update a variable once between sequence | |
points. | |
Not only it is undefined, in reality depending on the | |
order of expression evaluation, the increment may occur | |
before, after, or interleaved with the assignment. | |
Unspecified behavior | |
===================== | |
* Order of evaluation | |
http://en.cppreference.com/w/c/language/eval_order | |
-- The order that function parameters are evaluated | |
a[i] = i++; | |
f(foo(), bar()); | |
-- Evaluation of operands of any C operator | |
f = f1() + f2() + f3(); | |
With exceptions: Evaluation order of a statement consisting | |
of the '&&', '||' and '?' operator are defined (i.e. those | |
operators add a sequence point) | |
http://stackoverflow.com/questions/2456086/if-with-multiple-conditions-order-of-execution) | |
Implementation-dependent behavior | |
================================= | |
Implementation-defined behaviour is an action by a program the | |
result of which is not defined by the standard, but which the | |
implementation is required to document. An example is | |
"Multibyte character literals" -- | |
http://stackoverflow.com/questions/328215/is-there-a-c-compiler-that-fails-to-compile-this | |
Integer constant | |
================ | |
http://hardtoc.com/2009/07/16/int-min.html | |
/* Some of this is triggering undefined behavior */ | |
printf("INT_MAX: %d\n", INT_MAX); | |
printf("INT_MIN - 1: %d\n", INT_MIN - 1); | |
printf("-(INT_MIN + 1): %d\n\n", -(INT_MIN + 1)); | |
printf("INT_MIN + 1: %d\n", INT_MIN + 1); | |
printf("-(INT_MIN - 1): %d\n\n", -(INT_MIN - 1)); | |
printf("INT_MIN: %d\n", INT_MIN); | |
printf("-INT_MIN: %d\n", -INT_MIN); | |
printf("INT_MAX + 1: %d\n", INT_MAX + 1); | |
Result: | |
INT_MAX: 2147483647 | |
INT_MIN - 1: 2147483647 | |
-(INT_MIN + 1): 2147483647 | |
INT_MIN + 1: -2147483647 | |
-(INT_MIN - 1): -2147483647 | |
INT_MIN: -2147483648 | |
-INT_MIN: -2147483648 | |
INT_MAX + 1: -2147483648 | |
Logic or arithmetic bit shift? | |
'''''''''''''''''''''''''''''' | |
#define FIELD1_shift 4 | |
#define FIELD1_OPT1_val 0x5 | |
#define FIELD1_OPT2_val 0x7 | |
#define FIELD2_shift 30 | |
#define FIELD2_OPT1_val 0x1 | |
#define FIELD2_OPT2_val 0x2 | |
uint64_t reg1; | |
/* Initialize the register with some default values */ | |
reg1 = (FIELD1_OPT2_val << FIELD1_shift) | | |
(FIELD2_OPT2_val << FIELD2_shift); | |
printf("reg1: 0x%" PRIx64 "\n", reg1); | |
Result: | |
reg1: 0xffffffff80000070 | |
Fix: | |
#define FIELD1_OPT2_val 0x7U | |
#define FIELD2_OPT2_val 0x2U | |
Result: | |
reg1: 0x80000070 | |
String literal | |
============== | |
Enum | |
==== | |
http://codingrelic.geekhold.com/2008/10/ode-to-enum.html | |
Bit-fields | |
========== | |
https://d3s.mff.cuni.cz/pipermail/osy/2012-November/002059.html | |
LP64 vs. LLP64 | |
============== | |
http://www.unix.org/version2/whatsnew/lp64_wp.html | |
Structures | |
========== | |
Definition, field order | |
''''''''''''''''''''''' | |
From 6.2.5/20: | |
A structure type describes a sequentially allocated nonempty set of | |
member objects (and, in certain circumstances, an incomplete array), | |
each of which has an optionally specified name and possibly distinct | |
type. | |
6.7.2.1/15: | |
15 Within a structure object, the non-bit-field members and the units | |
in which bit-fields reside have addresses that increase in the order | |
in which they are declared. A pointer to a structure object, suitably | |
converted, points to its initial member (or if that member is a | |
bit-field, then to the unit in which it resides), and vice versa. | |
There may be unnamed padding within a structure object, but not at | |
its beginning. | |
The C standard requires that the elements of a structure are laid | |
out in the order that they are defined; the first element is at the | |
lowest address, and the next at a higher address, and so on for each | |
element. The compiler is not allowed to change the order. There can | |
be no padding before the first element of the structure. There can | |
be padding after any element of the structure as the compiler sees | |
fit to ensure what it considers appropriate alignment. | |
The general rules about field layout in C are: | |
The address of the first member is the same as the address of the | |
struct itself. That is, the offsetof of the member field is 0. | |
The addresses of the members always increase in declaration order. | |
That is, the offsetof of the n-th field is lower than that of the | |
(n+1)-th member. | |
There may however be padding bytes at the end of the structure as | |
also in-between members. | |
Assigning one struct to another | |
''''''''''''''''''''''''''''''' | |
https://stackoverflow.com/questions/2302351/assign-one-struct-to-another-in-c | |
Implicit function prototype | |
=========================== | |
http://stackoverflow.com/questions/2199076/printf-and-scanf-work-without-stdio-h-why | |
http://stackoverflow.com/questions/9182763/implicit-function-declarations-in-c | |
http://stackoverflow.com/questions/11150883/using-printf-function-without-actually-importing-stdio-h-and-it-worked-why-is | |
...When C doesn't find a declaration, it assumes this implicit | |
declaration: int f();, which means the function can receive whatever | |
you give it, and returns an integer... | |
Pointer aliasing | |
================ | |
http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule | |
Trigraphs | |
========= | |
int main(int argc, char* argv??(??)) | |
??< | |
printf("Olol??/n"); | |
return 0; | |
??> | |
Const qualifier | |
=============== | |
int tmp; | |
int a = 111; | |
const int b = 222; | |
int *c; | |
const int *d; /* pointer to a "const int" */ | |
a = 1; | |
b = 2; /* error: assignment of read-only variable ‘b’ */ | |
c = &tmp; | |
d = &tmp; /* OK */ | |
*c = 3; | |
*d = 4; /* error: assignment of read-only location ‘*d’ */ | |
d = (void*)0; /* OK */ | |
c = &b; | |
*c = 333; /* OK, b == 333 */ | |
Operator precedence | |
=================== | |
http://en.cppreference.com/w/c/language/operator_precedence | |
"2 + 1 << 2" == 16 | |
"2 + (1 << 2)" == 6 | |
Misconceptions | |
============== | |
* sizeof() is a function | |
It is an operator. It can be called like "a = sizeof foo;" | |
Parentheses are only needed when the argument is a type name. | |
* char is always one byte in size | |
http://stackoverflow.com/a/1864999 | |
(Does sizeof return size in Bytes or chars? | |
Does malloc get size defined in Bytes or chars?) | |
* pointer set to '\0' is NULL pointer | |
* Object pointers and function pointers are the same | |
http://stackoverflow.com/q/3860593 | |
################################################################################ | |
# Libc # | |
################################################################################ | |
Malloc | |
====== | |
Doug Lea implementation: | |
http://g.oswego.edu/dl/html/malloc.html | |
Casting result of malloc()? | |
http://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc | |
Malloc tutorial: | |
http://www.inf.udec.cl/~leo/Malloc_tutorial.pdf | |
################################################################################ | |
# Linkers & Loaders # | |
################################################################################ | |
Dynamic library symbol versioning: | |
http://www.trevorpounds.com/blog/?p=33 | |
################################################################################ | |
# Useful information sources # | |
################################################################################ | |
[The Descent to C] | |
http://www.chiark.greenend.org.uk/~sgtatham/cdescent/ | |
[C FAQ] | |
http://c-faq.com/ | |
[Deep C] | |
http://www.pvv.org/~oma/DeepC_slides_oct2011.pdf | |
[Embedded Programming with the GNU Toolchain] | |
http://www.bravegnu.org/gnu-eprog/ | |
[Feature Test Macro] | |
http://lwn.net/Articles/590381/ | |
[Musl libc] | |
http://wiki.musl-libc.org/wiki/Functional_differences_from_glibc | |
http://wiki.musl-libc.org/wiki/Design_Concepts | |
[Bionic libc] | |
http://codingrelic.geekhold.com/2008/11/six-million-dollar-libc.html | |
https://gitorious.org/0xdroid/bionic/raw/9f65adf2ba3bb15feb8b7a7b3eef788df3fd270e:libc/docs/OVERVIEW.TXT | |
http://drj11.wordpress.com/2013/09/01/on-compiling-34-year-old-c-code/ | |
http://stackoverflow.com/questions/tagged/c?sort=frequent&pageSize=50 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment