Skip to content

Instantly share code, notes, and snippets.

@mcejp
Last active July 29, 2024 08:39
Show Gist options
  • Save mcejp/719d3485b04cfcf82e8a8734957da06a to your computer and use it in GitHub Desktop.
Save mcejp/719d3485b04cfcf82e8a8734957da06a to your computer and use it in GitHub Desktop.
Ring buffer with Array + two indices mod 2N
// Background: https://www.snellman.net/blog/archive/2016-12-13-ring-buffers/
// Variant with indices modulo 2*capacity as suggested by dizzy57 and Aristotle Pagaltzis
size_t read;
size_t write;
mask(val) { return val % array.capacity; }
inc(index) { return wrap(index + 1); }
push(val) { assert(!full()); array[mask(write)] = val; write = inc(write); }
shift() { assert(!empty()); ret = array[mask(read)]; read = inc(read); return ret; }
empty() { return read == write; }
full() { return size() == array.capacity; }
size() { return wrap(write - read); }
free() { return array.capacity - size(); }
// If you can assure power-of-2 capacity
wrap(val) { return val % (array.capacity * 2); }
// If you can NOT assure power-of-2 capacity
size_t wrap(intptr_t val) {
val = val % (intptr_t)(capacity * 2);
if (val < 0) {
val += capacity * 2;
}
return val;
}
// Additional operations
read(data_out, count) {
count = min(count, size());
run = min(count, array.capacity - mask(read));
rem = count - run;
memcpy(data_out, array + mask(read), run);
memcpy(data_out + run, array, rem);
read = wrap(read + count);
return count;
}
write(data, count) {
count = min(count, array.capacity - size());
run = min(count, array.capacity - mask(write));
rem = count - run;
memcpy(array + mask(write), data, run);
memcpy(array, data + run, rem);
write = wrap(write + count);
return count;
}
@HiFiPhile
Copy link

I think size() should be { return mask(write - read); }

@mcejp
Copy link
Author

mcejp commented Mar 30, 2024

I think size() should be { return mask(write - read); }

@HiFiPhile nope, that's why you do mod 2N in the first place, this way you could never arrive to size() == capacity()

@HiFiPhile
Copy link

HiFiPhile commented Mar 30, 2024

Sorry I worked on an overwritable FIFO when the situation is more complicated, I must have mingled things....
https://github.com/hathach/tinyusb/blob/master/src/common/tusb_fifo.h

@etiennemlb
Copy link

etiennemlb commented Jul 15, 2024

Assume capacity is 3 and the size_t is something small and workable for our example, say uint8. So we work with integer mod 256 and we wrap() at 2x3 = 6.

Now if the head is at 1 and the tail at 5, our size() should be 1. But 0 - 5 is congruent to 252 mod 256 and Wrap(252) is 0.

I thus fear size() is buggy.

@mcejp
Copy link
Author

mcejp commented Jul 16, 2024

@etiennemlb You are making the tacit assumption that wrap takes a size_t argument. Which is not unreasonable, but it will indeed break if the buffer size is not 2**N. Ideally wrap would just take a signed argument, and this indeed gives the correct result in Python, where the % operator always returns a number with the same sign as the denominator, so (-4) % 6 == 2

In C/C++/Java, they made the unfortunate choice of having % return a value with the sign of the numerator. Therefore, (-4) % 6 will still be -4, which is not helpful here. I might not have been aware of this at the time of writing.

Thanks for pointing this out. I'll be back with a fix. Let's have the most robust implementation of a Ring buffer with Array + two indices mod 2N on the internet :)

@etiennemlb
Copy link

etiennemlb commented Jul 16, 2024

I think this is the debate of what should the remainder be defined as. With knuth advocating for the definition you gave and the C guys (and thus by transitivity, a whole bunch of language) choosing the over zealous (?) solution. Over zealous because while elegant that (a / b) * b + a % b = a it lacks a bunch of great properties.

The F-definition or E-definition remainder (as called so by Boute) on signed ints is what we would like for our modulo operator in the purpose above. Instead C gives us the T-definition.

To be fair, the C % operator is what the hardware gives us on x86. I'm not aware of an other remainder being natively supported in the x86 ISA for the signed ints.

@mcejp
Copy link
Author

mcejp commented Jul 21, 2024

@etiennemlb I updated the code a few days ago and forgot to ping you. I would be grateful for your feedback!

@etiennemlb
Copy link

etiennemlb commented Jul 28, 2024

While in the number theory discipline, the modulo operator is always defined as the remainder of the euclidean division, in CS this is quite a mess.

The C language (and thus many derivatives) is basing its signed integer modulo operator (%) on the remainder of a truncated (integral part toward zero) division (simply a/b in C/C++).
Reusing wikipedia's graphs, we get:
image

What we want for our ring's modulo operator is either the euclidean or the floored division based remainder (which one does not matter as we do not have negative denominator).
This would give, for the floored variant:
image

Lucky for us, the floor variant is easy to implement on top of the truncated division based remainder that the C modulo gives us. Notice that we just need to shift the modulo by the denominator value (more on that in Leijen's Division and modulus for computer scientists).

This shift of + denominator if the remainder is negative is just what you implemented, that said, I think we must be careful and judiciously place casting to (in our case), intptr_t during the write - read computation.

size() { return wrap((intptr_t)write - (intptr_t)read); }

In the abstract of Boute's The Euclidean Definition of the Functions div and mod we read:
The definitions of the functions div and mod in the computer science literature and in programming languages are either similar to the Algol or Pascal definition (which is shown to be an unfortunate choice) or based on division by truncation (T-definition) or division by flooring as defined by Knuth (F-definition). The differences between various definitions that are in common usage are discussed, and an additional one is proposed, which is based on Euclid’s theorem and therefore is called the Euclidean definition (E-definition). Its distinguishing feature is that 0 <= D mod d < |D| irrespective of the signs of D and d. It is argued that the E- and F-definitions are superior to all other ones in regularity and useful mathematical properties and hence deserve serious consideration as the standard convention at the applications and language level. It is also shown that these definitions are the most suitable ones for describing number representation systems and the realization of arithmetic operations at the architecture and hardware level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment