Once Go has shown us the power of slices, there is no doubt that
std::span
was a very serious omittance in the original C++.
Frankly, good old C codebases used the pointer+length metafor so
often, it is difficult to explain why C++ got the construct this
late. Probably, its designers considered non-owning pointers too
rough or maybe too dangerous? An attempt to introduce a similar
concept of std::string_view was clearly a failure and it was
deprecated. At last, std::span is very widely recommended.
Here, my point is rather blunt: std::span
may also be a miss.
A span as a pair of pointers is a much more powerful idiom than
a pointer+length span or an abstract span. Why, may you ask.
These two have the same size because ptrdiff is same size as
ptr, inevitably. Also, pointer+length seems much safer to use,
because less pointers is less trouble, right? Well, the answer
might be a bit unexpected: a pair of pointers has certain
symmetry and because of that you can recombine them like LEGO
pieces. Pointer+length is not flexible in that sense; it is a
single-use construct.
When implementing network or storage accesses, parsing or serialization, or other kinds of buffer shoveling, we often use circular buffers. The reason is that we stream data. We hold a buffer, append some new data in the end, process, discard some old data in the beginning. Let's call the former operation feed and the latter one drain.
How does a circular buffer look like? It may look like this:
+---consumed-space--+----filled-space---+--idle-space---+
^ ^ ^ ^
p0 p1 p2 p3
It takes four pointers to describe a circular buffer. We may
also see it as three spans. Now, how do we feed the data into
the buffer? We take the idle span, put the data in the
beginning, move the p2
pointer forward. How do we drain
processed data? We take the pending span, consume the data, move
the p1
pointer forward. See the pattern? It boils down to
pointer-pointer spans and we may see a circular buffer as a
combination of three such spans!
Now consider strings, vectors, queues... same story. A string or a vector has the allocated range and the filled range. Can we see it as two or three concatenated spans? Sure, no problem!
So, suppose some IO routine gets some data from some IO device. If we provide it with pointer+length, it will fill the range with data and return us the length value so we can handle the rest. That is pretty standard.
ssize_t read(int fd, void *buf, size_t count);
// 3 params, 1 return value, also touches errno
What if it supports spans?
Status Read(int fd, Span8& into);
// 2 params, returns Status
Now we may provide this method with an idle span of a circular buffer, or string, or vector, or queue or give it an entire on-stack array. It will handle everything nicely. It will move the pointer forward, contracting the idle area and expanding the pending area... without knowing much about the actual data structure it writes the data to!
Read(myfd, mystr.Idle()); // append to a string (feed)
Write(otherfd, mystr.Filled()); // drain
Importantly, we halved the number of variables to handle (4>2). Science says, we can comfortably handle about 7 items at once. Having less exposed variables means spending less of that 7!
My only remaining question is: why was not this a part of C++98? Everything that can be reduced to memory ranges, can be reduced to pointer-pointer Spans!
template<typename R>
struct Span<R> {
R *p[2];
size_t Length() const { return p[1] - p[0]; }
Status Feed(Span& b) {
auto l = std::min(Length(), b.Length());
memmove(p[0], b.p[0], sizeof(R)*l);
p[0] += l;
b.p[0] += l;
return OK;
}
};
using Span8 = Span<uint8_t>;
using Span8c = Span<const uint8_t>;
class Buffer {
uint8_t *p[4];
Span8c& Filled() {
return *(Span8c*)(p+1);
}
Span8& Idle() {
return *(Span8*)(p+2);
}
const Span8c& Consumed() const {
return *(Span8c*)(p+0);
}
};
In higher-level languages, string concatenation is often done
by a built-in operator +
. Idiomatic C++ uses string streams
and operator <<
. Idiomatic Java had verbose StringBuilder
.
Good old C has printf
which is loved by so many and ported
to so many languages.
Let's see how we can implement that in the p-p span paradigm.
Status FeedText(Span8& idle) {return OK;}
template<typename... Args>
Status FeedText(Span8& idle, const char* cstr, Args... args) {
auto l = strlen(cstr);
if (idle.Size()<l) return NOSPACEYET;
memcpy((void*)idle.p[0], (void*)cstr, l);
idle.p[0] += l;
return FeedText(idle, args...);
}
template<typename... Args>
Status FeedText(Span8& idle, int i, Args... args) {
// ... write the int
return FeedText(idle, args...);
}
// ...
String con{};
con.Reserve(64);
FeedText(con.Idle(), "concatenate ", 3, " arguments"); // bingo
// ...a printf equivalent would be...
char str[64];
snprintf(str, 64, "concatenate %i arguments", 3);
Note that FeedText
may write to span-based buffers, strings, vectors and so on.
It is as flexible as snprintf
and it does not leave the wiring exposed.
Which is pretty good.