I audited the master
branch of bytes
, currently at commit b5d4f87
.
There were a total of sixty-eight unsafe blocks, functions, and impls in this
version of bytes
.
There are several instances of undefined behaviour in this crate; several have
already been publicly discussed, here
and here. Several functions are "newly discovered" to be
questionably legal or likely illegal, including Bytes::to_mut
,
Inner::inline_ptr
, and [Inner::inline_len
]1.
Under a strict definition of undefined behaviour requiring uninitialized memory
to be accessible only through untagged unions or raw pointers, undefined behaivour
is pervasive throughout src/bytes.rs
. Unfortunately, it is currently unclear
if this is actually undefined behaviour, so I'm not comfortable making any
recommendations. As the bytes
authors are already breaking backwards
compatibility from the most recently published version in the audited version,
it may wish to modify its API to ensure that future clarifications of the
scope of UB leave bytes
entirely defined.
The files src/debug.rs
, src/lib.rs
, src/serde.rs
,
src/buf/from_buf.rs
, src/buf/into_buf.rs
, src/buf/iter.rs
,
src/buf/mod.rs
, src/buf/reader.rs
, src/buf/take.rs
,
src/buf/vec_deque.rs
, and src/buf/writer.rs
had no unsafe blocks,
functions, or impls and were out of this audit's scope and therefore were not
evaluated. Tests (tests/
) and benchmarks (benches/
) were also out of this
audit's scope and therefore were not evaluated. Examples using unsafe were
checked when the function that they documented was unsafe or contained unsafe
blocks.
I am not a professional security expert, and I do not have a particular knowledge of the nuances of proposed Rust memory models or esoterically undefined behaviours. This is the first formal audit of a library I have conducted.
Use these findings at your own risk.
src/bytes.rs
contains thirty unsafe blocks, twelve unsafe functions, and two
unsafe impls.
Bytes::slice
is a function on Bytes
which returns a slice from a
provided range. It contains one unsafe block calling two unsafe functions.
The unsafe block makes two function calls, the first to [Bytes::set_end
], and
the second to [Bytes::set_start
]. In this context, these functions requires
that all "revealed" space in the buffer is initialized. As slice can only
return an identical or smaller view of the buffer, this is trivially true,
assuming that the existing buffer is initialized.
Bytes::to_mut
is a function on Bytes
which returns a mutable reference
to the Bytes
. It contains one unsafe block dereferencing one raw pointer.
After checking if the buffer is exclusively owned, and cloning it if not, an
unsafe block takes the self
reference, casts it to a raw mutable pointer to a
Bytes
, and casts that to a raw mutable pointer to a [BytesMut
]. While both
structs have only one member, which is repr(C)
, the structs themselves are
not. While I can't imagine a scenario in which this breaks in practice, this is
still likely illegal2, and both Bytes
and [BytesMut
] ought to be
repr(C)
.
<Bytes as Buf>::advance
is an unsafe function implementing
Buf::advance
. It contains one unsafe block calling one unsafe function.
The function checks that there is sufficient initialized space remaining in
the buffer to advance by the requested number of bytes, and then uses
[Bytes::set_start
] to do so.
<Bytes as Clone>::clone
is a function implementing Clone::clone
.
It contains one unsafe block calling one unsafe function.
The entire function simply constructs a new Bytes using an
Inner::shallow_clone
of the passed Bytes
. As the Bytes
is immutable,
this is safe.
BytesMut::set_len
is an unsafe function on [BytesMut
] that sets the
length of the initialized portion of the [BytesMut
]s buffer. It is documented
as requiring the caller to ensure that any accessible data is initialized.
Ignoring that the backing store is a slice, the storing of uninitialized data
within which is questionably legal, it is sound if used properly.
The implementation simply redirects to Inner::set_len
.
<BytesMut as Buf>::advance
is a function implementing Buf::advance
.
It contains one unsafe block calling one unsafe function.
This function is exactly identical to <Bytes as Buf>::advance
, and shares
its soundness.
<BytesMut as BufMut>::advance_mut
is an unsafe function implementing
BufMut::advance_mut
. It calls one unsafe function.
The function calls Inner::set_len
, and is sound if used properly, given
standard concerns about accessible uninitialized memory.
<BytesMut as BufMut>::bytes_mut
is an unsafe function implementing
BufMut::bytes_mut
. It calls one unsafe function.
The function calls Inner::as_raw
, slices the returned slice to exclusively
after the cursor, and returns it. It is sound if used properly, given standard
concerns about accessible uninitialized memory.
<BytesMut as BufMut>::bytes_mut
is a function implementing
BufMut::put_slice
. It contains one unsafe block calling two unsafe
functions.
It calls two unsafe functions: <BytesMut as BufMut>::bytes_mut
and
<BytesMut as BufMut>::advance_mut
. These are used safely, given standard
concerns about accessible uninitialized memory.
[<BytesMut as From<&'_ u8>>::from
] is a function implementing
From<_>::from
. It contains one unsafe block calling two unsafe
functions.
UNAUDITED
[<BytesMut as Extend<u8>>::from
] is a function implementing
Extend<_>::extend
. It contains one unsafe block calling two unsafe
functions.
UNAUDITED
Inner::with_capacity
is a function on [Inner
] which constructs an
[Inner
] with a specified capacity. It contains one unsafe block calling
one unsafe function.
UNAUDITED
Inner::as_ref
is a function on [Inner
] which returns a slice
presenting a view of the shared buffer. It contains one unsafe block calling
two unsafe functions.
UNAUDITED
Inner::as_mut
is a function on [Inner
] which returns a mutable slice
presenting a view of the shared buffer. It contains one unsafe block calling
two unsafe functions.
The function itself should probably be unsafe, as it assumes that the buffer is not static. While this is guarded by a debug assertion, it would be possible to cause UB in safe release code. However, as this is not a public function, this is only a stylistic concern, and is not properly unsound.
Inner::as_raw
is an unsafe function on [Inner
] which returns a mutable
slice presenting a view of the shared buffer, including potentially
uninitialized data. It contains one unsafe block calling two unsafe functions.
UNAUDITED
Inner::put_u8
is a function on [Inner
] which sets the next byte in the
buffer to the provided byte, and advances the cursor by one. It contains two
unsafe blocks, each calling one unsafe function and dereferencing one raw
pointer.
UNAUDITED
Inner::inline_ptr
is an unsafe function on [Inner
] which produces a
mutable raw pointer to the Inner
's inline storage. It calls one call to an
unsafe function. It also contains a cast from a shared reference to a mutable
raw pointer.
UNAUDITED
note possible ub cast shared reference to mutable pointer
Inner::inline_len
is a function on [Inner
] which returns the initialized
length of the inline buffer. It contains one unsafe block calling one unsafe
function.
This exhibits the same synchronization issues as Inner::kind
.
Inner::set_len
is an unsafe function on [Inner
] which sets the initialized
length of the internal buffer. It contains no unsafe operations, but may be used
to expose uninitialized data to safe code.
UNAUDITED
Inner::split_off
is a function on [Inner
] which splits the internal buffer
at the provided position. It contains two unsafe blocks, the first calling one
unsafe function and the second calling two.
UNAUDITED
Inner::split_off
is a function on [Inner
] which splits the internal buffer
at the provided position. It contains two unsafe blocks, the first calling one
unsafe function and the second calling two.
UNAUDITED
Inner::truncate
is a function on [Inner
] which may shorten the length of
the initialized buffer. It contains one unsafe block, calling one unsafe
function.
UNAUDITED
Inner::try_unsplit
is a function on [Inner
] which coalesces two previously
split adjacent buffers with a reference counted backing. It contains one unsafe
block, calling one unsafe function.
UNAUDITED
Inner::resize
is a function on [Inner
] which resizes the initialized
area of the buffer. It contains one unsafe block, which calls two unsafe
functions.
UNAUDITED
Inner::set_start
is an unsafe function on [Inner
] which shifts data off
the front of the buffer. It calls six unsafe functions.
UNAUDITED
Inner::set_end
is an unsafe function on [Inner
] which shifts data off
the back of the buffer. It calls one unsafe function.
UNAUDITED
Inner::set_end
is a function on [Inner
] which checks if it is safe to
mutate the backing buffer. It contains one unsafe block, which dereferences
one raw pointer.
UNAUDITED
Inner::shallow_clone
is an unsafe function on [Inner
] which either
increases the reference count on the shared buffer, if an allocated buffer is
in use, or copies the internal buffer. It calls three unsafe functions.
UNAUDITED
Inner::shallow_clone_sync
is an unsafe function on [Inner
] which
increases the reference count on the shared buffer, converting from an
exclusively allocated buffer if necessary. It calls two unsafe functions.
UNAUDITED
Inner::shallow_clone_arc
is an unsafe function on [Inner
] which increases
the reference count on the provided shared buffer. It dereferences one raw
pointer.
UNAUDITED
Inner::shallow_clone_vec
is an unsafe function on [Inner
] which converts
a exclusively owned buffer to a shared buffer with two references. It calls one
unsafe function.
UNAUDITED
Inner::reserve
is a function on [Inner
] which reserves the requested
amount of uninitialized space at the end of the buffer. It contains two unsafe
blocks, the first calling three unsafe functions, and the second calling one and
dereferencing four raw pointers.
UNAUDITED
Inner::kind
is a function on [Inner
] which returns the backing storage
type. It contains two (conditionally compiled) unsafe blocks, each of which
calls one unsafe function.
This function exhibits undefined behaviour under all executions. As it explains in a comment at the top of the function:
This function is going to probably raise some eyebrows. The function
returns true if the buffer is stored inline. This is done by checking
the least significant bit in the `arc` field.
Now, you may notice that `arc` is an `AtomicPtr` and this is
accessing it as a normal field without performing an atomic load...
Again, the function only cares about the least significant bit, and
this bit is set when `Inner` is created and never changed after that.
All platforms have atomic "word" operations and won't randomly flip
bits, so even without any explicit atomic operations, reading the
flag will be correct.
This is undefined behavior due to a data race, but experimental
evidence shows that it works in practice (discussion:
<https://internals.rust-lang.org/t/bit-wise-reasoning-for-atomic-accesses/8853>).
This function is very critical performance wise as it is called for
every operation. Performing an atomic load would mess with the
compiler's ability to optimize. Simple benchmarks show up to a 10%
slowdown using a `Relaxed` atomic load on x86.
Inner::uncoordinated_get_vec_pos
is a function on [Inner
]. It contains one
unsafe block calling one unsafe function.
This exhibits the same synchronization issues as Inner::kind
.
Inner::uncoordinated_set_vec_pos
is a function on [Inner
]. It contains one
unsafe block calling one unsafe function.
This exhibits the same synchronization issues as Inner::kind
.
rebuild_vec
is a free function. It contains one unsafe block that calls two
unsafe functions.
UNAUDITED
release_shared
is a free function. It contains one unsafe block that
dereferences one raw pointer and calls one unsafe function.
UNAUDITED
unsafe impl Send for Inner
is an unsafe trait impl. It asserts that
Inner.ptr
can safely be sent across thread boundaries.
UNAUDITED
unsafe impl Sync for Inner
is an unsafe trait impl. It asserts that all
accesses to Inner.ptr
are synchronized.
UNAUDITED
src/buf/chain.rs
contains three unsafe functions. All are implementations of
member functions of the BufMut
trait.
<Either<_, _> as BufMut>::bytes_mut
is an unsafe function that returns a
mutable slice to the remainder of the object's buffer, though it may return
short. The implementation on Either<_, _>
simply calls the corresponding
function on which contained type is active. It appears free of undefined
behaviour provided its invariants are upheld and the called functions are
sound.
<Either<_, _> as BufMut>::bytes_vectored_mut
is an unsafe function that
fills a mutable destination slice with potentially multiple
std::io::IoSliceMut
s. The implementation on Either<_, _>
simply calls the
corresponding function on which contained type is active. It appears free of
undefined behaviour provided its invariants are upheld and the called functions
are sound.
<Either<_, _> as BufMut>::advance_mut
is an unsafe function that advances
the internal cursor of the object implementing BufMut
. The implementation on
Either<_, _>
simply calls the corresponding function on which contained type
is active. It appears free of undefined behaviour provided its invariants are
upheld and the called functions are sound.
src/buf/buf_mut.rs
contains four unsafe blocks and thirteen unsafe
functions.
BufMut::advance_mut
is an unsafe required method for the BufMut
trait.
It advances the implementing object's internal cursor by a requested number of
bytes. The documentation is somewhat unclear on what its invariants are, and
there are two reasonable interpretations: the first is that it requires that the
number of bytes being advanced past within the buffer immediately after the
cursor be defined; this is correct in all cases. The second interpretation is
that while those bytes may be undefined, you must not read them. This
interpretation is problematic as it may be undefined behaviour to have
uninitialized memory accessible other than through an untagged union or a raw
pointer.
The example seems to indicate that the second interpretation is correct, and under that strict definition of undefined behaviour, the example does exhibit UB.
BufMut::bytes_mut
is an unsafe required method which returns the
implementing object's byte buffer. Under a loose definition of undefined
behaviour (see the section above on BufMut::advance_mut
), this is a perfectly
acceptable API. Under a stricter definition of UB, this function may only be
called if the caller is sure that the entire buffer is initialized.
The documentation does not reflect this potential restriction.
Again, under the strict definition of UB, the example is incorrect.
BufMut::bytes_vectored_mut
is an unsafe provided method which is readv
to
BufMut::bytes_mut
's read
. It is potentially unsound for the same reasons.
BufMut::put
is a function containing two unsafe blocks that copies bytes
from a provided Buf
into the implementing object.
The first unsafe block involves two unsafe function calls.
The first call uses BufMut::bytes_mut
to get the slice to be copied into
by the second. Under the strict interpretation of UB, this is unsound as the
returned slice may contain uninitialized bytes, which cannot be accessible
except through an untagged union or raw pointer. Under a looser interpretation
of UB, this call is fine, as the slice is never read from.
The second call uses std::ptr::copy_nonoverlapping
to copy bytes from an
input buffer to the implementing object's buffer, via an immutable and mutable
reference, respectively. Rust's ownership rules ensure this block is safe, as
the destination cannot be aliased, and the source cannot be mutably aliased. A
buffer overrun is prevented by bounding count
to the minimum of the length of
the source and destination slices. As additional safety, an assertion requires
that the amount of spare buffer space in the implementing object is greater than
or equal to the remaining amount of bytes in the provided Buf
.
The second unsafe block uses BufMut::advance_mut
to advance the implementing
object's cursor by the number of bytes copied into the destination slice. This
is safe, as those bytes have been initialized by the earlier unsafe block.
Stylistic concerns suggest that either the first unsafe block should be broken up into two blocks, or that both unsafe blocks should be combined.
BufMut::put_slice
is a function containing two unsafe blocks that copies
bytes from a provided slice into the implementing object. As byte slices
implement Buf
, this function simply is a specialized implementation of a
subset of BufMut::put
s functionality.
The first unsafe block involves two unsafe function calls.
The first call uses BufMut::bytes_mut
to get the slice to be copied into.
See the discussion of the similar call within BufMut::put
, above.
The second call uses std::ptr::copy_nonoverlapping
to copy the provided
slice into the implementing object's buffer. See the discussion of the similar
call within BufMut::put
, above.
The second unsafe block uses BufMut::advance_mut
to advance the implementing
object's cursor by the number of bytes copied into the destination slice. See
the discussion of the similar call within BufMut::put
, above.
Stylistic concerns suggest that either the first unsafe block should be broken up into two blocks, or that both unsafe blocks should be combined.
<&'_ mut _ as BufMut>::bytes_mut
is an unsafe function implementing
BufMut::bytes_mut
. As it simply defers to the concrete implementation
through the reference, it has the same soundness properties as the referenced
object.
<&'_ mut _ as BufMut>::bytes_vectored_mut
is an unsafe function implementing
BufMut::bytes_vectored_mut
. As it simply defers to the concrete
implementation through the reference, it has the same soundness properties as
the referenced object.
<&'_ mut _ as BufMut>::advance_mut
is an unsafe function implementing
BufMut::advance_mut
. As it simply defers to the concrete implementation
through the reference, it has the same soundness properties as the referenced
object.
<Box<_> as BufMut>::bytes_mut
is an unsafe function implementing
BufMut::bytes_mut
. As it simply defers to the concrete implementation
through the reference, it has the same soundness properties as the referenced
object.
<Box<_> as BufMut>::bytes_vectored_mut
is an unsafe function implementing
BufMut::bytes_vectored_mut
. As it simply defers to the concrete
implementation through the reference, it has the same soundness properties as
the referenced object.
<Box<_> as BufMut>::advance_mut
is an unsafe function implementing
BufMut::advance_mut
. As it simply defers to the concrete implementation
through the reference, it has the same soundness properties as the referenced
object.
[<&'_ mut &$x91;u8“ as BufMut>::bytes_mut
] is an unsafe function implementing
BufMut::bytes_mut
. It simply returns a mutable reference to the slice.
Assuming that the slice is completely initialized when the function is called,
it is clearly sound. If the slice is not fully initialized, as the documentation
implies is a valid use, it may be unsound.
<&'_ mut ‘u8“ as BufMut>::advance_mut
is an unsafe function implementing
BufMut::advance_mut
. It calls no unsafe functions, and, assuming the slice
is completely initialized when the function is called, it is clearly sound. If
the slice is not fully initialized, as the documentation implies is a valid
use, it may be unsound.
<Vec<u8> as BufMut>::advance_mut
is an unsafe function implementing
BufMut::advance_mut
. If necessary, it reserves sufficient additional
capacity to set the length of the vector using Vec::set_len
to the vector's
current length plus the provided usize
. This may be unsound if access
to uninitialized memory except through a raw pointer or untagged union is
undefined behaviour.
<Vec<u8> as BufMut>::bytes_mut
is an unsafe function implementing
BufMut::bytes_mut
. It uses [std::slice::from_raw_parts_mut
] to convert the
vector into a mutable slice and then slices the slice to everything between the
vector's length and the vector's maximum capacity. If there is no capacity
beyond the length, 64 bytes of capacity are first reserved. This may be unsound
if access to uninitialized memory except through a raw pointer or untagged
union is undefined behaviour.
src/buf/buf.rs
contains one unsafe block.
Buf::copy_to_slice
is a default method for the Buf
trait. It uses
std::ptr::copy_nonoverlapping
to copy the elements of a &[u8]
returned
by another Buf
method on self
to a &mut [u8]
passed as an argument.
Rust's ownership rules ensure that this function is safe, as the destination
cannot be aliased, and the source cannot be mutably aliased. A buffer overrun
is prevented by bounding count
to the minimum of the length of the source and
destination slices.
src/buf/chain.rs
contains three unsafe functions. All are implementations of
member functions of the BufMut
trait.
<Chain<_, _> as BufMut>::bytes_mut
is an unsafe function that returns a
mutable slice to the remainder of the object's buffer, though it may return
short. The implementation on Chain<_, _>
simply calls the corresponding
function on its first object if it is not yet exhausted; otherwise, it calls it
on its second. It appears free of undefined behaviour provided its invariants
are upheld and the called functions are sound.
<Chain<_, _> as BufMut>::bytes_vectored_mut
is an unsafe function that
fills a mutable destination slice with potentially multiple
std::io::IoSliceMut
s. The implementation on Chain<_, _>
first calls the
corresponding function on its first object, and then calls the function on its
second object, but offsets the slice by the number of std::io::IoSliceMut
s
filled in the destination slice by the first function. It appears free of
undefined behaviour provided its invariants are upheld and the called functions
are sound.
<Chain<_, _> as BufMut>::advance_mut
is an unsafe function that advances
the internal cursor of the object implementing BufMut
. The implementation
on Chain<_, _>
first advances its first object by the requested number of
bytes, or to saturation. If it is saturated, it advances its second object by
the requested number of bytes minus the number of bytes required to saturate
the first object. Both advancements are done using the corresponding function
for each object. It appears free of undefined behaviour provided its invariants
are upheld and the called functions are sound.