Skip to content

Instantly share code, notes, and snippets.

@ramosbugs
Last active July 8, 2023 19:42
Show Gist options
  • Save ramosbugs/b914d87e6a09b075cb397ff5848e216b to your computer and use it in GitHub Desktop.
Save ramosbugs/b914d87e6a09b075cb397ff5848e216b to your computer and use it in GitHub Desktop.
/// Helper to be used with `#[serde(deserialize_with = "deserialize_large_vec")]` to efficiently
/// deserialize large `Vec<_>` that includes a trusted `size_hint`.
///
/// Ordinarily, serde uses a `cautious` size hint that's at most 4096 entries:
/// https://github.com/serde-rs/serde/blob/10e4839f8325dab5472b0ebf5551f4d607f14a33/serde/src/de/impls.rs#L1035.
/// For large Vec's, this causes lots of reallocations as the Vec is grown during deserialization.
/// This function avoids the cautious size hint and immediately allocates a Vec of the hinted size
/// to avoid unnecessary reallocations.
pub fn deserialize_large_vec<'de, D, T>(deserializer: D) -> Result<Vec<T>, D::Error>
where
D: Deserializer<'de>,
T: Deserialize<'de>,
{
struct VecVisitor<T> {
marker: PhantomData<T>,
}
impl<'de, T> Visitor<'de> for VecVisitor<T>
where
T: Deserialize<'de>,
{
type Value = Vec<T>;
fn expecting(&self, formatter: &mut Formatter) -> std::fmt::Result {
formatter.write_str("a sequence")
}
fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error>
where
A: SeqAccess<'de>,
{
// NB: This is the main line that differs from the serde default impl for Vec<_>.
let mut values = Vec::with_capacity(seq.size_hint().unwrap_or(0));
while let Some(value) = seq.next_element()? {
values.push(value);
}
Ok(values)
}
}
let visitor = VecVisitor {
marker: PhantomData,
};
deserializer.deserialize_seq(visitor)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment