The following process should be fairly universal for any sort of deserializable data format. Since the data was JSON, the examples will use that, but same general process works for YAML/TOML/etc.
cargo new --bin sample-project
cd sample-projectserde(with thederivefeature enabled) for generic (de)serialization.serde_jsonfor (de)serializing with JSON specifically.anyhowfor simple error handling.
cargo add serde serde_json anyhow
cargo feature serde +deriveIn src/main.rs:
#[derive(Debug, Clone, serde::Deserialize)]
#[serde(deny_unknown_fields)]
struct Sample;
fn main() -> anyhow::Result<()> {
let json = std::fs::read_to_string("./path/to/sample.json")?;
let sample: Sample = serde_json::from_str(&json)?;
Ok(())
}This is a very simple program that attempts to load a single JSON file from a
fixed path, and deserialize that into a Sample struct. Sample has serde(deny_unknown_fields)
set, so attempting to deserialize into it will fail for any valid non-empty JSON object.
cargo run- This will fail with an error.
- That error will indicate what (currently undefined) field was encountered.
- Add it to your
Samplestruct.- Make a guess as to the type, it's not important yet.
- Run it again.
- If the field cannot be deserialized into the type you guessed, it will fail with an error.
- This (new) error will give you some insight into the value. You can also just look at the JSON.
- Update your
Samplestruct accordingly.
- Otherwise, return to the beginning of this list until there are no more errors.
- If the field cannot be deserialized into the type you guessed, it will fail with an error.
Now do it again with another sample file.
- If you encounter a (new) error that one of the fields you have defined is missing, then that field must be optional.
- Change the type from
TtoOption<T>and continue.
- Change the type from
- If you encounter an error relating to mapping types, define another
struct, and use it as the type for that field, following the same process to discover its fields. Continue this process until you encounter no errors for all available samples.
For a sufficiently large and diverse set of valid samples, this process should produce an equally comprehensive and correct set of data structures.
If you have enough sample data to be confident that it is representative of all
reasonable variations, consider looking at any fields with a String type.
Do they all have the same limited set of values? That field might be an enum.