The following process should be fairly universal for any sort of deserializable data format. Since the data was JSON, the examples will use that, but same general process works for YAML/TOML/etc.
cargo new --bin sample-project
cd sample-project
serde
(with thederive
feature enabled) for generic (de)serialization.serde_json
for (de)serializing with JSON specifically.anyhow
for simple error handling.
cargo add serde serde_json anyhow
cargo feature serde +derive
In src/main.rs
:
#[derive(Debug, Clone, serde::Deserialize)]
#[serde(deny_unknown_fields)]
struct Sample;
fn main() -> anyhow::Result<()> {
let json = std::fs::read_to_string("./path/to/sample.json")?;
let sample: Sample = serde_json::from_str(&json)?;
Ok(())
}
This is a very simple program that attempts to load a single JSON file from a
fixed path, and deserialize that into a Sample
struct. Sample
has serde(deny_unknown_fields)
set, so attempting to deserialize into it will fail for any valid non-empty JSON object.
cargo run
- This will fail with an error.
- That error will indicate what (currently undefined) field was encountered.
- Add it to your
Sample
struct.- Make a guess as to the type, it's not important yet.
- Run it again.
- If the field cannot be deserialized into the type you guessed, it will fail with an error.
- This (new) error will give you some insight into the value. You can also just look at the JSON.
- Update your
Sample
struct accordingly.
- Otherwise, return to the beginning of this list until there are no more errors.
- If the field cannot be deserialized into the type you guessed, it will fail with an error.
Now do it again with another sample file.
- If you encounter a (new) error that one of the fields you have defined is missing, then that field must be optional.
- Change the type from
T
toOption<T>
and continue.
- Change the type from
- If you encounter an error relating to mapping types, define another
struct
, and use it as the type for that field, following the same process to discover its fields. Continue this process until you encounter no errors for all available samples.
For a sufficiently large and diverse set of valid samples, this process should produce an equally comprehensive and correct set of data structures.
If you have enough sample data to be confident that it is representative of all
reasonable variations, consider looking at any fields with a String
type.
Do they all have the same limited set of values? That field might be an enum
.