Rand design questions
| Questions and potential changes | |
| ============= | |
| Organisation and policies | |
| -------------- | |
| Should all RNGs be moved to a sub-module? E.g. `rng::isaac::IsaacRng` inside `rand` crate. | |
| ### RNG implementations | |
| Which RNGs should be included: what's the policy for accepting new RNGs? | |
| Should the crate allow any published RNG with a properly tested implementation? | |
| If not, there needs to be somewhere for the rest: individual crates | |
| (e.g. `rng-NAME`) or a catch-all-the-rest crate. | |
| What about: | |
| - well-known RNGs like Mersenne Twister? | |
| - new RNGs promising better performance/quality? | |
| - are all current RNGs worth keeping? | |
| - allow known-poor RNGs if there is demand (reproduction)? | |
| https://en.wikipedia.org/wiki/Pseudorandom_number_generator | |
| ### Distributions | |
| Standard generators | |
| ----------------- | |
| ### `StdRng` | |
| `StdRng` is supposed to be an efficient generator for the current platform. | |
| This implies the default could be changed in the future. But which properties | |
| is `StdRng` expected to have? | |
| * To be cryptography approved or not? | |
| * To not have major statistical flaws like correlation or poor distribution? | |
| Should be a given. | |
| * For generating mostly bools, bytes, 32-bit values, or 64-bit values? This is not | |
| such a pointless question as it might first appear: e.g. both `Isaac64` and | |
| `MT19937_64` RNGs implement `next_u32()` with `self.next_u64() as u32`, | |
| throwing away half the generated bits; some 64-bit generators may be able | |
| to extract sub-sets of the generated values with little overhead, thus | |
| performing better under this usage. | |
| * Does this necessarily support `SeedableRng<T>`? Currently yes, for | |
| `T = &[usize]` (which implies different seeding might be needed on different | |
| bit-ness platforms). | |
| At a minimum, the documentation should state which properties `StdRng` is | |
| required to have. It *might* be useful to have both standard crypto-approved | |
| and non-crypto algorithms. | |
| Required performance & cryptographic strength vary by application, so pretending | |
| there is "one good default" does not seem useful; some argue that cryptographic | |
| generators shouldn't be user-space at all, in which case there may be no point | |
| having a default crytographic generator — but a standard generator for sims and | |
| games could still be useful (though not really necessary). | |
| ### `ThreadRng` | |
| Again, which properties is this generator expected to have? | |
| There is an issue asking: make this just use the OS generator directly? | |
| Should there be help (instructions?) to implement an equivalently-easily-usable | |
| generator which is deterministic/repeatable? This would be useful for testing; | |
| it's likely not something that the library can/should provide (since algorithm | |
| must be fixed in this case). | |
| Generation support | |
| -------------------- | |
| ### `Rng` | |
| The `next_u32`, `next_u64` and `fill_bytes` methods all deserve | |
| to be there: missing any of these could easily result in unnecessary conversions | |
| between some generators (sources) and some users (sinks); the default | |
| implementations also make implementing `Rng` simple. | |
| The `next_f32` and `next_f64` methods *might* fall into the same category | |
| (if there are any native floating-point generators worth using), but likely | |
| don't. | |
| `gen_iter` is an iterator-adapter, but could still be an external method. | |
| All other methods are there to support generating various output types from a | |
| random source, and not specifically related to *generating randomness*, thus | |
| arguably belong elsewhere. However, it may still make sense to keep some/all | |
| for convenience or to avoid unnecessary breakage. | |
| (In fact, `gen_weighted_bool` is a distribution, not a simple convertor.) | |
| ### `Rand` | |
| Should `Rand` implementations for many combinations of arrays, tuples, etc. be kept? | |
| There's not *much* rationale for removing these features, however the `Rand` | |
| trait appears to be designed the way it is "because traits allow some cool tricks", | |
| rather than because the functionality is important and design well planned out. | |
| Further, the default distribution range is type dependent, which may result in | |
| a few surprises: | |
| - all values for integers | |
| - range [0, 1) for floats | |
| - valid codepoints for char | |
| - `Option<T>` has 50% probability of being `None`, 50% of being some generated `T` | |
| ### `sample` | |
| This selects a subset of a given sequence of specified length, may cause some | |
| reordering. Should possibly be in a sub-module, and have a name like | |
| `sample_from_seq`. | |
| ### Ranges | |
| Currently there is both `Rng::gen_range` (for convenience) and | |
| `Distributions::range::Range` (probably faster for repeated uses); this | |
| mostly makes sense; "range" is part-way between a simple value and a full | |
| distribution in complexity. | |
| Should `Range` be renamed `UniformRange` or similar? | |
| Can `Range` be modified to support some user-defined types? | |
| ### Alternatives | |
| Explicitly-named generators would seem a good starting point, but this isn't a | |
| full design. Ideas: | |
| * `gen::uniform::<i32>(&mut rng)` | |
| * `gen::uniform_i32(&mut rng)` | |
| * `gen::uniform01::<f32>(&mut rng)` | |
| * `gen::range01::<f32>(&mut rng)` | |
| * `gen01::<f32>(&mut rng)` | |
| * `gen::open01::<f32>(&mut rng)` | |
| * `gen::char(&mut rng)` | |
| * `gen::codepoint(&mut rng)` | |
| Possibly all generators could be considered distributions, but with some API simplications: | |
| * `distributions::Uniform` struct (full range) | |
| * `distributions::UniformRange` struct (specified range) | |
| * `distributions::Uniform01` struct (specifically for floating point) | |
| * `distributions::uniform` function to get a single `Uniform` value for convenience | |
| * etc. | |
| Distributions | |
| -------------- | |
| `WeightedChoice` appears to have ownership issues; should an owning version be added? | |
| Should it be removed entirely? | |
| Does `Sample` have a reason to exist at all? E.g. random walks are not distributions | |
| from which someone samples, but random processes, where it can be useful to | |
| separately (1) get the current state and (2) update it. | |
| See [this comment](https://github.com/rust-lang-nursery/rand/pull/27#issuecomment-317393407) for more on the matter. | |
| Can `IndepedentSample` be renamed `Sample`? | |
| Various generators use template parameters named `Support` and `Sup`; this may | |
| be confusing since template parameters are typically named `T` (`S`, `A`, etc.); | |
| at least this confused me. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment