Pseudorandom number generator (PRNG) is one of fundamental tools of applied mathematics, used in various applications such as cryptography, non-deterministic algorithms or modelling. However, "there is no silver bullet" and one PRNG doesn't fit all. More specifically, because PRNGs are
- Deterministic
- Periodic
usage of PRNGs in an environment unfit for them can lead to predictability or slow execution times and both are disastrous for industrial applications. Therefore, it is imperative to choose an appropriate PRNG for the right application, based on its statistical properties. One of such properties is splittability - the property of a PRNG which allows to split it into two PRNGs, each of which retains (at least to some extent) its qualities.
One of the core means of organising parallelism in any language is separation by data, which separates the workers so that they don't modify shared data. One of ways to achieve such separation is by splitting resources, so that each worker accesses its own copy. PRNG can be viewed as one of such resources. Thus, we come to the fact that splittable PRNG is one of a key factors for fast and parallel applications involving nondeterminism.
- Current standard generator of System.Random has no statistical foundation for its stdSplit function, which is also noted in the source. On the other hand, SplitMix generator from [1] is backed by a published paper, analysis and is specifically aimed at providing a splittable implementation.
- Current standard generator is known for its slow performance, splitting into n generators worsens the problem n-fold.
In order to be efficient while writing an implementation of a PRNG, one needs to first establish a benchmark for assessing its performance and the quality of generated random numbers. Luckily, [1] provides us with DieHarder and TestU01 test suites for the latter, while for the former one would need to
- Write test programs including most common usage cases a. Monte-Carlo simulations b. Quick-Check testing
- Inspect low-level representations of implemented algorithms
Also, it would be good to run cryptographic test suites such as [3] in order to evaluate SplitMix implementation's fitness for cryptographic purposes.
Therefore, as a result of this proposal one should produce:
- A well-documented SplitMix implementation in Haskell
- Results from running benchmarks for the written implementation
Before April 13: Read the papers relevant to the project such as [4] (paper on which current standard PRNG is based) and [5] (both referenced from the source), read [1] more thoroughly and look for more relevant papers (especially for ones about industrial applications using parallel random algorithms)
April 13 - May 25: Familiarize myself with Core representation, look at already implemented PRNGs' code, remember optimisations used, get development environment ready
May 26 - June 9: Implement SplitMix PRNG in Haskell, document it
June 10 - June 17: Find and fix bugs by using PRNG test suites mentioned above as well as self-written tests
June 18 - July 2: Write applications using splittable PRNG and estimate its performance, apply optimisations on Haskell level, document and explain changes made
July 3 - July 24: Look into Core representation of the implemented PRNG, write applications which exploit its weaknesses, optimize it on Core level, document and explain changes made
July 25 - August 10: Use [3] to find if SplitMix implementation from July 17 is fit for cryptographic purposes, if so try to apply the subsequent optimisations, produce a separate implementation if some are not possible. If implementation is unfit, find a way to fix it if the problems are minor, otherwise explore additional suggestions.
August 10 - August 21: Wrap up the work on both implementations, compile the documentation so far into a cohesive report
http://dl.acm.org/citation.cfm?id=2660195 http://www.inf.utfsm.cl/~hallende/download/Simul-2-2002/pads98.pdf http://csrc.nist.gov/publications/nistpubs/800-22-rev1a/SP800-22rev1a.pdf http://www.iro.umontreal.ca/~lecuyer/myftp/papers/cacm88.pdf http://www.cs.ou.edu/~rlpage/burtonpagerngjfp.pdf
I'm a second year graduate student at St Petersburg Academic University. After being introduced to functional programming I felt that its core ideas are very close to what I thought are the guidelines for writing reliable applications. From that time I wanted to be involved in an open-source Haskell project and considering that I am into theoretic cryptography, I think the fulfillment of this project will both benefit the community and be a great experience for me.
Unfortunately, I will probably need to take a week off around late May (step 2 of the timeline) to finish the year's business at the university, but I am pretty sure that it will not hinder the fulfillment of this project. I will notify the mentor of the exact dates, when I know them (around late April).
Github: https://github.com/nkartashov
Current resume: https://drive.google.com/file/d/0B4H5UegQoqcVWVRxdHpKZ0l2Xzg
Email: snailandmail@gmail.com
I'm surprised you don't mention the following work:
Are you aware of it? Have you considered it? Would it be an alternative to the approach you plan to take? That work is already targeted at Haskell. There might already be a prototype implementation around. Its interaction with QuickCheck specifically has been studied by the authors. (One of the authors is one of the original inventors of QuickCheck.) What motivates your use of a different PRNG?
(I have no opinion about the qualities of the different PRNGs. I'm just asking those questions because without considering them, your project might not have the best outcome.)