Skip to content

Instantly share code, notes, and snippets.

@gwynne
Created April 29, 2020 02:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gwynne/d596cb9712849cdc033084b2eebaf680 to your computer and use it in GitHub Desktop.
Save gwynne/d596cb9712849cdc033084b2eebaf680 to your computer and use it in GitHub Desktop.

Notes and corrections for draft-irtf-crfg-argon2-10

There are a number of issues of varying severity enumerated this document, but there is one highly critical issue of actual correctness that concerns me more than all the rest combined; that issue is found in section 3.2, step 6, regarding the calculation of blocks during iterations after the first. See the appropriate entry in this document for details.

How to read these notes

I have attempted to tag all of my remarks and suggestions according to their particular seriousness and effect. The definitions of the tags I've used, for the sake of avoiding any confusion, are (listed here very roughly in descending priority order):

  • [CORRECTNESS] - An issue of actual technical correctness. These are definitively the most critical concerns found in these notes, as they represent unambiguous errors of actual fact, to the best of my ability to so identify. Reserved for blatantly incorrect definitions and critical content omissions.

  • [CLARITY] - Any issue where the specification is correct in purely technical terms or the correct behavior can be inferred, but relies upon the reader to already have perused the reference implementation, done additional research, or otherwise be aware of behaviors and definitions provided by this specification. These include issues such as confusing phrasing, omitted definitions, conflicting statements, and incomplete specifications.

  • [SPELLING] - An error of English spelling. These are typically simple typos.

  • [GRAMMAR] - An error or concern regarding use of English syntax and grammar. There are a few cases where these represent actual misuse of language, but most of my uses of this tag are particular to my personal understanding and perceiption of formal prose as typically found in published RFCs (and even then I've held back in a number of cases). As a result, some of the issues marked with this tag could instead be called [STYLE] issues, depending on personal opinion.

  • [STYLE] - A concern of either prose or specification detail which concerned me sufficiently to take specific note of it but does not rise to the level of an error or a major impediment to comprehension. I consider notes having this tag to be of the lowest priority relative to the resat.

  • [OPINION] - Something I think might be worthwhile to provide or change, but can provide no strong justification for in terms of formal specification.

Wherever paragraph numbers are referenced in this document, the number given is relative to the start of the containing section, not the start of the page.

§ 1 Introduction

  • Pp. 1, para. 2: "Argon2 reflects the state of the art..."

[STYLE] This statement, while broadly accurate at the time of writing, is highly unlikely to remain so indefinitely. Suggest rephrase.

  • Pp. 2, para. 2: "Argon2 can be viewed as a mode of operation..."

[STYLE] Use of passive voice is often a telltale of poor prose. Suggest active role, e.g. "We view Argon2 as a mode of operation..."

  • Pp. 2, para. 2: "Even though Argon2 can potentially be used with arbitrary function H, as long as it provides..."

[GRAMMAR] The sentence is missing the indefinite article "an" before "arbitrary". Alternate phrasing is also possible.

  • Pp. 2, para. 2: "... up to 64 bytes, in this document it MUST be BLAKE2b [BLAKE2]."

[STYLE] The explicit normative definition of H as BLAKE2b is not repeated with the use of the "MUST" key word elsewhere. Such a declaration is most appropriately placed in sections 2 or 3, closer to the point of usage.

§ 2 Notation and Conventions

  • Pp. 2, item 3: "substraction of integer c with integer d"

[SPELLING] Misspelling of "subtraction".

[GRAMMAR] Formal English phrasing in mathematics prefers the preposition "from" and reverse ordering when discussion subtraction, e.g "subtraction of integer d from integer c"

  • Pp. 2, item 5: "The result is rational number"

[GRAMMAR] The indefinite article "a" preceeding "rational" is missing.

  • Pp. 2, item 6: "function I evaluated on integer parameter j"

[GRAMMAR] Formal English grammar prefers "from" to "on" here.

  • Pp. 3, items 8, 9: "converted to bytestring"

[GRAMMAR] The indefinite article "a" is again missing before "bytestring".

§ 3 Argon2 Algorithm

§ 3.2 Argon2 Operation

  • Pp. 4, para. 1 (entire paragraph)

[CLARITY] The operation of the hash function H^x() herein is described vaguely. No formal definition in procedural notation is ever given.

[GRAMMAR] At the end of the paragraph, there are two section references; in both cases the word "Section" is repeated twice.

  • Pp. 4, step 1: "If K,X, or S has zero length it is just absent."

[GRAMMAR] A space after the comma between K and X is missing.

[CLARITY] This sentence does not clarify whether the length counts of the byte strings are also removed from the calculation of M0 when the strings themselves are (they are not). Suggest an explicit statement that the length 0 is still encoded for each zero-length value.

  • Pp. 5, step 5: Compute B[i][j] for all i ranging from (and including) 0 to (not including) p, and for all j ranging from (and including) 2) to (not including) q.

[CLARITY] The phrasing of this step strongly suggests that i represents the outer loop, while j represents the inner loop, filling the block matrix row-by-row. This results in incorrect results and misbehavior of the algorithm; the referenced determination of l and z depends on the block matrix filling in column-by-column order (top-to-bottom), such that j represents the outer iteration loop with respect to i. Suggest rewording to explcitly clarify this requirement.

[GRAMMAR] Additionally, at the end of step 5, the word Section is again repeated.

  • Pp. 5, step 6: "... we repeat the steps however replacing ..."

[GRAMMAR] Consider the phrasing "we repeat the steps. However, we replace..."

  • Pp. 5, step 6: "replacing the computations with the following expression:"

[CORRECTNESS, CRITICAL] The provided expressions are correct for version 1.0 of the Argon2 algorithm but do NOT describe verson 1.3 as claimed. To do so, they must be replaced with:

    B[i][0] = B[i][0] XOR G(B[i][q-1], B[l][z])
    B[i][j] = B[i][j] XOR G(B[i][j-1], B[l][z])

§ 3.3 Variable-length hash function H'

  • Pp. 6: "Tag computation"

[CORRECTNESS] The variable-length hash function is used for computing the output tag, but it is also used for computing the initial 2 blocks in each lane of the main algorithm loop. Labeling the functional block "tag computation" is thus incorrect and misleading. Suggest a label such as "Variable-length hash computation" instead.

§ 3.4 Indexing

  • Pp. 6, para. 1: "The intersection of a slice and a lane is a segment of length q/SL."

[CLARITY] This definition of a segment is never given in formal notation. I recommend the variable name u to represent this definition; I reuse this recommendation in multiple calculations further down.

§ 3.4.1 Getting the 32-bit values J_1 and J_2

  • Pp. 7

[CLARITY] To match other usage in the spec, this section title should use "Computing" in place of "Getting".

§ 3.4.1.1 Argon2d
  • Pp. 7, functional block: J_1 = int32(extract(B[i][j-1], 1)) and proceeding line

[CORRECTNESS] The definition of the "extract()" function given in section 2 clearly states that the index of the selected 32-bit group is zero-based. The body paragraph immediately above these functional definitions additionally states J_1 is given by the first 32 bits of block B[i][j-1]. The definitions provided would yield the second and third 32 bits of the block. The parameters "1" and "2" should be replaced with "0" and "1" respectively.

§ 3.4.1.2 Argon2i
  • Pp.7, para. 1, entire paragraph

[CLARITY] There is no formal definition of how G is applied "in counter mode" to yield the referenced "128 64-bit values". This should, at minimum, be given as something like X = G(ZERO, G(ZERO, INPUT)) (where "INPUT" is the definition of the input block given immediately following).

[CORRECTNESS] As written, the paragraph strongly suggests that the correct mode of operation is to recalculate the X block for every single computation of J1 and J2; this is incorrect. The correct procedure is to calculate the complete block X with the counter i = 1 at the beginning of each segment, then to cycle through the 128 64-bit groups within the block to get values for J1 and J2. If all 128 "addresses" are used without completing a single segment (which will happen whenever q is greater than 512), a new X must be computed with the counter i incremented by one, repeating every 128 blocks until the single segment is complete, at which point any remaining unused values in the block are discarded.

  • Pp.7, para. 2: "The values r, l, sl, m', t, x, i are represented..."

[CORRECTNESS] The reference to x here should instead be a reference to y.

§ 3.4.2 Mapping J_1 and J_2 to reference block index

  • Pp.7, para. 1: "For the firt pass..."

[SPELLING] firt should obviously be first.

  • Pp.8, para. 4: "We are going to take a block from W with a non-uniform distribution over [0, |W|) using the mapping"

[STYLE] The use of "we" here is not in keeping with the style of prose used throughout up to this point (not to mention the lack of punctuation at the end). Suggest instead the impersonal voice, such as "Take a block from W using a non-uniform distribution over [0, |W|) via the mapping:"

  • Pp. 8, para. 6: "The value of z gives the reference block index in W."

[CLARITY] This sentence and the preceeding formal definition refer to a variable named z, named identically to a related value referenced in steps 5 and 6 of the main algorithm loop; these two z values are NOT the same, but one is determined on the basis of the other. I suggest renaming the one in this section to z'; I will refer to it as such henceforward.

[CLARITY] This sentence refers to indexing into W to find the appropriate block index z to use in the main algorithm loop. Given that z' is calculated as an index into [0, |W|), it should be clearly defined that the correct value of z (for use in the main algorithm loop) is computed as: z = (st + z') % q, where st is computed as if r == 0 then st = 0 else st = ((sl + 1) % SL) * (q / SL). st here refers to a "start position", e.g. the offset in blocks into the referenced lane l at which W begins, and r is the index of the current iteration of the main algorithm loop.

[OPINION] The calculations referenced by this section can be formulated as follows. Please note I have not completed a full cross-check of this set of equations as of the time of this writing; they may contain inaccuracies. This is intended as an informative suggestion, not as a normative reference. Should I later find any inaccuracies herein, I will provide corrections as appropriate:

sl = j / u
l = if r == 0 and sl == 0 then i else J_2 mod p
W_sl = if r == 0 then sl else SL
W_ex = if l == i then j % u else min(j % u, 1) - 1
|W| = (W_sl * u) + W_ex
x' = J_1^2 / 2^(32)
y' = (|W| * x') / 2^(32)
z' = |W| - 1 - y'
z = ((min(r, 1) * (((sl + 1) % SL) * u)) + z') % q

§ 3.5 Compression function G

  • Pp. 8, para. 1: "Compression function G is built upon the BLAKE2b round function P."

[GRAMMAR] The definite article "the" is missing from the beginning of this sentence.

[CLARITY] This sentence claims P is the BLAKE2b round function, but the following section (3.6) more correctly defines P as a permutation function based on (but not identical to) BLAKE2b's round function. Suggested rephrase: "The compression function G is built on a permutation function P."

  • Pp. 8, para. 2: "Compression function G(X, Y) operates..."

[GRAMMAR] The "the" definite article is again missing.

Addendum

A significant number of additional grammatical corrections (and a few clairifications) could be offered for text following this point, but there are no further issues of actual correctness or significant omissions requiring immediate attention as far as I could determine. In the interest of not (further) obscuring the more critical concerns, I have stopped here in my notes for now, though I intend to add the additional notations at a later time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment