Skip to content

Instantly share code, notes, and snippets.

@kakkun61
Last active August 8, 2020 09:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kakkun61/799803b1c50312fbd9ef6d83b078c439 to your computer and use it in GitHub Desktop.
Save kakkun61/799803b1c50312fbd9ef6d83b078c439 to your computer and use it in GitHub Desktop.
Haskell 2010 Language Report

10.3 Layout

Section 2.7 gives an informal discussion of the layout rule. This section defines it more precisely.

The meaning of a Haskell program may depend on its layout. The effect of layout on its meaning can be completely described by adding braces and semicolons in places determined by the layout. The meaning of this augmented program is now layout insensitive.

The effect of layout is specified in this section by describing how to add braces and semicolons to a laid-out program. The specification takes the form of a function L that performs the translation. The input to L is:

レイアウトの効果はこのセクションに記載されたどのように波括弧とセミコロンをプログラムに挿入することで仕様とされる。仕様は関数 L の形をとる。L はその変形をする。L への入力は:

  • A stream of lexemes as specified by the lexical syntax in the Haskell report, with the following additional tokens:

  • 次に示す追加のトークンが付与された、Haskell レポートの構文文法によって指定される連続した語彙素:

    • If a let, where, do, or of keyword is not followed by the lexeme {, the token {n} is inserted after the keyword, where n is the indentation of the next lexeme if there is one, or 0 if the end of file has been reached.

    • { 語彙素が続かない letwheredoof キーワードが来た場合、トークン {n} をキーワードの後に挿入する。ただし、n は次の語彙素があるならばそのインデントである。ファイルの終端に到達したならば0である。

    • If the first lexeme of a module is not { or module, then it is preceded by {n} where n is the indentation of the lexeme.

    • もしモジュール最初の語彙素が { もしくは module であるならば、{n} が先行する。n はその語彙素のインデントである。

    • Where the start of a lexeme is preceded only by white space on the same line, this lexeme is preceded by < n > where n is the indentation of the lexeme, provided that it is not, as a consequence of the first two rules, preceded by {n}. (NB: a string literal may span multiple lines – Section 2.6. So in the fragment

      f = ("Hello \  
              \Bill", "Jake")
      

      There is no < n > inserted before the \Bill, because it is not the beginning of a complete lexeme; nor before the ,, because it is not preceded only by white space.)

    • 同一行にスペースのみが先行する連続した語彙素の場合、この語彙素に < n > が先行する。n はその語彙素のインデントである。そうでない場合、先の2つの規則の結果によって {n} となる。

  • A stack of “layout contexts”, in which each element is either:

  • 「レイアウト文脈」のスタック内のそれぞれの要素が次のいずれかであるような「レイアウト要素」のスタック:

    • Zero, indicating that the enclosing context is explicit (i.e. the programmer supplied the opening brace). If the innermost context is 0, then no layout tokens will be inserted until either the enclosing context ends or a new context is pushed.

    • 包含する文脈が明示的である(つまりプログラマーが { を明記した)ことを意味する0。もし内側の文脈が0であるなら、包含する文脈が終わるか新しい文脈がプッシュされるまでレイアウトトークンは挿入されない。

    • A positive integer, which is the indentation column of the enclosing layout context.

    • 包含するレイアウト文脈のインデントのカラム数である正整数

The “indentation” of a lexeme is the column number of the first character of that lexeme; the indentation of a line is the indentation of its leftmost lexeme. To determine the column number, assume a fixed-width font with the following conventions:

語彙素の「インデント」はその語彙素の最初の文字のカラム数である。行のインデントとはその行の最も左の語彙素のインデントである。To determine the column number, assume a fixed-width font with the following conventions:

  • The characters newline, return, linefeed, and formfeed, all start a new line.
  • The first column is designated column 1, not 0.
  • Tab stops are 8 characters apart.
  • A tab character causes the insertion of enough spaces to align the current position with the next tab stop.

For the purposes of the layout rule, Unicode characters in a source program are considered to be of the same, fixed, width as an ASCII character. However, to avoid visual confusion, programmers should avoid writing programs in which the meaning of implicit layout depends on the width of non-space characters.

The application L tokens [] delivers a layout-insensitive translation of tokens, where tokens is the result of lexically analysing a module and adding column-number indicators to it as described above. The definition of L is as follows, where we use “:” as a stream construction operator, and “[]” for the empty stream.

L tokens [] という適用によってレイアウト非依存版の tokens が得られる。tokens はモジュールと上記のカラム数指示子の構文解析の結果である。L の定義は下記である。where we use “:” as a stream construction operator, and “[]” for the empty stream.

L (< n > : ts) (m : ms) = ";" : (L ts (m : ms))           if m = n
                        = "}" : (L (< n > : ts) ms)       if n < m
L (< n > : ts) ms       = L ts ms
L ({n} : ts) (m : ms)   = "{" : (L ts (n : m : ms))       if n > m                    (Note 1)
L ({n} : ts) []         = "{" : (L ts [n])                if n > 0                    (Note 1)
L ({n} : ts) ms         = "{" : "}" : (L (< n > : ts) ms)                             (Note 2)
L ("}" : ts) (0 : ms)   = "}" : (L ts ms)                                             (Note 3)
L ("}" : ts) ms         = parse-error                                                 (Note 3)
L ("}" : ts) ms         = "{" : (L ts (0 : ms))                                       (Note 4)
L (t : ts) (m : ms)     = "}" : (L (t : ts) ms)           if m ≠ 0 and parse-error(t) (Note 5)
L (t : ts) ms           = t : (L ts ms)
L [] []                 = []
L [] (m : ms)           = "}" : L [] ms                   if m ≠ 0                    (Note 6)

Note 1.

A nested context must be further indented than the enclosing context (n > m). If not, L fails, and the compiler should indicate a layout error. An example is:

入れ子になった文脈は包含する文脈よりインデントされなくてはいけない。さもなければ L は失敗しコンパイラーはレイアウトエラーを示す。An example is:

f x = let  
          h y = let  
  p z = z  
                in p  
      in h

Here, the definition of p is indented less than the indentation of the enclosing context, which is set in this case by the definition of h.

Note 2.

If the first token after a where (say) is not indented more than the enclosing layout context, then the block must be empty, so empty braces are inserted. The {n} token is replaced by < n >, to mimic the situation if the empty braces had been explicit.

where の後の最初のトークンが包含するレイアウト文脈よりインデントされていないならば、ブロックは必ず空である。なので空の波括弧が挿入される。The {n} token is replaced by < n >, to mimic the situation if the empty braces had been explicit.

Note 3.

By matching against 0 for the current layout context, we ensure that an explicit close brace can only match an explicit open brace. A parse error results if an explicit close brace matches an implicit open brace.

現在のレイアウト文脈において0に対してマッチするなら、明示的な閉じ波括弧は明示的な開き波括弧にのみマッチすることを確認する必要がある。明示的な閉じ波括弧が暗示的な開き波括弧とマッチするならパーサーの結果はエラーとなる。

Note 4.

This clause means that all brace pairs are treated as explicit layout contexts, including labelled construction and update (Section 3.15). This is a difference between this formulation and Haskell 1.4.

データのラベル付きの構築および更新を含め、全ての波括弧の組は明示的レイアウト文脈として扱われる。これは Haskell 1.4 と異なる点である。

Note 5.

The side condition parse-error(t) is to be interpreted as follows: if the tokens generated so far by L together with the next token t represent an invalid prefix of the Haskell grammar, and the tokens generated so far by L followed by the token “}” represent a valid prefix of the Haskell grammar, then parse-error(t) is true.

条件の parse-error(t) は次のように実行される:ここまでに L によって生成されたトークン列を次のトークン t と一緒にして Haskell 文法として不正な接頭辞を表し、ここまでに L によって生成されたトークン列にトークン } が続くものが妥当な Haskell 文法の接頭辞となるならば、parse-error(t) は真となる。

The test m ≠ 0 checks that an implicitly-added closing brace would match an implicit open brace.

m ≠ 0 の検査は暗黙的に追加された閉じ波括弧が暗黙的に追加された開き波括弧とマッチするかどうかを表わしている。

Note 6.

At the end of the input, any pending close-braces are inserted. It is an error at this point to be within a non-layout context (i.e. m = 0).

If none of the rules given above matches, then the algorithm fails. It can fail for instance when the end of the input is reached, and a non-layout context is active, since the close brace is missing. Some error conditions are not detected by the algorithm, although they could be: for example let }.

Note 1 implements the feature that layout processing can be stopped prematurely by a parse error. For example

let x = e; y = x in e'

is valid, because it translates to

let { x = e; y = x } in e'

The close brace is inserted due to the parse error rule above.

The authors and publisher intend this Report to belong to the entire Haskell community, and grant permission to copy and distribute it for any purpose, provided that it is reproduced in its entirety, including this Notice. Modified versions of this Report may also be copied and distributed for any purpose, provided that the modified version is clearly presented as such, and that it does not claim to be a definition of the language Haskell 2010.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment