Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
1. tokenization
2. parsing
3. checking and elaboration (i.e. producing TypedTree)
- $" " plain
- $"..." as FormattableString
- $"..." as PrintFormat
4. FSharp.Core support (printf.fs)
Code examples:
printf "abc %d def" 3
$"abc {1+1} def"
@$"abc {1+1} def"
$@"abc {1+1} def"
"""abc {1+1} def"""
## Tokenization
token, get " -> string (args with NormalString)
token, get $" -> string (args with InterpolatedString)
token, get {, and ars.Stack~~InterpolatedString -> token (args.PushABrace())
token, get }, and ars.Stack~~InterpolatedString and Braces=1 -> string/vstring/tqstring (args with NormalString)
token, get }, and ars.Stack~~InterpolatedString and Braces=N -> token (args.PopABrace())
token, get @" -> verbatimString
token, get @$", $@" -> verbatimString
token, get """ -> tripleQuoteString
token, get $""" -> tripleQuoteString
string, get "{", and args.InterpolatedString --> produce INTERPOLATED_STRING_FRAGEMENT, then go to token state + push
vstring, get "{", and args.InterpolatedString --> produce INTERPOLATED_STRING_FRAGEMENT, then go to token (args.Push "vstring")
New tokens:
INTERP_STRING_BEGIN_END --> $"cvkjvrkjhrve" $"""vrkjhrvhrewhervj""
INTERP_STRING_BEGIN_PART --> $"vrwhwver { $"""vrwjrlvjwe {
INTERP_STRING_PART --> } vrewhvrehkjervh {
INTERP_STRING_END --> } vrwkwjervh"
fsc --tokenize test.fs
## Parsing
atomicExprAfterType: // Q: WHY THIS ONE
| interpolatedString
| declExpr
| declExpr COLON ident %prec interpolation_fill
| INTERP_STRING_PART interpolatedStringFill interpolatedStringParts
| INTERP_STRING_BEGIN_PART interpolatedStringFill interpolatedStringParts
Giving these SyntaxTree extensions:
type SynExpr =
| InterpolatedString of
contents: SynInterpolatedStringPart list *
range: range
type SynInterpolatedStringPart =
| String of string * range
| FillExpr of SynExpr * Ident option
## Checking and Elaboration
1. $"..." : overallTy --> Check if overallTy unifies with 'string' etc. as per spec
2. Then put together the fragments into one format string using `%P()` or `%alignmentP(format)` as holes as per spec
3. Do normal format string checking of the overall format string, with %P(..) allowed
--> Extract type information about the format string
4. In the case where $".." is being used as a string or a PrintfFormat
Make a call to PrintfFormat<...>(format)
Fill in Captures and CaptureTypes in the PrintfFormat object.
If $"..." is being used as a string then call "sprintf" taking the PrintfFormat as argument
$"abc{x,5}" --> Printf.sprintf (new PrintfFormat("abc%5P()", [| x |], null))
$"abc{1+1}def" --> Printf.sprintf (new PrintfFormat("abc%P()def", [| box (1+1) |], null))
$"abc%d{1+1}def" --> Printf.sprintf (new PrintfFormat("abc%d%P()def", [| box (1+1) |], null))
In the case where $"..." is being used as a .NET FormattableString then some different codegen is needed, also
more restrictions apply (e.g. no % patterns are allowed), as per spec. Codegen becomes a
call to FormattableStringFactory.Create, e.g.
($"abc {x} {y:N}" : FormattableString)
--> FormattableStringFactory.Create("abc {0} {1:N}", [| box x; box y |])
## printf at runtime
- Given format string object containing
.FormatString (.Value) --> the string, e.g. "abc%d%P()def"
.Captures --> null for a normal old-style printf, non-null for capturing interpolation
.CaptureTypes --> null for a normal old-style printf, non-null of there are %A patterns
- Aim of `sprintf` is EITHER
1. produce a string (if interpolated printf formatting)
2. produce a curried function of the right type (if old-style printf formatting)
- Two phase approach
1. crack the format string into an array of "steps"
1b. if producing a curried function, generate the curried function now `(fun arg1 -> (fun arg2 -> .... <phase2>))`
2. iterate over the steps writing the output fragments
There is a two-level Cache, type-directed table
type Cache<'Printer, 'Residue, '...> =
static let mutable recent = ...
static let mutable dict = ConcurrentDictionary....
The Cache holds the results of phase 1
This is how printf has always worked since 2012 or so. The main addition here is that "phase 2" can fill in the arguments
and relevant %A types from Captures/CaptureTypes rather than the arguments of the curried function chain.
Basic runtime action of sprintf will
1. Look up cache, populate with phase1 results if needed
2. run phase 2, return the string.
## Tooling
1. Extra complication for reporting locations of %d etc. in interpolated strings.
2. Extra complication for making sure we can take a correct continuation from tokenization.
$""" vwhvwerkhj vwekh wvekjh vwe { <--- take continuation at the end of each line
Test cases related to tokenization:
$""" vwhvwerkhj vwekh wvekjh vwe {
#if GOO
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment