Skip to content

Instantly share code, notes, and snippets.

@duglin
Last active August 29, 2015 14:08
Show Gist options
  • Save duglin/e5eaabc0ee3e1fac828f to your computer and use it in GitHub Desktop.
Save duglin/e5eaabc0ee3e1fac828f to your computer and use it in GitHub Desktop.
Docker Parser V2 Proposal
Summary:
- Parse each line of the Dockerfile the same way people expect sh to do it
See below from the BASH man page but basically take into account quotes/spaces/etc
- Once parsed, the array of words (argv[]) will then be passed on to each command
Similar to how the client does it, each command can then use arg parsers (eg. pkg/flag) to deal
with options if they want.
- This means that we may see Dockerfile commands that look like:
COPY -R <src> <dest>
This provides us much more flexibility in the future to add additional semantics to exiting
commands through flags
- Presence of the command USE (instead of FROM) will be the indicator that we're using the new parser/format
- In general we should use this opporutnity to "fix" certain commands that don't act the way people
expect. For example COPY/ADD should be redone, and perhaps consider changing some command names,
for example, use EXPORT (or SET) instead of ENV. But these are not critical to the proposal, just things
to consider.
Details:
- All Dockerfile commands are parsed similar to how the shell parsers work
- Lines are broken up into words, where whitespace is the delimiter - see below
- Whitespace between words is not significant and will be ignored
- Quotes (" and ') as well as \ can be used to escape whitespace as a delimiter
- Double-quoted (") words will have environment variable processing done on them
- Single-quotes (') words will not have environment variable processing done on them
- Once broken into words, the array of words will be passed on to the appropriate command processor (ala argv[])
- There will be no command specific parsing done by the parser - commmands are free to do whatever they need
- Environment variables will be identified by $xxx
- The set of valid variable names will be the same as the shell: [a-zA-Z_][a-zA-Z0-9_]*
- In general people should be prepared for other bash-like features showing up (like "here docs") or support for
${}, which means they should avoid using a syntax that (while valid today) might give different results with a
full bash parser
- The trigger to know we're using this parser instead of the current parser is the use of a new keyword instead
of "FROM":
- Possible replacements: USE, BASE, BASEIMAGE, EXTEND, IMAGE - I like "USE"
From the BASH man page - we'll use this as our guide:
Word Splitting
The shell scans the results of parameter expansion, command substitu-
tion, and arithmetic expansion that did not occur within double quotes
for word splitting.
The shell treats each character of IFS as a delimiter, and splits the
results of the other expansions into words on these characters. If IFS
is unset, or its value is exactly <space><tab><newline>, the default,
then any sequence of IFS characters serves to delimit words. If IFS
has a value other than the default, then sequences of the whitespace
characters space and tab are ignored at the beginning and end of the
word, as long as the whitespace character is in the value of IFS (an
IFS whitespace character). Any character in IFS that is not IFS white-
space, along with any adjacent IFS whitespace characters, delimits a
field. A sequence of IFS whitespace characters is also treated as a
delimiter. If the value of IFS is null, no word splitting occurs.
Explicit null arguments ("" or '') are retained. Unquoted implicit
null arguments, resulting from the expansion of parameters that have no
values, are removed. If a parameter with no value is expanded within
double quotes, a null argument results and is retained.
Note that if no expansion occurs, no splitting is performed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment