Skip to content

Instantly share code, notes, and snippets.

@Jarvix
Created April 14, 2012 17:41
Show Gist options
  • Save Jarvix/2386170 to your computer and use it in GitHub Desktop.
Save Jarvix/2386170 to your computer and use it in GitHub Desktop.
RFC X____ J. Kuijpers, Ed.
Jarvix
M. Beermann, Ed.
April 17, 2012
0xSCA: Standards Committee Assembly
Abstract
This document describes an assembly and preprocessor syntax suitable
for the DCPU-16 environment. This syntax is called the 0xSCA, or
Standards Committee Assembly.
This is not a standard.
Kuijpers & Beermann [Page 1]
Assembly Syntactics April 2012
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Requirements Language . . . . . . . . . . . . . . . . . . . 3
2. Document Markup . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1. Filename . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2. Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.3. Indentation and whitepacing . . . . . . . . . . . . . . . . 3
3. Preprocessor Markup . . . . . . . . . . . . . . . . . . . . . . 3
3.1. Comments . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2. Prefix . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.3. Case insensitivity . . . . . . . . . . . . . . . . . . . . 4
3.4. Directives . . . . . . . . . . . . . . . . . . . . . . . . 4
3.4.1. Inclusion . . . . . . . . . . . . . . . . . . . . . . . 4
3.4.1.1. Code . . . . . . . . . . . . . . . . . . . . . . . 4
3.4.1.2. Binary . . . . . . . . . . . . . . . . . . . . . . 5
3.4.2. Definitions . . . . . . . . . . . . . . . . . . . . . . 5
3.4.3. Data insertion . . . . . . . . . . . . . . . . . . . . 5
3.4.4. Origin relocation . . . . . . . . . . . . . . . . . . . 5
3.4.5. Macros: macro block and macro insertion . . . . . . . . 6
3.4.6. Repeat block . . . . . . . . . . . . . . . . . . . . . 6
3.4.7. Conditionals . . . . . . . . . . . . . . . . . . . . . 6
3.4.8. Error reporting . . . . . . . . . . . . . . . . . . . . 7
3.4.9. Alignment . . . . . . . . . . . . . . . . . . . . . . . 7
3.5. Preprocessor inline arithmetic . . . . . . . . . . . . . . 7
4. Tokenizer Markup . . . . . . . . . . . . . . . . . . . . . . . 8
4.1. Labels . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.2. Case sensitivity . . . . . . . . . . . . . . . . . . . . . 8
4.3. Inline character literals . . . . . . . . . . . . . . . . . 8
5. Conformance . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.1. Recognition of conformance . . . . . . . . . . . . . . . . 8
6. Security Considerations . . . . . . . . . . . . . . . . . . . . 8
7. Normative References . . . . . . . . . . . . . . . . . . . . . 8
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 9
Kuijpers & Beermann [Page 2]
Assembly Syntactics April 2012
1. Introduction
TODO
1.1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
2. Document Markup
2.1. Filename
Assembly files on the DCPU-16 platform SHOULD have a filename suffix
either '.dasm' or '.dasm16'. This is first used by GitHub [GHBP] to
identify DCPU-16 assembly files.
2.2. Lines
An empty line MUST be omitted by the assembler. A line MUST NOT
contain more than one instruction. A line MAY both define a label
and contain an instruction, in this order.
2.3. Indentation and whitepacing
Whitespace MUST be allowed between all elements of a line, including
but not limited to opcodes, values, syntactic characters and
preprocessor directives. Both a space (' ' U+0020) and a tab
(U+0009) are considered whitespace characters.
Indenting instructions is RECOMMENDED. Labels and preprocessor
directives SHOULD NOT indented. NOT indenting labels and
preprocessor directives RECOMMENDED. The assembler MUST NOT mandate
indentation to assemble successfully.
3. Preprocessor Markup
3.1. Comments
Comments are used to add information to the code, making it more
readable and understandable. Comments can consist any character in
any combination. This document specifies one-line comments only.
Any characters following and in the same line of a semicolon (;
U+003B) are comments and MUST be ignored, except when the semicolon
Kuijpers & Beermann [Page 3]
Assembly Syntactics April 2012
resides within the representation of a string. In that case, the
semicolon MUST NOT be treated as a comment.
3.2. Prefix
Every preprocessor directive starts with an identifier. This
identifier is used to distinguish preprocessor directives from other
code.
For historical reasons, directives can either start with a dot (.
U+002E) or a number sign (# U+0023).
Preprocessor directives MUST start with a dot (. U+002E) or a number
sign (# U+0023). Documents SHOULD NOT mix usage of these, assemblers
SHOULD NOT accept mixing these in a single document.
Using a dot is RECOMMENDED to distinguish between C preprocessor
syntax.
3.3. Case insensitivity
Assemblers MUST accept directives, definitions and constants without
regard to case.
3.4. Directives
All directives in this section MUST be handled in order and in
recognition of their position. For unambigiousity a dot (.) is used
here to describe preprocessor directives.
3.4.1. Inclusion
3.4.1.1. Code
.include "file"
.include <file>
The former directive MUST include the file into the current file.
The path is relative to the current file. If the given filename does
not exist compilation MUST be aborted.
The latter includes the file from an implementation defined location,
which may not even exist but trigger certain behaviour, i.e.
inclusion of intrinsics.
Kuijpers & Beermann [Page 4]
Assembly Syntactics April 2012
3.4.1.2. Binary
.incbin "file"
.incbin <file>
incbin MUST include the specified binary as raw, unprocessed data,
the path to the file is relative from the current file. All labels
behind this directive MUST be offset by the size of the file.
The latter form of incbin MUST include the file from an
implementation defined location.
3.4.2. Definitions
.def name [value]
.undef name
def MUST assign the constant value to name. If the value is omitted,
the literal 1 (one) MUST be assumed.
undef MUST remove the given symbol from the namespace. If the given
symbol does not exist compilation SHOULD continue and a warning MAY
be emitted.
3.4.3. Data insertion
.word value [,value...]
.byte value [,value...]
.ascii "string"
word MUST store the values literally and unpacked at the location of
the directive.
byte MUST pack (i.e. two bytes per word, first byte is LSB) the
values at the location of the directive.
ascii MUST store the string unpacked (i.e. character is LSB, one word
per character) at the location of the directive.
3.4.4. Origin relocation
.org address
The org preprocessor directive MUST take an address as the only
argument. Assemblers SHOULD verify the address is 16-bit sized.
Assembler MUST add this address to the address of all labels,
creating a relocation of the program.
Kuijpers & Beermann [Page 5]
Assembly Syntactics April 2012
3.4.5. Macros: macro block and macro insertion
.macro name([param [,param...]])
code
.end
.ins name([param [,param...]])
The macro directive defines a macro, a parametrized block of code
that can be inserted any time later. Parameters, if any, are written
in parentheses seperated by commas (,).
The ins directive MUST insert a formerly defined macros and expands
the parameters of the macro with the comma-seperated parameters
following the name of the macro to insert.
Parameter substitutions can only be constant values and memory
references. Preprocessor directives inside the macro MUST be handled
upon insertion, not definition.
3.4.6. Repeat block
.rep times
code
.end
The code in the repeat-block MUST be repeated the number of times
specified. 'times' MUST be a positive integer. Preprocessor
directives inside the repeat-block MUST be handled when the
repetition is complete, to make allow conditional repetitions.
3.4.7. Conditionals
.if expression
codeTrue
.else
codeElse
.end
isdef(definition)
For the definition of valid expressions, see Section 3.5.
The if clause is REQUIRED. The else clause is OPTIONAL.
If expression consists of a single constant value, then expression =
1 MUST be assumed.
If expression evaluates to 1, the codeTrue-block MUST be assembled,
in any other case codeElse, if an else clause is specified, MUST be
Kuijpers & Beermann [Page 6]
Assembly Syntactics April 2012
assembled.
isdef(symbol) can be used in place of expression. isdef MUST evaluate
to 1 if the given symbol is currently defined, else it MUST evaluate
to 0.
Nesting of if directives MUST be supported.
3.4.8. Error reporting
.error message
Triggers an assembler error with the message, stopping execution of
the assembler. The message SHOULD be shown in combination with the
filename and line number.
3.4.9. Alignment
.align boundary
Aligns code or data on doubleword or other boundary.
The assembler MUST add NOPs (0x0000) to the generated machinecode
until the alignment is correct. The number of words inserted can be
calculated using the formula: 'boundary - (currentPosition %
boundary)' (% indiciates modulus).
3.5. Preprocessor inline arithmetic
Source code can include inline arithmetics anywhere a constant value
is permitted. Inline arithmetic may only consist of + (addition), -
(subtraction), * (multiplication), / (integer division) and %
(modulus), parentheses may be used to group expressions. The
evaluation order MUST be as follows: multiplication, division,
modulus, addition, substraction.
The following logical and bitwise operators MUST also be supported: =
(equal, also ==), != (not equal, also <>), < (smaller than), >
(greater than), <= (smaller or equal), >= (greater or equal), & (bit-
wise AND) ^ (bit-wise XOR), | (bit-wise OR), && (logical AND), ||
(logical OR), ^^ (logical XOR) which MUST be evaluated with respect
to this order.
Inline arithmetic MUST be evaluated as soon as possible, the result
MUST be used as a literal value in place of the expression.
Kuijpers & Beermann [Page 7]
Assembly Syntactics April 2012
4. Tokenizer Markup
4.1. Labels
Labels MUST be single-worded identifiers containing only alphabetical
characters (/[A-Za-z]/), numbers (/[0-9]/) and underscores (_
U+005F). The label MUST represent the address of following
instruction or data. A label MUST NOT start with a number. A label
MUST end with a colon (: U+003A). When the label is used, the
tokenizer MUST translate the label into the address it represents.
Local labels MUST start with a dot (. U+002E) and end with a colon
(: U+003A). Local labels MUST be scoped between the surrounding
global labels. Local labels in different scopes MUST be able to have
the same name.
4.2. Case sensitivity
Assemblers MUST accept registers and opcodes without regard to case.
Assemblers MUST accept labels respecting case.
4.3. Inline character literals
A character surrounded by apostrophes (' U+0029) MUST be interpreted
as its corresponding 7-bit ASCII value in a word (LSB). An assembler
MUST support at least the ascii values ranging from 32 to 126
(printable characters).
5. Conformance
5.1. Recognition of conformance
An assembler, formatter and any other assembly related program that
is fully compliant to 0xSCA MAY label itself "0xSCA compatible".
When using this label, the subject SHOULD include a note of the
version of the RFC it is written against.
6. Security Considerations
This memo has no applicable security considerations.
7. Normative References
[GHBP] Marti, V., "Take Over The Galaxy with GitHub", April 2012,
<https://github.com/blog/
Kuijpers & Beermann [Page 8]
Assembly Syntactics April 2012
1098-take-over-the-galaxy-with-github>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
Authors' Addresses
Jos Kuijpers (editor)
Jarvix
Email: jos@kuijpersvof.nl
URI: http://www.jarvix.org/
Marian Beermann (editor)
Email: public@enkore.de
URI: http://www.enkore.de/
Kuijpers & Beermann [Page 9]
@aubreyrjones
Copy link

There is already a lot of DCPU-16 assembly code with labels defined by starting with a colon. It is the "notch style" and has caught on tremendously quickly. It would probably be acceptable to accept labels either starting or ending with a colon, but not both.

In addition, most compilers output local labels with leading dot. This must be supported.

:global_name
:.local_name
  set pc, global_name

@Jarvix
Copy link
Author

Jarvix commented Apr 14, 2012

It is not hard to rewrite :label to label:, some assemblers could even builtin a rewriter that does this for you.
local labels is an idea indeed. it would be .label: which is common in ASM languages
Notch syntax is odd. moreover, his example was an example. Instead, everyone jumps on top of it and sees it as The Thing. With 0xSCA we want to fight against that because it is odd syntax.

I'll be adding local labels.

@aubreyrjones
Copy link

"With 0xSCA we want to fight against that because it is odd syntax."

Hey, I'm not going to argue about the weirdness of the syntax; it's clearly abnormal.

But, I am concerned somewhat by that tone. If the entire community is using a particular syntax, and dozens of tools have already been built to use that syntax, then it seems quite presumptuous for you to say, "No, you're all doing it wrong!"

I'd just as soon the standard say "any token with a colon in it will be considered a label, stripped of the colon".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment