Skip to content

Instantly share code, notes, and snippets.

@thar0x29a
Last active May 4, 2022 14:07
Show Gist options
  • Save thar0x29a/367e5b92ce47c65b680a01ddf20cfc7a to your computer and use it in GitHub Desktop.
Save thar0x29a/367e5b92ce47c65b680a01ddf20cfc7a to your computer and use it in GitHub Desktop.
Document to show changes made in basm compared to bass.

From bass to basm

This document shows changes in basm compared to the original bass Table Assembler, written by Near.

basm is a combination of the given Table Assembly Engine from bass combined with an new frontend (internally called plek) which is not exactly an assembler, but something that I call an Platypus Compiler: Something between assembly and basic.

The syntax itself mimics the old bass-macro syntax where it could.

Goal

I started this project having the wish to build a foundation for futher bass features that walks up the lane to be more programming language, and less assembler. The current macro framework of bass even struggles to have spaces on 'bad places' and to be honest, I found no way of solving this, but to switch from 'replace stuff' to 'parse stuff'. In other words: Redo the whole frontend from scratch.

A big step that came with a lot of controversy about syntax, coding style and legacy in general.

But in the end there was absolute no way around of it. After all this times bass had changed it's syntax, this is hopefully the last time I have do defend this kind of changes. But lets start!

Lazy writings

I added tons of shortings to pepple our users lazyness.

  • constant can also be const
  • variable can also be var
  • macro or function can also be fun - yes, no hamburgers.
  • architecture can also be arch

Most of this had been around allready. We allow any of them.

Everything is a function

Bass had it's internal reason to mix command's and build-in-functions. This had been removed. Everything is a function now.

print "Hello World\n"  // not working
print("Hello World\n") // working

Include

Include had several special rules in bass. This had been fixed. You can include your stuff where and whenever you want. Even with constructed file-names. Note that an include file will be loaded and parsed just once.

Datatsypes

Since I upgraded bass from an macro- to an script-language it also knows some Datatypes. The type of an variable or constant will be selected implicite by the value that you store inside of it.

Build in Datatypes

  • Integer - internally stored as int64
  • Float - internally stored as double
  • String - internally stored as ascii string - for now.
  • function - stored as hashmap given by its name and signature
  • array - stored as hashmap given by the used key

Custom Datatypes

under construction

It's planed to allow custom datatypes that

  • have accessible attributes (like the name)
  • can be forced on function parameters
  • allow the overloading of syntax features like +-*/ or parentheses.
  • can be detected by helper functions

They can be used to raise given code from assembly to programming language level. For example overloading + could simply put an add A, B line to the output.

Even thought Datatypes are just 'lookup-arrays', they have an huge impact on whats possible in basm, and what if not.

Arrays

'Arrays' are hash-maps. They can be created using the construction function var myarray = Array.new(1, ..). This syntax had been choosen to be as close as possible to the given array functions Array.size() and Array.sort(). On const arrays everything inside is constant. So its not possible to add, change, remove or sort its content. Access is possible with the common myarray[index] syntax. Arrays do not have a fixed size. All array-keys will be thread as strings, other types will be converted.

Nothing

There is an internal NULL-State called nothing for unknown or unsolved references or values. You can check against it, but it is not meant to be directly used by the programming person.

Namespaces

The namespace feature remains nearby unchanged to the given syntax in bass18:

namespace foo.baa {
  fun test() { print "Hello World\n" }
}

foo.baa.test() // works

However. Bass always had an different namespace for different things (var,const,macros,namespaces,arrays, etc..) - this is NOT the case anymore. Everything shares one namespace now.

The global keywords had been removed, thought it might come back as function.

If and Else

if(a==12) { }
else if(b==42) { }
else { }

Nothing special so far, except that the blocks { } are not optional. But they can be empty, thought.

Loops

While

var i = 0
while(i<50) {
  i = i+1

  if(i==5) { continue }
  else if(i==18) { break }
  
  print i
}

While loops had not been changed since bass. continue and break will only be handled by the enclosing while-loop, not inside of the if statements local scope.

For (each)

Since we allow custom array keys there had been the urge to iterate over arrays for each item. The syntax for the for loops differs from the syntax of while loops, but serves the same purpose.

for(var i : Array.new("a", "b", "c", 1, 2, 3, 4, 5)) {
  if(i=="b") { continue }
  if(i==3) { break }
  print("Hello ", i)
}

It is recommend to not manipulate something that is beeing iterated.

Macros / Functions

There are no Macro's. There is only zuul

Macros had been removed and replaced by functions. While using them is nearby identical foo() compared to before, they are not just copy and paste anymore. Each function

  • gets invoked
  • has his own scope
  • will crawl up on the scope tree whenever a symbol is missed/cannot be resolved
  • has an return value
var hello = "Hello "
fun greet(name) {
  return hello + name + "\n"
}

print(greet("World"))  // prints 'Hello World\n'

In other words: It should behave like you would expect it from a function. Ah and yes, overloading is still possible.

Note: macro as keyword is deprecated and might be removed soon.

Function parameters

Function parameters can be declared in various ways.

fun name(<var|const|ref> <custom_type> argument1, ...) { }

Lets go though this one by one.

Functions can share the same name, as long as they dont share the same number of parameters. This is called function overloading. The compiler will select the right one by the number of arguments. For this reason its not possible to declare functions that accepts an dynamic number of parameters. Just use Arrays instead.

Each argument can be declared as var-iable, const-ant or ref-erence. The default is var. While the first two just tell if you can write into the created variable, the last one offers an additional purpose: The content of this argument will not be the passed value, but the name of the used literal as a string.

fun name(ref a) { 
  // a == "foo"
}
name(foo)

Functions can force an custom_type. This got integrated to be used in operator-overloading scenarios. It can also be used for code hardening in general, but please note that basm is not an typed but an script'ish language.

Finally we have the name of the argument.

This

under construction

The 'this' keyword is planed to be used inside of functions and for loops to access the parent data layer.

Assembly

Assembly code can still be 'just' mixed into the script language. Therefore it is critical for the compiler to understand whats assembly, and what is not.

Assembly code may use alot of stuff that looks like script syntax in the first place. Like some commands include dots. Others a wide number of braches. To avoid strange errors it is good to know how bass notices 'hey, this is assembly and not macro syntax' it will follow this rules:

  • It is the beginning of a new line, or an new statement
  • This is no declaration, no label, no assignment, no namespace, and also not a command or anything else that 'starts' a line.
  • Then it must be assembly!

To avoid conflicts within the assembly line, all parameters have to be enclosed by braches {}. Yes. All of them. Im well aware that many assembly users going to hate this but some trades had to be made at some points, and this is one of them.

lda.b {MY_CONST_VALUE}
lda.b {getConfig()}
lda.b {100+12}

Directives

Inserts binary data directly into the target file. db stores 8-bit values, dw stores 16-bit values, dl stored 24-bit values, dd stored 32-bit values and dq stores 64-bit values by default.

The lenght of directives can be changed can be changed by architecture setups. Also there might be some additional keywords registered by them.

All build in datatypes can directly be stored with directives. Including strings and arrays.

db   3, 2, 1
db   "ABCD Hello World"
db   {Array.new(1,2,3,4,5)}

Lookahead

basm is still using an top down parser, but is not an multi pass approach anymore. On the first sign this was no biggie, since ony 'Lookahead' depended on this system. The current feature works a bit different from before.

When Assembly Statements or Directives face an unknown symbol it will be marked as missing, and pre-filled with the current program-counter value. Yes, thats right. With the pc(). If the pc does not have the right bit-size, for your reasons, this might lead to trouble.

After finishing the whole compiling process basm will try to solve the missing statements and rerun the assembly of this specific single instruction. Or fail with an error.

So compared to bass it is not an full multi-pass approach, but an on-point solving attempt.

Evaluations

Evaluations between braches { } have the purpose to allow the construction of an idetifier that would not be possible otherwise.

Evaluations can be understood as 'special syntax' - less an swiss army knife then just a filler that was 'available'. Their effect depends strongly on where you use them.

Left-Side Evaluations

Allows to construct an identifier.

const OFFSET = 1
var {"constructed"+"name"+(OFFSET+2)} = "Hello World"
// equals
var constuctedname3 = "Hello World"

Right-Side Assembly Evaluations

As allready shown in the assembly section:

  var x = 5
  var x2 = 32

  mov ax, {x*x}     // allows calculation
  mov ax, 25        // equals

  mov ax, {"x"+2}   // allows identifier construction
  mov ax, 32        // equals

Inside of assembly lines it's impossible to use any script-syntax or identifier except you wrap it into an Evaluation. If the Result of this Evaluation is an existing Identifier the value will be returned instead??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment