lunacookies/syntax_ideas.md

## syntax_ideas.md

      
    Raw
  

              syntax_ideas.md
            
          
    What is this?

I’m working on a new language (Fjord) for a shell (fj). Although I have some ideas of my own for syntax, I’m not sure if they’re a really bad idea or if they’re fine, so I’ve decided to conduct a ‘sanity check’ of sorts by writing some preliminary ideas down here. Please respond down in the comments with any thoughts you have!
General philosophy

Note: throughout this document I’ll refer to functions, which are what I’m calling commands.
As this is a language for a shell,

brevity is extremely highly valued (the more common something is, the easier it should be to type)
function calls are more common than anything else
string interpolation is also pretty common

Grouping

Expressions can be grouped so that they are evaluated first by wrapping them in parentheses:
(1 + 2) * 5

Function calls

Inspired by languages like Haskell and ML, I think that function calls should use simple juxtaposition:
add 1 2

This greatly reduces typing, which is important, as function calls are the most common operation in a shell. Imagine instead of typing ls /path/to/dir1 /path/to/dir2 you had to use the traditional syntax to call a function, and had to type ls(/path/to/dir1, /path/to/dir2)!
Variable usages

Those same languages from before, Haskell and ML, don’t differentiate syntactically between a variable and a function call without parameters. I would like to avoid this for a number of reasons:

it makes implementation a bit more complex
syntax highlighting that differentiates between functions and variables becomes extremely complex, if not impossible, to do accurately
using a variable has a much lower potential cost than calling a function – it is easier to see parts of the program that might be slow if they are different

Since functions are more often used than variables in the context of a shell, I decided to add syntax beyond just writing their name to variables, rather than functions.
The obvious choice is to prefix variable names with $, as this is used by all kinds of languages for one purpose or another: PHP, Perl, every shell I’ve ever seen, Swift (state properties), Rust (macro_rules!), the list goes on. However, after seeing that the Rust crate quote uses # to interpolate variables, I realised:

# is probably a better choice than $ since it’s easier to type
I can use any syntax I want – the choice isn’t so obvious

After some experimentation, I think that prefixing variable names with . is the ‘best’ choice, without looking too out of place. What do you think?
Case conventions

How to separate words in different case conventions:

snake_case: hold shift, press the hyphen/underscore key, let go of shift before you start typing the next word
kebab-case: press the hyphen/underscore key
camelCase: hold shift, type the first character of the next word, let go before you type the next character

The final shell will hopefully have tab-completion that supports case-insensitivity, so here is what that list would look like if case isn’t a consideration

snake_case: hold shift, press the hyphen/underscore key, let go of shift before you start typing the next word
kebab-case: press the hyphen/underscore key
camelCase: nothing

Of course, the tab completion could also intelligently convert hyphens to underscores and vice-versa:

snake_case: press the hyphen/underscore key
kebab-case: press the hyphen/underscore key
camelCase: nothing

Camel case is still the easiest to type choice, so that’s what I’ve decided on. What’s your opinion?
Variable definitions

This one is pretty obvious to me:
let name = value

But I guess the equals sign is implied …
let name value

Do you even need let?
name value

Now this looks exactly like a function call. Maybe it’s better with just the equals sign?
name = value

This is the most concise choice, and is also familiar to users of Haskell, Python and Ruby (and probably others). Or is there a better option I haven’t considered?
Function definitions

Most languages have a separate syntax for defining functions and variables:
// JavaScript

function name(param) {
    body
}

var variable = value; // ‘var’ could also be ‘let’ to declare a constant
// Rust

fn name(param: SomeType) -> AnotherType {
    body
}

let variable = value;
Lots of modern languages have support for lambdas, closures, function literals, anonymous functions, whatever you call them. This leads to a duplication of the ways to define a function:
// Rust

// using normal function declaration syntax
fn name(param: SomeType) -> AnotherType {
    body
}

// using a closure
let name = |param| {
    body
};
// JavaScript

function name(param) {
    body
}

// using an arrow function
let name = (param) => {
    body
};
# Python

def name(param):
    body

# using a lambda
name = lambda param : body
Although the ‘anonymous function form’ of all these examples is usually limited in some way compared to normal functions that would stop you from using them for everything, I still found this a little annoying.
Haskell uses a very similar syntax for function and variable definitions:
varName = value
functionName param = body
Haskell also, however, also has the same duplication from before, with a lambda syntax of its own:
name param = body

-- using an anonymous function
name = (\param -> body)
I really like the look and brevity of Haskell’s normal function definitions, but want to avoid the ‘duplication’ with anonymous functions. Initially something like this springs to mind:
name = param {
    body
}

But I plan on adding support for block expressions, which would mean that the syntax above would be confused with a function call with the name param and a parameter whose value is whatever body evaluates to.
So maybe something like this?
name = |param| {
    body
}

It is easier to type almost any character other than |, though:
name = fn param {
    body
}

Maybe it shouldn’t be required that function definitions use a block expression though?
name = fn param body

Some kind of separation would be nice:
name = fn param -> body

Although the arrow looks very nice, it would be easier to type if : is used instead:
name = fn param : body

Now we don’t need the fn any more:
name = param : body

Here’s what this syntax looks like with multiple parameters:
name = param1 param2 : body

Those block expressions can of course be used for the body of a function:
name = param1 param2 : {
    do
    lots
    of
    stuff
}

What do you think function definitions should look like? Do you think that the duplication of syntaxes for normal and anonymous functions is needed for some reason?
Strings

Simple enough: use double quotes. I’ve decided against single quotes because too many strings contain single quotes themselves, which would all require escaping.
Personally, I don’t like how some languages give you the choice of quotes, because it leads to inconsistency.
String interpolation

I think string interpolation deserves a special syntax for the string itself, like Python does in its f-strings:
"Hello, Sarah!"   # literal
f"Hello, {name}!" # interpolation
I like how it only takes one extra character to create an f-string, so this is something I hope to copy for Fjord. Besides, string interpolation is one of the most common tasks in a shell. This is in contrast to Ruby, where strings that have interpolations don’t get any differentiation from literals:
"Hello, Sarah!"   # literal
"Hello, #{name}!" # interpolation
Swift has the same ‘problem’:
"Hello, Sarah!"   // literal
"Hello, \(name)!" // interpolation
Here is what string interpolations could look like for Fjord, using the variable usage syntax from before:
f"Hello, .name!"

But this doesn’t allow for arbitrary expressions like all the syntaxes above do, only variables, so some kind of delimiter around the interpolation is needed. After playing around with it for a bit, I came to the conclusion that the curly-brace syntax that Python uses is my favourite.
f"Hello, {.name}!"

This is two extra characters compared to a traditional shell, such as bash in this example:
"Hello, $name!"
This is a little misleading, however, as that syntax can’t be used for interpolating any arbitrary expression like Fjord’s can:
f"Hello, {getUserName}!"

The syntax to do the same thing in bash takes the same number of characters:
"Hello, $(getusername)!" # I wrote it in lowercase because camel case command names look wrong
Do you think that strings containing interpolations should get a different syntax to literals? Do you think that you should be able to interpolate any expression, or is being able to interpolate just variables enough? Are you a fan of Python’s f-string syntax that I nicked for Fjord?
Options and named function parameters

A convention has arisen over the decades for passing options to commands:
$ command -o # short option name
$ command --option # long option name
$ command --speed=25 # option with value
$ command --speed 25 # most commands support using a space instead of =
$ command --flag --speed 25 "positional arguments follow options"
This isn’t set in stone, so sometimes I’m caught off guard by a command that doesn’t completely follow the convention:
$ find . -name '*foo*' # for some reason find uses a single dash for option names
$ command --speed=25 --speed 25 # some commands accept only one of these forms
$ ls /path/to/directory --long # the GNU utilities support putting options after
                               # positional arguments, while most commands don’t
I’ve realised that the whole ‘options’ convention that has appeared over time bears a striking resemblance to the handling of function parameters in some languages. Apart from being able to pass an option without a value (this can be viewed as equivalent to setting it to true), options are exactly like named parameters which can have default values.
As my main programming language is Rust (which doesn’t have named or default parameters), I’m not really familiar with these concepts. Python’s approach to named parameters and default paramter values seems very reasonable, so maybe Fjord could imitate it:
downloadUrl = url timeout=5 httpVersion=1.1 : doTheThing

These are all equivalent:
downloadUrl "https://google.com"
downloadUrl url="https://google.com" 5
downloadUrl timeout=5 url="https://google.com"

I’m not so sure about that = without any space around it – it kind of irks me. Here’s what it looks like with Swift-/Ruby-style colons:
downloadUrl = url timeout: 5 httpVersion=1.1 : doTheThing
downloadUrl "https://google.com"
downloadUrl url: "https://google.com" 5
downloadUrl timeout: 5 url: "https://google.com"

It looks a little strange to me without commas separating the arguments, so I think I prefer Python’s style for now.
This still doesn’t take into account how the command option convention has short option names and how if you don’t give an option a value it’s equivalent to setting its value to true. If we ignore those two features, here’s what a call to ls that uses a few different parameters could look like:
ls all=true long=true color=never /path/to/dir1 /path/to/dir2

Here’s what that looks like in a traditional shell:
$ ls -al --color=never /path/to/dir1 /path/to/dir2

# With long option names
$ ls --all --long --color=never /path/to/dir1 /path/to/dir2
Much, much cleaner. I’m not really sure how to integrate short named parameters and if-you-pass-a-named-paramter-without-a-value-it’s-a-boolean-true, so if you have any ideas, I’d appreciate it.