Skip to content

Instantly share code, notes, and snippets.

@lunacookies
Last active May 7, 2020 01:37
Show Gist options
  • Save lunacookies/760300c80245632d7c73e3c326ac6c4a to your computer and use it in GitHub Desktop.
Save lunacookies/760300c80245632d7c73e3c326ac6c4a to your computer and use it in GitHub Desktop.
Some Syntax Ideas

What is this?

I’m working on a new language (Fjord) for a shell (fj). Although I have some ideas of my own for syntax, I’m not sure if they’re a really bad idea or if they’re fine, so I’ve decided to conduct a ‘sanity check’ of sorts by writing some preliminary ideas down here. Please respond down in the comments with any thoughts you have!

General philosophy

Note: throughout this document I’ll refer to functions, which are what I’m calling commands.

As this is a language for a shell,

  • brevity is extremely highly valued (the more common something is, the easier it should be to type)
  • function calls are more common than anything else
  • string interpolation is also pretty common

Grouping

Expressions can be grouped so that they are evaluated first by wrapping them in parentheses:

(1 + 2) * 5

Function calls

Inspired by languages like Haskell and ML, I think that function calls should use simple juxtaposition:

add 1 2

This greatly reduces typing, which is important, as function calls are the most common operation in a shell. Imagine instead of typing ls /path/to/dir1 /path/to/dir2 you had to use the traditional syntax to call a function, and had to type ls(/path/to/dir1, /path/to/dir2)!

Variable usages

Those same languages from before, Haskell and ML, don’t differentiate syntactically between a variable and a function call without parameters. I would like to avoid this for a number of reasons:

  • it makes implementation a bit more complex
  • syntax highlighting that differentiates between functions and variables becomes extremely complex, if not impossible, to do accurately
  • using a variable has a much lower potential cost than calling a function – it is easier to see parts of the program that might be slow if they are different

Since functions are more often used than variables in the context of a shell, I decided to add syntax beyond just writing their name to variables, rather than functions.

The obvious choice is to prefix variable names with $, as this is used by all kinds of languages for one purpose or another: PHP, Perl, every shell I’ve ever seen, Swift (state properties), Rust (macro_rules!), the list goes on. However, after seeing that the Rust crate quote uses # to interpolate variables, I realised:

  • # is probably a better choice than $ since it’s easier to type
  • I can use any syntax I want – the choice isn’t so obvious

After some experimentation, I think that prefixing variable names with . is the ‘best’ choice, without looking too out of place. What do you think?

Case conventions

How to separate words in different case conventions:

  • snake_case: hold shift, press the hyphen/underscore key, let go of shift before you start typing the next word
  • kebab-case: press the hyphen/underscore key
  • camelCase: hold shift, type the first character of the next word, let go before you type the next character

The final shell will hopefully have tab-completion that supports case-insensitivity, so here is what that list would look like if case isn’t a consideration

  • snake_case: hold shift, press the hyphen/underscore key, let go of shift before you start typing the next word
  • kebab-case: press the hyphen/underscore key
  • camelCase: nothing

Of course, the tab completion could also intelligently convert hyphens to underscores and vice-versa:

  • snake_case: press the hyphen/underscore key
  • kebab-case: press the hyphen/underscore key
  • camelCase: nothing

Camel case is still the easiest to type choice, so that’s what I’ve decided on. What’s your opinion?

Variable definitions

This one is pretty obvious to me:

let name = value

But I guess the equals sign is implied …

let name value

Do you even need let?

name value

Now this looks exactly like a function call. Maybe it’s better with just the equals sign?

name = value

This is the most concise choice, and is also familiar to users of Haskell, Python and Ruby (and probably others). Or is there a better option I haven’t considered?

Function definitions

Most languages have a separate syntax for defining functions and variables:

// JavaScript

function name(param) {
    body
}

var variable = value; // ‘var’ could also be ‘let’ to declare a constant
// Rust

fn name(param: SomeType) -> AnotherType {
    body
}

let variable = value;

Lots of modern languages have support for lambdas, closures, function literals, anonymous functions, whatever you call them. This leads to a duplication of the ways to define a function:

// Rust

// using normal function declaration syntax
fn name(param: SomeType) -> AnotherType {
    body
}

// using a closure
let name = |param| {
    body
};
// JavaScript

function name(param) {
    body
}

// using an arrow function
let name = (param) => {
    body
};
# Python

def name(param):
    body

# using a lambda
name = lambda param : body

Although the ‘anonymous function form’ of all these examples is usually limited in some way compared to normal functions that would stop you from using them for everything, I still found this a little annoying.

Haskell uses a very similar syntax for function and variable definitions:

varName = value
functionName param = body

Haskell also, however, also has the same duplication from before, with a lambda syntax of its own:

name param = body

-- using an anonymous function
name = (\param -> body)

I really like the look and brevity of Haskell’s normal function definitions, but want to avoid the ‘duplication’ with anonymous functions. Initially something like this springs to mind:

name = param {
    body
}

But I plan on adding support for block expressions, which would mean that the syntax above would be confused with a function call with the name param and a parameter whose value is whatever body evaluates to.

So maybe something like this?

name = |param| {
    body
}

It is easier to type almost any character other than |, though:

name = fn param {
    body
}

Maybe it shouldn’t be required that function definitions use a block expression though?

name = fn param body

Some kind of separation would be nice:

name = fn param -> body

Although the arrow looks very nice, it would be easier to type if : is used instead:

name = fn param : body

Now we don’t need the fn any more:

name = param : body

Here’s what this syntax looks like with multiple parameters:

name = param1 param2 : body

Those block expressions can of course be used for the body of a function:

name = param1 param2 : {
    do
    lots
    of
    stuff
}

What do you think function definitions should look like? Do you think that the duplication of syntaxes for normal and anonymous functions is needed for some reason?

Strings

Simple enough: use double quotes. I’ve decided against single quotes because too many strings contain single quotes themselves, which would all require escaping.

Personally, I don’t like how some languages give you the choice of quotes, because it leads to inconsistency.

String interpolation

I think string interpolation deserves a special syntax for the string itself, like Python does in its f-strings:

"Hello, Sarah!"   # literal
f"Hello, {name}!" # interpolation

I like how it only takes one extra character to create an f-string, so this is something I hope to copy for Fjord. Besides, string interpolation is one of the most common tasks in a shell. This is in contrast to Ruby, where strings that have interpolations don’t get any differentiation from literals:

"Hello, Sarah!"   # literal
"Hello, #{name}!" # interpolation

Swift has the same ‘problem’:

"Hello, Sarah!"   // literal
"Hello, \(name)!" // interpolation

Here is what string interpolations could look like for Fjord, using the variable usage syntax from before:

f"Hello, .name!"

But this doesn’t allow for arbitrary expressions like all the syntaxes above do, only variables, so some kind of delimiter around the interpolation is needed. After playing around with it for a bit, I came to the conclusion that the curly-brace syntax that Python uses is my favourite.

f"Hello, {.name}!"

This is two extra characters compared to a traditional shell, such as bash in this example:

"Hello, $name!"

This is a little misleading, however, as that syntax can’t be used for interpolating any arbitrary expression like Fjord’s can:

f"Hello, {getUserName}!"

The syntax to do the same thing in bash takes the same number of characters:

"Hello, $(getusername)!" # I wrote it in lowercase because camel case command names look wrong

Do you think that strings containing interpolations should get a different syntax to literals? Do you think that you should be able to interpolate any expression, or is being able to interpolate just variables enough? Are you a fan of Python’s f-string syntax that I nicked for Fjord?

Options and named function parameters

A convention has arisen over the decades for passing options to commands:

$ command -o # short option name
$ command --option # long option name
$ command --speed=25 # option with value
$ command --speed 25 # most commands support using a space instead of =
$ command --flag --speed 25 "positional arguments follow options"

This isn’t set in stone, so sometimes I’m caught off guard by a command that doesn’t completely follow the convention:

$ find . -name '*foo*' # for some reason find uses a single dash for option names
$ command --speed=25 --speed 25 # some commands accept only one of these forms
$ ls /path/to/directory --long # the GNU utilities support putting options after
                               # positional arguments, while most commands don’t

I’ve realised that the whole ‘options’ convention that has appeared over time bears a striking resemblance to the handling of function parameters in some languages. Apart from being able to pass an option without a value (this can be viewed as equivalent to setting it to true), options are exactly like named parameters which can have default values.

As my main programming language is Rust (which doesn’t have named or default parameters), I’m not really familiar with these concepts. Python’s approach to named parameters and default paramter values seems very reasonable, so maybe Fjord could imitate it:

downloadUrl = url timeout=5 httpVersion=1.1 : doTheThing

These are all equivalent:

downloadUrl "https://google.com"
downloadUrl url="https://google.com" 5
downloadUrl timeout=5 url="https://google.com"

I’m not so sure about that = without any space around it – it kind of irks me. Here’s what it looks like with Swift-/Ruby-style colons:

downloadUrl = url timeout: 5 httpVersion=1.1 : doTheThing
downloadUrl "https://google.com"
downloadUrl url: "https://google.com" 5
downloadUrl timeout: 5 url: "https://google.com"

It looks a little strange to me without commas separating the arguments, so I think I prefer Python’s style for now.

This still doesn’t take into account how the command option convention has short option names and how if you don’t give an option a value it’s equivalent to setting its value to true. If we ignore those two features, here’s what a call to ls that uses a few different parameters could look like:

ls all=true long=true color=never /path/to/dir1 /path/to/dir2

Here’s what that looks like in a traditional shell:

$ ls -al --color=never /path/to/dir1 /path/to/dir2

# With long option names
$ ls --all --long --color=never /path/to/dir1 /path/to/dir2

Much, much cleaner. I’m not really sure how to integrate short named parameters and if-you-pass-a-named-paramter-without-a-value-it’s-a-boolean-true, so if you have any ideas, I’d appreciate it.

@tobimd
Copy link

tobimd commented Mar 30, 2020

I think you might have misunderstood me: I was saying that I like using alphabetic words for !, && and ||, but not for < > == != <= >=. So > would still be taken.

You are right, sorry.

5 gt 6 eq false

is just too horrible. What do you think?

Personally I don't mind, but I guess a mayor problem would be the inconsistency of having some logical operators being symbols and others being words. So either all words or none would be the best, I guess.

I think I would prefer using -> or :: or some other symbol to separate parameters from the function body while keeping Python-style operators.

Now that I think of it, lets assume we decide on having !, && and || with the Python-style operators (not, and & or) with the rest being their respective symbols.

If we were to use > as a separator for parameters and body inside a function definition, would it be a problem when deciding if it's being used as a logical operator or as a separator? Because, technically, to be used as a logical operator, there needs to be a literal value or a variable (which we know will be preceeded by a dot .) on both sides of the operator. So when defining a function, there would just be parameters without a . (dot) and no literals.

I guess the pros are that it's just one character to type, and relatively easy to use (can be written with just one hand), and the cons are that looking at it for the first time, may lead to confusion as to why there are 2 uses for that symbol ("Why are we comparing a parameter with something unrelated?").

Please correct me if I'm wrong, it's hard to keep track of all the possible outcomes and uses that may break the idea. Also, on a side note, even though we can't assume everyone will use syntax highlighting, most of these issues can be resolved by clear syntax highlighting, so I think it's a good idea to keep in mind that. I mean, we all know we can't start coding on a new editor without using syntax highlighting because it just burns our eyes otherwise.

@omega16
Copy link

omega16 commented Mar 31, 2020

I think much of this can be decided based on language interns.

Functions

Suppose you support partial application. Then it makes more sense to use a currying like syntax, and threat variables as constants functions.

At least in the sense you can use normal like reduction.

By example :

Suppose we have

g a1 a2 a3 = a1*a2*a3
f a b = g (a+b)

And then we want to evaluate

f k l m z
  1. Take the left most outer most (in this case "f")
  2. Check it's arity ( f arity is 2)
  3. Take from left to right as much as needed to evaluate ( in this case take "k" and "l") or as much as you can if it needs more arguments than used
  4. Apply or partial apply to the chosen arguments (in this case just "f k l" = k+l)
  5. Substitute the result of application or partial application and repeat until you don't have more arguments

Complete process :

f k l m z
(f k l) m z
g (k+l) m z
(k+l)*m*z

of course you need to first desugar things like (k+l) to (+ k l), supposed this it ends like

(k+l)*m*z
(* (* (+ k l) m) z)

To evaluate you just need to check if "k" and "l" makes sense in this context to be "+" arguments, so if "+" is the usual "+" on integers (no overload) it's time to check if arity of "k" is 0 and if it's value is of integer kind then change k for it's value (same for l).

What's my point? if you would use normal reduction for functions, then things like partial application, currying and no distinction between functions and variables makes sense ( whit currying it makes much more sense to use some "$" like Haskell operator ).

That means some syntax would mean more or less work to translate for your language inters.

As so. I recommend to first choose an evaluation model, then choose a syntax that translates best to your model (or not, but in that case you would like to define as explicit as you can the translation process) . Then you would know what syntax sugar is convenient and what not. All based on ease of use and ease to implement.

Blocks

If the language would be used as shell, well, It would be used on some command line?
If so, some syntax to the blocks that makes sense (at least to me) is to allow some indentation or line break rules.

By example , I like the bash assumption of extension to next line for the block, so you can do some like

for i in "abcdef"
do 
echo $i
done

in the command line. I like it since much of the time I forgot the "do" , and use of next line allows me to add it after the line break.

Strings and named commands

Is like for functions, How will you use them? Will you like to encode strings in some easy mutable structure? Will you generate a new string every time you use them? How would the command be passed to external process? As it comes? Some processing that allow other features?
Thinking of how it works inside, would allow you to choose between a lot of syntax or at least to reduce to simple cases when you can use your own taste to choose.

@lunacookies
Copy link
Author

@tubi-carrillo

I think you might have misunderstood me: I was saying that I like using alphabetic words for !, && and ||, but not for < > == != <= >=. So > would still be taken.

You are right, sorry.

No problem :)

a mayor problem would be the inconsistency of having some logical operators being symbols and others being words. So either all words or none would be the best, I guess.

Hmm, I hadn’t thought about that aspect. The main reason I’d prefer for it to be mixed is that is easier to type, and is easier to understand. Operators like < and == have an easily and well-understood connection with their meaning, while, to me at least, &&, || and ! seem more arbitrary.

If we were to use > as a separator for parameters and body inside a function definition, would it be a problem when deciding if it's being used as a logical operator or as a separator? Because, technically, to be used as a logical operator, there needs to be a literal value or a variable (which we know will be preceeded by a dot .) on both sides of the operator. So when defining a function, there would just be parameters without a . (dot) and no literals.

Yep, you’re right, there wouldn’t be any ambiguity.

I guess the pros are that it's just one character to type, and relatively easy to use (can be written with just one hand), and the cons are that looking at it for the first time, may lead to confusion as to why there are 2 uses for that symbol ("Why are we comparing a parameter with something unrelated?").

I see this as the main downside of using > for this purpose. IMO -> conveys more ‘arrowness’, while > looks more like an arbitrary symbol, but I’ll have to play around with it for a bit to decide. I guess an upside to using -> is that it’s already used in Haskell for function type signatures and lambdas, but that doesn’t really matter because it’s just a matter of getting used to it.

Also, on a side note, even though we can't assume everyone will use syntax highlighting, most of these issues can be resolved by clear syntax highlighting, so I think it's a good idea to keep in mind that.

Good point, they could be highlighted differently. I’ve noticed that Xcode highlights operators in Swift as a function from outside of your project, which makes sense as they’re actually functions that can be overloaded. Maybe binary and unary operators can be highlighted as language functions, while all the other ones (. for variables and whatever gets chosen for function parameter/body separation) can be highlighted as punctuation. Sorry, I’m getting ahead of myself :)

@lunacookies
Copy link
Author

lunacookies commented Mar 31, 2020

@omega16

Functions

Suppose you support partial application.

Would you say this is a useful feature to have in a shell?

Then it makes more sense to use a currying like syntax, and threat variables as constants functions.

At least in the sense you can use normal like reduction.

By example :

Suppose we have

g a1 a2 a3 = a1*a2*a3
f a b = g (a+b)

And then we want to evaluate

f k l m z
  1. Take the left most outer most (in this case "f")
  2. Check it's arity ( f arity is 2)
  3. Take from left to right as much as needed to evaluate ( in this case take "k" and "l") or as much as you can if it needs more arguments than used
  4. Apply or partial apply to the chosen arguments (in this case just "f k l" = k+l)
  5. Substitute the result of application or partial application and repeat until you don't have more arguments

Complete process :

f k l m z
(f k l) m z
g (k+l) m z
(k+l)*m*z

It was really interesting to read through this, thank you for that! I haven’t really looked into currying and partial application, but it definitely seems very interesting.

of course you need to first desugar things like (k+l) to (+ k l), supposed this it ends like

(k+l)*m*z
(* (* (+ k l) m) z)

To evaluate you just need to check if "k" and "l" makes sense in this context to be "+" arguments, so if "+" is the usual "+" on integers (no overload) it's time to check if arity of "k" is 0 and if it's value is of integer kind then change k for it's value (same for l).

What's my point? if you would use normal reduction for functions, then things like partial application, currying and no distinction between functions and variables makes sense ( whit currying it makes much more sense to use some "$" like Haskell operator ).

That means some syntax would mean more or less work to translate for your language inters.

I get what you’re saying, but don’t you think it’s useful to visually see the difference (if it’s different syntactically then syntax highlighting can change colour here too) between constant access (cheap) vs function call (potentially expensive)?

As so. I recommend to first choose an evaluation model, then choose a syntax that translates best to your model (or not, but in that case you would like to define as explicit as you can the translation process) . Then you would know what syntax sugar is convenient and what not. All based on ease of use and ease to implement.

This sounds like a good idea to me. Is there some kind of list (maybe list of categories?) of evaluation models, so that I can see which might be most appropriate for a shell?

Blocks

If the language would be used as shell, well, It would be used on some command line?

Yes.

If so, some syntax to the blocks that makes sense (at least to me) is to allow some indentation or line break rules.

By example , I like the bash assumption of extension to next line for the block, so you can do some like

for i in "abcdef"
do 
echo $i
done

in the command line. I like it since much of the time I forgot the "do" , and use of next line allows me to add it after the line break.

A while ago I was thinking about how I could possibly add that feature that all REPLs seem to have, where they go into a special mode when they are waiting for you to close something, e.g. a fi in bash or the end of an indentation level in Python. I realised that an easy way to add something similar to this is to create a keybind that adds a newline in the input itself, allowing the user to split their input into multiple lines. I was thinking that maybe ctrl-enter might be a nice option for this.

Strings and named commands

Is like for functions, How will you use them?

Named parameters will be used mainly to slightly alter the behaviour of functions, e.g. an ls function may have a named parameter all that, when set to true, also shows hidden files.

Will you like to encode strings in some easy mutable structure? Will you generate a new string every time you use them?

Sorry, I’m not really sure what you mean by this.

How would the command be passed to external process? As it comes? Some processing that allow other features?

I’m really undecided about how to handle external programs at the moment, but I’m leaning towards translating the syntax from Fjord to the convention that most commands seem to use today.

Thinking of how it works inside, would allow you to choose between a lot of syntax or at least to reduce to simple cases when you can use your own taste to choose.

I would definitely prefer to have as little syntax as possible, mainly because it simplifies the language and makes it easier to type.

@liljencrantz
Copy link

That’s an interesting design idea, I’ve never heard about that before. What would happen if you went to ‘execute’ (1 2 3 4)? Would it ‘return’ 1? Or would it throw an error that 1 doesn’t take any parameters?

If you do that in Lisp, Lisp will be cross with you. In crush, I've done thing differently.

First of all, I added a val command val that simply outputs it's argument, so val 3 puts 3 in the stream. Secondly, if a list has exactly one element, and that element is not executable, then it gets implicitly converted into a call to val with that element. That means that in cĆrush, you can write my_variable := (find /); my_variable | head 1. The first command will create a long running thread that outputs all files in your system, and it will stick the table_stream containing that output into the variable my_variable. The second command will send that stream as input to the head command, which will output the first line. Note that you can execute the head command like that many times, and each time you do, it will output the next line of the stream. Also note that the find thread will output a few lines until the channel becomes full and then block until somebody starts consuming the channel.

And what if you are just inspecting a variable? I assume that, if it is a REPL, you could use the following:

crush> var_name
value

Yup, that worls because of the implicit val thing I mentioned above.

Or is every variable a ‘function’ that takes no parameters? My mind is kind of broken by this :)

I am old. I have stopped caring about which case convention people use, I just really wish they pick a consistent one and stuck with it.

I definitely agree that consistency is more important than anything else – I just thought that if I needed to pick one, I might as well pick one suited to the use case. What do you think of making the use of a case convention other than ‘the chosen one’ a warning?

Makes sense. I wish compilers did that in general.

To catch bugs from typos fast. Exactly how in. Rust, you have to declare a variable using let before you're allowed to reassign it using =.

How does having a separate syntax for declaration and assignment prevent typos? Is it that it catches a situation in which you meant to declare a variable earlier, but forgot to, so try to assign without declaration?

It protects you against things like

crush> fooo := X
crush> foo = Y # Typo, I meant fooo here!

But even more importantly, it protects you against

crush> fooo := X
# 10000 lines of unrelated code
crush> fooo := Z # This now becomes an new variable, so we will not accidentally clobber the other fooo. Thanks, explicit declarations!

It makes sense to me in Rust, though, because the whole mutablity/immutability thing, as well as the type, has to be decided at the binding’s entry point.

If that was the only reason, we'd be able to use foo = bar in situations where you don't need mutability, and we could have the syntax of mut foo = bar in the much rarer cases where mutability is needed.

I think long/short options and prefixing options with -/-- are to be considered workarounds, rather than good designs. I could imagine adding prefix matching on option names, like in GNU getopt_long, so if you only have one argument that begins with the letter 'a', it would be enough to say a=fnurple instead of add-humongous-cow=fnurple, but in general, I think that emulating getopt-style argument passing is a bad idea.

I think that whole ‘unambiguous prefix’ thing is a recipe for unmaintainable code, so I’ll be avoiding that!

getopt-long has had that feature since forever, and people seem to not be abusing it too badly. But maybe that's because they also have the option of using the short options, hard to say.

But maybe with tab completion a lack of short option names won’t be such a hindrance. What about options without values? Do you think it’s worth adding a shorthand for =true, simply because of how common it is?

What would you think of such a shorthand:

ls all=true long=true
ls all= long=

+1, I have been thinking of making --foo be an alias for foo=true which is exactly the same idea. I think your suggestion looks nicer, but I don't see a way to make it parseable. Because of the lack of separators between expressions, it's impossible to use the same operator both as a infix and postfix operator. That's also why Crush needs to use neg instead of - for negating a number - - is already used as an infix subtraction operator.

If you're planning on using . as a sigil for variables, how are you planning on talking about file names? Like, if I want to touch the file 'foo.lock' what do I write? How about wildcards? How do you do the equivalent of cat *.txt?

I plan on adding some kind of a special syntax for filenames, possibly using single quotes:

Interesting. Right now, crush allows you to do exactly that. 'foo' is a file named foo.

But also a bit inconvenient, no? Super-common shell operations turn into a chore, e.g. cd .. becomes cd '..', cat foo.txt becomes cat 'foo.txt'. Not a deal breaker, but definitely feels annoying enough to avoid for me.

Crush mostly solves this by also importing the content of the current directory into your namespace, so if you have a file named foo.txt in your current directory, that means that there is a variable named foo.txt in your namespace. Files in Crush support a / operator that works like the .operator in e.g. Rust, but does file lookup instead of regular member lookup.

@liljencrantz
Copy link

liljencrantz commented Mar 31, 2020

@arzg
A while ago I was thinking about how I could possibly add that feature that all REPLs seem to have, where they go into a special mode when they are waiting for you to close something, e.g. a fi in bash or the end of an indentation level in Python. I realised that an easy way to add something similar to this is to create a keybind that adds a newline in the input itself, allowing the user to split their input into multiple lines. I was thinking that maybe ctrl-enter might be a nice option for this.

Try out fish. It detects if you have an unterminated block command and it's editor goes into multiline mode. You can still move the cursor between lines and edit the whole command.

@lunacookies
Copy link
Author

@liljencrantz

That’s an interesting design idea, I’ve never heard about that before. What would happen if you went to ‘execute’ (1 2 3 4)? Would it ‘return’ 1? Or would it throw an error that 1 doesn’t take any parameters?

If you do that in Lisp, Lisp will be cross with you. In crush, I've done thing differently.

First of all, I added a val command val that simply outputs it's argument, so val 3 puts 3 in the stream. Secondly, if a list has exactly one element, and that element is not executable, then it gets implicitly converted into a call to val with that element. That means that in cĆrush, you can write my_variable := (find /); my_variable | head 1. The first command will create a long running thread that outputs all files in your system, and it will stick the table_stream containing that output into the variable my_variable. The second command will send that stream as input to the head command, which will output the first line. Note that you can execute the head command like that many times, and each time you do, it will output the next line of the stream. Also note that the find thread will output a few lines until the channel becomes full and then block until somebody starts consuming the channel.

And what if you are just inspecting a variable? I assume that, if it is a REPL, you could use the following:

crush> var_name
value

Yup, that worls because of the implicit val thing I mentioned above.

Do you think that this might be overcomplicating things? Because it sure seems very complex to me :)

I am old. I have stopped caring about which case convention people use, I just really wish they pick a consistent one and stuck with it.

I definitely agree that consistency is more important than anything else – I just thought that if I needed to pick one, I might as well pick one suited to the use case. What do you think of making the use of a case convention other than ‘the chosen one’ a warning?

Makes sense. I wish compilers did that in general.

At the moment Fjord doesn’t have a warnings system, so you have to use CAMEL CASE THE ONLY CASE CONVENTION or your program will error out :)

To catch bugs from typos fast. Exactly how in. Rust, you have to declare a variable using let before you're allowed to reassign it using =.

How does having a separate syntax for declaration and assignment prevent typos? Is it that it catches a situation in which you meant to declare a variable earlier, but forgot to, so try to assign without declaration?

It protects you against things like

crush> fooo := X
crush> foo = Y # Typo, I meant fooo here!

But even more importantly, it protects you against

crush> fooo := X
# 10000 lines of unrelated code
crush> fooo := Z # This now becomes an new variable, so we will not accidentally clobber the other fooo. Thanks, explicit declarations!

Is something like this an example?

let mut counter = 0;
for i in 0..100 {
    counter += 1;
    let counter = 0; // This doesn’t have any effect because it’s a new variable that disappears at the end of the scope
}

I think long/short options and prefixing options with -/-- are to be considered workarounds, rather than good designs. I could imagine adding prefix matching on option names, like in GNU getopt_long, so if you only have one argument that begins with the letter 'a', it would be enough to say a=fnurple instead of add-humongous-cow=fnurple, but in general, I think that emulating getopt-style argument passing is a bad idea.

I think that whole ‘unambiguous prefix’ thing is a recipe for unmaintainable code, so I’ll be avoiding that!

getopt-long has had that feature since forever, and people seem to not be abusing it too badly. But maybe that's because they also have the option of using the short options, hard to say.

But maybe with tab completion a lack of short option names won’t be such a hindrance. What about options without values? Do you think it’s worth adding a shorthand for =true, simply because of how common it is?

What would you think of such a shorthand:

ls all=true long=true
ls all= long=

+1, I have been thinking of making --foo be an alias for foo=true which is exactly the same idea.

I was thinking of making it so that when you tab-complete a named parameter, it automatically inserts the = ready for you to give it a value. This would make it easer to default to true.

I think your suggestion looks nicer, but I don't see a way to make it parseable.

I’m not really sure I understand why this isn’t parseable though. If no whitespace is allowed around the =, if it sees a space after = then it knows to default to true.

Because of the lack of separators between expressions, it's impossible to use the same operator both as a infix and postfix operator. That's also why Crush needs to use neg instead of - for negating a number - - is already used as an infix subtraction operator.

Don’t lots of languages use - for negation and minus though?

If you're planning on using . as a sigil for variables, how are you planning on talking about file names? Like, if I want to touch the file 'foo.lock' what do I write? How about wildcards? How do you do the equivalent of cat *.txt?

I plan on adding some kind of a special syntax for filenames, possibly using single quotes:

Interesting. Right now, crush allows you to do exactly that. 'foo' is a file named foo.

But also a bit inconvenient, no? Super-common shell operations turn into a chore, e.g. cd .. becomes cd '..', cat foo.txt becomes cat 'foo.txt'. Not a deal breaker, but definitely feels annoying enough to avoid for me.

That definitely seems quite annoying. I would like to avoid having to escape common filename characters like (, and ) because it seems so … wrong to have to do that. The only way I see that would allow avoiding this is some kind of delimiter before and after a filename, which takes up at least two extra characters.

Well, the closing quote could be auto-inserted (like the pair matching in most editors), and maybe the opening one could be part of the tab-completion suggestion?

Crush mostly solves this by also importing the content of the current directory into your namespace, so if you have a file named foo.txt in your current directory, that means that there is a variable named foo.txt in your namespace. Files in Crush support a / operator that works like the .operator in e.g. Rust, but does file lookup instead of regular member lookup.

That does seem like a cool idea, but I fear that it might be quite fragile – imagine not being able to define a variable if there happens to be a file with that name in the current directory.

@lunacookies
Copy link
Author

@liljencrantz

A while ago I was thinking about how I could possibly add that feature that all REPLs seem to have, where they go into a special mode when they are waiting for you to close something, e.g. a fi in bash or the end of an indentation level in Python. I realised that an easy way to add something similar to this is to create a keybind that adds a newline in the input itself, allowing the user to split their input into multiple lines. I was thinking that maybe ctrl-enter might be a nice option for this.

Try out fish. It detects if you have an unterminated block command and it's editor goes into multiline mode. You can still move the cursor between lines and edit the whole command.

The problem I have is I’m not sure how to detect an unterminated block command. I’m using the Nom library for parsing, which has a bunch of parsers in the category of ‘streaming’. I believe this is made for networking, where all the data may not have arrived yet, so the parser can say that more data is needed. Perhaps this could be used to model an incomplete user input?

@liljencrantz
Copy link

Do you think that this might be overcomplicating things? Because it sure seems very complex to me :)

Maybe. I feel the syntax is really convenient and almost always does what you want, though.

Is something like this an example?

let mut counter = 0;
for i in 0..100 {
    counter += 1;
    let counter = 0; // This doesn’t have any effect because it’s a new variable that disappears at the end of the scope
}

I guess. If you're dealing with multiple scopes, I believe it's really important to be able to differentiate between reassigning to a variable in an outer scope vs declaring a new variable in an inner scope that has the same name as some other variable.

I’m not really sure I understand why this isn’t parseable though. If no whitespace is allowed around the =, if it sees a space after = then it knows to default to true.

But then you've suddenly made your language whitespace sensitive in a way that I believe will trip up a lot of people. If foo bar = baz is different from foo bar=baz, there will be endless errors as a result.

Because of the lack of separators between expressions, it's impossible to use the same operator both as a infix and postfix operator. That's also why Crush needs to use neg instead of - for negating a number - - is already used as an infix subtraction operator.

Don’t lots of languages use - for negation and minus though?

Those languages separate function call arguments with something, usually commas. This is unambigous:

foo 1, 2, -3

This is ambiguous unless you make your language white space sensitive:

foo 1 2 - 3

I really don't want to make the language more whitespace sensitive than absolutely needed, because that tends to come with huge cans of worms.

That definitely seems quite annoying. I would like to avoid having to escape common filename characters like (, and ) because it seems so … wrong to have to do that. The only way I see that would allow avoiding this is some kind of delimiter before and after a filename, which takes up at least two extra characters.

Well, the closing quote could be auto-inserted (like the pair matching in most editors), and maybe the opening one could be part of the tab-completion suggestion?

Crush mostly solves this by also importing the content of the current directory into your namespace, so if you have a file named foo.txt in your current directory, that means that there is a variable named foo.txt in your namespace. Files in Crush support a / operator that works like the .operator in e.g. Rust, but does file lookup instead of regular member lookup.

That does seem like a cool idea, but I fear that it might be quite fragile – imagine not being able to define a variable if there happens to be a file with that name in the current directory.

You can. Filenames live in in an outermost scope, any variable will shadow them. So potentially, you have the opposite problem, that you meant to reference a file and instead you had a variable lying around with the same name. To protect yourself against that, you can simply prefix your filename with ./, and there is no ambiguity.

That said, I definitely consider this file naming solution to be an experimental feature. I'm trying it out to see how well it works in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment