🚧 This document is a work in progress. The spec, examples, and wording are not complete or finalized. Let me know if you find mistakes or anything confusing Discussion is welcome. Feel free to skip the background section if you really want to cut to the chase.
- Background
- Proposal
Here's where i'm coming from.
Language | Used For | Last Used | Used For |
---|---|---|---|
JS/TS | ~3 yr | ongoing | personal & work |
Java | ~2.5 yr | 2021 | work |
C++ | ~1 yr | 2021 | personal |
C# | ~0.67 yr | 2021 | work |
Python | ~0.3 yr | 2019? | personal |
I have read a little about Rust and Kotlin, but have not used them yet.
I have not studied programming language design, and I don't know anything about other OOP languages that I haven't listed above. So perhaps there are better existing solutions to the problems I've listed that I haven't heard of. Some languages have partial solutions to the problems I want to solve, which I will mention later.
Here are the things that I find satisfaction in when reading and writing code:
- Controlling scoping and mutability
- Brevity
- Visual Aesthetic
I am not trying to come up with an "optimal" solution with the freedom to radically depart from the current status quo. I am trying to come up with a terser way to express scoping and mutability constraints in the languages that I have worked with- specifically Java, TypeScript, and C#. I want to minimize non-syntactic changes.
I am not thinking of scripting languages such as Python or JavaScript. I don't feel comfortable enough with C++ to believe that the idea I will propose will be compatible with it.
Here are things I personally feel bothered by concerning scoping, reassignment, and mutability control in the OOP languages I have used.
Some people have linked me to cool Raku things like its is built
and is rw
traits, which I still need time to digest and internalize.
-
Chained keywords such as "
public final
" take up a fair bit of space. For classes with many fields, there end up being big blocks of keywords.- The C++ label style takes up much less space, but it creates another problem: I can't tell what a field's scoping is just by looking at the line it's on. I need to scan upward to find out.
-
The meaning of the word "protected" is not intuitive. Private and public are not as bad, but in a sense, I think they are all "symbolic" in that the word's english meaning does not directly indicate it's meaning in the programming language. Thinking of the conventional keywords as symbols opens up discussion for choosing better symbols.
-
The modifiers don't align column-wise at their right edge. This is really minor, but I feel really unsatisfied whenever I see it happen.
- In Java, and C++, I sometimes need to make trivial getters and setters just to enforce different read/write permissions at different scopes.
- I guess in Java-land and maybe C++, you'd be writing all those accessors anyway, but I don't like the heavy method syntax.
- Kotlin and C#'s properties and TypeScript accessors have cleaner syntax. They allow for specifying different access levels for the getter and setter.
- I like the above property / accessor solutions. They are simple and powerful, but I still think they could be terser.
- C++ (and structs in C#) have the
const
keyword to get an immutable view or copy of an object, but I'm not aware of a similar mechanism in Java-like languages. And sometimes I want to make a field's value publicly readable, but only allow mutations privately- not for any possible performance gains, but for the sake of interface design. Again, the current pervasive OOP languages require some acrobatics to achieve this.- In C++, you could use a const reference, but in languages that don't allow marking methods as non-mutating, you would need to define an interface for an immutable view of an object and make up-casting getters. This includes standard containers and custom containers and classes.
I'm making an exception here to my rule of minimizing non-syntactic changes because controlling mutability is one of my key goals.
Punctuation | General Meaning | How to Remember |
---|---|---|
_ |
Hidden | Some naming conventions use underscores to indicate privacy or internalness |
. |
Readonly | Periods are widely used for object member access |
= |
Read-Write | The equals sign is widely used to assign values to variables and members |
I also like that there's somewhat of a vertical gradient where the shortest punctuation mark has the most strict meaning, and the tallest has the most permissive meaning.
There are alternative choices of characters I thought of including using "r" and "w" for readonly and read-write respectively. I think using letters is equally valid and may help with readability. I like the visual appearance of the punctuation, and that it has zero ties to non-programming languages, which may be more friendly for non-english speakers. On the other hand, on QWERTY keyboards, the period is far away from the underscore and equal sign, which I don't like.
One thing I don't like is that "_
" and ".
" are kind of hard to tell apart. A solution to this could be to use the "$
" sign for read-only, which harkens to bash-scripting.
Inheritance of Hidden-ness for Composition
Any contained thing is hidden at the same scopes for which its container is hidden. Specifically, objects, variables, functions, and namespaces. This should either be enforced by tooling such as the compiler, interpreter, or IDE, or by requiring inherited hidden marks to be omitted.
// Good
_.. namespace NamespaceA {
}
The punctuation marks are arranged in tuples, where each entry corresponds to a scope. The most outer scope is the first entry, and the most inner scope is the last entry.
Here are the kinds of modifiers I have chosen to exclude from this proposal due to their meanings not being associable/variable with multiple scopes. These are better left for each language to decide what keyword to use.
-
function purity/const-ness (restrictions on the ability for a function to change the state of anything declared at the top level).
-
abstractness
-
overriding / implementing
-
Whether a method is allowed to mutate the associated instance.
When applied to top-level variables, top-level functions, namespaces, and classes, they are a 3-tuple of {package-exported, package-internal, file/"module"-internal}.
For functions and namespaces, "=
" is not allowed.
For classes, "=
" is used to control whether the class is extensible at each level.
- TypeScript project references will be considered as packages in this spec.
When applied to local variables, they are a 1-tuple of {local/closure}. By "closure scope", I mean when a variable is captured by a nested function (like JavaScript functions or C++ lambdas).
- (Or maybe it would be useful to separate local and closure modifiers? I'm not sure right now. I haven't thought much about interactions with closures yet. Consider my thoughts here half-half-baked).
When applied to declarations of object fields, they are a 3-tuple or 2-tuple of {public, protected, private} (I haven't decided on whether to include one for internal, which would be a 4-tuple). read-write specifies whether the variable can be reassigned (mutability is a separate matter).
_.
private final_=
private..
public final.=
private with public getter==
public
__.
private final__=
private_..
protected final_.=
private with protected getter_==
protected...
public final..=
private with public getter.==
protected with public getter===
public
When applied to object methods, the tuple has similar meanings to object fields, except "=
" is not allowed.
When applied to type specifiers for instance objects (not functions or primitives), it is a N-tuple where N is the same as the length of the tuple of whatever holds the object (variable, parameter, field, another object).
- These can further restrict the scoping and mutability of an object's publicly accessible fields and methods.
- The "
_
" character will probably not have many use cases here. I would recommend disallowing it. - If "
.
" is used, that means that at that scope, no fields of the object can be reassigned or mutated, and no mutating methods can be called at the given scope.
// TODO
🚧 This section is likely to change or be removed.
Here are some random examples showing what it looks like:
- Some basic usages:
... f0 [..=] num
_.= f0 [...] str
... f0 [.==] ..= obj
- How nesting works:
... f0 [.==] [...] ..= obj
I'm just going to stop writing the Java parallels. You get the picture. They are very long...= f0 [..=] [===] [.==] str
... f0 [.==] [.==] [...] num
- When the field is hidden from public and protected, up to that part of the array access tuple is omitted:
_.. f0 [.=] num
__. f0 [=] .= obj
... f0 [.==] [_..] [.=] [_.] [=] = obj
It will probably be useful to code authors to default object mutability tuples to the tuple of whatever holds the object (variable, field, another object such as a collection or promise).
I won't impose defaults for other parts of this spec. I'd rather leave it up the language. Languages that emphasize convenience make things public and writeable by default, and languages that emphasize safety make things private and readonly by default.
In the case that a language chooses to provide no defaults, a pragma to "#pub_mut_all_the_things
" would probably be useful for scripting purposes.
-
The terseness could possibly hinder readability. There's a lot of meaning packed into just a few characters. It may be hard to unpack all that meaning when skimming code, or when tired.
- My response: Yep. Maybe it's just something that takes time to get used to, and then afterward it would be easy to read? Perhaps IDE's could help with some kind of color coding. It may also help if the tuples are optionally allowed to have spaces between the punctuation marks.
-
With all the options that it exposes, it would be easy to get overwhelmed.
- My response: Yep. It might help to just start off by making things super strict, and then opening things up as needed (as long as you think critically as you open things up).
-
It enables use cases / making guarantees that may not be really useful/meaningful/helpful. Ex. Enforcing that a variable holding an array cannot be reassigned, but allowing the array contents to be mutated.
- My response: That's for sure. Perhaps that's something that's best covered by linting rules or best-practice guides? I want to come up with a list of poor-practice cases before I move forward with any decision making. Help here would be appreciated.
- To be fair, the mechanism I'm proposing doesn't add any new poor-practice cases that couldn't be achieved in other common OOP languages. It just shortens the grammar for doing it... which could be a bad thing for a code author who doesn't know what they want... Hm...
- My response: That's for sure. Perhaps that's something that's best covered by linting rules or best-practice guides? I want to come up with a list of poor-practice cases before I move forward with any decision making. Help here would be appreciated.
Feel to share your thoughts in the discussion section of this gist, or in this reddit thread.
I haven't thought about how this syntax could possibly used to improve method scoping and overriding permissions, but I intend to try. It certainly should not be usable on the array-entry-access operator or the iterator operator, since that should be left for the instance declaration variable to decide.
How could this be integrated into existing codebases? Would that even be possible? I'm guessing it would be similar to a Kotlin situation: Create a language that compiles to Java bytecode.
I need to settle on what to call the entries of the tuples. Ideas: "knob", "scome", "modifier", "permission". I think permission is good, but there are other "permissions" that this spec doesn't cover such as function purity.