Many Cocoa programmers, myself included, want a more modern programming language than Objective-C, and a more modern programming environment than Cocoa. There's been some talk about what would better fit the bill, and engineers are ready to build that solution, but the question first needs to be answered – what are the most important problems to solve?
I've created the following unordered list of hypothetical goals for improvement to Cocoa/Objective-C, and would like to see how important each one is to the community. If you're interested in sharing your opinion, please fork this gist and order the below list. Feel free to add anything that you think might be missing.
Thanks!
- Functional programming, where function application is the main way to invoke code. This does not refer to using objects in a functional style.
- Referential transparency, also sometimes known as purity or immutability.
- Easy parallelization, meaning it should be easy to make code run concurrently without ill effect.
- Algebraic data types, in addition to or instead of Objective-C objects and inheritance.
- Statically typed, to catch type errors at compile-time. This does not necessarily imply a C-like type system – many modern languages support type inference, among other things.
- Powerful metaprogramming, to automate code generation or extend the language with itself.
- Easy to learn (although subjective, it's still interesting to rank it against other goals).
- Self-hosting, where the toolchain is written in the language/environment itself.
- Compiled, instead of interpreted. This could mean compilation to an intermediate language (like Objective-C, C, or LLVM IR), not necessarily machine code.
- Performant – in other words, how important is performance relative to expressiveness and flexibility?
- Not Cocoa-specific, so that it is easy to port or slightly modify code to run without Cocoa frameworks. This mostly just implies the use of a language that already exists and is used for other purposes.
- An existing body of open source code, available for use within a Cocoa project.
- Can be invoked from Cocoa code – not having this implies always invoking Cocoa instead.
- Can invoke Cocoa code – not having this implies always being invoked from Cocoa instead. Note that invoking Cocoa and being invoked from Cocoa are not mutually exclusive (but one is necessary).
- Can define Objective-C classes, protocols, categories, etc. in the language itself. This doesn't imply anything about being able to link to custom Objective-C code.
- Usable for AppKit/UIKit GUIs, in addition to being usable for a model layer (which is a requirement).
Note that this list does not, and is not meant to, capture the advantages and disadvantages of each line item.
Requirement should be compiling down to C (preferably, I'll explain why in a second) or LLVM. The goal with compiling down to C would allow us to embed C right in the language itself, allowing to cover corner cases in much the same way that the asm() construct works in C. This code would be inlined right into the compilation result. This would liberate the language syntax from needing to support anything in the same way C does to meet that interfacing with C goal. Plus, we can write objc apps entirely in C if we're masochistic and missing the point, I think it a fair target for computer generated code. Interfacing with C and writing LLVM IR can be a bit much, though that is the flip side of this suggestion.
Remember, not everything need be compiled ahead of time. You can do as much as you can ahead of time, then with the help of a lightweight profiler, you can detect particular hotspots in your program (that may extend over method call bounds) and compile those just in time, leaving the rest of the code on the slow path (the code that isn't hot); this saves time in the JIT. Additionally the JIT can support a monomorphic and polymorphic inline caches on the language side of things, reducing the cost of hitting libobjc (which would be one layer out). If a call site is seeing more than one type, a fixed number of cache lines at the call site can be used to cache object -> selector -> IMP tuples for faster access. Obviously, special care must be taken here in the context of whatever memory management subsystem is used.
On the memory management note, the language grammar should free the developer as much as possible from having to annote things
__autoreleasing
,__bridge
,__block
by being smarter when analyzing the graph. We shouldn't have to declare these things in most cases if the compiler were smart enough. I believe this a worthy goal as well.Finally, encourage less use of global/instance state. There are many ways one can go about this, for instance, full parametric polymorphism outside of objects (in helper functions). However this creates a duality in the system which increases cognitive burden on users of the language, so probably not a good idea. Alternatively, can simply model only behaviour, and not state, and given access control specifiers (public, private, ...) automatically generate ivars or whatnot for the user, but not allow them direct access to it. There must be a mechanism to subvert these system protections if the user needs, for instance, a custom setter; but this can be achieved with my first suggestion trivially. Merely naming rules have to be given to the generated code.