btbytes/rich-already-answered-that.md

## rich-already-answered-that.md

      
    Raw
  

              rich-already-answered-that.md
            
          
    Rich Already Answered That!

A list of commonly asked questions, design decisions, reasons why Clojure is the way it is as they were answered directly by Rich (even when from many years ago, those answers are pretty much valid today!).
Feel free to point friends and colleagues here next time they ask (again). Answers are pasted verbatim (I've made small adjustments for readibility, but never changed a sentence) from mailing lists, articles, chats.
How to use:

The link below in the summary jumps at the answer on this page.
The link on the question itself points back at the original post.

Summary

Should I use a namespace to bundle functions to model a domain object like a "customer" or "product"?
Why no pattern matching?
Why no tail call optimization?
Why you don't accept pull requests?
Are there cons cells in Clojure?
Why Clojure does not implement something like CLOS?
deftype, gen-class, proxy, reify, etc. Why there are so many ways to define a type?
Couldn't locals be mutable?
Why there is a "rest" and a "next"?
What is the difference between collections and sequences?
Why lazy sequences are cached?
Aren't chunked sequences completely subverting the point of laziness?
Why cannot "last" be fast on vector?
Why Clojure is not licensed under GPL, MIT, etc?
I want to study SICP and use Clojure instead of Scheme. Is that a good idea?
Why did you choose to make your language a Lisp dynamically typed as opposed to ML?
What lessons did you learn porting some data structures from Okasaki ML into Java?
Shouldn't (nth nil 1) throw outofbound exception?
Why Clojure doesn't have a generic insert, lookup, append that works the same on all collections?
Why Clojure compiler is single pass? Aren't many possible optimizations lost this way?
What is the convention for arguments order?
How do you pronounce assoc?
Why doesn't contains? do what I expect on vectors and lists?
Why commas are allowed as whitespaces?
How Clojure transients work?
How can I achieve programming mastery?

Should I use a namespace to bundle functions to model a domain object like a "customer" or "product"?

There's a presumption built-in to that question that I think needs to be addressed first. That is, that there should be specialized, named, data structures/classes corresponding to imagined information-based entities in a system, and a partitioning of functions by their association with same entities, either in classes or modules. I've come to the conclusion after 20 years of C++/Java/C#/CLOS/OODBMS that this is a bad idea.
OO is not bad when there is an actual entity - i.e. a stream, a socket, a window, etc to which an object corresponds, or in a simulation of actual entities. That's where it was born and where it shines. When applied to the information/data in a system, however, it causes code to be 10x as large and 1/10th as useful.
For instance, don't customers buy products? Which should own the functions that involve both? Do databases have different functions for different tables? One would hope not. And if a DB query returns a recordset with fields from more than one table, should a custom entity be defined that maps to it? No - ORM is an even worse idea.
Data is data, and having very few general data structures to hold it and very many general functions to manipulate it is the key. That is the Clojure philosophy. (I'm not claiming that this is unique to Clojure, btw)
Look at the way clojure.xml processes XML - it deals with an external entity, a SAX parser fronting some XML source, gets the data out of it and into a map, and it's done. There aren't and won't be many more functions specific to XML. The world of functions for manipulating maps will now apply. And if it becomes necessary to have more XQuery-like capabilities in order to handle XML applications, they will be built for maps generally, and be available and reusable on all maps. Contrast this with your typical XML DOM, a huge collection of functions useful nowhere else.
Similarly, resultset-seq pulls the data out of a db query and into a sequence of maps right away, where it can be manipulated as generally as possible. That isn't to say you can't or shouldn't tag your data with some information/metadata helping you discern its source, use, or categorization in one or more contexts. But you should avoid the ghettoization of your logic, lest you have to repeat it endlessly.
That's a long way to say you might have to think a bit differently about your problem in Clojure in order to achieve the desired result of writing less, more powerful, code.
Why no pattern matching?

If you are talking about the aspect of pattern matching that acts as a
conditional based upon structure, I'm not a big fan. I feel about them
the way I do about switch statements - they're brittle and
inextensible. I find the extensive use of them in some functional
languages an indication of the lack of polymorphism in those
languages. Some simple uses are fine - i.e. empty list vs not, but
Clojure's if handles that directly. More complex pattern matches end
up being switches on type, with the following negatives:

They are closed, enumerated in a single construct.
They are frequently replicated.
The binding aspect is used to rename structure components because the language throws the names away, presuming the types are enough. Again, redundantly, all over the app, with plenty of opportunity for error.

Polymorphic functions and maps are, IMO, preferable because:

They are open, no single co-located enumeration of possibilities.
There is no master switch, each 'type' (really situation, with Clojure's multimethods) owner takes care of themselves.
Maps allow for structures with named components where the names don't get lost.

I'd much rather use (defrecord Color [r g b a]) and end up with maps
with named parts than do color of real*real*real*real, and have to
rename the parts in patterns matches everywhere (was it rgba or argb?)
I'm not saying pattern matching is wrong or has no utility, just that
I would hope it would be less needed in idiomatic Clojure code.
Certainly I'd want the binding part of pattern matching independent of
the conditional part, and I would make any destructuring-bind for
Clojure work with all of the data structures, not just lists, i.e.
vectors and maps and metadata.
Why no tail call optimization?

When speaking about general TCO, we are not just talking about
recursive self-calls, but also tail calls to other functions. Full TCO
in the latter case is not possible on the JVM at present whilst
preserving Java calling conventions (i.e without interpreting or
inserting a trampoline etc).
While making self tail-calls into jumps would be easy (after all,
that's what recur does), doing so implicitly would create the wrong
expectations for those coming from, e.g. Scheme, which has full TCO.
So, instead we have an explicit recur construct.
Essentially it boils down to the difference between a mere
optimization and a semantic promise. Until I can make it a promise,
I'd rather not have partial TCO.
Some people even prefer 'recur' to the redundant restatement of the
function name. In addition, recur can enforce tail-call position.
Full TCO optimizes all calls in the tail position,
whether to the same function or another. No language that uses the JVM
stack and calling convention (e.g. Clojure and Scala) can do TCO since
it would require either stack manipulation capabilities not offered in
the bytecode spec or direct support in the JVM. The latter is the best
hope for functional languages on the JVM, but I'm not sure is a
priority for Sun as tail-calls are not idiomatic for JRuby/Jython/
Groovy.
IMO, either a language has full TCO or it simply doesn't support using
function calls for control flow, and shouldn't use the term. I didn't
implement self-tail-call optimizations because I think, for people who
rely on TCO, having some calls be optimized and some not is more a
recipe for error than a feature.
So, Clojure has recur, which clearly labels the only situations that
won't grow the stack, apologizes for no real TCO, doesn't pretend to
support function calls for control flow, and anxiously awaits TCO in
the JVM, at which point it will fully support it.
In addition to recur, there are lazy sequences and lazy-cons.
To the extent you can structure your processing using the lazy
sequence model, including making your own calls to lazy cons, you can
write traditional-looking functional and recursive solutions that not
only don't grow the stack, but also don't create fully-realized result
lists on the heap, a terrific efficiency boost. I am not contending
that this is as general as TCO, but it has other nice properties, like
the ability to make results that are not tail calls (e.g. list
builders) space efficient.
Why you don't accept pull requests?

I prefer patches. I understand some people don't. Can't we just agree to disagree? Why must this be repeatedly gone over?
Here's how I see it. I've spent at least 100,000x as much time on Clojure as you will in the difference between producing a patch vs a pull request. The command is:
git format-patch master --stdout > your-patch-file.diff

There are two sides to change management - production/submission, and management/assessment/application/other-stewardship. People who argue that the process should be optimized for ease of submission are simply wrong. What if I said patches were twice as efficient for me to assess and manage as pull requests? (it's more than that) Do the math and figure out how the effort should best be distributed.
I don't think asking for patches is asking too much, and I truly appreciate the people who are going to the extra effort. And, I respect the fact that people disagree, and they might decide not to participate as a result. However, I don't need to justify my decision over and over. How about a little consideration for me, and the other list participants? There is a real diluting effect of get-it-off-my-chest messages such as these on the value of the list and the utility of reading it for myself and others.
Sometimes it's better to save it for your bartender :)
Are there cons cells in Clojure?

There certainly is a cons cell, it's called Cons, and cons produces one. It differs from CL cons cell in not being an arbitrary pair (i.e. the rest must be a sequence or nil).
Because Clojure supports a seq abstraction, there is heterogeneity in the implementations. In fact there are two cons-cell-like data structures:
user=> (class (cons 4 '(1 2 3)))
clojure.lang.Cons
user=> (counted? (cons 4 '(1 2 3)))
false
user=> (class (conj '(1 2 3) 4))
clojure.lang.PersistentList
user=> (counted? (conj '(1 2 3) 4))
true
So, list? is more specific because seqs are more general. seq? is probably what is desired. cons onto an IPersistentList could be made to return another IPersistentList, but currently doesn't.
Why Clojure does not implement something like CLOS?

The reason Clojure doesn't have an elaborate system like CLOS, or even like Python et al is that all of those systems have limits and quirks and hardwiring one into a language is a bad idea, IMO.  Clojure's hierarchies and multimethods are just libraries - there's no reason you couldn't define a CLOS-like object system for Clojure (although any such system will have problems integrating Java's type hierarchy because there is no definitive superclass ordering). I'm just trying to tease out the useful ingredients of OO and make them available a la carte.
Two things contribute to the sometimes-convenient default resolution policy of CLOS:
One is left-to-right argument order precedence, which roughly translates to vector precedence in Clojure (although the mapping to a vector need not follow arg order). The other is default precedence of classes based upon left-to-right declaration in defclass, and topological sort based thereupon.
Right now, both matching and precedence follow isa?, but the latter need not, or could be supplemented. Here are the characteristics I think are important:


It should be possible to superimpose a parent on an existing child without modifying the child. This is a critical feature of Clojure's a la carte hierarchies and one which I have desired in every OO language (including CLOS) I have ever worked in. Requiring a client of (independently developed) A and B to have to modify A in order to have it work with B is unworkable.


It should work transparently with Java hierarchies, which also implies the above.


It should support independent and order-independent assertions about the hierarchy. Anything based upon canonic master definitions is brittle and inextensible.


What I like about preference declarations is that you are acknowledging a definite ambiguity in a pure derivation hierarchy
rather than poisoning the notion of hierarchy with directionality.
That said, perhaps the preference specifications at the method level aren't as reusable/convenient as preferences declared at another level or somewhere else.
I do care about making this easier to use, but arguments from familiarity are insufficient - we have to acknowledge the tradeoffs involved. I am not willing to repeat this behavior of CLOS for the sake of convenience:
(defclass a () ())
(defclass b () ())
(defclass c (a b) ())
(defclass d (b a) ())
(defclass e (c d) ())

(make-instance 'e)

Error: Error during finalization of class #<STANDARD-CLASS E
216B409B>: Cannot compute class precedence list for class: #<STANDARD-
CLASS E 216B409B>
Yuck.
deftype, gen-class, proxy, reify, defprotocol: why there are so many ways to define a type in Clojure?

A big part of the design thinking behind these features went like this.
Clojure is built on a set of abstractions, and leverages/requires that the host platform provide some sort of high-performance polymorphism construct in order to make that viable. That said, Clojure was bootstrapped on the host language and didn't really provide similar constructs itself (multimethods are more powerful but slower), leaving people that wanted to do things similar to what I did, in order to write Clojure and its data structures, to either write Java or use Clojure interop to, effectively, write Java in Clojure clothing.
So I took a step back and said, what part of Java did I need in order to implement Clojure and its data structures, what could I do without, and what semantics was I willing to support - for Clojure - i.e. not in terms of interop. What I ended up with was - a high- performance way to define and implement interfaces. What I explicitly left out was - concrete derivation and implementation inheritance.
reify is Clojure semantics and proxy is Java/host semantics. Why doesn't it replace proxy? Because proxy can derive from concrete classes with constructors that take arguments. Supporting that actually brings in a ton of semantics from Java, things I don't want in Clojure's semantics. reify should be possible and portable in any port of Clojure, proxy may not.  Will the performance improvements of reify make it into proxy? Probably at some point, not a priority now.
Prefer reify to proxy unless some interop API forces you to use proxy. You shouldn't be creating things in Clojure that would require you to use proxy.
defstruct is likely to be completely replaced by deftype, and at some point could be deprecated/removed.
Prefer deftype to defstruct, unconditionally.
AOT deftype vs gen-class touches on the same Clojure semantics vs Java/ host semantics, with the objectives from before - support implementing interfaces but not concrete derivation. So, no concrete base classes, no super calls, self-ctor calls, statics, methods not implementing interface methods etc. Will the performance improvements of deftype make it into gen-class? Probably at some point, not a priority now. Like proxy, gen-class will remain as an interop feature.
Prefer deftype to gen-class unless some interop API forces you to use gen-class.
There will be a definterface similar to and probably replacing gen-interface, with an API to match deftype.
So, with definterface, deftype, and reify you have a very clean way to specify and implement a subset of the Java/C# polymorphism model, that subset which I find clean and reasonable, with an expectation of portability, and performance exactly equivalent to the same features on the host. I could have stopped there, and almost did. But there are three aspects of that polymorphism model that aren't sufficient for Clojure:

It is insufficiently dynamic. There is a static component - named interfaces, that must be AOT compiled.
Client code must use the interop style (.method x), and type hints, in order to tap into the performance.
It is 'closed' polymorphism, i.e. the set of things a type can do is fixed at the definition time of the type. This results in the 'expression problem', in this case the inability to extend types with new capabilities/functions.

We've all experienced the expression problem - sometimes you simply can't request/require that some type implement YourInterface in order to play nicely with your design. You can see this in Clojure's implementation as well - RT.count/seq/get etc all try to use Clojure's abstraction interface first, but then have hand-written clauses for types (e.g. String) that couldn't be retrofitted with the interface.
multimethod, OTOH, don't suffer from this problem. But it is difficult to get something as generic as Clojure's multimethods to compete with interface dispatch in Java. Also, multimethods are kind of atomic, often you need a set of them to completely specify an abstraction. Finally, multimethods are a good story for the Clojure side of an abstraction, but should you define a valuable abstraction and useful code in Clojure and want to enable extension or interoperation from Java or other JVM langs, what's the recipe?
defprotocol takes a subset of multimethod power, open extension, combine it with a fixed, but extremely common, dispatch mechanism (single dispatch on 'type' of first arg), allow a set of functions constituting an abstraction to be named, specified, and implemented as group, and provide a clear way to extend the protocol using ordinary capabilities of the host (:on interface).
Prefer using protocols to specify your abstractions, vs interfaces.
This will give you open extension and a dynamic system. You can always make your protocol reach any type, and, you can always make your protocol extensible through an interface using :on interface. In particular note, calls to a protocol fn to an instance of the :on interface go straight through, and are as fast as calls using (.method #^AnInterface x), so there is no up-front performance compromise in choosing protocols.
The possibility of extending from abstract classes in Java was an important consideration in the deftype/protocol design. One reasonable argument for concrete implementation is abstract superclasses, especially when used correctly. And Clojure's implementation does use them. Some of the problems with abstract classes are:

They create a hierarchical type relationship (for no good reason).
Unless you are going to open that huge can of worms that is multiple concrete inheritance, you only get a single inheritable implementation.
They, too, are closed. If you are going to allow open extension, but implementation reuse requires derivation, there is an open/closed mismatch.

Protocols are designed to support hierarchy-free, open, multiple, mechanical mixins. This is enabled by the fact that extend is an ordinary function, and the mappings of names to implementation functions are ordinary maps. One can create mixins by simply making maps of names to functions. And one can use mixins in an ad hoc manner, merging and replacing functions using ordinary map manipulation:
(extend ::MyType AProtocol
  (assoc a-mixin-map :a-fn-to-replace a-replacement-fn))
I think people will find this quite powerful and programmable.
Couldn't locals be mutable?

One of the primary objectives of Clojure is that it makes it straightforward to build programs that behave well in the context of multiple threads. That means that any mutation facilities must be thread aware, as are Clojure's Vars, Refs and Agents. If you use these constructs (and Clojure's persistent data structures) and avoid Java side-effects, you can write robust multi-threaded programs easily.
If locals were variables, i.e. mutable, then closures could close over mutable state, and, given that closures can escape (without some extra prohibition on same), the result would be thread-unsafe. And people would certainly do so, e.g. closure-based pseudo-objects. The result would be a huge hole in Clojure's approach.
Without mutable locals, people are forced to use recur, a functional looping construct. While this may seem strange at first, it is just as succinct as loops with mutation, and the resulting patterns can be reused elsewhere in Clojure, i.e. recur, reduce, alter, commute etc are all (logically) very similar. Even though I could detect and prevent mutating closures from escaping, I decided to keep it this way for consistency. Even in the smallest context, non-mutating loops are easier to understand and debug than mutating ones. In any case, Vars are available for use when appropriate.
I would hope that programmers without prior functional programming experience would grow accustomed to Clojure's functional style, and, even if they never do multithreaded programming, would benefit from the more robust and easier-to-understand programs it yields.
Why there is a "rest" and a "next"?

It can be very difficult to enumerate (or even remember :) all of the contending tradeoffs around something like Clojure's nil handling.
The is no doubt nil punning is a form of complecting. But you don't completely remove all issues merely by using empty collections and empty?, you need something like Maybe and then things get really gross (IMO, for a concise dynamic language).
I like nil punning, and find it to be a great source of generalization and reduction of edge cases overall, while admitting the introduction of edges in specific cases. I am with Tim in preferring CL's approach over Scheme's, and will admit to personal bias and a certain comfort level with its (albeit small) complexity.
However, it couldn't be retained everywhere. In particular, two things conspire against it. One is laziness. You can't actually return nil on rest without forcing ahead. Clojure old timers will remember when this was different and the problems it caused. I disagree with Mark that this is remains significantly complected, nil is not an empty collection, nil is nothing.
Second, unlike in CL where the only 'type' of empty collection is nil and cons is not polymorphic, in Clojure conj is polymorphic and there can only be one data type created for (conj nil ...), thus we have [], {}, and empty?. Were data structures to collapse to nil on emptying, they could not be refilled and retain type.
At this point, this discussion is academic as nothing could possibly change in this area.
The easiest way to think about is is that nil means nothing, and an empty collection is not nothing. The sequence functions are functions of collection to (possibly lazy) collection, and seq/next is forcing out of laziness. No one is stopping you from using rest and empty?, nor your friend from using next and conditionals. Peace!
What is the difference between collections and sequences?

If you look at the docs for Collections you'll see that they all must support 3 functions:

count
conj
seq

and that seq returns an ISeq instance (Java interface) on the collection. That is the critical point - a seq is a logical sequential view of a collection and not (necessarily) the collection itself.
There are about 20 concrete implementations of the ISeq interface in Clojure, supporting sequential access to all of its data structures. One way to think of the seq on a collection is as a cursor or iterator over the collection. The ISeq Java interface in turn has 2 methods, first and rest, provided by all implementations.
The Clojure library has 2 functions, first and rest, that are defined on collections and specified to first call seq on their argument. Calling seq on something that is already a seq just returns it.
So [1 2 3] is not a seq, nor is {:a 1 :b 2}, but first and rest and most of the sequence library work on them because they first obtain a seq on the collection using the seq function. Otherwise it would be quite tedious to have to call seq everywhere. Thus, while they are functions of collections (i.e. they take collections), they operate on
the sequential views of those collections, thus the name sequence functions.
Some collections are their own sequential view, i.e. they implement ISeq directly because they inherently have linked structure (e.g. lists):
(instance? clojure.lang.ISeq [1 2 3])
-> false
(instance? clojure.lang.ISeq '(1 2 3))
-> true

(.getClass [1 2 3])
-> clojure.lang.PersistentVector

(.getClass (seq [1 2 3]))
-> clojure.lang.PersistentVector$ChunkedSeq

(.getClass '(1 2 3))
-> clojure.lang.PersistentList

(.getClass (seq '(1 2 3)))
-> clojure.lang.PersistentList
Of course, it's quite straightforward for ISeqs to behave like collections (i.e. implement count, conj and seq), and so they do, etc - the Clojure library is designed to be maximally polymorphic. In the end, unless you are interested in extending the sequence library in Java, it's not important to know the type hierarchy. Generally, all the sequence functions work on all of the collections. cons is an exception because it creates linked structure, and I want to avoid the
accidental creation of linked structure on top of associative structure, whereas all of the other sequence functions are view-only or create independent structure. Not that there's anything wrong with (cons 1 (seq {:a 1 :b 2 :c 3})) if that's what the user really intends.
Why lazy sequences are cached?

I think it's very important not to conflate different notions of sequences. Clojure's model a very specific abstraction, the Lisp list, originally implemented as a singly-linked list of cons cells. It is a persistent abstraction, first/second/third/rest etc, it is not a stream, nor an iterator. Lifting that abstraction off of cons cells doesn't change its persistent nature, nor does lazily realizing it. After my experimentation with a non-caching version, I am convinced it is incompatible with the abstraction.
If a seq was originally generated from an imperative source, you need it to be cached in order to get repeatable read persistence, if it was calculated at some expense, you need to cache it in order to get the performance characteristics you would expect from a persistent list. An abstraction is more than just an interface.
Everything with side-effects breaks. Lots of performance problems, clear multiple traversals, but often casual multiple local references to (rest x) and other parts of the seq, causing multiple evaluations and corresponding runtime multipliers. In short, an entirely different set of things you need to be careful of, and far more often.
Aren't chunked sequences completely subverting the point of laziness?

Were avoiding some non-terminating/exploding computation the only point of laziness, that might be true. But it's not. Remember, Clojure is primarily a strict/eager language. The only bits of Clojure that are lazy (other than explicit delays) are the sequence functions. This has several implications.
First, there simply aren't many non-terminating computations floating around in Clojure code. Second, there is another benefit to lazy sequences, of primary importance to Clojure: avoiding full realization of interim results. This isn't a simple question of a binary eager/lazy switch, it is rather about the granularity of laziness and sequential processing. Should the default be one item? I am increasingly convinced that the answer for languages like Clojure is no. See for instance this work, which I think is relevant to Clojure: MonetDB/X100: Hyper-Pipelining Query Execution
I don't disagree about the importance of an alternative providing item-at-a-time processing, and would have been reluctant for this to go forward without it were it not for the fact that the majority of users of Clojure have been using these chunked sequences without problems or complaint for months. I'm simply not done yet with an official API, but have provided an unofficial one for those who've asked.
Why cannot "last" be fast on vector?

It is quite easy to come up to some part of Clojure, with some need, and some current perspective, and find it unsatisfying. It certainly is not perfect. But a good presumption is that there might be more context, more constraints, or more issues in play than one might recognize at first glance.
Mark was most correct in identifying that the algorithmic complexity dominates this decision. As far as the functions being defined in terms of abstractions: while true, the abstractions are still something that could be refactored. They are a means, not an end.
History: 'last' is a Lisp function. In Common Lisp it operates on lists only. It is a function that, there, advertises slowness by its mere presence. Code using it has a presumption it will only be used on short lists, and in fact it generally is; in macros and things that process code. People reading that code can presume the same.
I do think that, in general, it is bad to have a single polymorphic function that, depending on the target arguments, transitions between different algorithmic complexity categories (N.B. that is not equivalent to using polymorphism to get better performance by leveraging some implementation specifics, it is a specific subset). 'nth' for seqs is a counterexample, was requested by the community, and I still think is somewhat of a mistake. At least, it would be good to also have an 'elt' with specific non-linear complexity (for the reasons I give below).
Usually, there is a 'fast' function and someone wants it to work on something that would be in a slower category. This case is the opposite, 'last' is slow and people want it to be faster where possible. The end result is the same.
Why not just be as polymorphic as possible? In Common Lisp, there is a set of functions that lets you use lists as associative maps. The performance is not good as things get larger (i.e the sky falls). While this made sense at one point in time (when there were only lists), is it still a good idea now? Should we have functions that expect maps accept lists?
In my experience, having everything work on/with lists made it really easy to get a system together that worked on small data sets, only to have it crash and burn on larger ones. And then, making it perform well was extremely challenging. Some of that was because of the lack of interface-conforming implementations of e.g. hashmaps to swap in. but another category of problem was not being able to see what calls mattered for performance, when lists would still be ok etc. Thus Clojure 'forces' you to make some of these decisions earlier, and make them explicit. Even so, it is still highly polymorphic.
At the moment, I'm not sure it would be bad to have 'last' behave better for vectors et al while still promising in documentation to be not necessarily better than linear. But I do know this - it is seriously unimportant. We all have much better things to do, I hope, than argue over things like this.
There is a perfectly adequate and well documented way to get the last element of a vector quickly, and code for which that matters can (and should, even if last were made faster) use it. Code for which it doesn't matter can use 'last' on vectors because - ipso facto it doesn't matter! People reading either code will know what is and isn't important, and that seems to matter most.
Why Clojure is not licensed under GPL, MIT, etc?

MIT and BSD are not reciprocal licenses. I want a reciprocal license.
But I don't want the license to apply to, or dictate anything about,
non-derivative work that is combined with mine, as GPL does. I think
doing so is fundamentally wrong.
The fact that GPL is not compatible with that approach is a problem
with GPL, and for users of GPL software.
I will not be using any 'customized' license. Using a well known
license intact is the only way to make it easy for users to vet the
license for use. Anything else requires lawyers for me and them.
I will not be dual licensing with GPL or LGPL. Both licenses allow the
creation of derived works under GPL, a license I cannot use in my
work. Allowing derived works I cannot use is not reciprocal and make
no sense for me.
I want to study SICP and use Clojure instead of Scheme. Is that a good idea?

Everyone's experience will be different, of course. Here's my 2 cents:
I don't think SICP is a book about a programming language. It's a book
about programming. It uses Scheme because Scheme is in many ways an
atomic programming language. Lambda calculus + TCO for loops +
Continuations for control abstraction + syntactic abstraction (macros)
+ mutable state for when you need it. It is very small. It is
sufficient.
The book really deals with the issues in programming. Modularity,
abstraction, state, data structures, concurrency etc. It provides
descriptions and toy implementations of generic dispatch, objects,
concurrency, lazy lists, (mutable) data structures, 'tagging' etc,
designed to illuminate the issues.
Clojure is not an atomic programming language. I'm too tired/old/lazy
to program with atoms. Clojure provides production implementations of
generic dispatch, associative maps, metadata, concurrency
infrastructure, persistent data structures, lazy seqs, polymorphic
libraries etc etc. Much better implementations of some of the things
you would be building by following along with SICP are in Clojure
already.
So the value in SICP would be in helping you understand programming
concepts. If you already understand the concepts, Clojure lets you get
on with writing interesting and robust programs much more quickly,
IMO. And I don't think the core of Clojure is appreciably bigger than
Scheme's. What do Schemers think?
I think the Lisps prior to Clojure lead you towards a good path with
functional programming and lists, only to leave you high and dry when
it comes to the suite of data structures you need to write real
programs, such data structures, when provided, being mutable and
imperative. Prior Lisps were also designed before pervasive in-process
concurrency, and before the value of high-performance polymorphic
dispatch (e.g. virtual functions) as library infrastructure was well
understood. Their libraries have decidedly limited polymorphism.
Alas, there is no book on Clojure yet. But, to the extent Schemes go
beyond the standard to provide more complete functionality (as most
do), there are no books on that either. Just docs in both cases.
Learning Scheme or Common Lisp on your way to Clojure is fine. There
will be specifics that don't translate (from Scheme - no TCO, false/
nil/() differences, no continuations; from CL - Lisp-1, symbol/var
dichotomy). But I personally don't think SICP will help you much with
Clojure. YMMV.
Why did you choose to make your language a Lisp (dynamically typed) as opposed to ML?

I think the ML, and even more so, Haskell, are extremely interesting type systems and languages. But, personally, I haven't found their type systems to be sufficiently expressive or sufficiently dynamic for my needs. I like dynamic polymorphism, I like heterogeneous collections. I'm sure there are probably people working on extensions to GHC to allow for similar things, but I don't have time for that. I don't think the benefits of those type systems outweigh their costs, for me. And, Haskell is already doing a brilliant job in delivering that kind of solution.
OTOH, I think immutability is a huge benefit, so Clojure is an effort to bring that over to a dynamic language. (Erlang is another). I think ML's restriction of mutation to atomic refs (arrays aside) is a good idea and was definitely inspiration for Clojure, which adds transactional semantics.
What lessons did you learn porting some data structures from Okasaki ML into Java?

I won't argue for the verbosity of Java vs. SML or Haskell, and I readily acknowledge the beauty of the original formulation (although the remove code, not in the book, isn't pretty in any language). But it was an interesting exercise to move code from a pattern-matching language to a non-pattern-matching language. This was my experience:
I started with pretty much a transliteration of the original, replacing datatype tags with enums and matches with conditionals. The result was none too fast. Also, I wanted to optimize the memory use, as I knew many nodes would be non-branching leaves, and often maps would be used as sets, i.e. without values. (Also, at that time I didn't have persistent hash maps, and thought the rb tree would have to serve all the purposes for which hash maps now serve). One could imagine extending the Tree ADT (of the SML version) from E|T to E|T| Leaf|EmptyT|EmptyLeaf or some such, but, without polymorphism, the effect on the pattern matching is a multiplicative growth of combinations.
So, given I was in Rome (Java), I decided to do the idiomatic thing, and pursued a hybrid strategy, maximizing the use of polymorphism, (which I know is optimized in Java), and minimizing the use of conditionals. (The remove code is still pretty much conditionals, as the logic is too impenetrable to safely recast :)
The result is the hierarchy of Nodes, implementing the memory usage strategy, and, for add/lookup at least, mostly polymorphic method dispatch. The result is more efficient and much faster. The logic is distributed - each node type takes care of itself to some degree. It was, IMO, easier to add the with/without value/branches variations this way to maintain a conditional approach, with or without patterns. The result might seem a bit verbose, but comparing it to the Okasaki book example is not apples-to-apples. You would have to have an SML/Haskell version that:

Was a map, not just a set.
Had the remove code.
Had optional values, and used no storage for no value.
Didn't store branches for leaves.
Allowed the option of comparing elements or supplying a comparator.

I'm sure all of this is elegantly possible, but it won't be the pretty little thing in the book anymore.
Here are some of my takeaways:

Adding types can have a multiplicative, rather than an additive, effect on pattern matches.
Without a pattern-match optimizer, naive mechanical translation of patterns to if-else constructs can produce slow code.

The latter is particularly important. I'm sure SML/Haskell/Erlang optimize pattern matching, and while it may be easy to
superficial behavior with a macro, the optimizers are likely non-trivial.
Shouldn't (nth nil 1) throw outofbound exception?

Thank goodness it doesn't (and first/rest etc) or speculative
destructuring (and many other Clojure idioms) would be impossible.
It is a CL influence to have semantics for many functions when applied
to nil rather than generate errors, and, once one knows the rules, it
can be used to great effect to produce elegant code. I've mentioned this before: http://people.cs.uchicago.edu/~wiseman/humor/large-programs.html
Why Clojure doesn't have a generic insert, lookup, append that works the same on all collections?

Also answers: why only some functions are more polymorphic (e.g. into, conj, count) while others are more type specific (e.g. contains?, assoc)?
It is an important aspect of Clojure that, in general, performance guarantees are part of the semantics of functions. In particular, functions are not supported on data structures where they are not performant.
You can't just swap out sequences or lists for maps or sets in a program. So, you need to use the right data structure for the job, and then the right functions will be available. When it makes sense, some functions are maximally polymorphic (e.g. seq, or into). But lookup, under any name, shouldn't be, IMO, so it isn't in Clojure. Similarly there is no insert at the beginning of vectors, append to end of lists, lookup for values of maps, insertion in the middle of sequences etc. I simply don't want to provide the tools for people to write programs in Clojure with N^2 performance - that benefits no one. But I've seen such performance degradation in many a Lisp program as people build upon O(n) functions on lists.
If you have a lookup table in your program, please use a set or map, or, if by index, a vector. You will have get, contains? or nth available, performance will be good, and that will be clear from the code.
seq-contains? exists because sometimes, say in a macro, you have a known short sequence and need to test for the presence of something.  There's no need to copy it into a set. And people have been rolling their own includes? etc for this job. Now everyone can use seq- contains? for that. Its poor performance is indicated by its name and documentation. Should someone try to take a piece of code based on seq- contains? and try to scale it, they will see that function as an obvious bottleneck.
Your type-check argument is a straw man that doesn't hold up in Clojure practice. People tend to use the right data structures for the job and don't choose algorithms via type checks.
Clojure is not particularly duck-type-y, in general things are not unified by the presence of same-named functions but by underlying shared abstractions. Just because protocols can be used to make things reach many types doesn't mean they should be used to unify things that aren't uniform.
Why Clojure compiler is single pass? Aren't many possible optimizations lost this way?

The issue is not single-pass vs multi-pass. It is instead, what constitutes a compilation unit, i.e., a pass over what?
Clojure, like many Lisps before it, does not have a strong notion of a compilation unit. Lisps were designed to receive a set of interactions/forms via a REPL, not to compile files/modules/programs etc. This means you can build up a Lisp program interactively in very small pieces, switching between namespaces as you go, etc. It is a very valuable part of the Lisp programming experience. It implies that you can stream fragments of Lisp programs as small as a single form over sockets, and have them be compiled and evaluated as they arrive. It implies that you can define a macro and immediately have the compiler incorporate it in the compilation of the next form, or evaluate some small section of an otherwise broken file. Etc, etc. That "joke from the 1980's" still has legs, and can enable things large-unit/multi-unit compilers cannot. FWIW, Clojure's compiler is two-pass, but the units are tiny (top-level forms).
What Yegge is really asking for is multi-unit (and larger unit) compilation for circular reference, whereby one unit can refer to another, and vice versa, and the compilation of both units will leave hanging some references that can only be resolved after consideration of the other, and tying things together in a subsequent 'pass'. What would constitute such a unit in Clojure? Should Clojure start requiring files and defining semantics for them? (it does not now)
Forward reference need not require multi-pass nor compilation units. Common Lisp allows references to undeclared and undefined things, and generates runtime errors should they not be defined by then. Clojure could have taken the same approach. The tradeoffs with that are as follows:

less help at compilation time
interning clashes

While #1 is arguably the fundamental dynamic language tradeoff, there is no doubt that this checking is convenient and useful. Clojure supports 'declare' so you are not forced to define your functions in any particular order.
#2 is the devil in the details. Clojure, like Common Lisp, is designed to be compiled, and does not in general look things up by name at runtime. (You can of course design fast languages that look things up, as do good Smalltalk implementations, but remember these languages focus on dealing with dictionary-carrying objects, Lisps do not). So, both Clojure and CL reify names into things whose addresses can be bound in the compiled code (symbols for CL, vars for Clojure). These reified things are 'interned', such that any reference to the same name refers to the same object, and thus compilation can proceed referring to things whose values are not yet defined.
But, what should happen here, when the compiler has never before seen bar?
(defn foo [] (bar))
or in CL:
(defun foo () (bar))
CL happily compiles it, and if bar is never defined, a runtime error will occur. Ok, but, what reified thing (symbol) did it use for bar during compilation? The symbol it interned when the form was read. So, what happens when you get the runtime error and realize that bar is defined in another package you forgot to import. You try to import other-package and, BAM!, another error - conflict, other-package:bar conflicts with read-in-package:bar. Then you go learn about uninterning.
In Clojure, the form doesn't compile, you get a message, and no var is interned for bar. You require other-namespace and continue. I vastly prefer this experience, and so made these tradeoffs. Many other benefits came about from using a non-interning reader, and interning only on definition/declaration. I'm not inclined to give them up, nor the benefits mentioned earlier, in order to support circular reference.
What is the convention for arguments order?

One way to think about sequences is that they are read from the left, and fed from the right:
<- [1 2 3 4]
Most of the sequence functions consume and produce sequences. So one way to visualize that is as a chain:
map<- filter<-[1 2 3 4]
and one way to think about many of the seq functions is that they are parameterized in some way:
(map f)<-(filter pred)<-[1 2 3 4]
So, sequence functions take their source(s) last, and any other parameters before them, and partial allows for direct parameterization as above. There is a tradition of this in functional languages and Lisps.
Note that this is not the same as taking the primary operand last. Some sequence functions have more than one source (concat, interleave). When sequence functions are variadic, it is usually in their sources.
I don't think variable arg lists should be a criteria for where the primary operand goes. Yes, they must come last, but as the evolution of assoc/dissoc shows, sometimes variable args are added later.
Ditto partial. Every library eventually ends up with a more order- independent partial binding method. For Clojure, it's #().
What then is the general rule?
Primary collection operands come first.That way one can write -> and its ilk, and their position is independent of whether or not they have variable arity parameters. There is a tradition of this in OO languages and CL (CL's slot-value, aref, elt - in fact the one that trips me up most often in CL is gethash, which is inconsistent with those).
So, in the end there are 2 rules, but it's not a free-for-all. Sequence functions take their sources last and collection functions take their primary operand (collection) first. Not that there aren't are a few kinks here and there that I need to iron out (e.g. set/select).
How do you pronounce assoc?

It's assosh, as in assoc(iate) this key with this value.
Why doesn't contains? do what I expect on vectors and lists?

Sequential lookup is not an important operation. Clojure includes sets and maps, and if you are going to be looking things up you should be using them. contains? maps to java.util.Set.contains. The fact that java.util.Collection also has contains is, IMO, a mistake, as you really can't write code to an interface with dramatically varying performance characteristics. So, lookups in sets and maps take priority and get the best name - contains? People write naive programs that do linear lookups with contains() in other languages, and get correspondingly bad n-squared performance - that's not an argument for encouraging that in Clojure. If they get an explanation of why it's a bad idea, and how to use sets and maps, that's fine.
Why commas are allowed as whitespaces?

The motivation came from having map literals. I wanted to avoid syntax, e.g. ':'s between key and value or mandatory commas, as well as structure, grouping key-value pairs with parens, for instance. But printing large maps with uniform keys and values with no grouping makes it really easy to lose your place: {1 2 3 2 5 3 7 2...}
Commas as whitespace give you the best of both worlds. You can enter in short maps or maps with distinct keys and values, e.g. {:fred 41 :ethel 42} with no commas, but use commas whenever they help. By default, maps print with commas between key/value pairs: {1 2, 3 2, 5 3, 7 2...}. Because it is not map-specific syntax, you can use them anywhere, making changing a map to a vector a matter of changing the {} to [].
People have started to use them elsewhere, esp. in let bindings, as Clojure doesn't have the traditional parens around each binding, and there too, it is optional, non-structural, a matter of personal preference and need. (let [a 1 b 2]...) might by fine but (let [x a, b y, c z] ...) might be clearer than without commas.
How Clojure transients work?

Transients are related to doing functional programming in that, if you are otherwise using persistent data structures, conversion to/from transients is O(1). There is no full copy to produce the mutable version (nor the other way), nor is use of the mutable version in any way unsafe for its persistent source. In order to do this, there needs to be a symbiotic design. It is not simply a matter of "we'll let you switch to these other, different, mutable data structures".
Certainly, using transients is not functional programming, nor are they persistent. But they are an important part of the story for those who might otherwise eschew functional programming and immutable data for fear that, should they encounter a performance bottleneck, they will have to change their data structures and the shape of their code.
Clojure's persistent vectors and hashmaps are trees with high branching factors. Creating a transient version means copying the root node into a new object that implements the transient interface, otherwise sharing the rest of the structure. As edits are made, the transient will clone nodes for which it doesn't own a writable copy. Once it has its own copy of a node, subsequent edits affecting that node can be made in place. When done editing, the root is cloned again, this time into an object implementing the persistent interface, and a flag is flipped in the transient disabling further use. Essentially the transient just builds a persistent in place, there is no change tracking or collapsing.
The threading and passing requirement is designed to ensure that the code preserves the same shape as functional code it replaces.
How can I achieve programming mastery?

Mastery comes from a combination of at least several of the following:

Knowledge
Focus
Relentless considered practice over a long period of time
Detected, recovered-from failures
Mentorship by an expert
Always working slightly beyond your comfort/ability zone, pushing it ever forward

Musicians get better by practice and tackling harder and harder pieces, not by switching instruments or genres, nor by learning more and varied easy pieces. Ditto almost every other specialty inhabited by experts or masters. A wide variety of experiences might lead to well-roundedness, but not to greatness, nor even goodness. By constantly switching from one thing to another you are always reaching above your comfort zone, yes, but doing so by resetting your skill and knowledge level to zero.
One can become a great developer in any general purpose language, in any domain, on any platform. And, most notably for the purposes of this discussion, such a developer can carry that greatness across a change in any of them. What skills then are so universally useful and transportable in software development? Two are: the ability to acquire knowledge, and the ability to solve problems.
How does one get better at acquiring knowledge and solving problems? Not by acquiring a lot of superficial knowledge nor solving a lot of trivial problems (a la your 'achievements'), but by acquiring ever deeper knowledge and solving ever harder problems.
You should take heed your phrase 'leveling up'. You don't level up by switching games all the time, but by sticking with one long enough to gain advanced skills. And, you need to be careful to recognize the actual game involved. Programming mastery has little to do with languages, paradigms, platforms, building blocks, open source, conferences etc. These things change all the time and are not fundamental. Knowledge acquisition skills allow you to grok them as needed. I'd take a developer (or even non-developer!) with deep knowledge acquisition and problem solving skills over a programmer with a smorgasbord of shallow experiences any day.
Will this lead to an as-easily-realized improvement strategy based upon boolean achievements? Probably not.