Skip to content

Instantly share code, notes, and snippets.

@weavejester
Last active June 21, 2021 04:36
Show Gist options
  • Save weavejester/5097f7ec9acaa05edc04ef44be131523 to your computer and use it in GitHub Desktop.
Save weavejester/5097f7ec9acaa05edc04ef44be131523 to your computer and use it in GitHub Desktop.
Homoiconicity in Clojure

Many programming languages are designed to favour writing code over writing literal data. For example, in Python the syntax for accessing an instance variable on an object is:

student.name

But to access a value in a dictionary data structure:

student["name"]

In this case, Python's syntax clearly favours working with objects over working directly with data structures.

The syntax of programming languages steers developers toward using certain tools over others. Contrast Haskell's concise curried functions:

(+ 1)

With anonymous classes in Java 7:

new Function<Integer,Integer>() {
  Integer apply(Integer x) {
    return x + 1;
  }
}

Even if we knew nothing else about Haskell, we might infer that Haskell is a language that makes liberal use of higher-order functions, whereas Java 7 does not.

So if we want to build a programming language around data, then we want to avoid syntax that favours writing code over writing data structures. We want writing literal data to be at least as easy as writing code.

The most straightforward solution to this problem is to make the language homoiconic. If your code and data share the same syntax, then by definition your syntax can't favour writing code over data. This, I think, is the most compelling reason why Clojure is a homoiconic language.

Non-homoiconic solutions may be possible, but they run into two significant problems.

The first problem is that they require additional syntax. Clojure uses lists and symbols to represent code, but these are not data types with no other purpose than to be evaluated. Symbols in particular compliment keywords. A keyword is a identifier that designates itself, whereas a symbol is an identifier that designates something else.

For example, symbols are used to represent variables in Datomic's query syntax:

[:find ?student :where [?student :student/name "Alice"]]

Or in Ataraxy's routing syntax:

{[:get "/users/" name] [:get-user name]}

Both of these examples represent Clojure data structures, rather than Clojure code.

So if we want to separate code from data, to have a data-orientated language that is not homoiconic, then we either need to stop using symbols as literal tokens altogether, or have introduce new syntax that can differentiate between symbols used in code, versus symbols used in data structures.

But let's say we decide to do this. We next run into the problem of how to write syntax that separates code and data, but to not favour writing code. Unfortunately, while we can always use code to define data structures:

(hash-map :a 1, :b 2) 

If the language is not homoiconic, the reverse isn't true; we can't use literal data structures to define code. So there's an inherent disparity, and by segregating code and data, we risk users becoming more familiar with writing code than writing data. With a homoiconic language, there's no such problem.

@johanatan
Copy link

johanatan commented Jan 2, 2017

I think you should read the article on Wikipedia on homoiconicity (or SICP or any classic Lisp book).
https://en.wikipedia.org/wiki/Homoiconicity

You seem to be confusing the presence of rich data-literals (which in the case of Clojure are in fact homoiconic-- but that is due to a) Hickey starting w/ a homoiconic language to begin with [a fact for which we should ask ourselves "why?"] and b) his wishing to extend that language without breaking homoiconicity [a wise choice as it allows the benefits of homoiconicity, namely macros]) with homoiconicity. JavaScript has object literals yet it is not homoiconic because its "code is not written as a basic data structure of the language itself".

Homoiconicity does nothing for you unless you intend to metaprogram (once again please refer to Wikipedia or classic Lisp book on this point) & you can have rich data literals without homoiconicity (as already demonstrated).

@johanatan
Copy link

johanatan commented Jan 2, 2017

If the language is not homoiconic, the reverse isn't true; we can't use literal data structures to define code.

This is a bit reversed (as homoiconicity is about code [function calls and their parameters] being easily interpreted as data) but even assuming its truth, how is this a benefit? The benefit of homoiconicity is that code is represented as plain data structures and thus can be easily transformed or manipulated directly within the language itself (i.e., metaprogrammed [via macros in the case of Lisp]).

@weavejester
Copy link
Author

I'm not sure how to explain this any better. We seem to be talking past each other.

Can you provide a rough sketch of a programming language syntax that is data-orientated but not homoiconic?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment