Create a gist now

Instantly share code, notes, and snippets.

@non /
Last active Mar 26, 2017

answer @nuttycom

What is the appeal of dynamically-typed languages?

Kris Nuttycombe asks:

I genuinely wish I understood the appeal of unityped languages better. Can someone who really knows both well-typed and unityped explain?

I think the terms well-typed and unityped are a bit of question-begging here (you might as well say good-typed versus bad-typed), so instead I will say statically-typed and dynamically-typed.

I'm going to approach this article using Scala to stand-in for static typing and Python for dynamic typing. I feel like I am credibly proficient both languages: I don't currently write a lot of Python, but I still have affection for the language, and have probably written hundreds of thousands of lines of Python code over the years.

Losing static guarantees

Obviously the biggest problem with writing Python compared to Scala is that you have many fewer static guarantees about what the program does. I'm not going to sugarcoat this -- it's a big disadvantage.

Most of the other advantages have to be understood in terms of this. If you value compile-time guarantees you may be tempted not to acknowledge the advantages. I think this is a mistake. If you really want to understand what makes writing Python appealing (or even fun), you have to be willing to suspend disbelief.

Less rope

At least when I was writing Python, the right way to think about Python's types was structural: types are strong (a string can't become a number for example), but best understood as a collection of capabilities. Rather than asserting that x is a Collator, and then calling .collate() once that fact is established, we just call .collate() directly.

Because types are not enforced, you would not tend to use them to guide API decisions. Compared to languages like Scala or Java, Python strongly encourages APIs that don't require exotic types, many parameters (unless sensible defaults can be provided for almost all of them), or deeply-nested structures.

In an interesting way, it keeps you from going all-in on object-oriented programming. You only want to create a class when a casual API user will understand how it works and what they would use it for. Otherwise, you tend to prefer using static methods (or similar) that can act on simpler data types. Similarly, there is strong pressure to use the standard collection types (lists, sets, dictionaries) in almost all cases.

This has a number of consequences:

  • You rarely have to wade through huge class hierarchies with poor documentation
  • APIs tend to be strongly-focused on what you want to do
  • You get much more mileage out of learning the default collections' APIs
  • Custom collections feel stronger pressure to conform to default APIs
  • You can assume most data has a useful string representation
  • You rarely have to worry about baked-in limitations of your types

To address the last point: you rarely have to worry about someone baking in the wrong collection type, or numeric type. If you have a type that behaves similarly, you can use that instead.

(A corollary here is that someone who comes to Python from e.g. Java may tend to produce APIs that are very hard to use.)

Distinctions without a difference

In Scala or Java, you often end up with tons of classes that are essentially data containers. Case classes do a great job of minimizing the boilerplate of these. But in Python, all of these classes are just tuples. You don't have to try to come up with names for them, or anything else. It is really liberating to be able to build modules that are much smaller, by virtue of not worrying about having to give things names, or which class to use.

Abstraction as pattern recognition

Abstracting over things like variance become much simpler. It's interesting that many Python programmers see the major difference between lists and tuples as immutability (tuples are immutable), but it makes sense when you consider that both can be iterated over, indexed by number, and have no limitations on their size. Compare this to the difficulties of correctly expressing and abstracting over product types in Scala. Even with Shapeless' help, it is a lot of work.

More generally, finding abstractions in Python feels much more like pattern recognition. If you see two stanzas of code that are essentially the same, it is trivial to abstract over their differences and write a common method. This is true even when the differences are down to:

  • Field or method names used
  • Arity of functions or tuples
  • Classes instantiated
  • Class or package names
  • Imports needed

In static typing, these sorts of abstractions involve figuring out how to relate the types of ASTs as well as the AST shape itself. It doesn't feel as much like a pure abstraction or compression problem as it does in a dynamic language.

Speculative programming

John De Goes made a point about fixing dynamic programs one-error-at-a-time, versus hundreds of compiler errors at once. I think he's right about that, but I don't think it does justice to why the approach sometimes feels better.

One of the first things we have to learn as programmers is how to emulate what the "machine" (a computer, interpreter, VM, whatever) is going to do. We learn to trace through programs imagining counters incrementing, data being allocated, functions being called, etc. We can use print to view intermediate values, we can raise exceptions to halt the process at an intermediate point, etc.

There are arguments that this is the wrong way to do programming. I find some of them convincing. But even for most people using static types, this is how they determine what their program will do, how they assemble it out of other programs, how they debug it, etc. Even those of us who wish programming was more like writing a proof do this.

One advantage Python has is that this same faculty that you are using to create a program is also used to test it and debug it. When you hit a confusing error, you are learning how the runtime is executing your code based on its state, which feels broadly useful (after all, you were trying to imagine what it would do when you wrote the code).

By contrast, writing Scala, you have to have a grasp on how two different systems work. You still have a runtime (the JVM) which is allocating memory, calling methods, doing I/O, and possibly throwing exceptions, just like Python. But you also have the compiler, which is creating (and inferring) types, checking your invariants, and doing a whole host of other things. There's no good way to peek inside that process and see what it is doing. Most people probably never develop great intuitions around how typing works, how complex types are encoded and used by the compiler, etc. (Although in Scala we are fortunate to have a lot of folks like Stephen Compall, Miles Sabin, and Jason Zaugg who do and are happy to talk about it.)

Not having to learn (or think about) this whole parallel system of constraints and proofs is really nice. I think it's easy for those of us who have learned both systems to ignore the intellectual cost to someone who is getting started.

An obvious question is why we have to mentally emulate a machine at all? In the long run I'm not sure we do. But with the current offering of statically-typed languages most folks are likely to use, I think we still do.

Where's the fire?

People are often confused that many scientists seem to love Python. But I think it makes sense.

Static typing is most useful in large, shared codebases where many of the main risks are misusing someone else's API, failing to refactor something correctly, or dealing with long-lived codebases full of deeply-nested interacting structures.

By contrast, a scientist's main concerns are probably mathematical errors (most of which the type system won't catch), methodological problems (even less likely to be caught) and overall code complexity. They are also unlikely to maintain code for very long periods of time or share codebases. This is someone for whom an empirical (and dynamic) runtime debugging process probably seems more pleasant than trying to understand what the type system and compiler are complaining about. (Even after their program compiles they will probably need to do the runtime testing anyway.)


I don't plan to stop writing code in Scala, Haskell, or Rust (or even C). And when I write Python these days, I do find that I miss the static guaranteees and type-driven development. But I don't hate writing Python, and when I'm writing Scala I still find things to envy.


An excellent write-up.


This is one of the more detailed and constructive contributions I've seen on this topic. Language designers and advocates on all sides would do well to read it. Thanks for taking the time to write it up.


And easier metaprogramming

Ovid commented May 8, 2015

Great write-up! For more thoughts, read what to know before debating type systems.

What I explain to people when I am asked out this is that computer scientists have reasonable disagreements about type systems, but computer programmers have unreasonable disagreements. (There's also a subtle play on words there, but people often miss the distinction about "reason").

mkolod commented May 8, 2015

Great write-up! I do have one qualification, though. Let's assume for the time being that by "data scientist," I mean a person familiar with statistics and machine learning who writes code to analyze data and generate insights (it's not that obvious since in the Hadoop community, even people doing ETL are often mislabeled as data scientists - it's not a precise term). In this sense of the word, a data scientist has to be distinguished from a machine learning engineer - the latter builds long-lived code which forms the algorithmic and infrastructural foundation on which data scientists can work (ML engineers are also very good data scientists of course, since they know the ML algorithm internals). As a ML engineer, I would choose Scala over Python. The reasons are numerous:

  • many people will be using my code
  • the runtime has to be fast (if you write Cython, you might as well be writing Scala)
  • the runtime has to be concurrent (let's ignore Twisted, Python's multiprocessing API, etc. which are essentially hacks).

That said, of course the data science ecosystem on the JVM is much more limited - yes, you can use JavaCV, NLP librries like Stanford CoreNLP, etc., but there's no real scikit-learn equivalent, etc. But, as an ML engineer, I'd be tasked with building that stuff, which is different from consuming it the way data scientists do.

BTW, there's no criticism here or one group or another - just a clarification of there being two data science-related audience - the builders and the users. Since the builders (ML engineers) are creating a long-lived foundation, Erik's argument in favor of static typing applies here without reservations. It was merely a clarification for non-data-science folks.


You lost me here:

In Scala or Java, you often end up with tons of classes that are essentially data containers. Case classes do a great job of minimizing the boilerplate of these. But in Python, all of these classes are just tuples. You don't have to try to come up with names for them, or anything else. It is really liberating to be able to build modules that are much smaller, by virtue of not worrying about having to give things names, or which class to use.

  • You compare maybe the best dynamic language (Python) with a rather poor strawman for a somewhat typed language (Scale). Compare Python with Haskell for fairness!
  • You can also use unnamed tuples in Haskell with the advantage that the compiler does type interferences on the tuples type.
  • You can also just give the thing a name and thus make your code much more readable:
data Person = Person { 
                       firstName :: String  
                     , lastName :: String  
                     , age :: Int  
                     } deriving (Eq, Show, Read) 

I rather deal with a type signature Person than (String, String, Int).

alvare commented May 9, 2015

Really good write up! A lot of truth too.

Though, I agree with @thkoch2001, a fight against Haskell would be way harder.
I mean, with lenses in MonadState you can do imperative things python just dreams of!


Productivity is the main appeal for those who are not firstly programmers but use programming to get something done such as Engineers or Scientists.

Learning a dynamic language such as Python tends to make one productive quicker and with less language to learn. When you do need to bring in the big guns of dedicated programmers, ones Python knowledge helps when explaining what is needed and the engineer or scientist can add that any code in other languages be correctly embedded in Python to ease its use.

odersky commented May 10, 2015

If I extrapolate your argument, the ideal situation would be a language with an expressive type system (like Scala) where library-writers nevertheless have the restraint to prefer simple Python-like APIs. Would you agree with that statement?


Python is a fantastic scripting language for little self-contained utilities like this, software automation tasks, hobby robotics, and science.

But let's get real here, and clarify for beginners who might be taken in by the Python hype. I'm throwing down the gauntlet. Is there seriously any professional software engineer here who's willing to say that they'd recommend a dynamically-typed language for their team to take on a 20k+ SLOC project? (other than the case of JavaScript for running in the browser)

Python is cute and very fun to use, but it's a niche tool. Static typing has such ridiculous value that you have to jettison something gigantic like garbage collection in order to find a serious statically typed language that's less suitable for large-scale software development than any dynamically-typed language. I'm writing Scala at work (hi @odersky) and the type system blows me away with how useful it is. You can write a complicated, deeply-nested chain of functional calls and get instant feedback on the presence of mistakes without cutting out the code and running through all the branches in a repl- just poke at it until it compiles, and then you know it works. And the ease of refactoring changes everything- one click in IntelliJ vs potentially hours of grepping through a big codebase.

MareinK commented May 10, 2015

@thkoch2001: Can this not be mostly achieved using namedtuple?

from collections import namedtuple
Person = namedtuple('Person', ['firstName', 'lastName', 'age'])

@briangordon You undermine your argument by immediately excluding a dynamic language.


@Paddy3118 Do you mean my reference to JavaScript? I meant obviously there are going to be giant frontend JavaScript projects, just because there were no other options 5 years ago.

non commented May 10, 2015

@mkolod I agree entirely with your assessment. I was trying to address the fact that my friends who are physicists, economists, sociologists, etc. overwhelmingly seem to be using Python for doing their work than other languages. I think for someone who is operationalizing statistics and machine-learning, it's much more common to see something like Scala being used.

non commented May 10, 2015

@thkoch2001 I think you assumed I was trying to make an argument that people should be using dynamic types. This was really just intended to try to explain to @nuttycom (or someone else who can't understand why anyone would enjoy programming in a dynamically-typed language) what feels good about it.

I don't agree with your characterization of Scala as a poor strawman for a statically-typed language. Beyond assuming that I needed a strawman to try to argue static types are bad (which is not something I even believe) I have much more experience writing Scala than Haskell. Again, I was taking two languages I know very well and trying to explain the appeal of Python.

As far as the section you quoted, my only point is that Haskell must care about the arity of your tuples. It's trivial for Python programmers to abstract over (1, 2, 3), (1, 2, 3, 4), and (1, 2, 3, 4, 5) in a way that it isn't for someone who has to regard those as three separate types. See this Stackoverflow question for what I'm talking about here.

Anyway, I'm disappointed that you read my article as attacking static-typing. As someone who spends 95% of their day writing code in statically-typed languages, that wasn't my intention.

non commented May 10, 2015

@odersky I'm not sure. I definitely feel that way about standard libraries.

As far as third-party libraries, I tend to have a much more of a "live and let live" view. There are several constituencies for APIs:

  • domain experts who want lots of control, will tolerate lots of complexity
  • casual users who are mostly happy with defaults and hate complexity
  • professionals who may need to override defaults, but prefer to avoid complexity

An implied argument from my post is that static typing is so useful that once you understand the type system, you can easily understand someone else's domain or object model. This makes it easier to to design libraries for domain experts which are still relatively accessible. However, this assumes familiarity and expertise with the type system.

Please don't read too much into this post. I'm not planning on giving up my type classes, higher-kinded types, or algebras anytime soon!


This reads more like a comparison between structural and nominal typing than between static and dynamic typing. The speculative programming bit isn't really related either, you can have static checking and still have the compiler defer the actual type errors to runtime by treating them as warnings rather than errors (like Haskell's f-defer-type-errors). A statically typed Python could still preserve all these advantages. The real question that was being asked is what do you have to give up to get static typing.

I also suspect that the bit about scientists is a second-order effect. Python is the best language for which the scientific and numerical libraries are mature enough to justify using it. Scientists probably don't really care too much about whether or not Python is statically or dynamically typed. If the next Python were statically typed they'd probably still use it. (EDIT: I don't disagree with the point you made though, I think it is valid, and salient, that static type checking probably buys them less).

The two points of views on why you actually want dynamic typing that I've heard of follow two basic lines of reasoning. The first is that there is no reason not to use static typing if you have infinite time and wisdom as dynamic typing can be embedded in it anyway through explicit unityping; however static type checking is hard to implement well (evidenced by the plethora of languages with poor static type systems) and if you make too many mistakes in the design of your language this will bite the users.

The second is that by deferring type checking until runtime, more information is available and you're going to be better at JITing an efficient specialized implementation and dealing with new types introduced by hot-loaded code that may make some code have sensible meanings where it didn't before. This can be seen as an emphasis on "late-binding", so late that the program is already running. Even in this case you can still probably justify having some kind of static type system so that the programmer doesn't have to write a bunch of checks into their code, this is more an argument against doing type erasure to get rid of RTTI; the RTTI might be useful for doing new, potentially more useful type checking at runtime.

My expectation is that in the future this distinction will be considered inconsequential as static type systems are developed that, in practice, work the way you expect things like Python to work. You'd really just be introducing some extra static checking and eliminating some need for manual type checking by the programmer. For a good enough implementation of static, structural typing with RTTI, a root type, and late binding, you're not going to notice any tradeoffs other than the fact that you have to run a static checker (the language implementers will, though), and I'm hoping that's where we are headed.


@barendventer: and if this future type checker can also handle duck typing at runtime then I would have it all!!!!

comex commented May 11, 2015

I had a small encounter with Rust the other day that reminded me of this whole debate. Not too interesting, really, not some case where a dynamic language avoids massive horrid ugliness through exquisite cleverness, but a paper cut - paper cuts matter. I had a parser which took a bunch of fields in a certain format from a binary file, interpreted them, and copied them to corresponding fields in a data structure. It was pretty straightforward; each field got a line like this:

self.modtab = self.file_array("module table", ds.modtaboff, ds.nmodtab, dylib_module_size);

But I had I decided I wanted to extend it to support writing as well as reading. I started to write another function that had some similar looking code going in the opposite direction, but since there are a bunch of fields and some complexity to how the fields are nested (which I'll gloss over), I decided the result would be better and more readable if I declared a list of fields in one place and used it in both the reading and writing functions, in a more abstract fashion. To do this I'd need to specify in some fashion which struct field and which field from the binary format (which is also just structs) should be used. Rust, like most languages, does not have any built-in concept of a 'first-class field', i.e. something like &Struct::field which you could then apply to multiple objects. I knew I could simulate it with a lambda going from a pointer to the object to a pointer to the field:

|x: &mut MachO| &mut x.dyld_weak_bind

but of course I didn't want to write that out for every field (that is, twice for each native-struct to format-struct mapping, since each side has a field), so I needed to write a macro...

But hold on, let's take stock for a second at how the situation I'm facing would differ in other languages.

  • Most static languages don't have macros out of the box.
  • You might just write the lambdas out (this would work in C++ too), but...
  • In any static language without pointers* or runtime reflection, you need two functions, one to get and one to set, or else to add some complexity to the interface somewhere, so this starts to look pretty ugly without macros.
  • Runtime reflection in a static language is basically a way to make the language dynamic when you need to, but even if it exists, if it happens to be arcane or verbose or it's not common enough for you to know how it works from memory then you probably won't want to use it...
  • Macros work if you have them, but they are somewhat more likely to confuse the reader than even method_missing type stuff, especially in languages such as Rust where they have weird limitations and aren't just text substitution. (Of course C style text substitution has many many problems of its own, so.)
  • Oh, and you could always just turn the whole struct into a Map<String, Foo> (on one side at least), but that just makes the rest of the code feel weird and out of place. (This is the fallacy that people who call dynamic typing "unityping" tend to fall into [not that the term implies the fallacy, but the blog post that started it rather strongly does]: You can treat dynamic typing as a special case of static typing, dynamically typed objects as a special case of hash maps, but unless you actually go the whole hog and use that subset of the language for your code [which is probably syntactically ugly at the least], instances where it could make some bit of code nicer are nearly impossible to actually realize.)
  • By the way, I glossed over something: in a static language, you probably want static performance, right? The above describes ways of handling the fields at runtime, but it's sure tempting to find some way to have it all compile down to what you would manually write, which is nice and elegant except it would probably make the whole thing 5x as complicated. I mean, I know it's a stretch to say dynamic languages being slow is a benefit, but in a small way it is, if it means you've already given up the hope of speed (especially in purely interpreted langs like Python) and don't have to worry about your code living up to performance standards...

You can certainly do it in any of the semi-problematic ways above, but if it's uglier than just doing reading and writing in the dumb non-abstract way, you probably won't; you'll do the easy thing and it will make the code "better", within the confines of that language, but worse compared to a theoretical optimum.

And in most dynamically typed languages this is super easy. There is no need to decide whether to do it the easier[?] way (boilerplate) or the elegant way (abstract field access) because the elegant way is easy, just two function calls:

getattr(obj, field)
setattr(obj, field, value)

And it's not just field access; you get analogous, rarer but more dramatic comparisons with other things like boilerplate object shapes or functions (in the many cases that generics don't cover). With a dynamic language I can say: "I will never do something gross just because it pleases the compiler; I will take a hard line on boilerplate." Sure, you're giving up the myriad ways compilers can help you if you follow their wishes. But you can say that.

Another way of saying this: Dynamic languages have a more expressive type system than static languages. It's just that you have to do all the type checking in your head. :)

Anyway, I did end up using a macro; I'm not saying they're that arcane or evil (though I've seen a few different people in the Rust mailing list clamor for their removal on roughly those grounds), just a bit steeper of a hill than I'd like. In my particular case it wouldn't have been that bad except for some Rust-specific issues which aren't really pertinent. Simple enough looking:

lbit!(dyld_weak_bind, LC_DYLD_INFO, dyld_info_command, weak_bind_off, weak_bind_size, 1),

For the record, I'm not saying this issue somehow makes the advantages of static typing irrelevant. It has various advantages and neat features even in the small, and I find it credible (don't have much personal experience) that coding in the large, type checking's assurances of (at least basic) consistency are indispensable. (This conflict is one reason I find gradual type systems interesting.) I am merely stating one aspect of the appeal of dynamic typing; there are a few others.

* Or Haskell lenses, though AFAIK they currently require the comparatively unpopular Template Haskell to actually autogenerate a lens for each record field. I think lenses are pretty neat.

ep1032 commented May 11, 2015

This is a great gist, but one thing I think that "where's the fire" nearly got, but stopped short of, is that dynamically typed languages by default, have less of an impedance mismatch with business logic than statically typed languages do.

This is an idea that Douglas Crockford talks about a lot. As we all know, when you're working in a statically typed OOP language (I'll leave out functional languages, because I have little experience with them), your programming process is greatly defined by identifying areas of responsibility and delegating that work into different objects, each of which can be understood independently (because encapsulation!).

That is great, and has a lot of advantages. But it also means that from the first moment code is written, the programmer needs to think greatly about what the structure of the application should be (which is often the worst time to do so, as the programmer is likely least knowledgeable about the subject matter in the beginning). This, of course, can result in rework and re-factoring as they become more knowledgeable, and original assumptions are corrected.

More importantly, it means that every programmer that later works with or modifies the solution has to go through the additional mental process of "I understand what the changes are in the logic that need to occur, now how do I translate that into our application's structure?", which now includes all the type choices that were made.

A single change within a statically typed language can result in a cascade of type changes through multiple classes (hopefully not). The realization that two fields should be passed around together requires type updating through a solution. Every time a subclass is required, significant thought needs to be put into making sure the original base class is indeed still a good abstraction given the new information you have about its necessary child types. Etc.

By contrast, in a dynamically typed language, neither of the first two problems exist, as a simple change to an object will propagate through the solution by default. And the third problem is language dependent, depending on how your dynamically typed language handles classes and inheritance.

The result of the above few paragraphs, is that while a statically typed language requires the programmer to start making assumptions about areas of responsibility, encapsulation, and project structure from the first moment of programming, in a dynamically typed language, the programmer is free to concentrate first on getting the initial business logic right, as these sorts of responsibilities are easier to put off until slightly later in the code writing process, as rapid type changes or architecture changes are less strongly tied into the code.

I'm not arguing that one approach is better than the other, of course. And there are ways to handle the benefits and challenges of the above from both sides of the coin. But it is an interesting difference.


What is an "exotic type"?

non commented May 13, 2015

@chris-martin For the purposes of this essay, I just meant a type a reader is unlikely to have encountered before, which is specific to the API in question and/or not obviously useful. I feel like I see this in Java APIs a lot. For example, think of all the classes that live in and now consider an author who just wants to iterate over the lines in a file. I would argue that many of the types involved in the "correct" Java/Scala solution feel exotic, at least compared to:

for line in open(path, 'r'):

Obviously you can write a Scala library that makes this particular example easier (I wrote one myself) but the argument there is that statically-typed languages may be more prone to this issue.

That said, of course the data science ecosystem on the JVM is much more limited - yes, you can use JavaCV, NLP librries like Stanford CoreNLP, etc., but there's no real scikit-learn equivalent, etc. But, as an ML engineer, I'd be tasked with building that stuff, which is different from consuming it the way data scientists do.

@mkolod your thoughts exactly echo my own. I use python mostly, but for at least a good portion of the time I would prefer to be in writing in scala. However, the libraries for Scala/JVM for data science aren't as well oiled as Python ones. A good example is the seamless integration between pandas, sklearn, matplotlib, and seaborn



This reads more like a comparison between structural and nominal typing than between static and dynamic typing.

From a purely rational PoV sure, but then you get languages like Go (which is a statically compiled structural language) that get a huge amount of flack for similar reasons


@nuttycom, I wonder why nobody notices an elephant in the room - refactoring code in dynamic languages. Maybe people don't know what refactoring is and why is it important? I am sure you do know, but to those who don't, let me give you a simple example.

Say, you have a project of 100 classes. One of the classes has a property "id". You realize that it should be called "position" not "id" and you want to rename it. Safely. However there are 4 other classes with property "id" and in all those classes the name "id" is correct. It means that you cannot just "search/replace" id in the entire project. Oh, also important: all those five classes are used everywhere, in all 100 others. Now then: how do you rename it safely in dynamic language? You can try, then run the program and if you're lucky - you'll see the errors. But what if there is one piece of code that uses this "id" somewhere in the menu you rarely access? I know how to do it in typed language: go ahead and just rename the property, then hit "compile" and see all the errors, go fix them one by one and you're done.

And it's only the simplest refactoring case. There is also "extract class", "inline class", "duplicated code removal", etcetera, etcetera.

So, maybe there is some magic tool to do it safely in dynamic language I am not familiar with? Please share.

Or maybe, people think "refactoring is not that important"? I beg to differ, because when you deal with big projects, refactoring is inevitable at some point.

@nuttycom, do you have an answer for it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment