Skip to content

Instantly share code, notes, and snippets.

@cgag
Forked from garybernhardt/gist:44b7063110fc423edb4d
Last active August 29, 2015 14:25
Show Gist options
  • Save cgag/1387e2f83abc578073e3 to your computer and use it in GitHub Desktop.
Save cgag/1387e2f83abc578073e3 to your computer and use it in GitHub Desktop.

Things that programmers don't know but should

(A book that I might eventually write!)

Gary Bernhardt

I imagine each of these chapters being about 2,000 words, making the whole book about the size of a small novel. For comparison, articles in large papers like the New York Times average about 1,200 words. Each topic gets whatever level of detail I can fit into that space. For simple topics, that's a lot of space: I can probably walk through a very basic, but working, implementation of the IP protocol. More subtle topics will get less detail: for RSA, maybe I'd (1) give a rough sketch of the number theory, (2) show that we can treat arbitrarily large byte arrays as numbers, and (3) say "combine them in the obvious way".

1. Computation

  • Computing by changing: Turing machines
  • Computing by constructing: lambda calculus
  • Turing equivalence and the Church-Turing thesis
  • Limits of computation: the halting problem and computability
  • Languages for computers: the chomsky hierarchy

2. Hardware

  • Computers are made of rocks: transistors, logic gates, and adders
  • Processors: registers, fetch-decode-execute, pipelines
  • Memory: memory hierachy and the MMU
  • Spinning rust: HDDs and SSDs

3. Kernels

  • Isolating computations: processes
  • Ordering computations: CPU scheduling
  • Kernel space vs. user space: system calls; memory allocation
  • Interfacing to the world: drivers and HALs

4. Fundamental networking

  • Addressing computers: IP
  • Fire-and-forget: UDP
  • Reliable delivery: TCP
  • Wiring computers together: Ethernet

5. Application networking

  • Email: SMTP
  • The web: HTTP
  • Domain names: DNS

6. Cryptography

  • The workhorses of modern cryptography: hash functions
  • One key: symmetric cryptography
  • Two keys: public key cryptography

7. Data integrity

  • Centralized data integrity: the relational model and normalization
  • Integrity with clusters of perfect computers: two-phase commit
  • Decentralized integrity with unreliable computers: eventual consistency and CRDTs

8. Decentralization

  • The first decentralized network: Gnutella
  • The highest-volume application protocol: BitTorrent
  • Distributed, high-performance, resilient data storage: Kademlia
  • Consensus between untrusted parties: Block chains

9. Holy grails

  • Attempts at omni-protocols: CORBA, COM, WS-*
  • Attempts at omni-virtual-machines: Parrot, JVM, ASM
  • The three great programming languages: Lisp, ML, C

I wrote this index out in full in about an hour, along with a couple more sections not shown here. When I finished and read through it, I realized that most of it corresponds directly to the standard computer science curriculum. In order, the chapters correspond to these standard courses:

  1. Automata and computability (sometimes called "theoretical computer science")
  2. Computer architecture
  3. Operating systems
  4. Networks
  5. Networks again, sort of
  6. Cryptography
  7. Databases
  8. (No equivalent)
  9. (No equivalent)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment