njt/hints-for-csd.md Secret

## hints-for-csd.md

      
    Raw
  

              hints-for-csd.md
            
          
    #Hints For Computer System Design, by Butler W. Lampson
[Hints for Computer System Design http://research.microsoft.com/en-us/um/people/blampson/33-Hints/WebPage.html]
This article is a collection of advice on writing good systems.  It's written in the early 80s from the feel of things: examples are the Alto, Tenex, Multics, etc.  Examples are low-level compared to modern papers (CPU, tape-disk-RAM optimisation, compilers, etc.) and very reminiscent of what I encountered at uni in early 90s.
The paper explores the separation between interface and implementation.  Of note:


consumers of an interface have expectations of performance, which are rarely documented. ("reasonable" is usually not documented anywhere).


service must have a fairly predictable cost, and the interface must not promise more than the implementer knows how to deliver. Especially, it should not promise features needed by only a few clients, unless the implementer knows how to provide them without penalizing others This is the 80s coder version of the product management dictum to keep scope down so that you can maximise delight.


There's a sweet story of the Tenex system raising exceptions ("traps") when accessing invalid virtual pages, and the password checking function for disk access checking the password one char at a time. Pass the checker a string with just one character in a good page, and the next page invalid. If the first char matches, you'll get the page trap. Repeat with two chars in a good page before an invalid next page ... and you've got a much more efficient password cracker than the system designers intended (64_n_ vs 128(n/2))


Example of picking the wrong interface: GetiThField to return the ith field in a document, and search was built by calling GetiThField(0) ... (1) ... (2) ...


Make it fast, rather than general or powerful. I hadn't heard that breakdown before. (Not sure what the difference between general and powerful is)


When a low level of abstraction allows something to be done quickly, higher levels should not bury this power inside something more general. The purpose of abstractions is to conceal undesirable properties; desirable ones should not be hidden. Example is the Alto disk hardware which has a stream mode for


When a low level of abstraction allows something to be done quickly, higher levels should not bury this power inside something more general. The purpose of abstractions is to conceal undesirable properties; desirable ones should not be hidden.  I feel like this turns on "desirable" - the example they give (Altos disk drive streaming a cylinder into memory) isn't widespread today, so perhaps it's not as desirable so much as cute?


"procedure arguments" is the phrase for passing a function pointer into a function so you can have generalised functions.  But the example they give is the Spy monitoring system at Berkeley which did some basic checks for evil, then installed user-supplied patches into the supervisor (kernel). Modern eyes water where 80s eyes thought that was clever.


DSL (a "specialized language") is alternative to function pointers. specialized language, however, if it is more amenable to static analysis for optimization.  I'm not sure optimization is the primary goal in modern systems, at least not for performance: now we're much more concerned with security, validity, reliability, ...


I like the idea that locks are successful because they're incredibly simple, and programmes around those simple things do the real work.


When a system grows to more than 250K lines of code the amount of change becomes intolerable; even when there is no doubt about what has to be done, it takes too long to do it. There is no choice but to break the system into smaller pieces related only by interfaces that are stable for years. Traditionally only the interface defined by a programming language or operating system kernel is this stable.


"Plan to throw one away" is common advice, but the author relates it to stealing from/learning from other systems -- you don't necessarily have to have written the system you throw away.


One way to improve performance is to increase the number of assumptions that one part of a system makes about another the idea that secrecy of implementation is often waived because knowledge of innards permits faster code (e.g., if a routine always produces an ordered list, then searching that list can be log n)


Divide and conquer as a way to manage scarce resources: When resources are limited the method takes a slightly different form: bite off as much as will fit, leaving the rest for another iteration.


Even in 1983 the problems of distributed systems were known: A small amount of data can easily be replicated locally by writing it on two or more disk drives [28]. When the amount of data is large or the data must be recorded on separate machines, it is not easy to ensure that the copies are always the same.


The normal case must be fast. The worst case must make some progress. They talk about crashing the system if a rare bug happens, because better to have it mostly fast and occasionally down than always down.


Many attempts have been made to analyze programs after the fact and optimize the disk transfers, but as far as I know this has never worked. The dynamic analysis done by demand paging is always at least as good.  I found this interesting: big data vs fairness. Also comes up in time-switching: rather than try to budget different slice times to different applications dynamically and "intelligently" to maximise use of CPU, give same size timeslices to all and it's easier. The overhead of decision-making (and the accuracy of past vs now) against ongoing costs.


I know what caches are, but hints are interest. Hints are stashed values that MIGHT be useful, and there's a cheap test to see if they are. E.g., instead of caching every return value by all arguments, if the function is often called in a row with the same arguments then stash the last return value, check args against those from last call, and reuse the response if so.


A cache has three values.  (1) parameters corresponding to (2) a return value, and (3) when it should be invalidated.


(work in progress)


{Nat Torkington - 20160720}
nathan@torkington.com | https://github.com/njt | @gnat