Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save cflynn07/09f64f206dd8809e24d2e76bedcf4f63 to your computer and use it in GitHub Desktop.
Save cflynn07/09f64f206dd8809e24d2e76bedcf4f63 to your computer and use it in GitHub Desktop.
p31 Claude Shannon systematized use of binary numbers into math/logic that computer use<div><br><div><div>well known for founding&nbsp;digital circuit&nbsp;design theory in 1937 (wikipedia)<br></div></div></div>
p31 Ted Nelson "Description of computer ""a box that follows a plan""<div><br></div><div><b>Theodor Holm Nelson</b>&nbsp;(born June 17, 1937) is an American pioneer of&nbsp;information technology, philosopher and sociologist. He coined the terms&nbsp;<i>hypertext</i>&nbsp;and&nbsp;<i>hypermedia</i>&nbsp;in 1963 and published them in 1965. (wikipedia)<br></div>"
p32 harvard architecture Mark1, early computer (1944), stored instructionns and data separately
p32 protean <div><div><div><i>adjective</i></div><div><b></b></div></div></div><div><div><div><div><div><div>tending or able to change frequently or easily.</div></div></div></div></div></div>
p32 John von Neumannn Guy who had insight that computer programs are data and should be stored in same memory system
p33 system memory ~ long row of storage compartments forr data. Each compartment is addressed.
p34 bitness size of internal data word and operations<div><br></div><div>32 bit systems = 4 bytes</div><div>64 bit systems = 8 bytes</div>
p34 registers storage locations on CPU<div><br></div><div>each holds single value</div>
p35 program counter register special purpose register<div><br></div><div>holds the address of the next machine instruction to be brough in from memory for execution</div>
p35 status register (aka flags register) holds value divided into single bits or groups of bits<div><br></div><div>holds status (1,0) of something CPU has just done</div>
p35 stack pointer register "holds address of ""last-in-first-out"" stack"
p35 accumulator register holds result of arithmetic and logical operations
p36 system bus <div>pathway between CPU and memory</div><div><br></div><div>consists of electrical condutors (lines), each carries 1 bit of info</div>
p36 instruction sets group of machine instructions the CPU understands<div><br></div><div>specific to family of CPUs (intel, ARM)</div>
p42 Radix/Base common typographical conventions "Binary:<div>011010B - trailing B (or b)</div><div>0b011010 - leading 0b</div><div>""%011010"" - leading %</div><div><br></div><div>F2E5H - trailing (H)</div><div>$F2E5 - leading $</div><div>0xF2E5 - leading 0x</div>"
p46 loadable kernel modules mechanism linux kernel adapts to different hardware<div><br></div><div>ex: device drivers, filesystems</div>
p46 ARM custom CPUs feature ARM hardware allows chip designer create custom CPUs<div><br></div><div>Cortex-A15 supports arbitrary numbers of cores in clusters of four</div>
p48 flip-flops single bit storage units of early computers consisting of vacuum tubes
p50 remanance Ability to retain a magnetic field over time
p50 coercitivity energy required to change a magnetic field
p50 each core of magnetic core memory plane contains (4) x wire - one dimension to select a core from a plane<div>y wire - one dimension to select a core from a plane</div><div>sense wire - allows system to sense state of core</div><div>inhibit wire - allows system to set state of core</div>
p52 destructive read type of magnetic core memory operation, forces value at x/y indexed core to 0 when reading
p53 Jack Kilby <div>Guy who invented integrated circuit by adding resistors and wafers on silicon</div><b><div><b><br></b></div>Jack St. Clair Kilby</b>&nbsp;(November 8, 1923 – June 20, 2005) was an American electrical engineer who took part (along with&nbsp;Robert Noyce) in the realization of the first&nbsp;integrated circuit&nbsp;while working at&nbsp;Texas Instruments&nbsp;(TI) in 1958.
p53 field-effect devices Devices in which electron flow is controlled by electric fields<br><br>opposed to older BJT (bipolar junction transistor) which use small current flows to control larger current flows
p53 flip-flop logic circuit with an output that can be in one of two states<div>can be switched between states by pulse or voltage change on an input</div>
p54 memory addressing job done by circuitry to convert binary memory addresses to x/y coordinates of memory cells on a matrix<div><br></div><div>address lines, data lines, control linnes</div>
p55 decoder logic element that accepts a binary numberr as input and selects one output line
p56 bus group of digital lines connecting a memory system of anny kind to a computer
"p57 memory chip ""depth""" number of storage locations in a memory chip or system
"p57 memory chip ""width""" number of bits at each storage location in a memory chip or system
p59 Robert H Dennard 1968 invented DRAM using miniscule capacitors to store bits as presence/absense of charge
p59 DRAM "Dynamic Random Access Memory<div><br></div><div>Dynamic because made with capacitors which leak charge and therefore have to be ""refreshed"" periodically</div>"
p60 DRAM cells contain (2) single MOS transistor<div>single capacitor</div><div><br></div>
p60 transistor has 3 connections - the gate: electrical switch toggle that either connencts the source to the drain or insulates them<div>- the source</div><div>- the drain</div>
p60 word line a common connection to all cell transistor gates
p60 bit line "common connection to all cells in a DRAM column transistor drain leads<div><br></div><div>end of each column bit line has ""sense amplifier"" which helps detect presence/absense of charge&nbsp; (0/1)</div>"
p62 DRAM row refreshed every - anytime cell is read<div>- every 5-50 milliseconds to prevent electron leakage</div>
p62 access time time takes moment memory access is requested by CPU and time access is completed
p62 bandwidth amount of data transferred to or from memory per unit time
p62 CAS Column Access Strobe
p62 RAS Row Access Strobe
p62 WE write enable command line
p62 OE output enable command line
p63 SDRAM synchronous DRAM<div><br></div><div>key innovation splitting DRAM cell matrix into independent banks</div>
p63 pipelining SDRAM technique of overlapping operations on multiple banks<div>keeps data bus always busy</div>
p64 SDRAM column width typically 8, 16, 32 bits. Read from/written to as a unit.
p64 SIMMs single in-line memory modules<div><br></div><div>existed until late 1990's</div><div><br></div><div>can transfer 32 bits to/from data bus at one time</div>
p64 DIMMs dual in-line memory modules<div><br></div><div>dominated since 2000</div><div><br></div><div>can transfer 64 bits to or from the data bus at one time</div>
p64 SODIMM smaller DIMM used for compactness in laptops<div><br></div><div>72 pin 32 bits wide</div><div>144 pin 64 bits wide</div>
p65 rank each side of a DIMM<div><br></div><div>bus-addressable memory block</div><div><br></div><div>a group of memory chips sharing the same chip-select control line</div>
p66 SDR SDRAM Single Data Rate SDRAM<div><br></div><div>first generation of SDRAM</div><div><br></div><div>DDR SDRAM is Dual Data Rate SDRAM</div>
p66 edge rate number of times a wire can change from 0 to 1 in a second
p67 How to sensibly interface higher internal bandwidth of SDRAM to slower bus require that memory accesses occur as short burts running from a starting address to some number of adjacent addresses<div><br></div><div><br></div>
p67 prefetching process where when SDRAM has read a column, subsequent columns can be read without requiring another time-consuming access to the array<div><br></div><div>this works out because sequential reads are most common&nbsp;</div>
p69 ECC error correcting code, prevent memory corruption from background radiation
p69 Richard Hamming "Invented Hamming code in 1950<div><br></div><div>Hamming code is mechanism used in modern computers to detect simultanious ""bad bits"" in 64 bit words</div><div>SECDED single error correcting and double error detecting</div>"
p70 LPDDR2 single-ended (unterminated) buses "power saving feature. Eliminates power loss in terminator resistors used by ""regular"" DDR memory. Downside is lower bus speed."
p71 self-refresh "allows memory controller to delegate task of refreshing the arrays to the SDRAM itself<div><br></div><div>lets CPU and other stuff go into ""deep sleep""</div>"
p71 ball-grid array type of IC packaging<div><br></div><div>industrial process very precisely aligns tiny balls of solder and IC connectors</div>
p72 data cache block of fast memory lying between CPU and system memory<div><br></div><div><br></div>
p72 locality of reference general principle in computer science&nbsp;<div><br></div><div>computer operations tend to cluster together</div><div><br></div><div>1. same data accessed nonw will probably be accessed again in the future</div><div>2. over short spans of time, reads/writes tend to cluster in same general area of memory</div><div>3. memory locations tend to occur sequentially</div>
p74 cache lines unit fixed-size blocks memorry is read and written with cache<div><br></div><div>RPi it's 32 bytes</div><div><br></div><div><br></div>
p74 cache tag "each cache line has a ""cache tag"" -- allows cache controller to determine where in system memory the cache line came from"
p75 valid and dirty bits two 1 bit flags stored in each cache line<div><br></div><div>valid: whether valid data is present inn that cache line</div><div>dirty: indicates some data in the line has been changed and needs to be written back to memory</div>
p76 direct mapping simplest cache mapping technique, uses modulo mapping of memoryr block index to cache index
p78 thrashing failure of cache, repeated fetches from system memory
p79 associative mapping memory stored at cache addressed by a hash of the memory contents<div><br></div><div>all locations searched in parallel</div><div><br></div><div>kinda slow, uses a lot of CPU die space</div>
p80 set-associative cache "organizes cache lines into sets, most common today<div><br></div><div>system memory addresses modulo mapped to ""set addresses"" -&gt; then blocks in set checked in parallel for match</div>"
&nbsp;p81 cache write policies "two general approaches to keeping cache and memory consistent<div>write-through - anytime data word in cache line is changed, cache line is written immediately</div><div>write-back - ""dirty"" line written back to memory only when evicted</div>"
p81 write-through any time data word in cache line changed, written immediately
p81 write-back """dirty"" cache line written back to memory only when evicted"
p82 pages virtual address space (view of memory) divided into sections (sometimes 4kb)
p82 paging process by which pages are traded between system memory and disk
p82 pagefile Area on disk dedicated to storing pages
p83 page table structure in system memory, describes address space of process<div><br><div>each entry: one page, what frame (if any), what operations allowed</div></div><div><br></div><div>if page swapped out, marked invalid. (access leads to page fault)</div>
p88 multi-level page table two-level page tables used by processes to save space<div>most significant 10 bits point to entry in first level which then points to second level</div>
p88 TLB translation lookaside buffer<div>fully/highly associative cache in the processor used with two-level page table lookups</div>
p94 Federico Faggin Guy at intel who led team that developed 4004 microprocessor, first commerical mass-produced single chip CPU
p94 altair 8800 first truly useful personal computer, used intel 8080 chip
p96 four most basic logic gates NOT<div>AND<br>OR<br>XOR</div>
p96 CMOS complementary metal-oxide semiconductor
p96 NMOS N-channel Metal Oxide Semiconductor<div><br></div><div>conduct when their gate input is high +V</div>
p96 PMOS P-channel Metal Oxide Semiconductor<div><br></div><div>conduct when their gate input is low 0V</div>
p97 propagation delay of logic path logic gates impose a delay, time for output to respond to change<div><br></div><div>propagation delay of logic path is sum of delays on longest path from input to output</div>
p100 CPU instruction contains (2) operation code (opcode)<div>1+ operands</div><div><br></div><div><br></div>
p100 program counter pointer inside CPU that contains the memory address of the currently executing instruction
p101 branch instructions instructions that jump forward/backward in sequence of machine instructions
p103 base pointer location in memory where a stack begins
p104 subroutine sequence of actions in a program that is executed as a group and given a name
p105 microcode way machine instructions were implemented in early decades of compuing<div><br></div><div>each machine instruction composed of combinations of microinstructions (saved space, slower)</div>
p106 pipelining overlapping execution of machine instructions for higher throughput in CPU
p111 branch prediction and speculative execution techniques CPUs use to avoid pipeline control hazards
p111 modified harvard architecture splitting level 1 cache into two separate caches, one for data and one for machine instructions<div><br></div><div>avoids pipeline control hazards / resource conflicts</div>
p112 interlock "instruction decode logic hardware that detects and resolves data dependence and resource conflict pipeline hazards<div><br></div><div>inserts ""bubble"" into pipeline to delay if hazard present</div>"
p112 ARM11 pipeline stages consists of 8 stages and 3 possible paths<div><br></div><div>- integer execution path</div><div>- multiple-accumulate path</div><div>- load/store path</div><div><br></div>
p114 superscalar execution CPU architecture pipeline with parallelism (performs 2 pipeline steps in same cycle)
p115 out-of-order execution feature of modern superscalar CPUs to dynamically reorder incoming instruction stream to maximize parallelism and avoid interlocks
p115 SIMD instructions single instruction, multiple data<div><br></div><div>parallelism technique</div><div><br></div><div>instructionsthat operate on multiple data items at once</div>
p116 vectors "one dimensional array of data values arranged such that a given architectures SIMD instructions can act on them<div><br></div><div>(opposed to ""scalars"" or single values)</div><div><br></div><div>typically 2-16 values in length, varying width</div>"
p116 key SIMD benefit SIMD explicity declares computations are independent and therefore avoids need for expensive interlock logic
p118 endianness little endian, least significant byte stored at lowest address in memory<div><br></div><div>big endian, least significant byte stored at highest address in memory</div>
p119 abstruse difficult to understand; obscure.
p119 common endiannesses networks: big endian<div>most microprocessors: little</div><div>ARM11: bi-endianness</div>
p121 key RISC advantage run quickly, simple execution -&gt; easier to apply techniques like pipelining&nbsp;
p121 RISC growth RISC has gradually increased in number and now has roughly the same number of instructions as CISC
p122 register file / register set title given to a CPU's registers as a group
p122 load/store architecture "CISC allowed direct system memory access<div>RISC remove ""memory-access powers"" so that instructions only act on registers</div><div><br></div><div>load/store architecture involves small cadre of machine instructions that only access memory</div>"
p121 RISC legacy - expanded register files<div>- load/store architecture</div><div>- orthogonal machine instructions</div><div>- separate caches for instructions and data</div>
p123 orthogonal machine instructions feature of RISC where all instructions are the same length<div><br></div><div>in 32 bit arch, always 1 32 bit word</div>
p123 separate caches for instructions and data slight divergence from John Von Neumann's principle that machine instructions and data should be stored together<div><br></div><div>performance advantages for having separate caches</div>
p124 ARM originally: Acorn RISC Machine<div>later: Advanced RISC Machine<br><div><br></div><div>microprocessor design developed by Acorn Computers in Cambridge, England</div><div><br></div><div><br></div></div>
"p125 ARM ""core""" in ARM universe, a core is a CPU that may be incorporated into a custom device (system-on-a-chip SoC)
p126 ARMv6 ISA sets (3) ARM, Jazelle, Thumb<div>ARM most frequently used</div><div><br></div><div>Jazelle allowed direct execution of bytecode, depricated ~2011</div><div><br></div><div>Thumb is 16 bit architecture for low end devices. Generally used in embedded systems.</div>
p130 CPU protected mode provides operating system kernel with privileged access to system resources<div><br></div><div>prerequisite for implementing a true operating system</div><div><br></div><div><br></div>
p130 user mode user application execution CPU mode
p130 supervisor mode operating system kernel code execution mode
p131 interrupts signals from hardware devices outside the CPU indicating the device requires attention
p131 exceptions annomalous events within the CPU that require special handling (generally in cooperation with the OS)
p131 ARMv6 TrustZone "feature that uses ""system monitor"" CPU mode to create isolated memory regions (worlds)<div><br></div><div>used by DRM to prevent ""sniffing"" of unencrypted data</div>"
p131 kernel space the memory that contains the kernel and associated data
p131 userland the memory and software environment where user applications run
p132 link register "<div>used to execute fast subroutine calls using one of group of instructions called ""Branch with Link""</div><div><br></div>R14 on ARMv6"
p133 CPSR current program status register<div><br></div><div>stores info on what CPU is doing (or has recently done) at any particular instant</div><div><br></div><div>* too complicated for this book, mostly used by compilers and assemblers *</div>
p134 CPSR condition flags 5 bits used as tests by conditional branch instructions<div>NZCVQ</div>
p135 interrupt/exception (event) CPU handling process 1. CPU immediately changes processor modes<div>2. stores current program counter in link register</div><div>3. stores CPSR in SPSR (saved program status register)</div><div>4. sets program counter to address within the vector table</div>
p135 vector table table of 8 unconditional branch instructions that direct CPU to handlers<div><br></div><div>located at bottom or top of address space (why? speed? organization?)</div>
p136 purpose of banked registers used when exceptions/interrupts occur to store state required to resume user-mode program
p137 two types of interrupts regular (IRQ - Interrupt Request)<div>fast (FIQ - Fast Interrupt Request)</div>
p137 fast interrupt differences 1. handler code is located in vector table itself<div>2. occurs without needing to access system memory</div>
p137 software interrupts - doesn't immediately interrupt what CPU is doing<div>- subroutine call, enter supervisor mode in managed way, generally for communicating with OS kernel</div><div><br></div><div><br></div>
p138 interrupt priority handling 1 can be disabled (setting bits in CPSR (F disables FIQ, I disabled IRQ))<div><br></div><div>2 higher priority exceptions can override (1-6 levels)</div><div><br></div><div>3 all interrupts same priority automatically disabled when handler running</div><div><br></div><div>4 interrupts can not be disabled by userland software</div>
p139 conditional instruction execution feature of ARM CPUs<div><br></div><div>4 bits of 32bit ARM instructions express condition codes</div><div><br></div><div>15 possible condition codes - correspond to condition flags in CPSR</div><div><br></div><div>2 advantages, makes some instructions unnecessary and helps avoid mispredicted branch execution disrupting the instruction pipeline</div>
p142 underflow/overflow overflow - value too large to express in 80 bits<div>underflow - value so small cannot be expressed in 80 bits</div><div>denormal - used to express values from underflows at lower precision</div>
p143 ARM coprocessor interface ARM11 has generalized coprocessor interface (up to 16 coprocessors)<div><br></div><div>coprocessor instructions are compiled into executables</div>
p142 coprocessor separate specialized execution unit, usually has instruction set of its own<div><br></div><div>Intels 8087 was one of the earliest (featured in New Mind youtube video)</div>
p143 coprocessor interface core sends all instructions to all coprocessors, coprocessors decode and decide to ignore or handle<div><br></div><div>coprocessor sends signal if instruction accepted</div>
p144 TCM <div>Tightly Coupled Memory</div><div><br></div><div>The TCM is designed to provide low-latency memory that can be used by the processor without the unpredictability that is a feature of caches.</div><div><br></div><div>optional, not in RPi</div><div><br></div><div><br></div>
p143 ARM11 System Control Coprocessor - exposes large suite of registers used to config and control ARM core mechanisms like cache, direct memory access, MMU, TrustZone<div><br></div><div>- manages TCM if present</div><div><br></div>
p144 MRC/MCR MRC - move from register to coprocessor<div><br></div><div>MCR - move from coprocessor to register</div><div><br></div><div>machine code instructions for interfacing with coprocessors</div>
p144 floating point operations computer mathmatics on fractional values
p144 VFP11 Vector Floating Point Coprocessor floating point coprocessor on ARM11 core<div><br></div><div>VFPv2 instruction set architecture - implements IEEE 754 standard binary floating point arithmetic</div>
p144 vector one dimensional array of same type data items
p145 multiply-and-accumulate specialized floating point operation used often in DSP<div><br></div>
p145 instruction emulation mechanism to handle coprocessor instructions when coprocessor in question isn't present<div><br></div><div>if instruction designated for nonpresent coprocessor, trigger undefined instruction exception - exception handler has subroutine to emulate</div>
p146 cortex family processors "4 broad ""profiles""<div>Cortex-R real-time embedded system service automobile, industrial controllers</div><div>Cortex-M small, inexpensive, low-power cores</div><div>Cortex-A (smartphones, tablets, e-book readers, etc)</div><div>SecureCore</div>"
p146 out-of-order execution (OOE) CPU can determine when machine instruction has to wait for operands, set aside until available, other instructions can execute in the meantime<div><br></div><div>dispatched - placed in queue after decoded</div><div>issued - sent to execution units</div>
p147 big.LITTLE feature in ARM where there are 2 cores<div>- high performance OOE, multi-issue</div><div>- lower performance in-order, singe issue</div>
p148 NEON cortex ARM family coprocessor for SIMD
p148 lanes logical groupings of bits treated as separate quantities during SIMD math
p103 stack pointer "indicates memory location on stack to be accessed next (also called ""top of the stack"")"
p152 photolithography process process which uses short-wavelength UV light + set of photographic masks to chemically impose patterns on a silicon wafer
p152 resist (masking operation) photosensitive chemical that's applied to a wafer of silicon as part of the photolithography process
p152 doping silicon process where silicon wafer exposed to various chemicals to infuse small quantities of boron and phosphorus to alter chemical properties of silicon
p153 process geometry defining parameter of a fabrication process - the size of the smallest components created on the silicon die
"p153 digital logic expressed in silicon ""levels""" cell &lt; macrocell &lt; core<div><br></div><div>cell - single logical element (gate, inverter, flip flop)</div><div>macrocells - registers, adders, memory blocks, etc</div><div>cores - subsystem level, processors, caches, coprocessors</div>
p154 hard ip's blocks that have been tested and laid out for masks in certain geometry<div><br></div><div>sometimes licensed by design houses</div>
p154 HDL "hardware description language<div><br></div><div>how ""soft IP"" is delivered</div>"
p154 RTL Register Transfer Level<div><br></div><div>HDL expresses logic in abstract form called RTL</div><div><br></div><div>decribes hardware in terms of registers form of flip-flops and combinatorial logic using simple logic gates</div>
p154 Verilog and VHDL two popular HDL - Hardware Description Languages
p154 netlist IP that's sythesized into network of individual gates
p154 floorplanning creating a tentative layout for a SoC<div><br></div><div>commences after having finished netlist that defines the entire device both logically and electrically</div><div><br></div><div><br></div>
p155 routing final step of SoC design, creating data paths, clock distribution paths, power distribution paths
p155 AMBA Advanced Microcontroller Bus Architecture<div><br></div><div>standards for creating and reusing IP</div><div><br></div><div>defacto standard for on-chip buses, especially for SoC that implement ARM cores</div><div><br></div><div><br></div>
p156 AMBA protocols bus architecture definitions<div><br></div><div>each includes specs for physical connections between cores and logic that governs data movement</div>
p156 AXI Advanced Extensible Interface<div><br></div><div>part of AMBA3 spec</div><div><br></div><div>used in RPi</div>
p156 ready-valid signaling control of unidirectional data flow over a bus uses read-valid signaling<div><br></div><div>upstream end asserts (set high, logic 1) a valid signal if has data</div><div>downstream end asserts (set high, logic 1) if ready to accept data</div><div>^ transfer during clock cycle IFF both high</div>
p156 5 AXI3 channels - read address channel<div>- read data channel</div><div>- write address channel</div><div>- write data channel</div><div>- write response channel</div>
p157 register slices "1/3 types of bus components<div><br></div><div>temporary memory for data moving through a bus</div><div><br></div><div>can be combined to allow ""pipelining"" of data, similarly to how pipelining works in CPUs</div>"
p157 arbiters 1/3 AXI3 bus components<div><br></div><div>merge multiple upstream buses into single downstream</div><div><br></div><div>allows multiple masters to interchange data with single slave</div><div><br></div><div><br></div>
p158 splitters 1/3 AXI3 bus components<div><br></div><div>divide single upstream bus into multiple downstream buses</div><div><br></div><div>allows single master to exchange data with multiple slaves</div>
p171 Dennis Ritchie Invented C language in 1972 while working at Bell Labs (replaced now vanished B)
p176 register allocation a process that rewrites intermediate compiled code to code that fits in the registers of the real cpu
p176 cross compilation compiler feature to generate code that targets different CPU architectures
p178 BNF Backus-Naur Form<div><br></div><div>Descriptive notation of set of rules that are used to build an AST</div>
&nbsp;p174 compiler preprocessing 1/7 text-based manipulation of incoming source code<div><br></div><div>remove comments, expand macros, conditional exclude stuff (debugging code), include files</div>
p175 compiler lexical analysis 2/7 compiler's lexer scans preprocessed code and identifies language features<br>
p175 compiler parser 3/7 scans stream of tokens outputted by lexer, checks if tokens follow structural rules of language<div><br></div><div>outputs an AST</div>
p175 compiler semantic analysis "4/7 compiler checks AST to ensure syntactically correct program is meaningful<div><br></div><div>ex: checks whether variables/constants of supported types used together in ways that make sense ex: ""false + a""</div>"
p176 compiler intermediate code generation "5/7 generates linear sequence of instructions that express logic of program<div><br></div><div>not generally machine code of target arch, instead artificial instruction set of a ""virtual machine"" that may have many registers (maybe as many registers as the program calls for)</div>"
p176 compiler optimisation 6/7 uses intermediate code to optimize, looks for code duplication and tries to rearrange intermediate code instructions -- faster, more compact
p176 compiler target code generation 7/7 converts intermediate code to sequence of native machine instructions that can execute on a specific CPU
p181 loop invariant computation in a loop that doesn't change per loop cycle (b*c example) - hoisted out of loop by simple optimization<div><br></div><div><br></div>
p181 register pressure number of values that need to be remembered at any given point in the program
p182 induction variable elimination optimization, reduce intermediary operations and variables using induction
p183 register allocation step performed by target code generation<div><br></div><div>finding a register for each value computed by program between point defined and last point used</div>
p183 instruction scheduling performed by compiler target code generation - ordering machine instructions to avoid triggering interlocks
p184 linker combines object code files into an executable<div><br></div><div><br></div>
p198 two's complement notation used by vast majority of architectures<div><br></div><div>to represent negative number, write regular binary representation and invert every bit and add 1</div><div><br></div><div>00000011 3</div><div>111111101 -3</div>
p199 mantissa significant bits of a floating point number
p199 exponent the magnitude of a floating point number
p199 IEEE 754 floating point number standard, 1985<div><br></div><div>defines several floating point formats that may be used in programming languages</div><div><br></div><div><br></div>
p144 why floating point dedicated coprocessors used a lot in scientific, engineering and games<div><br></div><div>since must express values many significant figures, registers larger than 32 bits required</div>
p224 GCC GNU Compiler Collection
p224 dynamic dispatch used in statically typed object oriented languages to determine correct method to call by inspecting an object instance
p225 gcc build process "preprocessing - gcc uses cpp (C preprocessor)<div>compiling - gcc turns preprocessed C files into intermediate code (gcc does this)</div><div>assembly - translates assembly language to native object code (gcc uses GNU assembler ""as"")</div><div>linking - convert and bind together object code files into code executable (gcc uses GNU linker ""ld"")</div>"
p233 ASCII American Standard for Code Information Interchange 1963
p235 significance of paper tape in history of computing brought the ASCII character encoding system out of telecommunications and made it the standard for non-mainframe computing
p237 flux transitions boundary between two magnetic domains<div><br></div><div>can be more easily detected than the domains themselves, this is exploited to represent binary data</div>
p237 bit cell region on magnetic medium in which a single bit is ecoded (can consist of 1-2 flux transitions and magnetic domains)<div><br></div><div>all same physical length</div>
p238 bit rot process where orientation of megnetic domains in storage mediums can spontaniously flip due to thermal effects, corrupting data&nbsp;
p239 longitudinal to perpendicular recording transition in hard drive magnetic encoding technique that led to increased density
p241 HD sector divided into (4) sync field - marks beginning of sector<div><br></div><div>address mark field&nbsp; - sector number, position, status info</div><div><br></div><div>data field - contains sector data, generally 512 or 4096 bytes</div><div><br></div><div>ECC field (error correcting code) - 50 bytes parity information</div>
p242 zone bit recording technique that divides platter's tracks into zones places more sectors in zones closer to the rim.<div><br></div><div>keeps number of bits per linear unit roughly constant from hub to the rim (more data)</div>
p242 cylinder set of all tracks that lie under the heads at any given time
p242 LBA logical block addressing<div><br></div><div>system used to locate data on hard drives</div><div><br></div><div>each sector gets a LBA number, hard drive controller translates that into cylinders, heads, sectors&nbsp;</div>
p242 low level formatting process where magnetic markers defining tracks and sectors must be laid down on all platter surfaces
p242 partitioning divides drive into spearate logical regions, can operate independently
p243 high level formatting sets up mechanism for organizing drives sectors into folders and files. This is done according to requirements of OS components called file systems
p245 single ended signalling each data path travels over a single wire<div>(PATA, ps/2, vga video, etc)</div><div><br></div>
p245 differential signalling teqnique developed to mitigate crosstalk<div><br></div><div>each data path requires two wires. signal encoded as difference between voltage levels of two wires. Interference affects both at once, changing relative to the ground but preserving the difference</div>
p247 MBR Master Boot Record<div><br></div><div>Contained in first sector of partitioned device</div><div><br></div><div>contains</div><div>1. bootloader, loads OS kernel into RAM</div><div>2. partition table - table of partition descriptors</div>
p248 MBR partition table default 4 entries<div><br></div><div>each entry:</div><div>- partition size and type</div><div>- start sector</div><div>- end sector</div>
p248 extended partition primary partition modified to allow it to act as a partition container<div><br></div><div><br></div>
p249 filesystems tables associating file and directory names with blocks of storage space (contiguous groups of sectors or allocation units)
p250 GPT GUID partition tables<div>new drive organization tech, each partition assigned random 122bit value</div><div><br><div>GUID - globally unique identifier</div></div>
p250 how GPT mitigates danger of damaging MBR creates multiple instances of partition tables and crucial data scattered across the drive<div>uses CRC (cyclic redundancy check) to assist in reconstructing damaged data</div><div><br></div>
p251 RPi boot sequence BCM2835 boot ROM runs VPU and loads first stage boot loader `bootcode.bin`<div>that loads `start.elf`</div><div>start.elf loads either kernel.img or kernel7.img</div>
p260 endurace the number of times a flash memory cell may be written to
p260 SLC vs MLC Single Level Cell&nbsp;<div>Multi Level Cell</div><div><br></div><div>flash memory cells, MLC stores multiple bits in a single cell by measuring variance in charge levels</div>
p261 NOR flash slower write/erase<div>less dense</div><div>faster read</div><div>in-place execution of code</div><div>commonly used for firmware embedded devices</div>
p262 NAND flash "accessed in ""pages"" 512-4096 bytes<div>read/written in pages, erased in blocks</div><div>faster to write and erase</div><div>more dense</div><div>slower read</div>"
p263 strings groupings of cells in NAND flash, 32 or 64 in series, accessed together
p264 NAND page each corresponding bit in a large number of strings treated as a unit called a page.<div><br></div><div>smallest unit that can be read/write single operation</div>
p264 NAND block all the cell strings that span a page
p265 flash erase-before-write erasing sets bits to 1<div>1 bits not written to cells unless part of the erase process, only 0 bits written</div>
p265 FTL "flash translation layer<div><br></div><div>interposes itself between filesystem and ""raw"" flash storage, prevents any single block from being written too much and running out of endurance</div>"
p266 flash FTL BAT Flash Translation Layer<br>Block Aging Table<div><br></div><div>used to implement wear leveling</div><div><br></div><div>dynamic wear leveling</div>
p266 write amplification FTL may sometimes erase multiple blocks to perform a write<div><br></div><div>FTL tries to minimize this</div>
p267 flash garbage collection "background task, gathers live pages ad consolidates them on fresh blocks<div><br></div><div>erases blocks for later writing&nbsp;</div><div><br></div><div>done during ""quiet time""</div>"
p267 TRIM command used by OS/SATA interface to tell a FTL a page is deleted, doesn't actually do a delete
p273 ALOHAnet first wireless computer network deployed by university of Hawaii in 1971
p274 OSI "Open System Interconnection standard 1974<div><br></div><div>way of creating ""big picture"" view of many smaller ideas within larger idea of networking</div>"
p276 OSI Presentation Layer "little misleading, not about displaying data<div><br></div><div>data conversion, how data will be ""presented"" to host on other end</div><div><br></div><div>ASCII/UTF</div>"
p277 data encapsulation process where ISO model layers add headers to data block passed down from layer above them
p277 PDU protocol data unit<div><br></div><div>chunk of data handled by a particular layer of the OSI model</div>
p278 session layer opens actual communication session with other host<div><br></div><div>determines full duplex/half duplex</div>
p279 TCP connection oriented protocol<div><br></div><div>sequence number, checksum</div><div><br></div><div>uses window field for flow control (each end specifies how much data to accept)</div><div><br></div><div>multiplexing with port fields in headers</div>
p279 UDP simpler than TCP<div><br></div><div>header contains only source and destination port fields</div>
p279 network layer primary concern is routing<div><br></div><div>routers big component</div>
p281 OSI Data Link Layer manages flow of data over direct connections<div><br></div><div>reorganises data into frames</div><div><br></div><div>MAC media access control</div>
p282 OSI Physical Layer "Where it ""gets physical""<div>frames from data link layer received as strings of bits, converted to signals in physical medium</div><div><br></div><div>transmissions book ended by preamble and delimiter bits</div><div><br></div><div>groups of bits are called ""symbols?""</div>"
p282 Ethernet Networking entity spans data link and physical layers<div><br></div><div>came from Xerox PARC labs Palo Alto, CA 1973, commercialized 1980, standardized 1983</div>
p284 MAC address Idea originated with Ethernet<div><br></div><div>every device attached has unique 48 bit address (expressed 6 groups of 2 hexadecimal digits)</div>
p285 collision detection in shared medium ethernet "if two pulses from two nodes enter the cable at the same time, the pulses ""add"" electricity and signal voltage is higher than normal"
p285 truncated binary exponential backoff algorithm used to vary distribution of backoff period based on collision frequency
p285 CSMA/CD Carrier Sense Multiple Access with Collision Detection<div><br></div><div>protocol used to collision detection and backoff determination</div>
p286 manchester encoding used in ethernet for data encoding<div><br></div><div>each data bit encoded in a clock cycle transition (positive-negative vice versa)</div><div><br></div><div>bits are represented by transitions, not 0v for 0 and +1v for 1 (also gives us 0DC component)</div><div><br></div><div><br></div>
p286 4B/5B scheme used by 100BASE-TX Fast Ethernet<div><br></div><div>encodes 4 data bits into 5 bits for transmission, encoding done using static table</div><div><br></div><div>designed to provide at least single level transition for every four bits of data, ensures self clocking</div>
p288 MLT-3 multi-level transmit using 3 levels<div><br></div><div><br></div>
p289 how 1000BASE-T simultanious bidirectional transmission over same set of conductors each receiver subtracts the (known) output of the local transmitter from the voltage observed on the line, leaving only incoming signal (if any)
p290 PAM-5 5 level pulse amplitude modulation<div><br></div><div>dense data transmisison encoding using 5 levels of voltages</div>
p290 amplitude modulation data encoded as varying voltage level in a signal
p291 differential signalling NICs transmit data by encoding 2 voltages on 2 cables, data is difference
p295 cut-through switching technology where network switch inspects incoming packet only until it has complete destination address and immediately begins forwarding to that host.
p298 ip address higher and lower order octets "higher order octets contain ""network address""<div><br></div><div>lower order octets contain ""host address""</div><div><br></div><div>larger networks with more than 255 hosts devote more octects to host portion of the address</div>"
p298 subnetwork mask four-octet bit pattern that specifies the split between the network portion and the host portion of an ip address
p299 demultiplexing splitting a single stream of incoming packets into multiple streams of packets based on port numbers for delivery to individual processes
p302 NAT "Network Address Translation<div><br></div><div>translates local non-routable local IP addresses into global, routable IP addresses</div><div><br></div><div>uses ""extended"" ip addresses which consist of connecting a TCP port number to a local ip address</div>"
p303 port forwarding feature of router which associates a port with a local ip address
p306 fading when wavefronts of wifi signals interfere with each other in unpredictable ways
p306 diversity reception using multiple antennas on wifi equipment, positioned one wavelength apart. Receiver samples signals on both antennas and chooses the stronger of the two.
p309 beacon frame 802.11 management frame, broadcast periodically to let stations know that the network is there&nbsp;
p311 SSID service set identifier<div><br></div><div><br></div>
p313 CSMA/CA "Carrier Sense Multiple Access/Collision Avoidance<div><br></div><div>wifi can't be full-duplex, when broadcasting station can't ""hear"" other stations</div>"
p313 DIFS distributed inter-frame space<div><br></div><div>strategy wireless networks use to time transmissions such that they avoid collisions</div><div><br></div><div>uses random backoff periods with a dynamic window</div>
p313 SIFS short inter-frame space<div><br></div><div>part of MAC-level acknowlegement and retransmisison protocol used in wifi</div><div><br></div><div>after station successfully receives a frame, waits a SIFS, transmits ACK</div>
p314 NAV "network allocation vector<div><br></div><div>physically sensing medium requires power, so wifi frames contain ""duration"" fields. allows to indicate how long frame transmission will occupy space</div>"
p314 hidden node problem when nodes connected through an access point are positioned such that they can't sense each other and can't implement CSMA/CA<div><br></div><div>fix is RTS/CTS</div>
p314 RTS/CTS request to send / clear to send<div><br></div><div>fix for hidden node problem</div><div><br></div><div>first performs handshake with receiving station by sending RTS, waits for CTS frame in response</div>
p315 fragmentation threshold specifies the maximum size of frame that may be transmitted in one piece<div><br></div><div>larger frames higher risk of encountering interference/collisions</div><div><br></div><div>frames larger than FT broken into numbered series of fragments and individually transmitted/ACK'd</div>
p318 digital modulation schemes for transmitting data BASK - binary aplitude-shift keying (aka OOK on-off keying)<div>BFSK - binary frequency-shift keying</div><div>BPSK - binary phase-shift keying</div><div>DBPSK - differential binary phase-shift keying</div>
p318 QAM "quadrature amplitude modulation<div><br></div><div>combine amplitude and phase modulation&nbsp;</div><div><br></div><div>digital QAM scheme = discrete (phase, amplitude) values used to transmit symbols, represented as a ""constellation""</div>"
p320 spread spectrum technique used by wifi to spread a signal across a wider bandwidth, reduces interference and spectral power density
p324 probe request frame client adaptors sends out a probe request frame to all APs within range, used in active scanning<div><br></div><div>if SSID field is null, all APs in range may send a response back</div>
p326 supplicant WPA-2 implemented in clients as piece of software called a supplicant
p338 3 types of interrupts hardware<div><br></div><div>software</div><div><br></div><div>traps - occur when CPU detects errors</div>
p340 what controlls the design of the OS? The hardware of the computer - its physical structure.&nbsp;
p341 major components of a CPU ALU&nbsp;<div><br></div><div>Process Registers - small amounts of working memory that supply input and accept output from ALU</div><div><br></div><div>Control Unit - accepts instructions from OS</div>
p341 OS 4 major functions Process Management<div><br></div><div>Memory Management</div><div><br></div><div>File System Management</div><div><br></div><div>Device Management</div>
p350 UEFI Unified Extensible Firmware Interface<div><br></div><div>Newer replacement for BIOS bootloader</div>
p359 quantisation artefacts visible steps in brightness or color
p360 H.261 First widely used video compression standard, developed by ITU (International Telecommunication Union)<div><br></div><div><br></div>
p360 MPEG Moving Picture Experts Group<div><br></div><div>Formed 1988</div>
p361 Y'CbCr Colorspace<div><br></div><div>Y' luma</div><div><br></div><div>Cb Cr Chroma<br><br>Y' computed as weighted sum of original RGB values<br><br>Y' = 0.257R + 0.504G + 0.098B + 16</div><div>Cr = 0.439R - 0.368G - 0.071B + 128</div><div>Cb = -0.148R - 0.219G - 0.439B + 128</div><div><br></div><div><br></div>
p363 I Frames intra-frames<div><br></div><div>stored in a way that allows them to be decoded by themselves, without reference to any other frame in the video</div><div><br></div><div>I frames encoded very similar way to JPEG image format</div>
p364 P Frame Predicted Frames<div><br></div><div>depend on image data from preceding I or P frame</div><div><br></div><div>just describe bits that have changed</div><div><br></div><div>divided into macroblocks like I frames</div><div><br></div><div><br></div>
p364 motion search procedure where encoder looks at each macroblock in an image and tries to find similar macroblock-sized areas in preceding frame<div><br></div><div><br></div>
p364 P macroblock A macroblock in a P Frame
p364 prediction error (residual) difference in a macroblock from current frame to preceding frame<div><br></div><div><br></div>
p364 interpolation "scheme used by decoder to calculate ""missing"" pixel values that lie halfway between the real pixel values (motion vector can be half-pixel)"
p365 B Frames bi-directional frames<div><br></div><div>can contain elements from preceding I/P frame and subsequent I/P frame</div><div><br></div><div><br></div>
p365 Video Files Encoding Frame Order Encoder doesn't write frame to the file in the order they appear on screen<div><br></div><div>Reference frames are always before the frames that are predicted off them</div>
p366 GOP Group of Pictures<div><br></div><div>An I frame and successive P and B frames</div>
p371 RLE Run-Length Encoding<div><br></div><div>Basically scientific notation for compression. Instead of storing 40 0's, stores 0x40</div><div><br></div><div><br></div>
p371 Huffman Encoding Removes duplicate data, works on sequences oof symbols that are repeated at different locations in the data.<div><br></div><div><br></div>
p373 interlaced video video technique to increase the apparent frame rate of a video. Commonly used in broadcast TV.
p374 lumi masking process where encoder preferentialy removes detail from very bright or very dark areas of image (taking advantage of human visual system inability to distinguish)<div><br></div>
p375 H.264 (1) main improvement Previous standards, B frames could be predicted off two adjacent reference frames<div><br></div><div><div>H.264 increases this to 16 nearby frames</div></div><div><br></div><div>motion vectors 1/4 pel precision</div>
p378 CTUs Coding Tree Units<div><br></div><div>Replacement for macroblocks (H.264) used in H.265</div>
p382 PSNR Peak Signal to Noise Ratio<div><br></div><div>Computational approach to assessing video quality</div>
p389 Jim Clark professor at Stanford<div><br></div><div>developed the geometry engine in 1979, foundation of modern hardware to accelerate 3D modelling</div><div><br></div><div><br></div>
p390 OpenGL Graphics standard created and released by SGI in 1992<div><br></div><div>Provides abstraction layer above underlying hardware for portability (downside: performance cost)</div>
p390 Direct 3D competitor to OpenGL from Microsoft<div><br></div><div><br></div>
p391 fixed-function hardware pipeline a collection of processing stages tightly mapped to dedicated set of logic gates<div><br></div><div>gave way to programmable hardware pipeline</div><div><br></div><div>PHPs now underpin all modern graphics processors</div>
p392 high level 4 stages graphics pipeline 1. Vertex processing<div>2. Rasterization</div><div>3. Frament Processing</div><div>4. Output merging</div><div><br></div><div><br><div><br></div></div>
p392 vertex processing stage 1/4 graphics pipeline<div><br></div><div><br></div>
p397 OpenGL ES 3 basic modelview transformations "translation - simple, adds offset to each component of position vector<div><br></div><div>scaling - multiplies each component by scale factor, resizing</div><div><br></div><div>rotation - opengl uses ""right handed"" coordinate system, curling fingers on right hand shows direction of positive rotation around axis</div>"
p401 specular reflection rays of light reflected almost entirely in one direction
p401 diffuse reflection scatters light in all directions
p403 vertex shading process of transformation and lighting in OpenGL ES 2.0+
p412 tile-based rendering "rendering scheme used on mobile devices to cope with lower memory and bandwidth limitations<div><br></div><div>output frame-buffer divided into array of ""tiles"" (squares or rectangles)</div><div><br></div><div>each tile then rendered seperately</div>"
p413 culling rejection of objects that are not visable to user early in graphics pipeline to improve performance
p413 clipping process where only portion of objects that lie within the viewing volume are rendered
p423 heterogeneous architectures architectures that aim to make use of compute elements beyond just the CPU (most commonly CPU + GPU). Goal of these systems is to ensure passing of data between the CPU and GPU is efficient, usually via shared memory.
p429 transducer a device that converts variations in air pressure to an electrical waveform that changes in frequency and amplitude to match sounds (a microphone)
p499 Douglas Engelbart Developed the mouse at Stanford in the 1960's
p455 USB device classes part of USB standard, software drivers for class codes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment