Skip to content

Instantly share code, notes, and snippets.

@sawaYch
Last active December 12, 2018 12:08
Show Gist options
  • Save sawaYch/5dd0d82d14cc9f65c00557e4084167ce to your computer and use it in GitHub Desktop.
Save sawaYch/5dd0d82d14cc9f65c00557e4084167ce to your computer and use it in GitHub Desktop.
Something about operating system.

Example of what;s MMU doing

Segment Base Limit 0 132 300 1 1672 150 2 1960 90 3 445 78 4 598 696

Physical Address? (0, 133) -> 265 ( 132+ 133 offset) (3, 86) -> Trap illegal (4, 200) -> 798

Logical address 1801 -> (1, 129) 2029 -> (2, 69)

2-Level Page Table (???)

page # page offset

|11-bit | 11-bit | 10-bit |

|----------------32-bit-----------------------|

Suppose each page table entry size: 64bit = 8 Byte

Question: space needed by the page table for a process of 128MB virtual address space?

The page size = 2^10 = 1KB

number of entries 1st level page table: 2^11 = 2048 = 2K

number of entries 2nd level page table: 2^11 = 2048 = 2K

128MB/ 1KB = 128000

128000 / 2000 = 64

(64 + 1) * 16KB = 1040KB // answer

Banker's algorithm

https://www.geeksforgeeks.org/operating-system-bankers-algorithm/

定義

Need: n * m Array.

  • Need[i, j] = Max[i, j] - Allocation[i, j]
  • Need[i, j] = k ; i 表示Process_id, j表示Resource_id, k 表示instance數量
  • e.g. 如果Need[i, j] = k, 那麼Pi 需要多 k 個Rj的instance去完成工作

*當進程Pi要allocated requested時, 需要更改以下的state:

Available_ = Available - Request

Allocationi = Allocationi + Requesti

Needi = Needi - Requesti

Example

5 Process P0 to P4 with 3 resource types, A (10 instance ), B (5 instance), C (7 instance)

TO-DO:

  1. 先算出Need Array
  2. 從P0開始檢查available是否>=Need
  3. 不是的話, 該進程需要等待資源被Release
  4. 是的話: Available = Available + Allocation
  5. 繼續下一個process, 以circular queue的方式

When in Time T0:

        Allocation                        Max                        Available                        Need
​	A	B	C		A	B	C		A	B	C		A	B	C
P0	0	1	0		7	5	3		3	3	2		7	4	3

P1	2	0	0		3	2	2						1	2	2

P2	3	0	2		9	0	2						6	0	0

P3	2	1	1		2	2	2						0	1	1

P4	0	0	2		4	3	3						4	3	1

... After a few iterations ...

...

After Finish all:

        Allocation                        Max                        Available                        Need

​	A	B	C		A	B	C		A	B	C		A	B	C

P0	0	1	0		7	5	3		3(7)	3(5)	2(5)		7	4	3

P1	2	0	0		3	2	2		5	3	2		1	2	2

P2	3	0	2		9	0	2		10	5	7		6	0	0

P3	2	1	1		2	2	2		7	4	3		0	1	1

P4	0	0	2		4	3	3		7	4	5		4	3	1

Safe sequence: P1 | P3 | P4 | P0 | P2

Example : P1 Request more (1, 0, 2)

Do *

Available_ = Available - Request

Allocationi = Allocationi + Requesti

Needi = Needi - Requesti // don't forget to update Need Array

When in Time T0:

        Allocation                        Max                        Available                        Need

​	A	B	C		A	B	C		A	B	C		A	B	C

P0	0	1	0		7	5	3    	     3-1    3-0   2-2		7	4	3

P1	2+1	0+0	0+2		3	2	2						1-1	2-0	2-2

P2	3	0	2		9	0	2						6	0	0

P3	2	1	1		2	2	2						0	1	1

P4	0	0	2		4	3	3						4	3	1

Become:

        Allocation                        Max                        Available                        Need

​	A	B	C		A	B	C		A	B	C		A	B	C

P0	0	1	0		7	5	3    	     	2    	3	0		7	4	3

P1	3	0	2		3	2	2						0	2	0

P2	3	0	2		9	0	2						6	0	0

P3	2	1	1		2	2	2						0	1	1

P4	0	0	2		4	3	3						4	3	1

...

After a few iterations ...

...

        Allocation                        Max                        Available                        Need

​	A	B	C		A	B	C		A	B	C		A	B	C

P0	0	1	0		7	5	3    	     	2(7)	3(5)	0(5)		7	4	3

P1	3	0	2		3	2	2		5	3	2		0	2	0

P2	3	0	2		9	0	2		10	5	7		6	0	0

P3	2	1	1		2	2	2		7	4	3		0	1	1

P4	0	0	2		4	3	3		7	4	5		4	3	1

Safe sequence: P1 | P3 | P4 | P0 | P2

process arrive time burst time p1 0 30 p2 10 15 p3 20 24 p4 30 10 p5 5 59

Round Robin (quantum==8) p1 p5 p1 p2 p5 p3 p1 p4 p2 p5 p3 p1 p4 p5 p3 p5 ... 0 8 16 24 32 40 48 56 64 71 79 87 93 95 103 111 ...

Deadlock detection algorithm

This thing is Different to Banker algorithm.

Rule and step of algo:

  1. Init Work = Available

  2. if Allocationi = 0, Finish i = true ;else Finish i = false;

  3. Find a job that Finish i == false and Requesti <= Available; If none of them found, then Go TO FINAL STEP

  4. Available = Available + Allocation i Finishi = true Then Go back to STEP 3 keep searching

  5. If Finish i == false, for some i ==> DEADLOCK - Pi is deadlocked else ===> ALL IS WELL

Example* NO DEADLOCK

​	Alloc.	Req.	Avail.

P0	0 1 0	0 0 0	0 0 0

P1 	2 0 0	2 0 2

P2	3 0 3	0 0 0

P3	2 1 1	1 0 0

P4	0 0 2	0 0 2
  1. Select P0, finish = true, avail. update to 0 1 0.
  2. Select P2, Req = 0 0 0 which is <= avail. 0 1 0; finish = true, avail update to 3 1 3
  3. Select P1, finish = true, avail. update to 5 1 3.
  4. Select P3, finish = true, avail. update to 7 2 4.
  5. Select P4, finish = true, avail. update to 7 2 6

Result:

  • all process Finish = true, so NO DEADLOCK

FS Allocation Methods

Purpose:

  • allocate disk blocks for files, utilize space effectively

3 common way: contiguous, linked, indexed.

Contiguous

Each file own set of contiguous blocks of disk

  • Good performance in seq/ direct access
  • Simple
  • When file size grows, it has problems with finding space (dynamic storage-allocation problem -> external fragmentation)
  • Best-fit/ First-fit > Worst-fit
  • File size must be known at file creation -> overestimation cause large amount of internal fragmentation

Extent

Allocate disk block contiguously, and in extents

  • one file consist multiple extents

  • file blocks will record:

    • location
    • block count
    • ptr to the first block of the next extent
  • Internal fragmentation still a problem (if extents too large)

  • Extern fragmentation occur (if extents of varying size are allocated and deallocated)

Linked (List)

Use linked list data structure, blocks of one file are scattered anywhere on the disk

  • directory know the HEAD and TAIL

  • file end with NULL ptr

  • each node (block) contain ptr to next node (block)

  • No compaction - no external fragmentation

  • Easily grow

  • Bad for direct access (only good for Seq search)

  • Extra disk space for ptr - 4Byte ptr / 512 Byte - 0.78%

  • Not Reliable, ptr can be lost (bad block)

FAT (File Allocation Table)

Variation of Linked allocation

  • Beginning of volume has a FAT table section

  • Table has 1 entry for every disk block, indexed by block-id

  • Directory entry has block id of first block of file

  • Table entry indexed by that block id contain the block id of next block, until reach the last block.

  • unused block is value 0 in table entry. (finding 0 for allocate a new file)

  • FAT can be cached. Direct access time improve.

Indexed

Bring all pointers together into Index Block

Each file has own Index Block, contain Vector of pointer to its data block/ disk block addr

  • Direct access

  • No external fragmentation (any free block can satisfy a request for space allocation)

  • Extra space needed - Pointer overhead greater than linked list allocation

Index Block usually one disk block (512 Byte)

If file is too large, and indexed pointer not enough space to store in index block, we can:

  • Linked scheme - linking multiple index block together ( form a linked list)
  • Multilevel scheme - 1st level index block points to a set of second level index blocks, then point to data blocks.
    • in 32-bit system, 4K block, it can store 1,024 number of 4 byte (32-bit) pointer (point to data block). With 2-level indexing, it allow 1024*1024 = 1,048,576 data blocks and it equals a file with 4GB size.
  • Combined scheme
    • Direct blocks for small files
    • Indirect blocks for large files (means multilevel)

Performance

  • Declare file access type at creation -> select either contiguous allocation for direct-access file or linked allocation for sequential-access file
  • Indexed
    • If the index block is already in memory, then access can be made directly. However, keeping index block in memory requires considerable space.
    • If the index block is not in memory, we have to read the index block first and then desired data block

Free-Space Management

OS has free-space list to track available disk block.

Bit vector

require extra space:

if block size = 4 KB (2^12 byte)

disk size = 1 TB (2^40 byte)

n = disk_size / block_size = 2^28 bits (256 MB)

Linked (Free Space) List

  • Same logic of Linked allocation, connect free block with linked list

Grouping

Basically is linked list, but the different is...

continuous N free block can group as 1 node, and the N-1 is actually free, but the Nth contain the address of anther free blocks.

Counting

If contiguous-allocation algorithm or extents is used, we can use this method.

  • keep address of first free block and count of following contiguous free blocks
    • e.g. 0xACE268AE 15 means start at that adress, we have 15 contiguous free space

Efficiency

  • Disk-allocation and directory algorithms
  • Types of data kept in file’s directory entry
  • Pre-allocation or as-needed allocation of metadata structures
  • Fixed-size or varying-size data structures

Performance

  • keep data and metadata close
  • buffer cache
  • asynchronous
  • Free-behind and read-ahead – techniques to optimize sequential access
  • Reads frequently slower than writes, as writing contents are usually buffered in the cache, and disk sorts its output queue according to disk address.

Hierarchical Page Tables (2-level)

a.k.a Forward Mapped Page Table

Assume { 32-bit machine } with { 4K page size }

4KB = 212 , so page offset 12 bit

m = 32, n = 12, so page number = m - n = 20 bit

To use 2-level page table, we paged the page table into:

---------------------------------------------
|  p1     |    p2    |    page offset       |
---------------------------------------------
   10bit      10bit        12 bit
  • p1 is outer page table 's page number
  • p2 is outer page table 's page offset

Suppose a process uses 256 MB memory.

It spend 228 ( calculation: log(256 * 1024 * 1024) / log(2) ).

Each page is 4 KB ( 212 )

So Number of Page that the program need = 228 / 212 = 216

How many second-level page we need = 216 / 210 = 26 = 64 pages

Remember to add 1st level page table (4KB)

Total space = (1 + 64) * 4 KB = 65 *4 KB = 260 KB

Keyword, Terminology

Microkernel

  • Remove all non-essential components -> move to user level
    • Make kernel smaller , easy maintain kernel codebase, easy to ported
    • Performance hit: IPC switching between user-mode & kernel-mode
    • Adding new service no need to modify kernel
    • Secure: most service done in user-mode (restricted privilege)
    • Simpler, imporve reliable
    • IPC method: message passing

Process scheduler

Long-term scheduler (aka job scheduler):
select the processes that should be enqueue to ready queue.

Short-term scheduler (aka cpu scheduler):
select the processes that should be executed next from ready queue (dequeue) & allocates CPU to run.

Degree of multiprogramming :
num of processes that a system can be running concurrently.

Medium term scheduler :
swapping scheduler; swap-in & swap-out.

Type of processes:
I/O bound process & CPU bound process.

Process Creation

  1. fork()
    Create new child process that duplicate entire address of parent.
    -> return 0 to the child process;
    -> return pid (always > 0) of that child process to parent process;
    -> return -1 : no child is created, errno will be set;

  2. exec()
    Replace process's memory space with new program (used after fork()).

  3. exit()
    Terminate current process.

  4. wait()
    Suspend execution of current process until one of its children terminates;
    wait(NULL) means dont get the exit code of child;
    wait(&status) means get the exit code of child;

  5. abort() Terminate whole program, cause all children terminate (cascading termination), send out SIGABT signal.

Things happend in process creation

  • New process control block
  • Memory allocation
  • Complete copy of parent memory and I/O state (stdout, stdin, file handles)
    • Same
      • Global & local variables
      • Process resources
      • Address space
    • Diff
      • PID
      • Running time
      • Running State

Zombie

Terminated process's parent no wait() issued, then PCB will not release even the child processes are finished.
Unix not handle zombie process by default, unless:
​ - user issue 'kill' systmem call
​ - system reboot

Orphans

Parent terminated without wait() the child process finished, all child processes become orphans.
Init process will be the new parent of those orphans, and call wait() periodically to reclaim the resources of orphan.

File Descriptor (fd)

fd(0)   stdin
fd(1)   stdout
fd(2)   stderr

Pipes

One of the IPC (inter-process-communication) mechanisms.
Allow two processes to communicate.

Unidirectional & Bidirectional

  • Single direction : data stream from left to right
  • Bi-direction : data stream from left to right and right to left are avalible

Half-duplex & Full-duplex

  • First one can only either read or write; Second one can read and write at a same time;

Anonymouse pipe

Named pipe (FIFO)

  • https://en.wikipedia.org/wiki/Named_pipe
  • No parent-child relationship is necessary between processes
  • Bidirectional
  • Half-duplex or Full-duplex
  • Multiple processes can use single named pipe for IPC purpose
  • CMD for Unix/Linux create named pipe: mkfifo {your_named_pipe}
    • Act like a normal file
    • Delete: rm {your_named_pipe}

IPC

Model

  • Shared Memory Mapping
    • read() / write() to shared-mem region
  • Message Passing (Bus system e.g. D-Bus)
    • send() / receive()
    • Better performance in multicore Sys
    • Can across netwrok
    • Direct & In-direct communication

Direct & In-Dircet

  • Direct: 1to1 connection
  • In-Dircet: Through mailbox, multiple link connection

Synchronization

Blocking = synchronous

  • Blocking send: sending process is blocked until target process receive the message.
  • Blocking receive: receving process is blocked until any message is avalible (come in).

NonBlocking = asynchronous

Amdahl's Law

  • S - serial portion (1 - S = parallel ratio portion)
  • N - # of processing cores Speedup <= 1 =/ (S+ (1-S)/N)

TCB (Thread Control Block)

  • Exec state (registers, PC, stack ptr)
  • Scheduling info (thread state, priority, cpu burst time)
  • Pointer to enclosing process -> PCB
Shared by all threads Standalone(Keep in TCB)
Contents of mem CPU register
I/O state Execution Stack
parameter, temp, PC

Multithreading Model

  • https://www.studytonight.com/operating-system/multithreading
  • Many to One
    • Efficiency
    • When a thread block, the entire process (other thread) will be blocked. Only one thread can access the kernel at a time, so multiple threads are unable to run in parallel on multiprocessors.
  • One to One
    • One system call blocks own one thread, more concurrency
    • Overhead for create kernel thread for each user thread sometimes
  • Many to Many
    • A combination of parallelism and efficiency
    • Blocking the kernel system calls does not block the entire process
    • The number of kernel threads may be specific to either a application or a particular machine
  • Two-level Model
    • Many to Many + One to One

Thread Cancellation

  • Asynchronous cancellation
    • terminate target thread at once
    • no matter in which statge (maybe in shared data update, it will cause data corrupt)
  • Deferred cancellation
    • keep polling to find a suitable timing to terminate target thread
    • achieve thread safety

Thread Creation

Thread Scheduling

  • LWP (light weight process) 1 to 1 bind to kernel thread
  • Many to Many & Many to One Model:
    • Thread lib schedules User-level threads to run on LWP ; aka Process contention scope
    • Kernel thread schedules onto CPU directly; aka system contention scope
      • One to One model will use this only

Scheduling Concept

  • CPU burst followed by I/O burst ( CPUBurst->I/OBurst->CPUBurst->I/OBurst...)
  • CPU utilization = use more CPU as you can
  • Throughput = num of processes that can complete execute per time instant
  • Turnaround time = arrive time, waiting time to finished time
  • Waiting time = how long does a process wait and do nothing

Dispatcher

  • module that control CPU to process that selected by short-term scheduler, and involves:
    • Context switching (save the state of current process, then switch task)
    • Switching back to user mode from kernel mode
    • Jumping to the proper location in the user program to restart that program indicated by its new state
    • Dispatcher has latency :(

Scheduling Method

FCFS

  • Convoy effect = short process stuck behind
    • consider a case: one CPU bound but many I/O bound processes <---this will totally stuck

SJF

  • Greedy algorithm, which is optimal --> minimum average waiting time for all scheduled processes
  • Determine next CPU Burst: exponential averaging
    • t_n = length of n_th CPU burst (already fin)
    • r_n+1 = predicted value for next CPU burst
    • a, 0<= a <= 1; usually a = 0.5
    • r_n+1 = a*t_n + (1-a)*r_n
    • Dynamic programming process, done in run-time
  • This result is used as priority, which is one of the scheduling policy
  • Can be preemptive or non-preemptive
  • SRTF = preemptive SJF
  • Problems: Starvation ; low priority processes may never execute (if high priority process keep come=in)
  • Solution: Aging; increase priority of the process as time progresses.

RR

  • Time quantum, q
  • q usually 10ms to 100ms, context switch < 10 microseconds
  • If q too hight, it become a FIFO(FCFS)
  • If q too small, many times of context switching will be trigger; cause overhead
  • 要留意特殊例子: 當current process完結; 並且有new process要進入Ready Queue時: new process會優先加入到Ready Queue, 然後才輪到current process加入Ready Queue Tail

Multilevel feedback Queue

  • Preemptive: Queue1 to QueueN(Second to bottom level) will be preempted by a process arriving for Queue0 (top level)
  • Aging: RRq=8 -> RRq=16 -> FCFS
  • 要留意特殊例子: 當Level_N (N > 0)的當前process被preempt時, 並且當前該個process還未完成所有的quantum(slice),它將會留在同一Level,插入隊尾; 而__非__把該個process調去Level_N+1

Asymmetric multiprocessing

  • One processor scheduling
  • Other only execute user code

SMP (Symmetric multiprocessing)

  • Each processor is self-schedule
  • All of them use common ready queue;
  • Or, Each of them Own private ready queue

Processor affinity

Process has affinity for processor on which it is currently running, i.e. the caches remain in the processor last time can take advantages (like cache-hits) if the process is being scheduling to the processor again.
Soft affinity it the OS attempt to keep a process running on the same processor, not guaranteeing it.
Hard affinity allows a process to specify a subset of processors on which it may run.

Race condition

  • Multiple process modify a shared data concurrently; The result depends on the running order of the processes.
  • Which will cause data inconsistent

Critical Section

  • A segment of code, that:
    • process may changing shared variable
    • TO-DO: make sure no other can in its critical section when one is running critical section

Solutions

Consider~

  • Mutual Exclusion
    • 當有一個 process 佔住 critical-section 時,其他 process 不能進入 critical section,不會有兩個 process 同時間在 critical-section 中工作。
  • Progress
    • 當沒有 process 要在 critical-section 中執行時,不能阻擋其他想要進入 critical section 工作的 process 進入 critical-section,要選擇其中一個候選 process 進入 critical-section,不能空在那邊。(延遲:postponed)
  • Bounded waiting
    • 等待 critical-section 的時間,不能是無窮大的時間,是有個界線。也就是說,不能佔住了critical-section 就不出來了。

Approaches~

  • Preemptive : allows preemption of process when running in the kernel mode, not free from the race condition, and increasingly more difficult in SMP architectures.
  • Non preemptive: runs until exiting the kernel mode, blocks, or voluntarily yields CPU. This is essentially free of race conditions in the kernel mode.

General Structure for multiprocess programming

do {
    // entry section
        // critical section
    // exit section
        // remainder section
} while(true);

Peterson's Solution For Critical Section Problem

  • Atomic instruction : cannot be interrupted
  • Functions and var that used:
    • load()
    • store()
    • int turn; // indicates which process is enter to C.S.
    • bool flag[2]; // true means a process N is ready and request to enter C.S.
  • Code:
// In process P_{i}
do {
    flag[i] = true;
    turn = j; // <------------- GG, if context switch ocurr here
    while(flag[j]==true && turn == j); // when P_{j} is accessing, P_{i} dont go in
        // critical section
    flag[i] = false;                   // after P_{i} done C.S. reset it for P_{j} to used
        // remainder section
} while(true);
...
...
...
// In process P_{j}
do {
    flag[j] = true;
    turn = i;
    while(flag[i]==true && turn == i); // when P_{i} is accessing, P_{j} dont go in
        // critical section
    flag[j] = false;                   // after P_{j} done C.S. reset it for P_{i} to used
        // remainder section
} while(true);

Synchronization Hardware : atomic hardware instruction:

Test And Set (TAS) Instruction

http://web.cecs.pdx.edu/~walpole/class/cse513/slides/3.pdf

The variable, lock is initialized to False, then we have function definition:

volatile bool lock = false; // global variable

bool test_and_set(bool *target){
    bool original_input = *target;
    *target = true;		// always set to true, maybe keep setting it...
    return original_input;
}

and,

do{
	while (test_and_set(&lock)); /* can only get in when lock = false */
	/* critial section */ 
    lock = false;  // Release Lock, let other process get in
    /*remainder section */
} while (true);

Problems

  • Notices that, this solution didn't guarantee __Bounded-waiting __/ Fairness.
    • Anyone process may have bad luck to wait forever until they got their turn go in C.S.
    • The chance of acquiring the lock is not fair when processes are set free.
  • High bus traffic , keep setting lock to true during waiting.









Bounded-waiting Test And Set (TAS) solution

do{
    waiting[i] = true;
    key = true;
    while( waiting[i] && key == true)  // <----- wait or continue
        key = test_and_set(&lock);
    waiting[i] = false;
    
    /* critical section */
    
    /* release lock or let others continue */
    j = (i + 1) % n; 
    while( (j != i) && !waiting[j] )
        j = (j + 1) % n;
    if (j == i) { lock = false; }
    	else { waiting[j] = false; }
    
    /* remainder section */
} while (true);

Achieve Bounded-waiting/ Fairness. How?

  1. Find out next process which is In Waiting State.
  2. The code from j = (i + 1) %n to while( (j != i) && !waiting[j] ) ... is for searching which is the next process that in waiting state;
  3. If that waiting process equal to current process, means nobody else is waiting, then release the lock.
  4. If that waiting process is a different process, set such process 's waiting[j] = false .

Compare And Swap (CAS) Instruction

https://en.wikipedia.org/wiki/Compare-and-swap

volatile bool lock = false;

/** 
  * If input is equal to expected value, then set it to new value.
  * It will always return the original value of param {input}.
  */
int compare_and_swap(int *input, bool expected, bool new_value){
    int original_input = *input;
    if (*input == expected)
        *input = new_value;
    return original_input;
}

and,

do{
    /* if lock is false, can go in C.S. */
    while (compare_and_sawp(&lock, false, true) == true);
    /* critical section */
    lock = false; // release
    /* remainder section */
} while (true);

Actually CAS and TAS are very similar, the difference is:

  • TAS change the contents of a memory location and returns its old value as a single atomic operation.
  • CAS atomically compares the content of a memory location to given value, then change the contents of that memory location to a new value only if their content are the same.
  • ...which means, by using CAS, it don't keep setting lock to true during waiting.

Problems

  • Also same problem with previous solution, it is Not Bounded-waiting/ fairness.

  • To achieve, the solution is exactly same as Bounded-waiting TAS solution.

Mutex Lock

http://welkinchen.pixnet.net/blog/post/47071066-spinlock-%26-mutex-%26-semaphore-%E7%9A%84%E4%BD%9C%E7%94%A8%E5%92%8C%E5%8D%80%E5%88%A5

'Mutex' is a general abstract concept of lock;

Spin Lock is one of the Synchronization Mechanism in Mutex;

__B__ut In real life, Win32 and Linux API does have Mutex Lock, and noted that Mutex = Binary Semaphore.

In Linux API:

Mutex is extremely similar to semaphore, which : when process acquire lock fail, call sleep() and enqueue it to waiting queue, this cause kernel perform context switch to another process, after another process release the lock, wake up the previous process. It is also sleep-waiting.

Spin Lock

http://blog.hitripod.com/synchronization-mechanism-comparison-spinlock-mutex/

http://www.cs.cornell.edu/courses/cs4410/2015su/lectures/lec06-spin.html

  • Repeatedly checking if the lock is available using atomic instruction (like test_and_set())
  • Kind of Busy Waiting, not sleeping
  • Bad things: waste CPU cycles due to busy waiting
  • Good things: Avoid overhead from OS process rescheduling of context switching
    • Performance better than Semaphore
volatile bool available = false;
void acquire_lock(){
    while ( !available ); /* busy waiting */
    available = false;
}

void release_lock(){
    available = true;
}

do {
    acquire_lock();
    /* critical section */
    release_lock();
    /* remainder section */
} while (true);

Semaphore

https://en.wikipedia.org/wiki/Semaphore_(programming)

  • Must: No more than one process can execute wait() and signal() operations on same semaphore at the same time. (critical section problem by itself)
    • Require to disable interrupts in single-processor system would work.
    • Require to put wait() and signal() into critical section.
  • Semaphore with busy-waiting
    • Semaphore can be negative value but always have a chance that can never be negative value
    • If it is negative, then its magnitude is the number of processes currently waiting on the semaphore. (負數等於正在等待的processes數量, e.g. -10 表示10個processes正在等待)

Counting Semaphore

Use to control access to given resource consisting a finite number of instances. Semaphore value 'S' is initialize to number of available resources.

Binary Semaphore

Semaphore value can either be '0' or '1', behaves like mutex locks (spin lock).

volatile int S = 0;

void wait(){
    while( S <= 0);
    S--;
}

void signal(){
    S++;
}

Semaphore with no Busy waiting

Each semaphore is associated to a waiting queue, each node contain value and pointer to next node (just a linked list implemented Queue).

Operation

  • block() (a.k.a sleep() )
    • place the process invoking the operation on the appropriate waiting queue
  • wakeup()
    • remove one of processes in the waiting queue and put it in ready queue
/* semaphore object */
typedef struct{
    int value;
    struct process *list;
}semaphore;

void wait(semaphore *S){
    S->value--;
    if ( S->value < 0 ){
         // add current process to waiting queue
    	enqueue(this->process, S->list);
        block(); // sleep
    }
}

void signal(semaphore *S){
    S->value++;
    if (S->value <= 0){
        // remove head process from waiting queue
        process *P = dequeue(S->list);
        wakeup(P); // wake it up
    }
}

So, what is the difference between them?

Semaphore Vs Spin Lock Vs Mutex

  1. Semaphore can set number of lock acquire (if it is allow multiple process access critical section, but side effect is Deadlock), Spin lock CANNOT
  2. Mutex lock can only be released by the process who acquired the lock, but Semaphore did not has this restriction, any process can release the lock.
  3. Spin Lock has bad performance in single processor system, it wastes resource, slow down the owner of the lock, and cause deadlock.
    • Spin Lock Good for protect a code segment, for short time.
    • Semaphore Good for long time and large code segment protection.
  4. Spin lock is busy waiting (no context switch), Semaphore and Mutex is sleep waiting.

Deadlock

http://mropengate.blogspot.com/2015/01/operating-system-ch7-deadlock.html

http://www.csie.ntnu.edu.tw/~swanky/os/chap5.htm

Two or more processes wait indefinitely for an event, that caused by one of the waiting processes.

How Deadlock happen?

Match these 4 situation:

  • Mutual exclusion

  • Hold and wait

    • 某process持有部分資源, 同時又正等待其他process正在持有的資源
  • No preemption

    • process不能強奪其他process所持有的資源
  • Circular waiting

    • 系統存在一組N數量processes, 假設N = 3, 則其中P{0} 等待 P{1}, P{1} 等待 P{2}, P{2} 等待 P{3}, P{3} 等待 P{0}, 形成循環式等待

Starvation

Indefinite blocking, a process never wakeup() from the semaphore waiting queue.

Priority Inversion (a.k.a Unbounded Priority inversion)

Consider there are 3 process:

  • LP (Low priority process) require and accessing C.S.

  • MP (Medium priority process)

  • HP (High priority process) require C.S.

  1. Scheduling problem that when lower-priority process holds a lock needed and accessing a C.S resource that HP wanted.
  2. In normal situation, after LP finish and leave C.S., HP can continue execute and go in C.S.
  3. In-case if MP come-in, MP did not request the lock and C.S, but MP still preempt LP's execution.
  4. HP will preempt MP, but HP want C.S. and the lock is acquired by LP, HP can do nothing until LP release the lock.
  5. Then MP keep doing thing, because MP's has high priority than LP, it cause LP's cannot not release the lock until MP finish, make HP is keep blocking.
  6. Finally MP finish, give to LP, LP finish and release the lock, then execute HP
  7. Result execution sequence: MP -> LP -> HP

Solution - Priority-inheritance protocol

  • All processes that accessing C.S. that needed by a higher priority process will inherit the higher priority process's priority number.

  • Once they are finished accessing C.S. , their priorities will be revert to the origin.

God Damn Problem 1 - Bounded-Buffer Problem (a.k.a Producer-consumer problem)

Assume pool consist of n buffer, each capable of holding one item.

Producer and consumer will share these of the following:

int n;
semaphore mutex = 1; // muual exclusion for accesses to buffer pool
semaphore empty = n; // count num of empty buffer
semaphore full	= 0; // count num of full buffer

// RECALL THAT :D
/* semaphore object */
typedef struct{
    int value;
    struct process *list;
}semaphore;

void wait(semaphore *S){
    S->value--;
    if ( S->value < 0 ){
         // add current process to waiting queue
    	enqueue(this->process, S->list);
        block(); // sleep
    }
}

void signal(semaphore *S){
    S->value++;
    if (S->value <= 0){
        // remove head process from waiting queue
        process *P = dequeue(S->list);
        wakeup(P); // wake it up
    }
}


/* producer process */
do {
    //...
    /* produce an item in next_produced */
    //...
    wait(empty);
    wait(mutex);
    //...
    /* add next produced to the buffer */
    //...
    signal(mutex);
    signal(full);
} while (true);

/* consumer process */
do {
    wait(full);
    wait(mutex);
    // ...
    /* remove item from buffer to next_consumed */
    // ...
    signal(mutex);
    signal(empty);
    // ...
    /* consume the item in next_consumed */
    // ...
} while(true);

God Damn Problem 2 - Readers-Writers Problem

https://en.wikipedia.org/wiki/Readers–writers_problem#First_readers-writers_problem

Data set is share to multiple concurrent processes

  • Reader - read only cd rom
  • Writers - can do r/w

Problems?

Resolve these:

  • Only one can access the C.S. at a time;

  • Allow many readers read C.S. at the same time.

int Dataset[] = { 1, 2, 3, 4};
semaphore rw_mutex = 1;
semaphore mutex = 1;
int read_count = 0;


/* writer */
do {
    wait(rw_mutex);
    // ...
    /* writing is perfromed */
    // ...
    signal(rw_mutex);
}

/* reader */
do {
    wait(mutex);
    read_count++;
    if (read_count == 1) { wait(rw_mutex); }
    signal(mutex);
    //...
    /* reading is perform */
    //...
    wait(mutex);
    read_count--;
    if (read_count == 0) { signal(rw_mutex);}
    signal(mutex);    
}while (true);

First variation

No reader waiting, until writer has gained access to use C.S. This can cause starvation for writers, and delay the update of object.

Second variation

When any one of the writer (first) ready, perform update asap. If a writer is waiting to access the section (i.e. reader is reading), NO MORE NEW reader may start ready, and they must wait after writer update the object. After the reader who is reading is finished, give the turn to writer, after writer finish, give the turn to NEW reader

Visualize the 2nd Variation:

reader(reading) <-- writer want (wait)

...Here, new reader comes.

reader(reading) <-- writer want(wait) <--- new reader (wait) reader(finish) <-- writer want(wait) <--- new reader (wait) writer want(writing) <--- new reader (wait) new reader (reading)

  • Always possible causing starvation
    • Can be solve by kernel provided Reader-Writer Locks Multiple processes are permitted to concurrently acquire a read-writer lock in READ Mode. Only 1 process can acquire the read-writer lock in WRITE Mode.

God Damn Problem 3: Dining (Philosophers) Sucker Problem

https://zh.wikipedia.org/wiki/%E5%93%B2%E5%AD%A6%E5%AE%B6%E5%B0%B1%E9%A4%90%E9%97%AE%E9%A2%98

One of the interesting problem abstraction that I have ever seen LOL

Philosopher

  • thinking
  • eating
  • Non interact to each other
  • require 2 chopsticks to eat (can only get left and right one)

If Philosopher (process) = 5 ... and Bowl of Rice (Shared Data Set) = 5 there have number of Chopsticks (Semaphore) = 5

How Deadlock occur?

The original problematic Philosopher Process (Deadlock)

// i means the (current) index of philosopher
do {
    // one p must need 2 chopsticks to eat   
    wait(chopstick[i]); // pick left first
    wait(chopstick[(i+1) % 5 ]); // then right
    
    /** Eat la eat la ... **/
    
    // Eating finsihed, relase 2 chopsticks
    signal(chpostick[i]); // relase left
    signal(chopstick[(i+1) % 5]); // realse right
    
    // They start to think alone
    
} while (true);

Whats the problems:

  • guarantee no 2 neighbors are eating at same time
  • In 5 man, only 2 people (not neighbors) can eat
  • If 5 philosophers want to eat at the same time
    • each of them get the left chopstick, and waiting for the right chopstick... (or reverse)
    • However right chopstick is the left chopstick of the neighbor (with is already picked)
    • They all wait for right chopstick until one of them finish eating ... but NO ONE can do it
    • DEAD LOCK

Or... you can solve it by:

  • When philosophers who are waiting for another chopstick (already) have one on the hand, need to release the chopstick they already acquire After 5 mins

Deadlock is solve !!! but...

  • If that 5 people go in the restaurant at the same time, and pick left chopstick at the same time, then they will also face deadlock so wait for 5 minutes concurrently
  • After the waiting they begin to release their own chopstick at the same time and wait for 5 minutes together
  • ... and they pick up their left chopstick again... LOL
  • endless loop, no one can eat rice
  • Low probability event but it does happen!

Solution

1. Waiter

Waiter is a observer know which chopsticks are using.

  • philosophers need waiter's permission to pick up chopsticks

Example:

If philosopher A and C are eating

-> B cannot eat because B doesn't have any chopstick next

-> D, E only have 1 chopstick next to, so cannot eat

If D want to eat and pick up chopstick, Deadlock may occur

So he need to ask Waiter, Waiter will let him wait until 2 chopstick are free (guarantee he can eat) to prevent dead lock.

It is also called Monitor solution:

// Monitor act like an observer - The Waiter
Monitor DiningPhilosophers{
    enum{Thinking,Hungry,Eating} state[5];
    Condition self[5];
    
    void pickup(int i){
        state[i]=Hungry;
        // check if I able to eat
        test(i);
        // I cannot eat, so I wait
        if(state[i]!=Eating) self[i].wait;
    }
    
    void putdown(int i){
        // I finish eating, then I think
        state[i]=Thinking;
        // ask left&right neighbors
        // do they want or able to eat?
        test((i+4)%5);
        test((i+1)%5);
    }
    
    /** 
    	when neighbors are not eating 
        and I am hungry, I eat
    **/
    void test(int i){
        if ((state[(i+4)%5]!=Eating) &&
           (state[i]==Hungry)&&
           (state[(i+1)%5]!= Eating]))
           self[i].signal(); //dont wait, eat
    }
    
    void initialize(){
        for(int i=0;i<5;++i)
            state[i]=Thinking;
    }
}

Noted that condition variable self here represent philosopher. And the waiting queue (for suspend), have at most 1 process.

each Philosopher (process) should run like:

DiningPhilosophers.pickup(i);
/** if its able to eat, go eating here **/
DiningPhilosophers.putdon(i);

2. Resource hierarchy

label a ID for chopsticks in order.

  • Philosophers can only pick up the chopstick which has smaller ID first, then the larger one.
  • After finish eating, he always release larger ID chopstick first.
  • If we have 5 man, and 5 of them want to eat, then the last man can not eat because the smaller ID chopstick has already been picked, so he need to wait his neighbor to finish eating (Prevent Deadlock perfectly)

3. Chandy & Misra 's Solution

Those philosophers who want to eat but did not have any chopsticks will get a chopstick coupon.

  • philosophers who hold chopstick coupon give his coupon to the one who finished eating, and the one give his chopstick to the one who want to eat and hold a coupon.
  • NO ONE can hold TWO coupons.
    • Make sure who have chopstick who can eat.

Monitor

Provide better management for synchronization

Schematic

  • shared data define inside monitor, NOT Critical section data, but something like states, status, condition variable, priority, id, etc...
  • Only change shared data through monitor's methods (functions)
  • Initialization code (like constructor)
  • Only one process can be active within the monitor at a time
  • attaching a entry queue (incoming process)
  • hold multiple condition variables

Condition Variables

Each condition variable represent a condition (similar to a semaphore) , and it hold a queue for processes.

Condition x, y, z ...;

So, each condition variable also have these 2 functions:

x.wait()

Move current activated process into suspend list, and it has to release monitor to the other processes.

x.signal()

  • Resume one of processes that invoked x.wait() earlier

  • Does nothing If no waiting is called before

  • Different with signal() from semaphore (semaphore increase number, this bring something from suspend queue out to ready queue)

Case: Handle process switch in Monitor

If P call x.signal() when Q is under x.wait() ...

Possible options

  • Q resumed and activated, then P become x.wait()

  • Signal & wait:

    係指 P 一call x.signal() 嘅時侯 x 嘅waiting queue一個process (呢到嘅例子係process Q)會即刻俾人resume, 就算P未行完都會強行暫停咗, 變咗P入去waiting queue,輪到Q行

  • Signal & continue:

    係指 P 一call x.signal()嘅時侯monitor知道process Q應該要俾人resume, 但唔係即刻resume, 而係等到P 行哂完哂之後先resume process Q

Using Semaphore to Create a Monitor?

Possible.

/** init = 1 **/
Semaphore mutex;

/** representing entry queue 
	The signaling process also suspend itself
	and goto here
	init = 0
**/
Semaphore next;

/** count the number of process suspend inside next 
**/
int next_count = 0;

Problems with Semaphore

1. Incorrect use:

reverse the order: ❌ signal(a) ... wait(a)

wrong pair: ❌ wait(a) ... wait(a) / wait(a) ... signal(b)

missing: ❌ wait(a) ...

2. Deadlock

3. Starvation

Resource Graph

Denote definition:

Resource types R1, R2, ..., Rm => CPU cycles, memory space, I/O device;

Each Resource Ri has Wi instances, and;

Each Process control a resource as Request, Use, Release.

Remember Graph Theory Basic?

Request Edge - Directed Edge Pi -----> Rj

Assignment Edge - Directed Edge Rj -----> Pi

This graph has a DeadLock
Why? Because P3 request R2 where it is hold by P2 and at the same time P2 request R3 (which is hold by P3), it form a cycle.

Chapter 10 FS

Basic File Attribute

  • Name
  • Identifier
  • Type (MIME)
  • Location
  • Size
  • Permission, Access Control List
  • Time, date
  • Checksum

Information like these are kept in Directory structure. Part of it can be cached in RAM.

Open File Lock

Share Lock - Reader-writer lock - multiple process can access concurrently

Exclusive lock - write lock - either one can write

Single-Level directory

Two-Level directory

Separate dir for each user

  • NO grouping capability

Tree structured directory

  • Efficient searching

  • Group capability

Acyclic-Graph directory

  • Using Link (symbolic link and hard link)

  • Difficult to ensure no cycles in acyclic graph

General Graph directory

  • Allow only links to file not subdirectories

  • Every time a new link is added use a cycle detection algorithm to determine whether there is a cycle or not – time consuming

Chapter 11 Implementing FS

Layered FS

application program > logical FS > file organization module > baisc FS > I/O control > device driver

I/O control, device driver

device driver, interrupt handler, give disk header cmd

Basic FS

  • manage cache and buffer for block allocation, freeing and replacement

File organization module

  • translate logical block addr to physical block addr

  • manage free disk space, disk block allocation

Logical FS

  • metadata - the fs structure data of data
  • manage directory structure
  • translate file name -> file number/ file handle by maintaining File Control Block

File Control Block

Inode - in Unix file system

contain information about permission, ownership, location of file contents on the disk (it act like a linked list node)

Two things must consider to implement a FS

On disk structure

boot-loader, num of blocks, location of free blocks

  • Boot control block (per volume)
    • contain bootloader, first block, called boot-block (UFS) or partition boot sector (M$)
  • Volume control block (per volume)
    • contain volume/partition data, called super block (UFS) , or master file table (M$)
  • Directory structure (per FS)
    • file name and associate inode number
  • FCB (inode) (per file)

In memory information

Used for performance improvement via caching. Include Data load in mount time, update during fs operations.

  • mount table - contain data of mounted volume
  • system wide/ per-process open-file table
    • contain inode (File-control-block) of all or process opened file
  • Buffer hold inode when r/w required

Directory implementation

Linear List

  • simple
  • bad time-complexity for linear search
    • can improve by caching
    • keep ordered by linked-list / B+ tree
  • e.g. Btrfs use B+ tree, Ext4 use Vector Table

Hash Table

  • O(1) search time
  • collision problems
  • only good if size are fixed, or use chained-overflow method

... Continue with FS Allocation Methods.md

Chapter 12 mass storage system

HDD

  • positioning time, aka. random access time = seek time + rotational latency

    • seek time = time for disk arm move to desired cylinder
    • rotational latency = time for desired sector to rotate under disk head
  • Head crash = disk head making contact with disk surface. Then the Disk GG.

  • I/O bus

    • EIDE (IDE), SATA, USB Fibre Channel, SCSI, SAS
    • Disk controller - built-in cache
      • Data transfer at drive - between cache and disk surface
      • Data transfer to host - between cache and host controller
    • Host controller - end of the buses (e.g. SATA controller), using memory-mapped I/O ports, send commands via message to Disk controller perform disk I/O operations.
  • Performance

    • Average Seek time

      • 1/3 * number of tracks
      • fastest: 3ms + 2ms = 5ms
      • slow: 9ms + 5.56ms = 14.56ms
    • Latency based on spindle speed

      • Latency = 60/RPM (a.k.a RpS)
      • Average Latency = latency * 0.5 (unit: s)
    • Average I/O time

      • average access time+ (amount to transfer / transfer rate) + controller overhead

      • e.g. 5ms(average seek time) + 4.17ms (7200 rpm disk) + 4KB(data required to transfer)/ 1Gb/sec(transfer rate) + 0.1ms (controller overhead)

        = 9.27ms + 0.12ms = 9.39ms

        • 1KB/s = 1024*8 = 8192bps
        • 1Gbps = 230bps = 1073741824bps
        • 1073741824/8192
        • ==>1 Gbps = 131072 KB/s

SSD

  • single-level cell (SLC), multilevel cell (MLC)
  • Require faster bus interface (SATA is too slow, need PCIE)
  • Pros
    • more reliable => no mechanical moving parts
    • much faster => no seek time and rotation latency
    • consumes less power
  • Cons
    • expensive per MB
    • less capcity
    • shorter life span

Magnetic Tape

  • extreme slow (random) access time
  • always used for data backup
  • large capacities

Disk Structure

  • Drive are Addressed as vector of logical blocks
    • logical block - smallest unit of transfers
    • size is usually 512 byte for traditional HDD
    • modern HDD/ SSD used 4K block
  • Vector of logical blocks is mapped into sectors of the disk sequentially

NAT (Network Attached Storage)

  • Implemented via remote procedure calls (RPCs)
    • over network protocol: TCP/UDP based
      • e.g. NFS, CIFS
  • Performance less efficient than Direct-attached storage protocols, like FC used by SAN
    • consume band-width; increase latency

SAN (Storage Area Network)

  • FC (Fibre channel) - SAN (Storage area network)
    • arbitrated loop (FC-AL) with address 126 devices including drives and controllers
  • Provide flexibility
    • multi hosts and multiple storage arrays can attach to same SAN
    • storage can dynamically allocated to hosts
  • High performance
    • large bandwidth
    • block-level data transfer - faster
    • Drive act as direct attach device

Disk Scheduling

https://www.geeksforgeeks.org/disk-scheduling-algorithms/

Why it is important?

  • Multiple I/O requests may arrive by different processes and only one I/O request can be served at a time by disk controller. Thus other I/O requests need to wait in waiting queue and need to be scheduled.
  • Two or more request may be far from each other so can result in greater disk arm movement.
  • Hard drives are one of the slowest parts of computer system and thus need to be accessed in an efficient manner.

How to Calculate DISK ACCESS TIME?

Disk Access Time = Seek Time + Ratationla Latency + Transfer Time 

More details:

  • Seek Time : Seek time is the time taken to locate the disk arm to a specified track where the data is to be read or write. So the disk scheduling algorithm that gives minimum average seek time is better.
  • Rotational Latency: Rotational Latency is the time taken by the desired sector of disk to rotate into a position so that it can access the read/write heads. So the disk scheduling algorithm that gives minimum rotational latency is better.
  • Transfer Time: Transfer time is the time to transfer the data. It depends on the rotating speed of the disk and number of bytes to be transferred.

FCFS (first-come-first-serve)

Just follow the request order to move disk head.

Pros:

  • Every request have fair chance
  • No indefinite postponement

Cons:

  • Seek time is NOT optimal
  • May not provide best possible service

SSTF (shortest-seek-time-first)

Choose the pending request closet to the current head position, either forward or backward.

Pros:

  • shortest seek time. Disk access time is optimal.

Cons:

  • Possible of causing Starvation

SCAN Scheduling (Elevator algorithm)

Disk arm moves into certain direction and services the coming requests coming in its path until reach the end of the disk.

Characteristic :

  • The requests at the mid-range are serviced more
  • Arriving behind the disk arm need to wait

Pros:

  • High throughput
  • Low variance of response time
  • Average response time

Cons:

  • Long waiting time for requests those locations just visited by disk arm.

C-SCAN (Circular-SCAN Scheduling)

Same as SCAN Scheduling, but when disk head reach one end, it will NOT Reverse its direction, but go to the other end of the disk, and start servicing the requests from there.

Pros:

  • More uniform waiting time compared to SCAN

LOOK and C-LOOK

  • Another version of SCAN and C-SCAN

Difference:

  • Disk head will not go to the end of the disk
  • ...But only go to the Last Request location to be serviced
  • And also for C-LOOK, when it reach the end (last request), it will go back to other end's FIRST REQUEST

Pros:

  • Prevent extra delay for going to the end

Which Disk-scheduling algo should I use?

  • common way - GO SSTF
  • heavy load on disk - GO LOOK/ C-LOOK
    • less likely to cause starvation problem

Disk Management

Low-level formatting

a.k.a physical formatting

  • divide a disk into sectors
    • for disk controller r/w
    • 512 bytes each sector
      • header & trailer - sector number, ECC (error-correction code)

Partition

  • Separate disk into multiple groups of cylinders, it records its own data structure

Logical formatting

  • stores file-system data

Cluster

  • chunk of blocks

  • Increase efficiency

    • Disk I/O done in blocks
    • File I/O done in clusters

Raw Disk

  • Disk partition as a large sequential array of logical blocks, NO file-system.
  • For Application that want to do their own block management, keep OS out of the way (databases for example) – raw I/O bypasses all file-system services such as space allocation, file names and directories

Sector sparing

  • invisible to OS
  • used to handle bad blocks

RAID (Redundant Array of Independent Disks)

  • many cheap disk composed together
  • for high reliability via redundancy & high data transfer rate
  • mirroring
  • NVRAM (solid-state nonvolatile RAM) can handle power failures -> add to RAID

Parallelism & stripping

goals:

  • increase throughout of many small access by load balancing
  • reduce response time of large access

Bit-level stripping:

  • If we have 8 disk running bit-level stripping, write bit i of each byte to disk i.
  • array of 8 disks treated as a single disk with sector that are 8 times larger. access rate 8x up.
  • disk num should be mul of 8 or div 8
    • e.g. 4 disks, bit i and bit 4+i of each byte will store in same disk i (let say i = 0, then bit 0 and bit 4 will go same disks)

Block-level stripping:

  • a file are stripped across multiple disk
  • n disks, block i of file goes to disk (i mod n) + 1

Common RAID Structure

RAID 0:

  • no redundancy
  • blocks level stripping, pure performance up

RAID 1:

  • only redundancy
  • duplicate each disk, called mirroring / shadowing

RAID 2:

  • Hamming code, Error correction code

RAID 3:

  • one disk store parity
  • bit interleaved parity

RAID 4:

  • one disk store parity
  • block interleaved parity

RAID 5:

  • Distributed Parity
  • block level stripping

RAID 6:

  • similar to RAID 5, but add one more extra parity block

Hybrid RAID (0 + 1 or 1 + 0)

  • high performance & high redundancy

More feature

  • snapshot , save last update change, for recovery
  • replication, automatic duplication of writes between separate sites for redundancy and disaster recovery.
  • hot spare, when disk failure, auto replace a new disk to RAID

Chapter 8 Memory Management

Common sense:

  • program brought from disk into mem => process run
  • main memory, cache and registers are the only storage that CPU can direct access
  • CPI access secondary device (HDD) via I/O controller

Hardware Address Protection

Using Base registers and Limit registers

  • Define the address space (start & end) of process
  • CPU ensure every memory access within the range
  • Larger or equal than Base & Smaller than Base + Limit

Address Binding

Source code address is symbolic

Compiled code address Bind to relocatable address

  • ​ 14 bytes start of this module

Linker / Loader bind relocatable address to physical address

  • 74000 + 14 = 74014

Each binding map 1 address space to another 1 address space (1 to 1 binding)

Program Address Binding Stage

Compile time:

If user already know target memory address, absolute code can be generated (boot-loader); must recompile code if program RAM starting location change.

Load time:

Complier gen relocatable code if memory location is not known (most of the program). Just reload the code to incorporate changed value (linking).

Execution time:

If process can moved during execution from one memory segment to another. Need hardware support for address maps (base & limit reg), the examples are dynamic library (*.so).

Logical & Physical Address Space

  • Logical address - gen by CPU, virtual address

  • Physical address - address seen by MMU (Memory-Management-Unit)

Logical and physical addresses are the SAME in Compile-time and Load-time address binding schemes;

... but only DIFFERENT in execution-time address-binding scheme

This image show what MMU does.

Dynamic loading (a.k.a dynamic linking) provide better memory-space utilization.

Swap

Temporary memory (usually using disk drive)

  • Increase degree of multi-programming

Swap-in, Swap-out

Lower priority process swapped out, higher-priority process can be loaded (swapped in) and executed

Main Memory Allocation

http://users.cs.cf.ac.uk/O.F.Rana/os/lectureos8/node4.html

Contiguous Method

  • Traditional - just check within base and limit, then add it directly.
  • Multiple-partition allocations
    • number of partition limit degree of multiprogramming
    • Hole - block for available memory
    • Variable-partition - allowing jobs to use as much space as they needed
    • Allocate a large enough space for new coming process
    • When a process is exit (free), the adjacent free partition combined

Dynamic Storage Allocation Problem

First fit

Allocate first available hole

Best fit

Allocate tide (smallest) available hole

Worst fit

Allocate largest available hole

  • First-fit / Best-fit >>>>>>> Worst-fit ... in term of speed and storage utilization

External Fragmentation

Total memory space exist to satisfy a request, but not contiguous (足夠空間但非連續)

Internal Fragmemtation

Memory allocated to a process may be larger than requested; this size difference is memory internal to a partition, but not being used (分配過大空間造成內部冗餘)

Compaction

Reduce external fragmentation.

Shuffle memory contents to place all free memory together in one large block.

  • only work if process is dynamic relocation
  • done in execution time
  • expensive

Segmentation

It can solve _Internal Fragmentation_ only. External fragmentation still exist.

Program is a logical unit including:

main program, procedure, function, local variables, global variables, stack, symbol table, arrays...

! Just divide these parts and allocate memory for each of them.

Logical address = < Segment-number , offset >

Segment table

Map 2D addresses to 1D vector physical addresses, each entry consists of Base (STBR) and Limit(STLR).

Protection

Each entry in segment table with:

  • Validation bit = 0 ( illegal segment )
  • read/ write/ exec permission

Sharing

Different process can share same segment (e.g dynamic linking library)

Paging

Aim at solve External Fragmentation and varying size memory chunks

Divide physical memory into fixed-sized block called "Frames"; usually size is power of 2 between 512 Byte to 16 MB

Divide logical memory into blocks of same size, called "Pages"

Keep tracking free frames available in Main Memory

  • Page table base register (PTBR) - page start address
  • Page table length register (PTLR) - indicate size of page
  • Page number - used as an index into page table where contain Base address of each page in physical memory
  • Page offset - combined with base address to let MMU know the actual address
  • Logical address space = 2m
  • Page size = 2n
|--------------------|----------------------|
| page number        |   page offset        |
|--------------------|----------------------|
      m - n 			      n

logical address space = 16 Byte = 2^4 page size = 4 Byte = 2^2

Workflow

Require 2 memory access p = "Page", d = "Data", f = "Frame"

paging-diagram.png

Internal fragmentation will occur

  • since program size usually not complete dividable by page size, so pages are not 100% fully used

Page Memory Protection

Page also have protection, each entry with

  • Validation bit = 0
  • read/ write/ exec permission

TLB (Translation look-aside buffers) / Associative Memory

... For Solve this problem:

Every data/ instruction access require 2 memory accesses

  • page table
  • data/ instruction

If page is NOT in TLB, get Frame from page table in memory, and also load this entry into TLB for faster access next time.

If page is IN TLB's associative register, get that Frame out.

Effective Access Time (EAT)

EAT = (TLB_search_time + memory_access_time) * hit_ratio + (TLB_search_time + 2*memory_access_time) * (1- hit_ratio) 

where:

ε = lookup time (unit: ns) a = hit ratio (suppose 80%)

Consider α = 80%, ε = 20ns for TLB search, 100ns for memory access EAT = 0.80 x 120 + 0.20 x 220 = 140ns

Shared Page

  • Read-only page can share among multiple process
  • Reentrant code is non-self-modifying code
  • Useful for IPC (inter-process-communication)

Chapter 9 Virtual Memory

  • Virtual memory separate user logical memory form physical memory.

  • Part of program stay in main memory for execution.

  • Allows address spaces to be shared by several processes

  • Allows for more efficient process creation, as pages can be shared during process creation

Virtual Memory Address Space

Memory Map
---------------------
|     Stack         |
---------------------
|      vvvvv        |  Virtual Memory
|  Virtual Memory   |   is between
|      ^^^^         |  Stack and Heap
---------------------
|     Heap          |
*                   *

Demand Paging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment