onyb/OS3e_Chapter_10

## OS3e_Chapter_10
# Introduction
    - Solves the problem of limited memory space, by creating an illusion that
      more memory exists than is available in the system.
    - Two types of addresses in virtual memory systems
        * Virtual addresses: referenced by processes
        * Physical (real) addresses: describes locations in main memory


# Evolution of Memory Organization
  ├── Real Memory Systems
  │   ├── Multiprogramming systems
  │   │   ├── Fixed-parition multiprogramming
  │   │   │   ├── Absolute
  │   │   │   └── Relocatable
  │   │   └── Variable-parition multiprogramming
  │   └── Single-user dedicated systems
  └── Virtual Memory Systems
      ├── Pure paging
      ├── Pure segmentation
      └── Combined paging and segmentation


# Virtual Memory: Basic Concepts
    - Virtual memory systems contain special-purpose hardware called Memory
      Management Unit (MMU) that quickly maps virtual addresses to real
      addresses.
    - Virtual Address Space (V): Range of virtual addresses that a process may
      refer.
    - Real Address Space (R): Range of physical addresses available on a
      particular computer system.
    - Normally, |V| >> |R|
    - Two-level storage scheme
        * OS must store parts of V for each process outside the main memory.
        * OS shuttles portions of V between main memory (and caches) and
          secondary storage.
    - Dynamic Address Translation (DAT)
        * Converts virtual addresses to physical addresses during execution.
        * Systems that use DAT exhibit the property that the contiguous
          addresses in a process's virtual address space need not be contiguous
          in physical memory. This property is called artificial contiguity.
        * Artificial contiguity simplifies programming by enabling a process to
          reference its memory as if it were contiguous, even though its data
          and instructions may be scattered throughout the main memory.


# Block Mapping
    - Virtual memory is organized into blocks.
    - If the blocks are of fixed size, they are called pages. The technique is
      called paging.
    - If the blocks are of different sizes, they are called segments. The
      technique is called segmentation.
    - In a virtual memory system with block mapping, the system represents
      a virtual address v as an ordered pair (b,d), where b is the block
      number in which the referenced item resides, and d is the displacement
      or offset within block b, from the start of the block, at which the
      referenced item is located.
    - Translation of virtual memory address v=(b,d) to a real address r
        * The system maintains in memory a block map table for each process.
        * The process's block map table contains one entry for each block of
          the process, in sorted order.
        * A high-speed special-purpose register called "block map table origin
          register" stores the base address (in real memory) of the process's
          block map table.
        * System adds the block number b, to the base address a, of the
          process's block map table to form real address of the entry for block
          b in the block map table.
        * This entry yields the address b', for the start of block b in main
          memory.
        * The system then adds the displacement d, to the block start address
          b', to form the desired real address r.
        * For the virtual address v=(b,d), r=b'+d, where b' is stored in the
          block map table cell located at main memory address a+b.
        * The add operation and lookups performed during translation of virtual
          memory address to real address must be done using special hardware in
          order to gain a significant speedup.
    - Let v=(b, d) be represented by 32 bits. Block displacement d if specified
      by n. The system will have 2^(32-n) number of blocks.
        * n=6:  Number of blocks = 2^(26), which is huge.
                Block size is small
                Limited internal fragmentation
        * n=24: Number of blocks = 2^(8), which is very small.
                Block size is huge
                Significant internal fragmentation
        * n=12: This choice is quite popular in paging systems, yielding a page
                size of 2^(12) = 4096 bytes
                Moderate internal fragmentation


# Paging
    - Virtual address in paging system is an ordered pair v=(p,d). p is the
      number of the page in virtual memory on which the referenced item
      resides, and d is the displacement within page p at which the referenced
      item is located.
    - Paging uses fixed-size block mapping.
    - When a page is transferred from secondary storage to main memory, it is
      placed in a memory block called page frame, p' (of same size as page).
    - A page frame begins at a main memory address that is an integral multiple
      of fixed page size p_s.
    - DAT under paging:
        * v=(p,d) results in a lookup of the page p in the "page map table" or
          simply "page table" to determine the page frame p'. The real address
          is obtained by the equation r = (p * p_s) + d
        * For a system which uses n bits to represent both real as well as
          virtual addresses, there is an ingenuine technique which can avoid
          the potentially slow addition and multiplication operations.

          If the displacement is represented by m bits, the page number will
          be represented by the most-significant n-m bits. The real address
          can be represented as r=(p',d) and can be formed by concatenating
          p' and d, placing p' in the most-significant bits of the memory
          address and d in the least significant bits.

          Ex: In a 32-bit system (n=32) with page size 4096 bytes (m=12), page
          frame number is represented by 20 bits and displacement represented
          by 12 bits. The 32-bit real address can be formed by concatenating
          the two binary strings.
    - One benefit of paging is that not all pages belonging to a process must
      reside in main memory at the same time.
    - When a process references a page that is not in main memory, the
      processor generates a "page fault", which invokes the OS to load the
      missing page into memory from secondary storage.
    - Typical Page Table Entry (PTE)
      +-------------------+---------------------------+-----------------------+
      | Page resident bit | Secondary storage address |   Page Frame number   |
      |                   | (page not in main memory) | (page in main memory) |
      |         r         |            s              |           p'          |
      +-------------------+---------------------------+-----------------------+

      r=0, if page is in memory
      r=1, if page is not in memory

      To reduce the amount of memory consumed by PTEs, many systems contain
      only one cell to store either the page frame number or the secondary
      storage address depending upon the value of the resident bit.


# Paging Address Translation by Direct Mapping
    - Dynamic address translation under paging is similar to block address
      translation.
    - Process references virtual address v=(p,d)
    - DAT adds the process's page table base address b (stored in "page table
      origin register") to referenced page number p.
    - (b+p) forms the main memory address of the PTE for page p.
    - The PTE indicates that the virtual page p corresponds to page frame p'.
    - System concatenates p' with displacement d to form real address r.
    - Known as direct mapping because page table contains one PTE for every
      page in V.
    - If a process contains n pages in V, then the direct-mapped page table
      for the process contains entries successively for page 0, page 1, ...,
      page n-1. Hence direct mapping is similar to array indexing, and the
      system can locate any PTE in O(1).
    - The page table can become huge due to typically large sizes of V. Since
      the page table is stored in the main memory, reference to page requires
      a complete main memory cycle which can degrade performance. Furthermore,
      due to large size of the table, it cannot be stored in the cache.
    - Due to huge size of direct-mapped page tables, most entries are unused
      resulting in table fragmentation.


# Paging Address Translation by Associative Mapping
    - Maintaining entire page table in cache memory is often not viable.
    - Increase performance of DAT by placing entire page table into
      content-addressed associative memory (rather than location-addressed).
    - Every PTE in associative memory is searched simultaneously.
    - Content-addressed cache memory is also prohibitively expensive.


# Paging Address Translation by Direct/Associative Mapping
    - Compromise between cost and performance.
    - Most PTEs are stored in direct-mapped tables in main memory.
    - Most recently used PTEs are stored in high-speed set-associative
      cache memory called a Translation Lookaside Buffer (TLB)
    - If PTE is not found in TLB (called a TLB miss), the DAT mechanism
      searches the direct-mapped page table in main memory.
    - Can yield high performance with relatively small TLB due to temporal
      locality, which assumes that a page referenced by a process recently is
      likely to be referenced again soon.


# Multilevel Page Tables
    - Multilevel page tables enable the system to store in discontiguous
      locations in main memory those portions of a process's page table that
      the process is actively using.
    - Hierarchical structure of page tables.
    - Each level containing a table that stores pointers to tables in the level
      below.
    - The bottom-most level is comprised of tables containing the page to
      page-frame mappings.
    - In most systems each table in the hierarchy is the size of one page,
      which enables the operating system to transfer page tables between main
      memory and secondary storage easily.
    - Can reduce memory overhead compared to direct-mapping system.
    - The overhead incurred by multilevel page tables is the addition of
      another memory access to the page mapping mechanism. However, due to
      availability of high speed TLB, and use of locality, multilevel page
      tables have high performance with extremely low TLB miss rate.
    - Intel IA-32 architecture supports two levels of page tables.


# Inverted Page Tables
    - Inverted page tables store one PTE in memory for each page frame in the
      system.
    - Inverted relative to traditional page tables because the PTEs are indexed
      by page frame number virtual page number.
    - Uses hash functions to map virtual pages to PTEs.
    - Hashing can lead to collisions. To prevent collisions, inverted page
      tables implement chaining mechanism to resolve collisions. When a hash
      value maps an item to a location that is occupied, a new function is
      applied to the hash value.
    - The chaining mechanism increases address translation time by increasing
      the number of times memory must be accessed.
    - Collisions can be reduced by increasing the range of the hash function.
      Hash anchor table (HAT) increases the range of the hash function by
      adding another level of indirection.
    - Intel IA-64, HP PA-RISC and PowerPC architecture implement inverted page
      tables.


# Sharing in a Paging System
    - Sharing in multiprogramming systems reduces memory consumed by programs
      that use common data and/or instructions.
    - Requires system to identify each page as shareable or non-shareable.
    - Some OSs use the copy-on-write technique. Initially the system maintains
      only one copy of each shared page in memory for all processes. If a
      process attempts to modify the shared page, the OS creates a copy of the
      page and applies the modification. The unmodified copy of the page
      remains mapped to the address space of all other processes sharing the
      page.
    - Some architectures provide each PTE with a read/write bit. When RW bit is
      off, the page can be read, but not modified. When RW bit is on, the page
      can both be read and modified.
    - Copy-on-write can be implemented by marking each shared page as readonly.
      When a process attempts to write to a page, the processor on which it
      executes causes a page fault at which point copy-on-write is performed.
    - When OS moves a shared page to secondary storage, it must update the
      corresponding PTEs for every process sharing that page. For large number
      of processes sharing a page, this could incur significant overhead.
	# Introduction
	- Solves the problem of limited memory space, by creating an illusion that
	more memory exists than is available in the system.
	- Two types of addresses in virtual memory systems
	* Virtual addresses: referenced by processes
	* Physical (real) addresses: describes locations in main memory


	# Evolution of Memory Organization
	├── Real Memory Systems
	│ ├── Multiprogramming systems
	│ │ ├── Fixed-parition multiprogramming
	│ │ │ ├── Absolute
	│ │ │ └── Relocatable
	│ │ └── Variable-parition multiprogramming
	│ └── Single-user dedicated systems
	└── Virtual Memory Systems
	├── Pure paging
	├── Pure segmentation
	└── Combined paging and segmentation


	# Virtual Memory: Basic Concepts
	- Virtual memory systems contain special-purpose hardware called Memory
	Management Unit (MMU) that quickly maps virtual addresses to real
	addresses.
	- Virtual Address Space (V): Range of virtual addresses that a process may
	refer.
	- Real Address Space (R): Range of physical addresses available on a
	particular computer system.
	- Normally, \|V\| >> \|R\|
	- Two-level storage scheme
	* OS must store parts of V for each process outside the main memory.
	* OS shuttles portions of V between main memory (and caches) and
	secondary storage.
	- Dynamic Address Translation (DAT)
	* Converts virtual addresses to physical addresses during execution.
	* Systems that use DAT exhibit the property that the contiguous
	addresses in a process's virtual address space need not be contiguous
	in physical memory. This property is called artificial contiguity.
	* Artificial contiguity simplifies programming by enabling a process to
	reference its memory as if it were contiguous, even though its data
	and instructions may be scattered throughout the main memory.


	# Block Mapping
	- Virtual memory is organized into blocks.
	- If the blocks are of fixed size, they are called pages. The technique is
	called paging.
	- If the blocks are of different sizes, they are called segments. The
	technique is called segmentation.
	- In a virtual memory system with block mapping, the system represents
	a virtual address v as an ordered pair (b,d), where b is the block
	number in which the referenced item resides, and d is the displacement
	or offset within block b, from the start of the block, at which the
	referenced item is located.
	- Translation of virtual memory address v=(b,d) to a real address r
	* The system maintains in memory a block map table for each process.
	* The process's block map table contains one entry for each block of
	the process, in sorted order.
	* A high-speed special-purpose register called "block map table origin
	register" stores the base address (in real memory) of the process's
	block map table.
	* System adds the block number b, to the base address a, of the
	process's block map table to form real address of the entry for block
	b in the block map table.
	* This entry yields the address b', for the start of block b in main
	memory.
	* The system then adds the displacement d, to the block start address
	b', to form the desired real address r.
	* For the virtual address v=(b,d), r=b'+d, where b' is stored in the
	block map table cell located at main memory address a+b.
	* The add operation and lookups performed during translation of virtual
	memory address to real address must be done using special hardware in
	order to gain a significant speedup.
	- Let v=(b, d) be represented by 32 bits. Block displacement d if specified
	by n. The system will have 2^(32-n) number of blocks.
	* n=6: Number of blocks = 2^(26), which is huge.
	Block size is small
	Limited internal fragmentation
	* n=24: Number of blocks = 2^(8), which is very small.
	Block size is huge
	Significant internal fragmentation
	* n=12: This choice is quite popular in paging systems, yielding a page
	size of 2^(12) = 4096 bytes
	Moderate internal fragmentation


	# Paging
	- Virtual address in paging system is an ordered pair v=(p,d). p is the
	number of the page in virtual memory on which the referenced item
	resides, and d is the displacement within page p at which the referenced
	item is located.
	- Paging uses fixed-size block mapping.
	- When a page is transferred from secondary storage to main memory, it is
	placed in a memory block called page frame, p' (of same size as page).
	- A page frame begins at a main memory address that is an integral multiple
	of fixed page size p_s.
	- DAT under paging:
	* v=(p,d) results in a lookup of the page p in the "page map table" or
	simply "page table" to determine the page frame p'. The real address
	is obtained by the equation r = (p * p_s) + d
	* For a system which uses n bits to represent both real as well as
	virtual addresses, there is an ingenuine technique which can avoid
	the potentially slow addition and multiplication operations.

	If the displacement is represented by m bits, the page number will
	be represented by the most-significant n-m bits. The real address
	can be represented as r=(p',d) and can be formed by concatenating
	p' and d, placing p' in the most-significant bits of the memory
	address and d in the least significant bits.

	Ex: In a 32-bit system (n=32) with page size 4096 bytes (m=12), page
	frame number is represented by 20 bits and displacement represented
	by 12 bits. The 32-bit real address can be formed by concatenating
	the two binary strings.
	- One benefit of paging is that not all pages belonging to a process must
	reside in main memory at the same time.
	- When a process references a page that is not in main memory, the
	processor generates a "page fault", which invokes the OS to load the
	missing page into memory from secondary storage.
	- Typical Page Table Entry (PTE)
	+-------------------+---------------------------+-----------------------+
	\| Page resident bit \| Secondary storage address \| Page Frame number \|
	\| \| (page not in main memory) \| (page in main memory) \|
	\| r \| s \| p' \|
	+-------------------+---------------------------+-----------------------+

	r=0, if page is in memory
	r=1, if page is not in memory

	To reduce the amount of memory consumed by PTEs, many systems contain
	only one cell to store either the page frame number or the secondary
	storage address depending upon the value of the resident bit.


	# Paging Address Translation by Direct Mapping
	- Dynamic address translation under paging is similar to block address
	translation.
	- Process references virtual address v=(p,d)
	- DAT adds the process's page table base address b (stored in "page table
	origin register") to referenced page number p.
	- (b+p) forms the main memory address of the PTE for page p.
	- The PTE indicates that the virtual page p corresponds to page frame p'.
	- System concatenates p' with displacement d to form real address r.
	- Known as direct mapping because page table contains one PTE for every
	page in V.
	- If a process contains n pages in V, then the direct-mapped page table
	for the process contains entries successively for page 0, page 1, ...,
	page n-1. Hence direct mapping is similar to array indexing, and the
	system can locate any PTE in O(1).
	- The page table can become huge due to typically large sizes of V. Since
	the page table is stored in the main memory, reference to page requires
	a complete main memory cycle which can degrade performance. Furthermore,
	due to large size of the table, it cannot be stored in the cache.
	- Due to huge size of direct-mapped page tables, most entries are unused
	resulting in table fragmentation.


	# Paging Address Translation by Associative Mapping
	- Maintaining entire page table in cache memory is often not viable.
	- Increase performance of DAT by placing entire page table into
	content-addressed associative memory (rather than location-addressed).
	- Every PTE in associative memory is searched simultaneously.
	- Content-addressed cache memory is also prohibitively expensive.


	# Paging Address Translation by Direct/Associative Mapping
	- Compromise between cost and performance.
	- Most PTEs are stored in direct-mapped tables in main memory.
	- Most recently used PTEs are stored in high-speed set-associative
	cache memory called a Translation Lookaside Buffer (TLB)
	- If PTE is not found in TLB (called a TLB miss), the DAT mechanism
	searches the direct-mapped page table in main memory.
	- Can yield high performance with relatively small TLB due to temporal
	locality, which assumes that a page referenced by a process recently is
	likely to be referenced again soon.


	# Multilevel Page Tables
	- Multilevel page tables enable the system to store in discontiguous
	locations in main memory those portions of a process's page table that
	the process is actively using.
	- Hierarchical structure of page tables.
	- Each level containing a table that stores pointers to tables in the level
	below.
	- The bottom-most level is comprised of tables containing the page to
	page-frame mappings.
	- In most systems each table in the hierarchy is the size of one page,
	which enables the operating system to transfer page tables between main
	memory and secondary storage easily.
	- Can reduce memory overhead compared to direct-mapping system.
	- The overhead incurred by multilevel page tables is the addition of
	another memory access to the page mapping mechanism. However, due to
	availability of high speed TLB, and use of locality, multilevel page
	tables have high performance with extremely low TLB miss rate.
	- Intel IA-32 architecture supports two levels of page tables.


	# Inverted Page Tables
	- Inverted page tables store one PTE in memory for each page frame in the
	system.
	- Inverted relative to traditional page tables because the PTEs are indexed
	by page frame number virtual page number.
	- Uses hash functions to map virtual pages to PTEs.
	- Hashing can lead to collisions. To prevent collisions, inverted page
	tables implement chaining mechanism to resolve collisions. When a hash
	value maps an item to a location that is occupied, a new function is
	applied to the hash value.
	- The chaining mechanism increases address translation time by increasing
	the number of times memory must be accessed.
	- Collisions can be reduced by increasing the range of the hash function.
	Hash anchor table (HAT) increases the range of the hash function by
	adding another level of indirection.
	- Intel IA-64, HP PA-RISC and PowerPC architecture implement inverted page
	tables.


	# Sharing in a Paging System
	- Sharing in multiprogramming systems reduces memory consumed by programs
	that use common data and/or instructions.
	- Requires system to identify each page as shareable or non-shareable.
	- Some OSs use the copy-on-write technique. Initially the system maintains
	only one copy of each shared page in memory for all processes. If a
	process attempts to modify the shared page, the OS creates a copy of the
	page and applies the modification. The unmodified copy of the page
	remains mapped to the address space of all other processes sharing the
	page.
	- Some architectures provide each PTE with a read/write bit. When RW bit is
	off, the page can be read, but not modified. When RW bit is on, the page
	can both be read and modified.
	- Copy-on-write can be implemented by marking each shared page as readonly.
	When a process attempts to write to a page, the processor on which it
	executes causes a page fault at which point copy-on-write is performed.
	- When OS moves a shared page to secondary storage, it must update the
	corresponding PTEs for every process sharing that page. For large number
	of processes sharing a page, this could incur significant overhead.