Skip to content

Instantly share code, notes, and snippets.

@banacorn
Last active Aug 29, 2015
Embed
What would you like to do?

Advenced Cache Performance Optimization

Optimization Miss rate Miss penalty Hit time Bandwidth Note
compiler optmization -
critical word first and early restart -
merging write buffers -
small & simple cache + -
way prediction -
hardware prefetching - -
compiler prefetching - -
pipelined caches + +
multibacked caches +
non-blocking caches - +
  • Reduce miss rate:
    • compiler optmization
  • Reduce miss penalty:
    • critical word first and early restart
    • merging write buffers
  • Reduce hit time:
    • small and simple first level caches
    • way prediction
  • Reducing miss penalty or miss rate via parallelism:
    • hardware prefetching
    • compiler prefetching
  • Increasing cache bandwidth:
    • pipelined caches
    • multibanked caches
    • non‐blocking caches

Cache Performance

Miss rate = (Miss count / Instruction count) / (Memory access count / Instruction count)
AMAT = Hit time + Miss rate * Miss penalty
CPU time = Clock cycle time * Instruction count * (CPI + (Miss count / Instruction count) * Miss penalty)

6 Basic Cache Optmizations

Optimization Miss rate Miss penalty Hit time Note
larger block size - +
larger cache size - +
higher associativity - +
multilevel cache -
reads over writes -
avoid address translation -
  • Reducing miss rate:
    • larger block size
    • larger cache size
    • higher associativity
  • Reducing miss penalty:
    • multilevel caches
    • giving reads priority over writes
  • Reducing hit time:
    • avoid address translation when indexing the cache

Increase block size

  • Pros
    • Reduce compulsory misses
  • Cons
    • Increase miss penalty
    • Increase conflict miss and capacity miss if cache is small

Increase cache size

  • Pros
    • Reduce capacity and conflict misses
  • Cons
    • Increase hit time

Increase associativity

  • Pros
    • Reduce conflict misses
  • Cons
    • Increase hit time

Multilevel cache

  • Pros
    • Reduce penalty

Giving Priority to Read Misses

  • Write through:
  • Read miss waits until write buffer empty
  • Check write buffer content before reading
  • Write back
  • Copy the dirty bit to write buffer, do the read then do the write

Avoid translating address

4 memory hierarchy questions

  • Block placement
  • Block replacement
  • Block identification
  • Write strategy

Symmetric shared-memory

Write-invalidate cache coherence protocol for a write-back cache

State Request from CPU Request to the bus & Action Destination Note
Invalid Read Miss Read Miss Shared
Invalid Write Miss Write Miss Exclusive
Shared Read Hit Shared
Shared Read Miss Read Miss Shared
Shared Write Hit Invalidate Exclusive
Shared Write Miss Write Miss Exclusive
Exclusive Read Hit Exclusive
Exclusive Read Miss Read Miss + Write Back Shared
Exclusive Write Hit Exclusive
Exclusive Write Miss Write Miss + Write Back Invalid
State Request from Bus Destination Note
Shared Invalidate Invalid
Shared Read Miss Shared
Shared Write Miss Invalid
Exclusive Invalidate Invalid
Exclusive Read Miss Shared
Exclusive Write Miss Invalid
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment