Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
MRU Cache
"""
PROBLEM: Implement MRU Cache
- i.e. cache the MOST-MOST-RECENTLY-USED-OR-ACCESSED, and drop LEAST-RECENTLY-USED
- FIXED-size cache
initialized with a size, never to exceed that size
- put(key, val), O(1)
- get(key) -> val (or None), O(1)
TEST CASES:
testCache = Cache(3)
testCache.put("A", 1)
testCache.put("B", 2)
testCache.put("C", 3)
print testCache.get("B") # 2
testCache.put("D", 4)
print testCache.get("A") # One of these should be None
print testCache.get("B")
print testCache.get("C")
print testCache.get("D")
"""
"""
SUMMARY:
- Lookup with DICT using String:Node reference
- Nodes referenced contained in in PriorityQ implemented as
DEQUE
- NODE key=String, value=Integer
- need random access lookup for Get/Put via DICT
- need to determine which Node to evict in PriorityQ;
then lookup corresponding DICT entry to remove
So KEY has to be part of Node.
CACHE EVICTION POLICY VARIATIONs from SIMPLEST to REAL-WORLD:
1) RECENTLY WRITTEN-INSERTED
- if you just want to evict ANY node; then simple List array implementation with eviction of last appended is OK; but then we have cache policy of retaining OLDEST
WRITTEN-INSERTED and not USED-ACCESSED which is bad
- LIST implemented as Array in Python makes removal from FRONT
expensive due to copy; whereas add to END is less expensive due to amortized
block allocation
2) ACCESS RECENCY: Most recently accessed CACHED; stale accesses evicted
- if we want to evict node by ACCESS recency -- eg retain most recently accessed
- so FIFO according to ACCESS, then use Doubly-Linked-List or DEQUE
since remove from HEAD and add to TAIL are constant operations
3) ATTRIBUTE VALUE: Use HEAP, with value according to ATTRIBUTE
- Price (cache High-Price, so evict low-price with MIN-Heap)
- Time Written (cache recently written with smallest time difference from NOW, so evict largest time difference with MAX-Heap)
TRICK:
- MOST RECENTLY ACCESSED is REMOVED from TAIL of Priority Q,
then MOVED to HEAD of Q
- DICT is UNCHANGED, as holds reference to Node which didn't itself change EXCEPT when Q is FULL:
- Node at END of Q is EVICTED on a full Q,
then LOOKUP on Node KEY needed to REMOVE it from DICT!
ATTENTION TO:
- track data IN-SYNC on each PUT modification: Q, DICT, COUNT;
which is unaffected with GET: DICT, Q, COUNT
- need to EVICT TAIL on FULL Q case;
then LOOKUP corresponding KEY in DICT to RELEASE reference!
WHAT MATTERS:
- DICT always points to Node directly for random-access for GET
- PUT needs to check for FULL and use
PriorityQ with EVICTION POLICY:
- On Eviction from PriorityQ, need Node with both
with KEY-VALUE Node for
lookup and eviction of corresponding Entry from DICT!
- SIMPLEST to have VALUE on DICT reference SAME Node structure as Q!
- DEQUE since insert-delete at HEAD and TAIL are needed;
AND want efficient DELETE so need BACK ptr!
- update COUNT on PUT only
DETAILS:
get => lookup Dict is NOT modified, instead PriorityQ is
- retrieves Node value if it exists from DICT
- REMOVE accessed item from TAIL of PriorityQ
- ADD it to HEAD to TRACK MR
- otherwise returns None if it does not exist
set => PriorityQ modified, THEN lookup Dict is synced
- if EXISTS, UPDATE its value
- if NOT exists and
- if NOT FULL at capacity
- lookup DICT, and UPDATE value
- if FULL at capacity
- create new node
- REMOVE stale item from HEAD of Priority Q,
then goto lookup KEY in DICT to REMOVE corresponding entry
- INSERT New Node to TAIL of PriorityQ
SIMPLIFYING ASSUMPTIONS:
- ignore concurrent access; otherwise have to copy and return MODIFIED
collection each time, leaving original dictionary intact
- ignore WEAK references on DICT key which allow garbage collection for
stale keys
- NO DEEP copies required; just REFERENCES to VALUE objects saved;
shallow copy OK. eg not copy.deepcopy(), but dict.copy()
LANGUAGE-SPECIFIC SUPPORT:
- JAVA
- LinkedHashMap with ctor parameter for AccessOrder (True for Access, False for Insertion)
- PriorityQ with ctor parameter for Comparator where DEFAULT -1 for
L < R MIN HEAP; and you multiply by -1 to get REVERSE order to conver to MAX HEAP
- java Generics PECS rule: Producer Extend, Consumer Super
Comparator<? super E>
- https://stackoverflow.com/questions/2723397/what-is-pecs-producer-extends-consumer-super
- PYTHON
- heapq supports min-heap by default;
but needs customization for MAX-Heap
- i.e needs ctor-parameterized comparator function for specific contained element type
*** BUT I CUSTOMIZED THIS MYSELF -- WHOOHOO!
https://github.com/DagnyTagg2013/CTCI-Python/blob/master/HEAP/binheap.py
- OR from Queue import PriorityQueue (synchronized and slower)
which adds elements as TUPLEs for automatic field-sequence ordering;
but maybe also need to do * -1 to REVERSE ordering for a MAX heap
- https://stackoverflow.com/questions/407734/a-generic-priority-queue-fo-python
- https://docs.python.org/3/library/heapq.html
"""
from collections import deque
class Node:
def __init__(self, key=None, value=None):
self.key = key
self.value = value
def __repr__(self):
status = "({},{})".format(self.key, self.value)
return status
class MRUCache:
def __init__(self, maxSize=3):
self.capacity = maxSize
self.currSize = 0
self.lookup = {}
self.priorityQ = deque()
def __repr__(self):
# ATTN print syntax!
status = '\nMRUCache\n {}, {}\n{}\n{}'.format(self.capacity, self.currSize, self.lookup, self.priorityQ)
return status
def get(self, key):
# RETURN NONE if not in lookup cache
foundNode = self.lookup.get(key, None)
# if node found, reorder ACCESS-ORDER
if foundNode:
# FIND/REMOVE corresponding node reference from priority Q,
# (should be SAME reference as above!)
# then put it into the FRONT of Q
self.priorityQ.remove(foundNode)
self.priorityQ.appendleft(foundNode)
# RETURNs object reference if in lookup cache or None if not found
return foundNode
def put(self, key, value):
currNode = None
# UPDATE value if ALREADY in lookup cache
if key in self.lookup:
currNode = self.lookup[key]
# ATTN: Dict only saves REFERENCE to pairNode, so modify
# value on that Node!
currNode.value = value
# FIND corresponding node from Q,
# then put it into the FRONT of Q,
# - NOTE this is the SAME Node reference as ABOVE!
# ATTENTION: TIME TO SEARCH is O(n) on DEQUE
existNode = self.priorityQ.remove(currNode)
# otherwise, CREATE new Node, and ADD it to lookup cache
else:
# create new node
currNode = Node(key, value)
self.currSize = len(self.priorityQ)
# ATTN : check FWD bound before EVICTION
if ((self.currSize + 1) > self.capacity):
# ATTN: update Q WITH its corresponding DICT Entry!
# IFF we're at CAPACITY, first remove STALE node at
# END of priorityQ, then remove its corresponding entry from the lookup
# cache
staleNode = self.priorityQ.pop()
# ATTN: this is the way to REMOVE from a DICT in Python!
staleNode = self.lookup.pop(staleNode.key, None)
# COMMON-CODE,
# accessed updated/added node ALWAYs gets added to HEAD of Q,
# AND exists in the lookup cache based on Node key
self.priorityQ.appendleft(currNode)
self.lookup[currNode.key] = currNode
# UPDATE size!
self.currSize += 1
# ************ TEST DRIVER ******************
print("\n*** INIT CACHE")
testCache = MRUCache(3)
print(testCache)
print("\n*** PUT A")
testCache.put("A", 1)
print(testCache)
print("\n*** PUT B")
testCache.put("B", 2)
print(testCache)
print("\n*** PUT C")
testCache.put("C", 3)
print(testCache)
print("\n*** GET B")
print(testCache.get("B")) # 2
print(testCache)
print("\n*** PUT D")
testCache.put("D", 4)
print(testCache)
print("\n*** GET A")
print(testCache.get("A")) # THIS should be None
print(testCache)
print("\n*** GET B")
print(testCache.get("B"))
print(testCache)
print("\n*** GET C")
print(testCache.get("C"))
print(testCache)
print("\n*** GET D")
print(testCache.get("D"))
print(testCache)
print("\n*** UPDATE B")
testCache.put("B", 100)
print(testCache)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.