Introduction
In earlier blog, I have discussed about Least Recent Use implementation, and pointed out:
- we can use a linked list to store Data where the end/tail has the most recent used data, when the begin/head has the least used data. (We can do reverse, of course).
- Query data requires moving its current position to the end/tail since its becomes the most recent used data.
- Double linked list helps moving in O(1) since we only need to modify the "links"
prev
andnext
- If the current position is the head, we need to delete the oldHead to avoid mem leak.
- Double linked list helps moving in O(1) since we only need to modify the "links"
- Adding data always push the data to back/end/tail.
- If linked list is at capacity, then remove the front/head/begin.
In this blog I will discuss and implement Least Frequently Use cache "step by step". I describe the rudimentary solutions and approaches that fail but should leads to improvement and final solution.
Concepts
Least Recent Use cache performs 2 operations with basic requirements:
addData
: if cache is at capacity, remove the least frequently used; if there is a tide, remove the least recent used.queryData
: after a data item is queried, increment the count for that data item.
Use linked list to store item based on usage frequency
At first, a double linked list is used to store data item based on usage frequency:
- we choose the back/tail to store the least used and least recent used item.
- we choose the front/head to store the most used and possibly most recent used item.
The following illustrates the case that we add up data while cache is not yet at capacity, data items are added at the front, on the right:

If an item is queried, it is moved to the front/right/head. Suppose that we have Data_2 queried 4 times and Data_3 queried 3 times, and Data_2 queried 2 times, then we have:

Now if the new data item is added,
- should it be added at the back instead of the front?
- if cache is at capacity, should be we remove Data_1 although its count is greater than newly added item?
There is no good answer to the above questions. At this point, it is clear to see that using linked list to store items based on usage frequency is not a good idea.
Using 2 hash maps and linked list
The common accepted method uses a linked list but needs 2 hash maps:
- linked list stores the items based on least recent use, not frequency. The linked list stores items that have save frequency usage but in order of recent use.
- 1st hashmap: to remove item based on the least use, we maintain the variable
minFrequency
. In order to get the frequency usage of each item quickly and compared to minFrequence, we maintain a frequency map where frequency is the key, and value is the linked list of items that have same frequency. Note that the value doesn't point to each node in the linked list, but the whole linked list. - 2nd hashmap: function queryData gets item based on provided key. Therefore, we need a 2nd hashmap to access individual items quickly from keys. Hashmap has key which is the iterator to individual items in the linked list.

Let think of each operations and different scenarios:
addData
: when we add a new data to the cache that has not reached capacity, we increment the freq to 1 and push to linked list at front.
queryData
: when a data item is query, freq is incremented. The item is removed from the linked list of items that has same frequency and put into a new linked list. Thus we performupdateFrequency
. For example, in the below figure, ifData_3
is queried, it should be moved to the linked list withfreq=2
, and put at the tail, right afterData_
1.updateFrequency
performs such operation.

addData
: when we add a new data to the cache that is at capacity, we removed the item of the front of the linked list of items whose frequency isminFreq
. Even whenminFreq
is greater than 1, we still remove the item.
Implementation notes:
- I create Data struct within class LFUCache
- The LinkedList stores Data object instead of pointer so that I don't have to worry about memory issue. Every time a data item is added more move to different list, a new copy of data is made in put to the list.
- a minFreq variable is used. It is updated when:
- a new data item is added to the list. minFreq=1
- the new data item is queried.
Code:
class LFUCache {
private:
struct Data {
int key, value, freq;
Data(int k, int v) : key(k), value(v), freq(1) {}
};
int capacity, minFreq;
// key = key, value = iterator to data item in linked list
std::unordered_map<int, std::list<Data>::iterator> dataList;
// key = freq, value = LinkedList of data items of same freq
std::unordered_map<int, std::list<Data>> freqList;
void updateFrequency(std::list<Data>::iterator nodeIter) {
int key = nodeIter->key;
int value = nodeIter->value;
int freq = nodeIter->freq;
freqList[freq].erase(nodeIter);
if (freqList[freq].empty()) {
freqList.erase(freq);
if (minFreq == freq) {
minFreq++;
}
}
freq++;
freqList[freq].emplace_front(key, value);
dataList[key] = freqList[freq].begin();
dataList[key]->freq = freq;
}
public:
LFUCache(int capacity) : capacity(capacity), minFreq(0) {}
int queryData (int key) {
if (dataList.find(key) == dataList.end()) {
return -1;
}
auto dataIter = dataList[key];
int value = dataIter->value;
updateFrequency(dataIter);
return value;
}
void addData (int key, int value) {
if (capacity == 0) return;
if (dataList.find(key) != dataList.end()) {
auto dataIter = dataList[key];
dataIter->value = value;
updateFrequency(dataIter);
} else {
if (dataList.size() == capacity) {
auto dataIter = freqList[minFreq].back();
dataList.erase(dataIter.key);
freqList[minFreq].pop_back();
if (freqList[minFreq].empty()) {
freqList.erase(minFreq);
}
}
freqList[1].emplace_front(key, value);
dataList[key] = freqList[1].begin();
minFreq = 1;
}
}
};
Conclusion
LFU cache could be remembered with following main points:
- 1 linked list stores recent used items with same frequency just like in LRU.
- 1 hashmap provides O(1) access to linked list of items with same frequencies. The key is frequency, value is the linked list.
- 1 hashmap provides O(1) access to item in a particular linked list.
addItem
add items with freq=1 to linked list, or it has to move the item from one linked list to another linked list.queryItem
needs to search for Data item from the data hashmap, (look for data item from key) and has to move the item from one linked list to another linked list.- a helper function
updateFrequency
is needed to perform moving item from one linked list to another linked list.