A computer’s cache memory is a specialized, small volatile memory that offers high-speed data access to a processor. It stores computer programs, data, and applications which are used frequently. Cache is the fastest memory of a computer.

Usually, the cache is integrated on the motherboard and is directly embedded in the processor or the random access memory (RAM). Cache is expensive and helps to improve the performance of the computer.

It is used by cache clients like CPU, applications, web browsers and operating system. The main bulk storage cannot always meet the demands of the cache clients, and hence, the cache is thus used to reduce data access time as well as decrease latency and enhance application performance.

In this article, we will discuss the types of cache memory available and their functions, how cache works, what are the algorithms and policies it uses, etc. Let us begin.

What are the types of cache memory?

Caching is mainly of two types – memory caching and disk caching.

Memory caching

A memory cache is the part of memory which is made of high-speed static RAM (SRAM). It is also called a cache store or RAM cache.

Many programs access the same data and instructions very frequently. This is when memory cache plays its role.

By storing as much as information possible in the SRAM, the computer reduces the chances of accessing the slower cheaper dynamic RAM (DRAM).

L1 & L2 caches

Memory caches are also built into the architecture of microprocessors. When the memory system is faster, smaller and built directly into the microprocessor’s chip, it is called level 1 or L1 cache.

In this way, the memory will be accessed at the speed of the microprocessor and not at that of the memory bus.

A specialized, small and fast memory bank in the motherboard, which is two times faster than the main memory access, is called level 2 or L2 cache.

Both L1 and L2 caches have SRAM, but they are larger in the L2 cache.

Translation Lookaside Buffer (TLB)

TLB is a kind of memory cache which stores recent translations of virtual memory to physical addresses and speeds up virtual memory operations.

At first, a program looks at the CPU while referring to a virtual address. If the concerned memory address is not found, the system looks up the memory’s physical address, checking the TLB first.

If the address is not found in TLB, then the physical memory is searched. Virtual memory addresses are added to the TLB as they are translated. TLB is on the processor, hence address retrieval is faster thus reducing latency.

Disk caching

Disk caching is similar to memory caching. The difference is memory cache uses SRAM, but in disk caching, it uses the conventional main memory.

The recently accessed data from the disk as well as its adjacent sectors are stored in the memory buffer. When a program wants to obtain any information from the disk, it first checks the disk cache to see if it’s present there.

Disk caching improves the performance of applications in a wonderful way since accessing a byte of data from RAM is much faster than accessing a byte on a hard disk.

How does cache works?

memory organization                                      Fig 1.0  Memory organization

When an operation is to be performed, then the data and commands concerning that particular operation are shifted from a slow-moving storage device (hard disk or CD drive) to a faster device.

This faster device is RAM, specifically DRAM. RAM is faster and can provide the data or commands, needed by the processor, at a quicker rate than the slow storage devices.

Although RAM is faster than the storage devices, the processor processes at a much quicker pace and the RAM cannot provide the required instructions at that rate.

Therefore, another faster memory called the cache comes in which can keep pace with the processor speed.

Cache is also a type of RAM, called the SRAM and is faster as well as more expensive than DRAM.

The entire cache is separated into different categories which vary in their processing speed and memory. The cache can be divided either into 2 or 3 parts.

From RAM, the data is sent to the third level of cache (L3 cache). L3 is a part of the cache and is faster than RAM but slower than L3 cache.

To further speed up the process, second level cache or L2 cache is used. In modern processors, the L2 cache is inbuilt in them to speed up the entire process.

The first level cache or L1 cache resides at the core level and contains the commonly used data, commands, and instructions. It is built in the processor and is the fastest of all cache memory.

Therefore, to perform any action or execute a command, the processor checks the state of the data registers at first. If the required instruction is not found there, it looks in the L1 cache.

If it is still not there, it goes to the next levels subsequently. If the data needed is not found in the cache, then it is referred to as cache miss.

This makes delay in execution and slows down the system.

If it is located in the cache, it is called cache hit. If the required data is not found in any of the caches, the processor looks for it in the RAM, and if it’s not there in RAM, it checks in the storage device.

In pictorial representation, it can be summarized as. Memory access path                                            Fig 1.1 Memory access path

How does cache improve the performance of a system?

You are now acquainted with the terms cache hit and cache miss. You also know that cache accelerates the process of data access in a system and improves its performance.

But what are the principles behind it? To understand how it is done and what data is removed from the cache to make room for new one is based on the caching algorithms and cache policies.

Let us see what they are.

Caching algorithms

How the cache must be maintained is governed by the cache algorithms. They are as follows:

Least Frequently Used (LFU)

It keeps track of how frequently an entry is accessed. The item having the lowest count is removed first.

Least Recently Used (LRU)

It keeps the recently accessed items at cache’s top. When the cache reaches its limit, the entries which are least recently accessed, are removed.

Most Recently Used (MRU)

It removes the most recently accessed items first. This technique is considered to be best when older entries are more likely to be used.

What are different Cache policies?

The different cache policies are as follows

Write-around cache

Write-around cache skips the entire cache and writes operations to the storage. It averts the cache from being flooded when there are vast amounts of write I/O.

The read operations are prolonged in this approach because data are not cached until read from the storage. It is a big disadvantage.

Write-through cache

Write-through cache writes data both on cache and storage. The newly written information is always cached and thus can be read quickly.

However, write operation is not considered complete until data is written both on cache and storage. This may introduce latency in write operations.

Write-back cache

All the write operations are directed to cache in this approach. The data from the cache is later copied to the storage. Both read and write operations have less latency in this policy.

However, the data can be lost until it is committed to the storage.

What are some of the popular uses of cache?

Some other popular uses of the cache are mentioned below.

Cache server

It is a service that acts as a web server responsible for saving web pages and other internet contents locally. It is also known as the proxy cache.

Flash cache

Flash cache is the cache where data is stored temporarily on NAND flash memory chips. It uses solid state drives (SSDs) to cater to data requests much quicker than if the cache were on a hard disk drive (HDD).

Persistent cache

It helps to prevent any data loss during a system reboot or crash. Data is often flushed to a battery-backed DRAM as an additional defense layer against data loss.

Conclusion

Computer’s cache is a small, faster and costlier memory that accelerates data access to a processor and stores frequently used computer programs, applications, and data. If the cache already possesses the instance of data requested by the processor, it does not need to fetch the data from the main memory or hard disk.

Cache is mainly of two types, memory cache, and disk cache. A cache is also divided into several parts which differ in processing speed and memory.

To improve the performance, the cache uses different cache algorithms and cache policies. The cache system functions excellently, and every modern computer has incorporated it.

Initially, it was implemented on the motherboard, but with progress in processor design, it has been integrated into the processor.

A computer's cache memory is a specialized, volatile memory that offers high-speed data access to a processor. It is the fastest memory in a computer.