Difference between buffer and page cache in Linux ?

People sometimes do think that page cache and buffer cache are one and the same thing, however there is slight difference in the understanding of both.

Lets have a look at the definitions and differences.

The page cache stores pages of files to optimize file I/O performance.
The buffer cache stores disk blocks to optimize block I/O.

There is a major difference what data gets stored or cached by page & buffer cache prior to Linux kernel 2.4 and after this kernel release.

Lets have a look at their behavior before & after kernel 2.4 release.

Behavior of Page and Buffer Cache prior to Linux kernel version 2.4
Prior to 2.4 kernel, the two caches were different.

Files' cache gets stored in page cache, and disk blocks were stored in buffer cache. 
The disk blocks usually refers to same file data and hence there was duplicate data gets cached.

Hence data gets referenced twice once in each of the caches.

To avoid this behavior, Page & Buffer cache is modified from 2.4 kernel release.

Behavior of Page & Buffer Cache from Linux kernel 2.4 Onwards
Starting from kernel version 2.4, the contents of the page & buffer caches are unified.

The Virtual machine subsystem now carry out Input-Output by caching files' data/pages in page cache. 
If cached data includes both the file and block representation of data —as most of the data does—then the buffer cache will simply point/link to the data in page cache.

Hence only one instance of file data gets cached in memory.

We can define page and buffer cache w.r.t Linux 2.4 and later kernel release:

1. What is page cache?
Page Cache contains code and data i.e. file’s IO block pages. In reality, all our applications reside in page cache pool. When cpu accesses a chunk of pages, the new pages get immediately updated to the cache.

On next access, CPU first checks the page cache and then access from disk if not available. It caches file data from a disk to make subsequent I/O faster.

2. What is Buffer Cache ?
The buffer cache remains when the kernel still needs to perform block I/O operations in terms of blocks, not pages.
Blocks usually also represent file data. Hence most of the buffer cache is pointed/linked to the page cache.
However a small amount of block data which isn’t present in file cache is solely represented by the buffer cache.

Seems interesting concept, do post your comment/feedback or suggestions below to improve this further.

Leave a Reply

Your email address will not be published.