Managing NFS and NIS, 2nd Edition - Mike Eisler [90]
Having file blocks cached on the server during writes poses a problem if the server crashes. The client cannot determine which RPC write operations completed before the crash, violating the stateless nature of NFS. Writes cannot be cached on the server side, as this would allow the client to think that the data was properly written when the server is still exposed to losing the cached request during a reboot.
Ensuring that writes are completed before they are acknowledged introduces a major bottleneck for NFS write operations, especially for NFS Version 2. A single Version 2 file write operation may require up to three disk writes on the server to update the file's inode, an indirect block pointer, and the data block being written. Each of these server write operations must complete before the NFS write RPC returns to the client. Some vendors eliminate most of this bottleneck by committing the data to nonvolatile, nondisk storage at memory speeds, and then moving data from the NFS write buffer memory to disk in large (64 kilobyte) buffers. Even when using NFS Version 3, the introduction of nonvolatile, nondisk storage can improve performance, though much less dramatically than with NFS Version 2.
Using the buffer cache and allowing async threads to cluster multiple buffers introduces some problems when several machines are reading from and writing to the same file. To prevent file inconsistency with multiple readers and writers of the same file, NFS institutes a flush-on-close policy:
All partially filled NFS buffers are written to the NFS server when a file is closed.
For NFS Version 3 clients, any writes that were done with the stable flag set to off are forced onto the server's stable storage via the commit operation.
This ensures that a process on another NFS client sees all changes to a file that it is opening for reading:
Client A
Client B
open( )
write( )
NFS Version 3 only: commit
close( )
open( )
read( )
The read( ) system call on Client B will see all of the data in a file just written by Client A, because Client A flushed out all of its buffers for that file when the close( ) system call was made. Note that file consistency is less certain if Client B opens the file before Client A has closed it. If overlapping read and write operations will be performed on a single file, file locking must be used to prevent cache consistency problems. When a file has been locked, the use of the buffer cache is disabled for that file, making it more of a write-through than a write-back cache. Instead of bundling small NFS requests together, each NFS write request for a locked file is sent to the NFS server immediately.
Server-side caching
The client-side caching mechanisms — file attribute and buffer caching — reduce the number of requests that need to be sent to an NFS server. On the server, additional cache policies reduce the time required to service these requests. NFS servers have three caches:
The inode cache, containing file attributes. Inode entries read from disk are kept in-core for as long as possible. Being able to read and write these attributes in memory, instead of having to go to disk, make the get- and set-attribute NFS requests much faster.
The directory name lookup cache, or DNLC, containing recently read directory entries. Caching directory entries means that the server does not have to open and re-read directories on every pathname resolution. Directory searching is a fairly expensive operation, since it involves going to disk and searching linearly for a particular name in the directory. The DNLC cache works at the VFS layer, not at the local filesystem layer, so it caches directory entries for all types of filesystems. If