Managing NFS and NIS, 2nd Edition - Mike Eisler [87]
The client process performs an NFS read RPC. If the client and server are using NFS Version 3, the read request asks for a complete 32 kilobyte NFS buffer (otherwise it will ask for an 8 kilobyte buffer). The client process goes to sleep waiting for the RPC request to complete. Note that the client process itself makes the RPC, not the async thread: the client can't continue execution until the data is returned, so there is nothing gained by having another process perform its RPC. However, the operating system will schedule async threads to perform read-ahead for this process, getting the next buffer from the remote file.
The server receives the RPC packet and schedules a kernel server thread to handle it. The server thread picks up the packet, determines the RPC call to be made, and initiates the disk operation. All of these are kernel functions, so the server thread never leaves the kernel. The server thread that was scheduled goes to sleep waiting for the disk read to complete, and when it does, the kernel schedules it again to send the data and RPC acknowledgment back to the client.
The reading process on the client wakes up, and takes its data out of the buffer returned by the NFS read RPC request. The data is left in the buffer cache so that future read operations do not have to go over the network. The process's read( ) system call returns, and the process continues execution. At the same time, the read-ahead RPC requests sent by the async threads are pre-fetching additional buffers of the file. If the process is reading the file sequentially, it will be able to perform many read( ) system calls before it looks for data that is not in the buffer cache.
Obviously, changing the numbers of async threads and server threads, and the NFS buffer sizes impacts the behavior of the read-ahead (and write-behind) algorithms. Effects of varying the number of daemons and the NFS buffer sizes will be explored as part of the performance discussion in Chapter 17.
Caching
Caching involves keeping frequently used data "close" to where it is needed, or preloading data in anticipation of future operations. Data read from disks may be cached until a subsequent write makes it invalid, and data written to disk is usually cached so that many consecutive changes to the same file may be written out in a single operation. In NFS, data caching means not having to send an RPC request over the network to a server: the data is cached on the NFS client and can be read out of local memory instead of from a remote disk. Depending upon the filesystem structure and usage, some cache schemes may be prohibited for certain operations to guarantee data integrity or consistency with multiple processes reading or writing the same file. Cache policies in NFS ensure that performance is acceptable while also preventing the introduction of state into the client-server relationship.
File attribute caching
Not all filesystem operations touch the data in files; many of them either get or set the attributes of the file such as its length, owner, modification time, and inode number. Because these attribute-only operations are frequent and do not affect the data in a file, they are prime candidates for using cached data. Think of ls -l as a classic example of an attribute-only operation: it gets information about directories and files, but doesn't look at the contents of the files.
NFS caches file attributes on the client side so that every getattr operation does not have to go all the way to the NFS server. When a file's attributes are read, they remain valid on the client for some minimum period of time, typically three seconds. If the file's attributes remain static for some maximum period, normally 60 seconds, they are flushed from the cache. When an application on the NFS client modifies an NFS attribute, the attribute is immediately written back to the server. The only