Online Book Reader

Home Category

Managing NFS and NIS, 2nd Edition - Mike Eisler [236]

By Root 430 0

Once the disk device driver has a read or write request, only a media failure causes the operation to return an error status. Any other failures, such as a permission problem, or the filesystem running out of space, are detected by the filesystem management routines before the disk driver gets the request. From the point of view of the read( ) and write( ) system calls, everything from the filesystem write routine down is a black box: the application isn't necessarily concerned with how the data makes it to or from the disk, as long as it does so reliably. The actual write operation occurs asynchronously to the application calling write( ). If a media error occurs — for example, the disk has a bad sector brewing — then the media-level error will be reported back to the application during the next write( ) call or during the close( ) of the file containing the bad block. When the driver notices the error returned by the disk controller, it prints a media failure message on the console.

A similar mechanism is used by NFS to report errors on the "virtual media" of the remote fileserver. When write( ) is called on an NFS-mounted file, the data buffer and offset into the file are handed to the NFS write routine, just as a UFS write calls the lower-level disk driver write routine. Like the disk device driver, NFS has a driver routine for scheduling write requests: each new request is put into the page cache. When a full page has been written, it is handed to an NFS async thread that performs the RPC call to the remote server and returns a result code. Once the request has been written into the local page cache, the write( ) system call returns to the application — just as if the application was writing to a local disk. The actual NFS write is synchronous to the NFS async thread, allowing these threads to perform write-behind. A similar process occurs for reads, where the NFS async thread performs some read-ahead by fetching NFS buffers in anticipation of future read( ) system calls. See Section 7.3.2 for details on the operation of the NFS async threads.

Occasionally, an NFS async thread detects an error when attempting to write to a remote server, and the error is printed (by the NFS async thread) on the client's console. The scenario is identical to that of a failing disk: the write( ) system call has already returned, so the error must be reported on the console in the next similar system call.

The format of these error messages is:

NFS write error on host mahimahi: No space left on device.

(file handle: 800006 2 a0000 3ef 12e09b14 a0000 2 4beac395)

The number of potential failures when writing to an NFS-mounted disk exceeds the few media-related errors that would cause a UFS write to fail. Table 15-1 gives some examples.

Table 15-1. NFS-related errors

Error

Typical Cause

Permission denied

Superuser cannot write to remote filesystem.

No space left on device

Remote disk is full.

Stale filehandle

File or directory has been removed on the server without the client's knowledge.

Both the "Permission denied" and the "No space left on device" errors would have been detected on a local filesystem, but the NFS client has no way to determine if a write operation will succeed at some future time (when the NFS async thread eventually sends it to the server). For example, if a client writes out 1KB buffers, then its NFS async threads write out 8KB buffers to the server on every 8th call to write( ). Several seconds may go by between the time the first write( ) system call returns to the application and the time that the eighth call forces the NFS async thread to perform an RPC to the NFS server. In this interval, another process may have filled up the server's disk with some huge write requests, so the NFS async thread's attempt to write its 8-KB buffer will fail.

If you are consistently seeing NFS writes fail due to full filesystems or permission problems, you can usually chase down the user or process that is performing the writes by identifying the file being written. Unfortunately, Solaris does not provide

Return Main Page Previous Page Next Page

®Online Book Reader