Managing NFS and NIS, 2nd Edition - Mike Eisler [262]
If you choose to ignore this advice, and choose to use soft-mounted NFS filesystems, you should at least make NFS clients more tolerant of soft-mounted NFS fileservers by increasing the retrans mount option. Increasing the number of attempts to reach the server makes the client less likely to produce an RPC error during brief periods of server loading.
Adjusting for network reliability problems
Even a lightly loaded network can suffer from reliability problems if older bridges or routers joining the network segments routinely drop parts of long packet trains. Older bridges and routers are most likely to affect NFS performance if their network interfaces cannot keep up with the packet arrival rates generated by the NFS clients and servers on each side.
Some NFS experts believe it is a bad idea to micro-manage NFS to compensate for network problems, arguing instead that these problems should be handled by the transport layer. We encourage you to use NFS over TCP, and allow the TCP implementation to dynamically adapt to network glitches and unreliable networks. TCP does a much better job of adjusting transfer sizes, handling congestion, and generating retransmissions to compensate for network problems.
Having said this, there may still be times when you choose to use UDP instead of TCP to handle your NFS traffic.[3] In such cases, you will need to determine the impact that an old bridge or router is having on your network. This requires another look at the client-side RPC statistics:
% nfsstat -rc
Client rpc:
Connection-oriented:
calls badcalls badxids timeouts newcreds badverfs
1753569 1412 3 64 0 0
timers cantconn nomem interrupts
0 1317 0 18
Connectionless:
calls badcalls retrans badxids timeouts newcreds
12252 41 334 5 166 0
badverfs timers nomem cantsend
0 4321 0 206
When timeouts is high and badxid is close to zero, it implies that the network or one of the network interfaces on the client, server, or any intermediate routing hardware is dropping packets. Some older host Ethernet interfaces are tuned to handle page-sized packets and do not reliably handle larger packets; similarly, many older Ethernet bridges cannot forward long bursts of packets. Older routers or hosts acting as IP routers may have limited forwarding capacity, so reducing the number of packets sent for any request reduces the probability that these routers will drop packets that build up behind their network interfaces.
The NFS buffer size determines how many packets are required to send a single, large read or write request. The Solaris default buffer size is 8KB for NFS Version 2 and 32KB for NFS Version 3. Linux[4] uses a default buffer size of 1KB. The buffer size can be negotiated down, at mount time, if the client determines that the server prefers a smaller transfer size.
Compensating for unreliable networks involves changing the NFS buffer size, controlled by the rsize and wsize mount options. rsize determines how many bytes are requested in each NFS read, and wsize gauges the number of bytes sent in each NFS write operation. Reducing rsize and wsize eases the peak loads on the network by sending shorter packet trains for each NFS request. By spacing the requests out, and increasing the probability that the entire request reaches the server or client intact on the first transmission, the overall load on the network and server is smoothed out over time.
The read and write buffer sizes are specified in bytes. They are generally made multiples of 512 bytes, based on the size of a disk block. There is no requirement that either size be an integer multiple of 512, although using an arbitrary size can make the disk operations on the remote host less efficient. Write operations performed on non-disk block aligned buffers require the NFS server to read the block, modify the block,