Tracing NFS activity
[W.I.P.]
The problem
You need to benchmark a pre-production new NFS server. "Standard" benchmarks
(bonnie, iozone, plain old dd) are not enough, since they generate a workload
different from the "real" one you expect to have in a full production
environment.
So you need to stress test the new server with a workload as similar as
possible to the "real" one. You need to measure precisely what your workload is.
Partial solution
Collect data
tcpdump -c 1000000 -i bond0 -p -s 256 -w /local_scratch/pcap/bond0.pcap
-C 16 host 10.2.0.100 and udp and 'ip[6:2] & 0x1fff = 0' and
not multicast and not broadcast
- this assumes you are doing NFS-over-UDP and there is no other significant UDP traffic;
change to suit your setup
-s 256
according to tcpdump manpage 192 should be enough, but sometimes
filenames are truncated
'ip[6:2] & 0x1fff = 0'
this selects only non-fragmented packets or first
fragment -- everything else is useless
- do not save the pcap files on a network mounted filesystem; save to a
local fs and retrieve later
Analyze data
tcpdump -n -r bond0.pcap8 -vvv
or if you are interested only in workload (and not its originating client):
tcpdump -n -r bond0.pcap8 -vvv | egrep -o '[^ ]+ fh [^ ]+ .+'
This ugly piece of code will split the NFS filehandle into the slightly more
meaningful format inode/generation/pad:
(see also this
NFS File Handle Security for a more complete decode of the FH)
WiP
Next one needs to
- map inodes to paths
- possibly merge consecutive read/write operations
- create a filesystem with "more or less" the same structure (i.e. the same directory
hierarchy and filenames; contents can be borrowed from /dev/zero)
- finally generate some code that produces the same workload