03-21-2013 05:34 AM
HP TCP/IP services v5.7-ECO4 on IA64-VMS 8.4 with all the latest ECOs installed. NFS server exports directory (with subdirectories of different size) on the regular ODS-2 volume. When I read the sub-directory from the client (SUSE Linux) it gives directory entries sorted on the strange unrecognizable principle - not by name, not by size, not by create date. The side effect of it is the pure directory read performance. If I try to read big sub-directory (say, 10000 entries), the NFS server falls to the CUR state for several hours, fully eating processor. Obviously it tries to sort this directory. From the other side, TCP/IP services v5.4 on AXP-VMS 7.3 reads the same directory for the same client normally.
So the question: what is the knob of NFS server what cause this unneccessary sort and how to switch it off? I didn't find any.
03-21-2013 06:14 AM
What's your ls command, any option, any alias? Does it colorize the output? In my environment switching off colorizing makes a significant difference (20 seconds versus 4, for about 700 files).
When you see a strange sort order you may already have turned off local sorting. Usually the directory output is sorted by the local ls, depending on whatever you have specified in LANG, LC_COLLATE, etc. and independant of the NFS server sort order, if there is any.
As you may know, VMS directories are already sorted, by name.
03-21-2013 06:27 AM
Sorry, but I don't know what's the colorizing means. But I guess that it relates to the client, right? It's the simple Perl' readdir function what we use on the client. Without any option. And it's not the client who sort the directory, but the server does. The same client reads the same directory with the same command normally from TCPIP 5.4 on VMS 7.3. It's definitely the server tries to sort the directory although it was not asked for. And I don't know why.
03-21-2013 06:57 AM - edited 03-21-2013 07:58 AM
can you reproduce this behaviour using the 'ls' command on the NFS client ?
Does TCPIP SHOW NFS show any unusual high counters ?
What does $ MONITOR FILE report ? And $ MONITOR MODE ?
Did you read the chapter about 'Managing the File Name Cache' in the TCP/IP Services Mgmt Guide ?
03-21-2013 08:08 AM - edited 03-21-2013 08:11 AM
I can somehow reproduce these symptoms on a slow PersonalAlpha running OpenVMS V8.3 and TCPIP V5.7 ECO 3. With 800 files in a directory, performance ($ DIR DNFSx:[000000...] from an OpenVMS NFS V2 client) is acceptable. After creating another 800 files, performacne on the NFS server node is so bad, that I get SYSTEM-F-TIMEOUT on the NFS client.
The CPU is completely saturated (be aware, it's a PersonalAlpha emulator) and NFS server is BUSY in EXEC mode all the time, mostly in TCPIP$CFS_SHR and LIBOTS.
03-21-2013 09:00 AM
If enabled - some Linux distributions do this via an alias - ls uses colors to show different file types. That requires more data than just the list of files. It may even cause the ls utility to ask the server for data per file.
Sorry, from "When I read the sub-directory" it wasn't obvious to me, what you were doing.
03-21-2013 10:40 PM
Volker, can you see the strange sort order on small directories too? If yes, then it's exactly the same behavior we see here. And the obvious culprit is the unnecessary sort. I had a wild theory (inspired by ) that client somehow transfers russian locale/collating sequence to the server and the server tries to sort directory according to it. But your experiment killed it as hardly you have russian locale on your client.
Yes, I've read TCPIP management and tuning manuals forth and back and tried to pull all the described knobs (well, all the knobs which looks relevant to me, actually) with no success. There are a lot of undescribed sysconfig parameters in nfs and vfs sections and the cryptic tcpip$cfs_modus_operandi logical, but I compared the values with the one's from TCPIP 5.4 which works normally and found them the same. Sigh.
03-22-2013 12:35 AM
doing a DIR/COL=1 on a small directory (from OpenVMS NFSv2 client) to an OpenVMS V5.7 ECO 3 NFS server shows the same order of files than the local DIR/COL=1 on the NFS server node. The files are on an ODS-5 disk mounted with /STRUCT=5 from the NFSv2 client. The NFS export has options=Name_cvt.
Behaviour of a TCPIP V5.5 NFS server seems to be similar (running on a faster emulator). It's not causing timeouts, but MONI FILE shows high attempt rates on the various resources.
03-22-2013 05:55 AM
Well, I tried to mount it from another VMS system to sort out locale/collating sequence theories. The behaviour for VMS client is a bit different from the Linux one - sort order is normal alphabetic, but the server hangs on big directory as well. NFS server is looping in EXEC mode, attempts rate according to MON FILE is zero. So I'm stuck with no more ideas. By the way, I should to redefine "small" and "big", so the small directory has 2300 files and the big one has 134000 files.