Valid HTML 4.01 Transitional
Prev: Home Network Dies on Cable Due To Small MTU Next: X-Windows Logins Very Slow
(Index)
Jim Carter's Bugfixes

Back Version NFS Client Sees Files Owned by Nobody

James F. Carter
2013-03-19
Symptom:

I have a back-version client with nfs-client-1.2.3 (or earlier) trying to use NFS4 to mount from an upgraded server with nfs-kernel-server-1.2.6 (problem is said to start with version 1.2.4). The client experiences server files being owned by nobody:nobody, with obvious baleful effects. For example, try this:

Command:> ls -l /net/tabbycat/m1/testfile #Execute on client
-rw-r--r-- 1 nobody nobody 10 Mar 19 15:37 /net/tabbycat/m1/testfile
The output should have been:
-rw-r--r-- 1 source source 10 Mar 19 15:37 /net/tabbycat/m1/testfile

The other version combinations work properly giving the should have been output: 1.2.3 mounting 1.2.3, 1.2.6 mounting 1.2.6, or 1.2.6 mounting 1.2.3.

What's happening:

With NFSv3 the server reports the numeric user ID and group ID of each file to the client. Therefore it is necessary for the user database to be in sync on the two machines. /etc/passwd and /etc/group have to be identical. Sun's NIS or LDAP can make synchronization easier, unless you're mounting across domains like I am.

NFSv4 has a new daemon, rpc.idmapd, which translates the UID to or from an alphabetic name@domain. Thus, assuming the domain is the same on the client and the server, only the alphabetic names have to be in sync.

Formerly (including version 1.2.3), NFSv4 servers never used idmapd when the security style was sec=sys which is the default. Starting on a non-obvious date presumably with v1.2.4, they started using it. As of 2012-01-09 (not sure what version this is), this patch for NFS adds a new parameter to the server driver which in the default state claims to revert to the v1.2.3 behavior; in other words, if you want idmapd you have to ask for it. The parameter is called nfs4_disable_idmapping, values are 0 or 1, 1 is the default, and both the nfsd.ko and nfs.ko kernel modules have it. This parameter of course is not present in version 1.2.3.

My experience is that on OpenSuSE 12.2 dated about 2012-08-29 (NFS-1.2.6) the parameter has no effect on my issue; the result is the same with it set to 0 or to 1 or unset (default 1) on the v1.2.6 server.

How to fix:

See this post on an OpenSuSE networking forum for a solution. OP is omattiaso, dated 2013-01-24. He has my exact situation and outcome. Knurpht recommends a scorched earth solution: reverting to NFSv3. To do that, edit /etc/nfsmount.conf, uncomment and alter the setting: Nfsvers=3.

This fixed the problem for the OP, and also fixed it for me.

Evidently /etc/nfsmount.conf is re-read on each mount, so it is not necessary to reboot the client after you change the file.

Here's a checklist of configuration issues for rpc.idmapd mentioned in various forum postings, which I put various amounts of work into without fixing my problem, but other people apparently had these issues and got relief from the fixes.

The NFS system has a cache for ID translations. There is also a timeout of 600 seconds. The cache includes negative entries, that is, keys that are translated to nobody. When testing idmapd fixes, it is never clear whether the cache is cleared: idmapd has code to clear it when it starts up, but symptoms suggested that it may have kept its maps. It's safest to reboot both the server and the client (test machines or virtual machines) after each configuration change.


Prev: Home Network Dies on Cable Due To Small MTU Next: X-Windows Logins Very Slow
(Index)