A Critique of Kerberos and AFS
James F. Carter
2002-05-06
At UCLA-Mathnet we use Sun's NIS for user authentication, and thus
everyone's encrypted password is exposed to all users on local systems
without a root exploit. For distributed file service we use Sun's
NFS, whose access control is host-based and depends on user authentication
on the client; thus it is insecure to export filesystems to machines
not under our direct control, and spoofing exploits are not hard.
Both protocols are not ideal when run through a firewall. We are therefore
investigating alternative systems for authentication and distributed
files. The conclusion is that Kerberos and AFS are the leading contenders.
What possibilities are there for authentication? I restrict the scope
to only those that have PAM modules, since PAM is essential in Solaris
and much preferred in Linux.
- NIS: Sun's Network Information Service distributes (among other things)
the traditional password file from a central server to all authenticating
clients. The encrypted passwords are available to anyone, with an
easy command line interface. NIS is what we are using now, and want
to replace.
- NIS+: It uses Kerberos for the authentication component, but there
is a lot of baggage that comes with it. We tried to install NIS+;
it worked, but we feared that the excess baggage would break and we
would not be able to fix it, so we reverted to NIS.
- Miscellaneous alternatives: Besides putting a traditional UNIX passwd
file on every client, you can store name-password pairs in a SQL database
or a LDAP server, or you can use a Windows NT PDC via SMB, or an IMAP
server. None of these are useful for us.
- Kerberos: It has been around for a long time and is well understood.
Beyond just matching up a user ID and password, it can be used for
access control to Kerberos-aware services, and there is a way to propagate
tickets to remotely spawned sessions. Windows NT uses Kerberos for
its authentication component.
- Radius (RFC 2865) is widely used by embedded systems like routers,
and by ISPs to authenticate dialin clients. A Radius server is available
for Linux. On the wire the presented password is encrypted by a not
particularly strong code. The password is typically memorized but
there is some indication of smart card implementations too. Radius
would be a contender if it meets other needs.
- TACACS (RFC 1492): Similar to Radius but without the encryption. Cisco
offers a product with proprietary extensions. Not considered further.
- One-time passwords: S/KEY or OPIE (RFC 1760 or 2289) transform a memorized
password by repeatedly MD5-hashing it (plus a random seed), thus providing
a sequence of values, each used only once, that represent the password
and which can be sent over the wire without any benefit to an eavesdropper.
But the sequence has finite length and can only be resynced on a network
free of hackers. This is not worth the bother.
- Physical objects: The SecurID card by RSA shows a number which is
a function of time, used as a password; the server can reproduce the
number. The iButton has a matching reader and contains a readonly
serial number plus user-defined information. Other smart cards exist
but PAM modules for them are not known. None of these is likely in
our environment.
Of these authentication services, Radius and Kerberos are the major
contenders. For Kerberos the tie-in with both Windows NT and Solaris
is a big attraction, and also, AFS (if we choose to install it) requires
Kerberos for access control. So let's look more closely at the advantages
of Kerberos.
- Users are authenticated with strong cryptographic keys. Brute force
attacks are unlikely to work. If a hacker does a root exploit on a
machine, Kerberos credentials are useless if the user is not logged
on, and if the tickets of a logged-in user are stolen, they are valid
only for a limited time. Of course a Trojan horse can be installed
to capture users' pass phrases.
- Authentication information is never passed on the net in clear text,
nor is it available to ordinary users in bulk. With NIS, you can just
do ``ypcat passwd'' and have all the encrypted passwords, cracking
them at your leisure.
- Kerberos includes an access control component, for applications programmed
to use it.
- Users can authenticate manually at a remote site, if needed, while
remaining authenticated locally. Blanket cross-realm trust can also
be set up.
- Kerberos authentication can be honored on remote sites for access
control, provided the relevant servers and clients are aware of Kerberos.
For telnet and ftp, kerberized versions are part of the standard distribution.
rsh, rcp, rlogin are also available but are deprecated because the
tickets are transferred in clear text. Openssh can be compiled with
Kerberos v4 support (I don't know about Ylonen ssh). Besides ssh,
daemons are standardly provided that can forward generic tickets and
X-windows connections securely. AFS file sharing includes access control
through Kerberos as an integral part.
- Kerberos has remote administration based on Kerberos tickets and an
access control list; you don't have to be root, nor on the server
machine, to administer it.
- Slave servers are a standard feature. Incremental updating of the
database is an ``advanced feature'' which appears to work; bulk updates
(as done for NIS) are actually the normal mode of operation.
- It is possible to dump the database in ASCII form, and restore it.
- Windows NT (and successors) standardly use Kerberos for authentication.
The Windows machine can be configured to use a UNIX Kerberos server.
I don't know if Windows knows about slave servers; probably it does.
If this were done, UNIX and Windows accounts would be controlled as
a unit, and password changes on one would be immediately seen on the
other. (I haven't seen discussion of UNIX authenticating to a Windows
Kerberos server; but there is a PAM module for SMB authentication
which ultimately uses Kerberos.)
So what is wrong with Kerberos that might prevent us from using it?
- I have seen criticism of the Solaris and Linux PAM modules for Kerberos;
however, this may be old information, since there were no bug reports
for the PAM module in a recent Debian distro.
- Recent source code for AFS and ssh both use the deprecated Kerberos
version 4, not version 5. It would appear that the developers of these
subsystems are not aggressively keeping them up to date with Kerberos.
However, the Kerberos 5 server can serve Kerberos 4 clients.
- Ticket forwarding can be a problem: if you login to machine A and
get a set of tickets, then from there you use ssh to execute a command
relevant to Kerberos on machine B, you will need tickets on B, and
most likely you won't have them. The same problem happens if you queue
a batch job. Tickets are saved in /tmp of the local machine, in a
file readable only by the user (and root), and have to be propagated
using the provided client and daemon, manually or as part of a script.
Ssh can propagate AFS-related tickets by itself, but the documentation
says that other tickets are not propagated.
I would conclude that Kerberos is likely to work; the transition to
Kerberos is likely to be transparent for most but not all of our users;
Kerberos will be much more secure than what we have; and we will get
additional capabilities which may be very useful.
What are we looking for in a distributed filesystem? We want its access
control to be hard to circumvent, and we want it to work through an
aggressive firewall. It must be moderately scalable in the sense that
we have 32 fileservers with 168 exported filesystems among them. Encrypting
file content in transit would be nice but is not essential. Replicated
readonly volumes for software (with transparent failover) would be
a helpful addition. Also helpful would be sufficiently robust security
that off-campus machines beyond our control could mount our filesystems
with low risk to us. We must either have a dedicated Windows client
or the data must be exportable via Samba.
In a web search I found a number of cluster filesystems. I don't think
they're relevent here; their intended use is for high-performance
parallel I/O, as in a Beowulf cluster, or for Storage Area Networks
(i.e. fileserver appliances), or for enormous worldwide databases.
I'm listing them here so in future searches they can be recognized
and ignored.
- GASS from the Globus Tookit. For Beowulf.
- PVFS, Parallel Virtual File System, from Clemson. For Beowulf.
- Veritas Distributed Filesystem. Proprietary, for a commercial SAN
product.
- GFS Global Filesystem, a SAN block-level cluster design.
- APGrid Datafarm, targeted for petabytes of data spread over thousands
of nodes.
- OIF, Oracle Internet Filesystem. Stores files in an Oracle database.
There are, however, a few filesystems targeted at our kind of application.
- Sun's NFS is the distributed filesystem that Mathnet currently uses.
Its disadvantages are that users are identified on the insecure client,
not the server; data is transferred by UDP which is fine for the interior
LAN but not if a serious firewall has to be traversed; it has no encryption
(though supposedly Sun's implementation can be made to encrypt); and
it is not particularly fast, having no local cache.
- AFS is the most well-established distributed filesystem. It has been
around since 1984, and OpenAFS is under active development. A number
of organizations, big and small, use it and are happy with it. Its
features, advantages and disadvantages are discussed below, but in
summary: It has excellent security, encryption and access control.
It has client side caching, server replication, and tolerance of server
failures (if replicated). Apparently it is reasonably scalable. It
works directly on the raw device, needing special information in the
inodes. It is not much slower than NFS (faster when the cache can
be utilized). It is available for Solaris and Linux; and there is
a Windows client (not server).
- Coda filesystem. It has client side persistent caching and server
replication. It has good security for authentication, encryption,
and access control. It is tolerant of network and server failures,
and is designed so clients can be disconnected for mobile operation.
It has good scalability. It uses native filesystem formats such as UFS or
ext2, but the files are organized idiosyncratically, not really accessible
except via the Coda client.
It is available for Linux and Solaris.
It is under active development and is a fairly mature product.
It is not particularly fast when writing,
which requires a synchronous round-trip to the server.
- InterMezzo filesystem. The leader of the Coda team did a ground-up
redesign to make it faster. It uses an existing disc filesystem format
and driver; it is known to work on ext2, ext3 and tmpfs. Its kernel
module is included in the Linux kernel starting with 2.4.15. I believe
there is no Solaris kernel module. I'm afraid that it's kind of alpha
level, but it could be a contender in a few years.
- DCE/DFS, an OFS standard set of services and interfaces for client-server
applications. Uses Kerberos-5 for authentication, and includes a distributed
file system with local caching and AFS-style access control and write
synchronization. DCE includes a lot more good stuff, including portable
RPC, a thread library, naming service (for files on remote machines),
communication with alien cells, and time synchronization. It isn't
clear whether you have to take the whole package to get any part of
it. SGI may be in charge of development.
Of these filesystems, NFS is the devil that we know, which all others
have to measure up to. AFS is a strong contender, used at a number
of sites to do what we're trying to do. Coda and DCE/DFS may or may
not be contenders also, but our lack of familiarity with them puts
risk in their columns. InterMezzo is for the future.
So if we replace NFS at all, we'll almost certainly replace it with
AFS. What are its advantages?
- AFS uses Kerberos version 4 tickets, issued by a daemon on or adjacent
to the fileserver, to determine if users have permission to read particular
files. ``Honor system'' user identification is a major weakness of
Sun's NFS. By ``honor system'' I mean that if a machine is not under
our administrative control, because it is foreign, or is a personal
laptop, or is infested by a hacker, our choices are to trust the user
identities it passes out, or to not export files to it.
- File content is encrypted in transit. Eavesdropping is not a major
threat for Mathnet, but is important for other organizations.
- AFS protocols are organized so users can connect to the fileserver
through a firewall. It is feasible, and is common practice, for users
on foreign systems to mount AFS volumes, either world-readable ones,
or if the users have authenticated themselves to the fileserver.
- AFS has more flexible access control than standard UNIX: the directory's
owner can specify an arbitrary access control list, if desired.
- For readonly (software residence) volumes that are replicated, AFS
has automatic and transparent failover when a server dies.
- For writeable (home directory) volumes, there is a local write-back
cache on the client, and a standard feature is a backup copy of all
files with copy on write semantics, so the version as of the last
backup is kept automatically, unobtrusively and without a lot of actual
disc space, if most files aren't written on.
- A client is available to mount AFS volumes on Microsoft Windows NT,
but Windows directories cannot be exported as if AFS.
- A PAM module is available that converts Kerberos credentials to AFS
tokens at login time.
- We won't be alone in using AFS. Here are a few other users, mostly
hype from the OpenAFS web site (selected for positive results):
- Duke University, serving student home directories in campus labs.
Solaris 2.6 through 8 on servers; Windows and MacOS-X clients. Has
cross-realm trust between Kerberos-5 and Win2K-XP Active Directory.
- KTH EE Department (Sweden; newly expanded installation), 1 Tbyte of
data on several HP Alpha with Tru64 UNIX. 400 PC or Mac clients, 15
Solaris clients. No server crashes in 6 months that they've used it.
- A high school in Germany: 1.6 Tbyte on 4 servers, 100 to 150 PC-type
clients running Linux and Win2K. 1500 student users.
- CMU Computing Services Division: Solaris 2.6 servers; Solaris 7 and
8 clients; also Linux and WinNT. Doesn't say how many.
- Stanford University: All student home directories, accessible from
the dorms, public labs and departmental installations. Solaris servers,
Windows and Mac clients (and presumably Linux). They have used it
at least since 1996. Trouble-free from the user's point of view.
AFS sounds like a really wonderful replacement for NFS, so why don't
we, and everyone else, embrace it wholeheartedly?
- Kerberos version 4 is the old version, and is potentially vulnerable
to replay attacks, though such attacks are not promient in the security
mailing lists. However, it would appear that Kerberos-5 can support
AFS in a compatibility mode.
- NFS operates ``on top of'' local filesystems; in other words it moves
the data between machines but has nothing to do with how it is stored
on the fileserver. On the other hand, AFS volumes are specially formatted
for AFS, and are accessible only through AFS, even locally. Thus an
organization deploying AFS has to trust it from the beginning, and
has to make an instantaneous transition from no AFS to all AFS, referring
to access to the volumes being converted.
- AFS file permissions are not the same as UNIX. While AFS access control
may be ``better'', scripts and procedures that assume a UNIX background
may find surprises if run on AFS. Only the ``owner'' UNIX permissions
have any effect.
- Remote authentication and forwarding of tickets is not necessarily
transparent to the user. This is unimportant if the user works on
one client (workstation) at a time, but where multiple machines are
involved, new procedures may have to be developed and learned by the
users. This is particularly a problem for asynchronous jobs, i.e.
submitted to a batch queuing system, and starting or finishing after
the user has logged out, invalidating his tickets. This can be dealt
with, but the procedure has to be researched (by us) and learned (by
the user).
So what should we do now?
- Set up a Kerberos realm and have the MCG staff use it for UNIX authentication.
Make sure we can make it function.
- Have the MCG staff use the UNIX Kerberos for Windows authentication.
- Assuming we're going to commit to Kerberos, redesign our root access
paradigm to take advantage of Kerberos security.
- Think very hard whether the advantages of AFS -- and they are real --
are enough to justify abandoning NFS. An initial step would be to
put the MCG home directories on an AFS filesystem, and install just
the clients globally. A prerequisite would be to have Kerberos operating
reliably.
- In particular, have Windows home directories of MCG staff served from
AFS. At all the AFS sites with testimonials, this was the biggest
part of the use of AFS.
- A Radius to Kerberos proxy server might be useful for controlling
administrative access to our routers and, possibly, printers.