Authentication means presenting (to your computer system) the
credentials that prove your identity, such as a password, a fingerprint,
or a smart card.
With the advent of PAM, SASL and GSSAPI, authentication has become somewhat
more manageable on UNIX/Linux systems, but we still have a long way to go
before we achieve the seamless integration
desired by our users, to
borrow a phrase from Microsoft Windows, where such integration
has been a design goal from the beginning. However, we want to co-opt what
Windows does well, and to not be infested by what Windows does badly.
The paradigm that I want to discuss is where the user authenticates only once,
and all the servers in the system believe in that one authentication.
I call this transitive authentication, because trust in the identity
crosses over from the initial authentication to subsequent service activities.
Other authors refer to it as single sign-on.
I'm going to use the client-server
metaphor: a server is
designed to provide some service, a client wants the service, to
which it is entitled, and an enemy also
wants the service to happen but is not entitled to it.
Authorization is the process by which the server is made aware which clients it should provide the service to. This implies that the clients each need to have an identity, and the server needs an authorization list, or some substitute, of identities that it should (or should not) serve. The process by which the server's owner decides which clients to authorize is important but is (mostly) beyond the scope of this document: our interest begins at the point where the identity appears on the authorization list.
Authentication is the process by which the the client asserts an identity and convinces the server that it is the referent of this identity. Enemies will try to steal the identity, that is, to lie to the server that the enemy is the client which is the referent of the identity, whereupon the service will be provided to the enemy.
It is fairly common for one client to act as a proxy for another; for example the boss has his secretary read his e-mail, and so the mail server needs to know that the secretary is authorized to receive, as a proxy, the services which normally would be given only to the boss.
It is also fairly common for one user to have multiple roles with differing security and reporting requirements. For example it is not a big problem if an enemy gets to read a system administrator's routine mail about office parties and the like, but if the same administrator authorizes services and establishes identities, and if the enemy were to perform those activities by stealing the administrator's identity, the organization would be in big trouble.
Authentication is particularly a problem when the client and server are on different network connected computers. In this case there is also the issue of privacy: enemies can steal information off the network, and authentication data as well as payload data is vulnerable. However, privacy is not an authentication issue and will not be discussed here except insofar as authentication is an intrinsic part of establishing a secure connection.
The phrase transitive authentication means that the client authenticates once, and when he requests subsequent services the servers are aware of and believe in the prior authentication. Generally the initial authentication takes work; at the very least it requires typing a password, showing biometric data, or insertion of a possession key. Users greatly resist authentication if it's frequent, and several services don't work at all unless the user can authenticate to them transitively.
Multi-stage transitive authentication means that the client transitively authenticates to one service, which then obtains different services using the client's credentials. This behavior is an advantage when the service is complex and has multiple parts, but if the client does not trust the server very much, it may be prudent for him to sabotage multi-stage transitive authentication.
Strong authentication cannot be faked by an enemy unless he
cheats, whereas weak authentication can be broken by an enemy using data that
can be stolen in the normal course of business
and using a reasonable
amount of computer resources. For example the traditional UNIX hashed password
was strong when UNIX was first created, but is weak today, because the password
file has to be publicly readable, and it takes only 15 seconds for a modern
processor to crack the low-quality passwords used by typical clueless users.
An example of cheating
that happens all too frequently would be
stealing a secret encryption key protected by UNIX file permissions, which
requires a root exploit. I use phrases like the server knows for sure
or authentication cannot be faked
in the sense that it takes a root
exploit to trick the server and fake the authentication. On several occasions
important Linux servers have been the subject of root exploits, causing
major upsets and remediation efforts by the global Linux community. Although
a well-maintained Linux machine is a lot more secure than a laptop that is
run out of the box (not under Linux) and never updated, the administrator of
a site that has something to lose should take with a grain of salt the
statements below that certain mechanisms are invulnerable, and should not take
for granted the security of his operating system. There is also the
possibility that an authorized administrator goes over to the dark side
and voluntarily lets enemies onto his system.
In a number of services the activity consists of a sequence of transactions
of the same general type; e.g. successive mail messages are retrieved, or
various database queries are performed. Presently client software often uses
a poor substitute for transitive authentication: the client does
service-specific authentication once (e.g. gives an identity and a password),
and the software saves the information and sends it on every transaction. The
disadvantages here are, first, the service has to do the full work of
authenticating the client every time, whereas true transitive authentication
could take less work. But worse, the client user needs to authenticate
separately to each service, every time he starts its client software. On
Microsoft Windows, every such software has a feature
to remember the
client's identity and password as an easily stolen registry key, and the users
think Windows is far superior to UNIX because of its convenience and seamless
integration. True transitive authentication would consign such monstrosities
to the garbage heap where they belong.
Each of the credential types has its own issues and problems.
Traditionally passwords have been up to 8 bytes long. By today's standard 64 bits of entropy is about the minimum that can be considered marginally strong. This would be 11 truly random printable bytes including punctuation, or 22 bytes of running English text (at 3 bits per byte). 28 bytes is about the maximum that a skilled user can type reliably error free every time; this would be running text, and the limit for truly random bytes must be less. In the context of credit or debit cards, some banks can only handle four decimal digits in a password, 13 bits. Thus in general passwords as credentials are not all that strong even for a conscientious user, and can be very weak if the user or the authentication system cuts corners.
Another problem is that users generally worry that they will forget the password, or have poor memorization skills and know that they will not remember it when needed, and so they write it down in a way that is easy to steal (e.g. in their wallet) or trivial to steal (e.g. on a sticky note on the monitor border). This is a particular problem for passwords that are used rarely. If one good password is used every day and transitive authentication gets the client into the rarely used service, the result is more secure than a separate password stuck to the user's monitor. Backup copies of a special password should be in an encrypted file (encrypted with a password unlikely to be forgotten, i.e. the one used for transitive authentication), in a safe, or both.
The major advantage of a password is that it costs nothing to create it (you get what you pay for) and nothing beyond machine time to collect and verify it. Also users are familiar with passwords and it's easy to educate them about passwords. Getting them to do what you tell them is another story. Generally it's easier to get your users to create, memorize and type in one good password once a day, then four or ten of them (needing to give the correct one several times an hour), which they will cut corners on, e.g. appending a different letter to each one, or just using a single (poor) password and not telling you; you wouldn't know because of salted hashes.
Biometrics is seen as a panacea for authentication problems, but of course it isn't. Commonly attempted biometric data includes fingerprints, retina scans, voice recognition, and face recognition. Fingerprints are the most common, having relatively inexpensive readers (US$50 to $200) that provide reasonably informative data. Hard data is not available on how often fingerprints are similar, but it is generally believed that false matches are rare. Retina scans are probably equally reliable, but again, hard data is not widely available. Voice and face recognition are difficult to get right.
Biometric credentials of all kinds have a number of problems:
The reader software always matches the incoming image against a set of standard images, one per known user. We would prefer if it would put out a normalized datum that is supposed to be same every time the same user is seen, as a password would be, because some authentication schemes require such a datum to use as an encryption key.
The user's body is not static. For example, a cut finger may invalidate a fingerprint and a stuffed-up nose would invalidate a voiceprint. The authentication system must be able, without losing security, to replace the user's standard image on short notice without access to the old authentication token, and for some uses, e.g. medical, it is particularly important to provide service reliably to an injured or sick user.
The standard image has to be persistent on the authentication server.
Many setups place the authentication server
in the actual reader
or on the computer it's attached to -- when a match is seen, a
secondary authentication token (a password) is released for network
authentication. This means that both the standard image and the
secondary token are very exposed to theft, and the client's image can
potentially be replaced by the enemy's image. If the images are
encrypted with the secret key of something, such as the administrator
account, theft and fraudulent replacement are harder, but you risk
locking yourself out of your machine, requiring an authorized
root exploit to break in, just as the enemy would do.
Everywhere you go, you leave fingerprints, voices, and facial images. Only retina images are not publicly available. The enemy can then construct, for example, an artificial finger, and trials of this technique show that it can be accepted by reader software. Revoking a compromised finger is hard: you can't cut it off and grow a new one, as insects and salamanders can.
Biometric credentials have their place, for example when a machine wakes from software suspend and you want unobtrusive assurance that the owner and not a casual thief is waking it up, but you cannot make them a non-bypassable element of authentication, nor rely on them for high security.
The smart card is used for customer authentication in every cell phone (the SIM), is making inroads in the credit card industry, and is used by some companies for authenticating users on their computers. It acts as a key agent, holding a secret key, generally a RSA key. When a server doing authentication sends a message, client software passes it to the smart card, which encrypts or decrypts it. Smart cards have a number of security issues:
The conclusion about primary authentication mechanisms is that none of them matches the strength achievable in inter-computer communication, and all have known weaknesses and exploits against them. For the highest security, two or three factor authentication is used: a credential from more than one of these categories.
A seriously security-conscious client will put as many obstructions as possible in the way of an enemy trying to steal his identity. In particular, he will avoid transitive authentication, so if the enemy steals one identity it gives access to only one service but the others remain under the client's control. This means memorizing many passwords, using a different finger in each fingerprint reader, or carrying around a big stack of smart cards. Few users are this conscientious, nor are the services as valuable as one thinks, so the choice of convenience over security may be justified.
The original model of UNIX systems was that certain clients (human users)
have accounts on a specific host, and upon logging in
, that
is, upon authenticating, the client may receive all the services of which that
one host is capable. Thus, most daemons (individual service
provider programs) need not consider authentication and authorization at all:
requests can only come from within the host, and therefore can only come
from authenticated users. Only the login service needs to handle
authentication and authorization, which are commingled: if your authentication
data (encrypted password) is in the password file, you are authorized to be
on the host.
Soon administrators needed to discriminate among users: should students be allowed to print on the professors' printer? Should any user be able to read someone else's mail? Various ad-hoc schemes were developed so services could be authorized for only subsets of the users.
As tasks became more complex we had the spectacle of daemons requesting services from other daemons: for example, the cron daemon nightly runs a job that uses the database daemon to provide whatever data, and then uses the print daemon to produce hardcopy, all without human intervention.
Then network computing reared its fanged head: the network is
the computer
was a marketing slogan of Sun Microsystems. One or a few
particular hosts could provide certain services, to which any host on the local
net could get access; for example, a single machine might have a lineprinter
that was shared by the whole department, or a user's home directory might
reside on one host and be accessed across the net from any other host. A large
can of worms was then opened:
Identities and authorization information must be coordinated among all hosts on the local net. This is generally handled by replacing the per-host password file and related files by a distributed database such as Sun's NIS, or LDAP.
If the enemy gained root access to a host, e.g. by hacking, or by
connecting his own host to the net, he could authenticate
on it as
any user he pleased. The other hosts should no longer just believe when
the client's host's operating system alleges that the request comes from a
particular identity. In addition, there is now a concept of authenticating
hosts: these belong to our group, and all others are enemies, or at least,
guests with their personal rogue laptops are entitled to lesser services
than our own people.
Requests from legitimate users could arrive over the Internet from anywhere in the world, including hosts infested with viruses and keystroke loggers, mixed in with the evil sendings of millions of enemy zombie bots. Some organizations need to provide a certain degree of service even in the face of such dangers.
Since that time the Internet has exploded, and services have appeared that were undreamed of at the Beginning of Time (1970-01-01 00:00:00 +0000), and many of them have serious authentication issues which need to be discussed.
So here's a list of services, far from complete. I have not distinguished between secure and unencrypted versions of the same service, or multiple variants that provide the same general service, or remote versus local execution.
This is the ancient way of using the computer, giving commands and receiving responses over a serial channel.
Many companies rent computing services to tenants who access them over the net, never being physically present with the hardware. Services include execution, file storage, and Internet connectivity to third party customers.
Only authorized users are allowed in. At the start of the session the user must authenticate. It is common for the user to log in to a home host and then to use (or want to use) transitive authentication to get into others -- particularly hosts in a foreign realm. In the extreme case of bulk computing services all interactions go over the global internet and originate on the client's own host, on which the server cannot enforce security policies.
A variety of programs interact with the user via a Graphical User Interface (GUI).
Authentication issues are identical to the shell prompt case, but numerous programs will originate connections to the graphics server, and enemies are not welcome because they display annoying advertising, or worse, they can get a copy of every keystroke entered including secret information.
Within limits imposed by the site, the user can change his own password and various other account information such as his full name, shell, and GUI type.
In traditional UNIX there is no transitive authentication usable for password changing, and so the old password must be given.
The user's personal files reside on the file server. There are also shared files, particularly software shared (readonly) by all the users.
Filesystem access absolutely requires transitive authentication: imagine having to type a password every time a file is opened. Here's an overview of transitive authentication in traditional UNIX:
- Each file has one owner (a client user) and one group (a set of such users); when created, or later, it is configured to be readable, writable or executable by the general public, by group members, or by its owner, in any combination. Modern systems have POSIX ACLs which allow multiple users or groups to be given permission.
- UNIX systems run
simultaneouslymultiple processes each of which is owned by one client user.- File operations are performed by processes, and the process owner's identity governs whether the proposed operation is allowed.
- When a client logs in (authenticates) his process group leader is re-owned to his identity, which is inherited by the spawned processes that actually do the work. This is how the original authentication is propagated transitively for filesystem access.
- Many services receive their requests over UNIX domain sockets which are a variety of file. The service can then be restricted to one user or a set of them by UNIX file permissions -- provided the client and server are on the same host. UNIX domain sockets do not work across the net; network sockets have no access control by file permissions.
- There is quite a variety of network filesystems, with varying quality of transitive authentication. Sun's NFS version 2 relies on the client host's operating system to honestly report the client's identity, and relies on hostbased authentication to restrict service to hosts within the organization (or rogues which have stolen an authorized IP address). AFS uses Kerberos v4 tickets for transitive authentication, which is much more effective (though obsolete). NFS version 4 can also be configured to recognize or require Kerberos v5 tickets.
There is a specialized daemon that can store and retrieve structured information, which in the interesting case is shared by a whole department or company. Generally it is important to keep this information away from unauthorized eyeballs. Modern database engines have an elaborate system of identities and authorizations which is not necessarily coordinated with the host operating system.
The database is the most underused of services, and in my opinion a major reason is the difficulty of authentication. With each database query it's necessary to include an identity and a password, and any automated script needs to have this information hardcoded or hidden in a separate file and exposed to theft. Further, maintaining database authorizations is a nightmare for the organization and mistakes lead to embarrassing information leakage. The database cries out for identities coordinated with the host system and for transitive authentication of those clients.
In addition to shared printers, a site may have shared facilities for creating (burning) discs. Fax transmission is often implemented as a variant of printing.
In all of these services expensive media is consumed, and organizations with privacy or secrecy concerns know that the printed (etc.) information can easily escape from their control. Therefore they generally want to restrict and account for printing. Nonetheless, typical UNIX systems use thehonor systemfor print requests: the standard print software fills in the client's identity with the request, and clients generally refrain from using hacked software that puts in some false identity. On the other hand, Microsoft Windows has (optionally) true transitive or password authentication for printers, and there is a Kerberos hack for CUPS on UNIX which is said to interoperate with Windows, though it is not mainstream.
The system holds mail and delivers it to the user for reading.
In traditional UNIX the mail is stored in and read out of a file protected by UNIX file permissions. The problem is locking the file, so mail delivery and deletion of old messages do not happen at the same time, trashing the mailbox. Because of locking issues as well as less-than-optimal authentication, it is not practical for mail readers to modify this file across the net, and so several networked mail servers have been developed; there is also web mail, in which the organization's web server acts as a proxy, formatting the mail as a web page and sending it out to the client's web browser. For all of these services it is bad form if the client has to enter a password for each message. In the normal case mail is served to only one client, the identity to which the mail was sent.
The system accepts outgoing mail from the user. The software and authentication issues are different from the reading side.
A traditional UNIX mail transfer agent (MTA) was willing to forward mail for anyone to anywhere, but with the invention of spam (unsolicited commercial e-mail) mail transfer agents need to restrict service. A typical rule is that an authenticated organization member may send mail to anywhere, whereas non-authenticated clients (or authenticated outsiders) may send to organization members but may not relay spam through the MTA. This means that client software needs to be able to authenticate to the MTA, and this authentication needs to persist over multiple outgoing messages in the session.
E-mail was one of the first internet
applications and in those days neither authentication nor privacy was either
practical or valued. Today is different. E-mail is handled on networks and by
intermediate servers on which content scanners (Carnivore
) are known to
be operating, and spam (unsolicited commercial e-mail) is normally sent with a
forged sender address. Thus the sender and recipient would like to know
authoritatively that only the intended recipient could read the mail and that
the listed sender actually did send it. Some security-conscious organizations
have a policy that all work-related mail leaving the company must be at least
signed, and encrypted if the recipient can handle it.
Several mail client programs offer automated digital signatures and encryption on mail. The sender and recipient need to effectivelyauthenticateto themselves to use the feature, and transitive authentication makes the user experience much more pleasant.
The system presents personal information such as a contact list in a form that can usefully be coordinated with the mail system and other communications media. Calendar information is partly in this category and partly in the next.
For the most part it's enough to store personal information in a file protected by UNIX file permissions. The Horde is a web-based personal information manager, webmail server, etc. Its authentication issues are the same as for any other modifiable web pages, with the exception of the calendar component.
While normal
people treat their calendar
of activities as personal information, in the corporate setting it is
common for a person's calendar to be visible publicly and for other
people to be able to make demands on the user's time through the calendar
-- making a meeting
.
Clearly the person making the meeting needs to be authoritatively authenticated to the system and must be authorized to make demands on his victim.
This means computer-aided realtime interactive communication. There are a wide variety of chat services, from purely textbased to fullbore video conferences. Of particular interest, voice chat (VOIP) can be bridged to the Public Switched Telephone Network (PSTN).
The original chat service is IRC (Internet Relay Chat). It has purely honor system authentication; it was developed in 1988 when authentication was barely relevant. Jabber/XMPP is much newer and has real authentication, which can be bridged to the host operating system or can be administered separately. VOIP services generally require authentication. When a PSTN (land phone) bridge is involved it's particularly important that the computer-side client be authenticated because an outgoing call is expensive, and an incoming call is intended for a specific person.
Users can view a vast variety of information on the web, of varying degrees of authoritativeness. The content is not only text and images, but audio, video, and software. Most web content is for public consumption. But of interest here is content restricted to particular clients. And in most cases the restricted content can be modified by the authorized client. Examples include:
Of all services protected by authentication, customers' bank accounts (and other financial services such as eBay, and stock brokerages) are one of the two which are most productive for working criminals. Users generally choose weak passwords, and more advanced authentication such as biometric or smart cards is unheard of. Users frequently fall for the blandishment of their non-UNIX web browsers to remember their passwords. Better authentication for restricted web content would really help the clients -- and the servers, who legally may be left holding the bag.
Some webservers can provide a service for which they expect to be paid, such as commercial software, music files, or physical goods to be delivered by other means.
This is the other authenticated service that is regularly tapped into by working criminals. To make a credit card purchase from a webserver the client (or enemy) needs only to provide the information from the card (account number, client's name, expiration date, and special security code) plus the billing address. If the enemy has physical possession of the card, e.g. if it was stolen, or if a copy was made while the card was in a physical merchant's possession, the billing address is easy to find, normally being the holder's residence address, which is public record. Authentication beyond the card information would greatly reduce crime.
Both wireless and wired networks can be set up so a client host needs to authenticate before it can send packets to useful destinations. The latest security protocol, WPA, is governed by IEEE 802.11i. Authentication on Ethernet and friends is governed by IEEE 802.1x, which 802.11i uses in one of its modes.
An organization requires wireless authentication for two reasons: so clients not part of the organization, or who have not paid, cannot use the service, and so enemies cannot connect to the network and steal traffic. On wireless nets, however, security (WEP) has turned out to be less effective than promised, and the management of pre-shared keys and/or WPA authentication credentials is more difficult than expected, particularly when the clients are ad-hoc, numerous and temporary, as at a public wireless hotspot, e.g. an airport. Therefore such wireless nets generally do not use authentication or other security on the air interface; rather, if they restrict access at all, their egress router is set to reject all packets except those from specific MAC addresses, and they have a web server that handles authentication or payment, and then adds the client's MAC address to the router's ACL. It's recommended that clients on any network see to their privacy themselves, assuming that enemies could subvert any of the numerous devices that handle the data packets on their way to and from the server.
Trust agents are a unique case in authentication. To
purchase trust, the client authenticates to the trust agent, generally only
once during its digital lifetime, specifying its own Distinguished Name and/or
the hostname of its server, and its RSA public key. If satisfied, the trust
agent then provides a computer-readable certificate, invariably a X.509
certificate, signed by the agent, which states that the principal (client)
named therein has the right to use the included name. The client's transaction
partners are then supposed to trust the agent and believe in this certificate,
thereby trusting that a connection is coming from the partner named in the
certificate and not from an enemy committing fraud. (The procedure for
trusting the certificate is digital and is automated; see Prime Pairs
below.) An organization can act as its own trust agent, and will trust itself,
but outside servers and clients generally will not trust it and will reject
certificates that it signs.
Generally the trust agent handles authentication by physical mail. The client sends in copies of various non-computerized documents such as a birth or citizenship certificate or a state charter for a corporation, plus DNS records if a host is being certified.
This is the original UNIX method. The client types his identity followed by the corresponding plaintext password, either in response to prompts on a serial line, or in boxes on a GUI form. (If the client is authenticating to a networked server, the plaintext password can be stolen off the wire unless an encrypted connection is used.) The server does a one-way transformation or hash of the password, and if it matches the hashed password on file, the client gets on. The hash is used because the UNIX password file has to be publicly readable, but the fastest computers of the day could not break the hash function and recover the password. Today, however, a weak password (8 random lower case letters) can be cracked in 15 seconds and a dictionary word can be found instantly. Thus organizations are trying to move to improved authentication methods, by moving the hashes into a separate nonpublic file, by requiring stronger passwords, and by not using hashes at all.
Some services keep the client's password in a file or database without any hashing or encryption. This is not secure and is only tolerable when there is no major consequence if an enemy steals all the users' passwords. An important goal for an authentication method is to put as many barriers as possible in the way of a hacker who breaks into the server.
In this family of authentication mechanisms the client has a collection of passwords each of which is used only once. Thus even if the enemy has a keystroke logger on the client user's host, or steals the password off the net (assuming no encryption), the stolen password does him no good since it cannot be re-used. One method involves printing out a sheet of passwords in a secure setting. In another, a small computer generates a deterministic sequence of pseudorandom numbers at a known frequency such as once every 30 seconds, and the authentication server is also able to compute this sequence as a function of time so as to decide if the password is valid.
This is a variant of the plaintext password technique, frequently used on wireless networks, in which all the clients share the same password. The shared key is used to encrypt the network traffic, and hence authentication and privacy are identical. However, traffic is not private from other clients holding the key. It is very simple to authorize a client: just tell him the shared key. One disadvantage is that you can only de-authorize a client (or enemy) by getting all the other clients to change their keys. A worse disadvantage is that the more people know the secret, the more likely that an enemy can steal it.
Other services such as IPSec (a VPN or Virtual Private Network for encrypting traffic to a server-gateway) can be configured with more elaborate and stronger shared keys that are unique for each client. Again, the key functions both for authentication and for privacy. However, it takes work to create the key and to transport it securely to the partner. Thus shared keys are rarely used any more with IPSec; RSA keys are used instead. Except, a server may post publicly a shared key that anyone can use to establish a tunnel; the session initiation protocol randomizes the session key so each datastream is private from the others.
This authentication service is mature and is well-liked by those who use it -- specifically Microsoft Windows. Basically the user's password (after hashing) becomes the key for authenticated and encrypted communication with the Authentication Service. Again, authentication and encryption are tied together: the client announces its identity; the Authentication Service looks up its saved copy of that identity's key; and it sends back an encrypted message which that client ought to be able to decrypt. The payload, a Kerberos Ticket Granting Ticket, allows the client to transitively authenticate to any Kerberos-capable service that talks to the same server. Kerberos has several other very desirable features:
replaysin which an enemy steals data off the network and retransmits it to authenticate to the same server at a later time.
GSSAPI (Generic Security Services Application Program Interface) is a framework for secure authentication between the client and the server. While there are several mechanisms available for GSSAPI, the only one in common use is Kerberos. Thus GSSAPI is often used synonymously with Kerberos, although that usage is not correct.
Rivest, Shamir and Adleman (RSA) developed an algorithm in which a pair of prime numbers, plus some other parameters, functions as a private (secret) key for encrypting or decrypting messages, and their product acts as a key for the inverse transformation. This product is posted publicly and is given to all relevant partners; thus it is called the public key. If an enemy factors the public key he can regenerate the secret key. The procedure to do this is well-known and simple, if you're willing to wait long enough, but the time or resources required are outrageous for commonly used key lengths. Thus RSA-PKI is strong when used for encryption and for authentication.
It is assumed that only the actual client has access to the secret key. If an enemy gets the key, the enemy can impersonate the client.
Only the client can encrypt with the secret key; hence if the server is given a message which it can decrypt with the public key, it knows authoritatively that the message came from the client and not some random enemy. Conversely, the server can encrypt a message with the public key and be certain that only the client will be able to decrypt it.
RSA keys are an integral part of X.509 certificates. The client first generates a secret key, and multiplies the factors producing the matching public key. He then appends to it his Distinguished Name: basically, his personal name or hostname, the name of his organization, and his locality. This information is packed up in a Certificate Signing Request and sent, with authentication documents, to the trust agent (Certificate Authority). The trust agent, if satisfied with the documentation and the fee, forges the certificate by appending its own Distinguished Name and key identifier, and then making a hash over everything, encrypting it with the trust agent's secret key, and appending that. Any server trusting that agent will have installed its root certificate (public key), and can decrypt the hash, and can compute for himself what the hash should have been: should be equal. This proves that the certificate was forged by the trust agent and not an enemy, and that it has not been fraudulently altered afterward.
For authentication, an RSA key is used like this: The client sends in a X.509 certificate. The server checks its signature, and thus trusts that the key goes with the Distinguished Name in the certificate. The server sends a string of random data. The client encrypts this string using its secret key, and sends it back. The server decrypts it with the public key in the certificate; should be what was sent over. Now the server knows authoritatively that the client named in the certificate is at the other end of the connection. The server can then provide the service, if the client is authorized. This procedure is used with SSL/TLS (Secure Socket Layer or Transport Layer Security) which is used by webservers and many other services.
An important variant is used by SSH (Secure Shell): there is no X.509 certificate, but the server already has a copy of the public key, matched up with the client's identity. (It's stored in the client's home directory. SSH allows several public keys, in case one user will act as a proxy for another.) The client announces its identity first plus a key identifier, and the server retrieves the public key. From there the procedure is the same.
In systems that use cryptographic keys for authentication, generally a user's secret key is encrypted with a password, or is otherwise protected. For transitive authentication to work, the key must be available throughout the session. There are several ways to satisfy this requirement:
SSH and GPG include programs called a key agent. If the user sets it up, when the session begins the user provides his password to the agent and it decrypts and saves in memory the secret key. (Memory is protected by UNIX file permissions; only by a root exploit can an enemy steal the key out of memory.) It is aware when the session ends, whereupon it shreds the key and exits. It can also be configured to forget the key after a time limit, after which the user must give the password again. There is a UNIX domain socket opened by the agent, with UNIX file permissions limiting its use to the owner. Client software can pass messages to the agent (from servers wanting transitive authentication) and it will encrypt or decrypt them. In the absence of the agent the client software will ask for a password every time it is used, to separately decrypt and use the secret key.
PAM includes a module which starts the SSH agent and uses the login password to decrypt the key, assuming that the same password is used to log in and to encrypt the key. PAM can be configured so this module does actual authentication of the user, that is, if the key can be decrypted then the user gets on, but if it can't then authentication fails and the session cannot begin. In this mode, every user must have a SSH key.
Kerberos does not use a key agent, but it does save the Ticket Granting Ticket in a file protected by UNIX file permissions. Client software can use this ticket to transitively authenticate to servers.
Smart cards act as key agents; in fact the best such cards can create a RSA secret key and never reveal it (unless the enemy physically disassembles and destroys the smart card). The card may or may not be programmed to require a password before it will encrypt or decrypt using the secret key.
Now let's again go over the list of services and see what needs to be done to make them work with transitive authentication.
Kerberos and SSH transitive authentication are both well supported. There are PAM modules to set up and dispose of the Kerberos and SSH credentials (and key agent). SSH can be configured (per session) to propagate either or both credentials to a session on a remote host; the Kerberos Ticket Granting Ticket ends up in a file on the remote host, while a connection to the local key agent is tunneled through the (encrypted) SSH connection, so client software on the remote host can connect to the remote socket but the authentication messages end up going through the local key agent. SSH can make a tunnel for arbitrary ports, specifically the graphics subsystem data feed, preventing enemies from eavesdropping on keystrokes and images.
Kerberos includes a set of servers and clients for protocols requiring authentication such as FTP and Telnet, and the Kerberos credential can be propagated to a remote session; however, these protocols are presently somewhat obsolete, and SSH is preferred for these functions.
PAM gives some assistance in password changing. Specifically, the PAM module for Kerberos includes a password feature, that works. However, there is no corresponding module for SSH -- and there should be.
When a user gets a shell session on a remote host which is within the same realm (department, etc.), network file storage is generally set up so the remote host can access the same files as the local host. In other words, transitive authentication works within the realm. But using Sun's NFS version 2 or 3, the file server needs to trust the client identities told to it by the client host (remote or local).
Hosts outside the realm, including personal laptops on the local net, cannot be trusted. There are several network filesystems that can function without hostbased trust. AFS is one; sites that use it like it a lot, but it has issues which make it hard to deploy at new sites. NFSv4 includes GSSAPI (Kerberos) transitive authentication, which solves the problem.
Many database engines, with their own clients or with special-purpose applications calling the vendor's API, can accept transitive authentication with Kerberos. These include PostgreSQL (broken in 2002, allegedly working in v6.4), Oracle (version 9, introduced in 8.x for some x), and Microsoft SQL Server. MySQL does not do Kerberos and appears to have no plans to do so. SQLite depends on UNIX file permissions; the way it is used, Kerberos or similar authentication is irrelevant.
Oracle's client can also do transitive authentication via an X.509 certificate, or several proprietary variants.
Database middleware, on the other hand, does not do Kerberos. This includes unixODBC and Trifox Vortex, middleware used at UCLA-Mathnet.
For MySQL people use a kludge: database access is granted to generic or role accounts, the database password hides in a file, and the clients are authorized to read this file as a proxy for being authorized to use the database. This is what UCLA-Mathnet does too, with Microsoft SQL Server via middleware.
CUPS allows each printer to be configured with
an access control list (ACL) listing the user identities and UNIX groups that
are allowed to print. The client software provides the identity on an honor
system basis, so authentication
serves mainly to keep clueless users off
printers where they're not wanted, but is unable to repel a determined enemy.
Since the client's system identities are used, CUPS authentication only is useful when the client is in the same realm as the print server. CUPS can be configured to require HTTP Basic Authentication (loginID and password) on every transaction, but that defeats the purpose of transitive authentication.
Printing needs improved authentication, but only a few sysops see high value coming from this improvement, so there are only a few patches, including Kerberos for CUPS. But none are headed for the mainstream.
When the mail reader is on the mail storage host, the user's system mailbox is protected by UNIX file permissions. However, most users are on separate workstations or on their home machines, and mail is served over the network. (Similarly for webmail; the webserver does not have user home directories nor mailboxes on it.)
At Mathnet we prefer the IMAP protocol for mail delivery, over its secure port; it can also optionally do a TLS upgrade on the insecure port. Secure POP is also used by some (clueless) users. IMAP is designed so the mail client should open a connection and keep it open, whereas with POP the client can only poll the server periodically. POP clients invariably use the kludge of remembering the user's loginID and password for each poll attempt. Some IMAP clients such as Pine hold the connection open for the whole session, as they should, but others like Osso Mail on the Nokia 770 do the polling thing just like for POP.
There is a patch to the server (both IMAP and POP) that adds GSSAPI (Kerberos) capability. We need to get this installed and working. But only some clients know about GSSAPI. Pine is one. We need to find out which of our preferred clients, such as Thunderbird, can do GSSAPI or can be induced to do so with patches.
Mathnet uses Postfix on its SMTP gateways. It will do a TLS upgrade on any of its incoming ports: optional on 25 and mandatory on 587. The client can then do SASL-type authentication, including GSSAPI, or the client's X.509 certificate from the TLS negotiation can prove the client's identity. (In our configuration the X.509 certificate is optional.) The X.509 certificate is used when sending from a Linux box with its own Postfix daemon. But from Windows, the clients seem to know only about Plain authentication, i.e. a loginID and password for each message. It is believed that Pine can do GSSAPI when sending mail, but I have never actually tried it.
The whole issue of authentication to Postfix needs more research.
The standard for mail privacy and authentication is GPG (Gnu Privacy Guard), a modern version of PGP (Pretty Good Privacy) by Phil Zimmermann. It includes a key agent. The mail is filtered through the gpg program (presumably spawned by the mail reader), which decrypts it and checks the signature, or on sending signs and encrypts the message. There is also support for a detached signature, in which the body of the message, but not the attachments, is signed, and this signature then becomes one of the attachments.
Insofar as PIM data is stored in files or is served over the web, authentication issues are simply special cases of generic file access or web access to restricted data.
Microsoft Exchange presumably uses Microsoft's brand
of transitive authentication when a manager makes a meeting by writing on
other peoples' calendars. If the calendar is served over the web, then
the general solution for web access to restricted data applies to this
application. If the calendar is in UNIX files owned by the subject, which
is the right
or politically correct
way, versus having the files
or database writable because they are owned by a central daemon, then
write access for other people has to be handled through group permissions
and/or ACLs, which can get hard to manage.
Jabber/XMPP can be configured to use the traditional loginID and password, or several digest-type equivalents, and these can be done through SASL or by intrinsic methods (assuming the server can read the required password table). There is a patch to accept GSSAPI via SASL, but I was not able to get it to work. There is also the issue that the client, gaim, doesn't do its part of GSSAPI. Thus transitive authentication is not going to happen for Jabber.
IRC can, but usually doesn't, do password authentication of clients.
A typical setup just gets an identity with honor system
authentication.
Web service of restricted information requires some kind of authentication. Since the http protocol is stateless, it is necessary to send the authentication credential from the web browser to the server on every page request. But in the https protocol the TLS connection itself can be cached effectively, and the major web browsers and servers do this; thus if an X.509 certificate is involved it persists and need not be sent and checked again. Web browsers generally can remember the credential within a session (web browser execution) and re-send it when another page in the restricted directory is requested. There are four basic types of web authentication:
The client user provides to the web
browser a loginID and password (possibly to be remembered
in a local
file), which are sent in clear text unless https is in use. True
transitive authentication is not possible. To authorize a client the
server operator obtains his password, hashes it, and saves it in a file in
the restricted directory. This authorization style makes administration
difficult; thus a common way to use Basic authentication is to have only
one loginID and password, and to authorize clients by telling this
credential to them.
With https, normally the client does not send a certificate while the server does. But individual directories can be configured to require a certificate. The Distinguished Name in the client certificate is sought in the password storage file in the restricted directory. Thus this authentication style is hard to administer unless there is only one or a few authorized users. In theory X.509 authentication could provide true transitive authentication, but the web browser generally requires a password to decrypt the secret key, plus there are typically several certificates and keys, and the user has to tell the browser which one to send (the browser is too dumb to remember this choice). So the hope of transitive authentication is dashed.
Restricted data such as a person's calendar may be held by a webserver extension (CGI) which runs under a daemon user ID, not the client's identity, and which has its own way of restricting access to the files including an authentication module. Generally this would involve a loginID and password, but in principle some form of transitive authentication could be made to happen.
The module mod_auth_kerb is available from Sourceforge. It can use Basic authentication but validating the loginID and password through the Kerberos authentication server, but also, if supported by the browser, it can do actual Kerberos transitive authentication. Supporting browsers include Microsoft Internet Explorer version 6 and up (another source says 5 and up), and Mozilla with the negotiateauth plugin (source says Mozilla 1.7 and Firefox 0.9). Definitely we have to get this module installed and working, plus the Mozilla plugin.
I believe the authorization method is the same as the others, i.e. looking for the client's identity in the password storage file in the restricted directory.
MSIE has to be configured to use this mechanism. Under tools - internet
options - advanced, find the security
section and turn on Enable
Integrated Windows Authentication
. Mozilla and Firefox also have a
configuration option but it appears to be turned on by default, but only on
a secure connection (https). If you want to allow your password and/or
Kerberos credential on an unencrypted link (not recommended), navigate to
the pseudo-URL about:config
. Locate and edit the keys
network.negotiate-auth.delegation-uris
(that's URIs, not URLs) and
network.negotiate-auth.trusted-uris
. The default values (comma
separated list) are just https://
. Append the insecure
,http://
to each of them.
Naturally the client needs to have a Kerberos ticket for this to work. On Windows he needs to be logged in to a Win2K (or newer) domain with Active Directory active or with an external Kerberos server configured.
Another possible module is mod_auth_gss; it is referred to in this blog entry. It does generally the same thing as mod_auth_kerb except no fallback to Basic authentication. Its advantage is that it only needs libgss.so and not the complete Kerberos libraries. It is discussed in the context of Solaris-10 and possibly it is usable only on Solaris. It is, however, available from Sourceforge CVS.
Presently transitive authentication does not apply to electronic commerce, but it could and should. The whole issue is discussed later in a separate section. The remaining authenticated services also don't effectively do transitive authentication: Wireless nets are usually set for WEP or WPA shared keys, or if a X.509 certificate is used it's generally not the user's own certificate. Trust agents do their thing just once in the digital lifetime of a server host or, rarely, a client identity, and so transitive authentication is irrelevant.
Having reviewed how to do transitive authentication on numerous specific services, let's now review the various authentication mechanisms generically.
The original authentication method was a loginID and password (Plain or Basic authentication), and all services and client software that can do authentication can do this style. But this is what we're trying to get away from, so I won't discuss it further.
To use X.509 certificates for transitive authentication, the operating system as part of the initial login process needs to start a key agent and load the client's secret key(s) into it. This agent can then encrypt and decrypt challenge strings to do the transitive authentication, as well as participate in setting up an encrypted channel, typically TLS.
The X.509 certificate is very attractive for authentication because:
On the other hand, X.509 certificates have these disadvantages:
For the system administrator the main activity needing transitive authentication is remote shell execution. (You install files by executing a partner-server on the remote site; you perform software package updates by executing the updater on the remote site, etc. etc.) In addition, one of the standard operating modes of the IMAP mail delivery protocol is to remotely execute a partner-server on the mail storage host. For these services SSH is ideal. Here are some of its advantages:
But SSH is no panacea. Here are some of its limitations:
One of the major design goals of Kerberos is transitive authentication, and when client and server software do Kerberos at all, the user experience is excellent. Kerberos has these advantages:
On the other hand, Kerberos has these disadvantages:
encryptthe authentication server's database.
experimental feature. Jimc has implemented an incremental propagation scheme for MIT Kerberos, though it is not mainstream.
Kerberos is the key to practical transitive authentication. RSA keys have their place but will not help us with most of our authentication issues.
It is possible to do transitive authentication on a lot more services than we do now. Here are lists of improvements that we should make. The first set are actually doable.
Finish creating the Kerberos infrastructure. This includes:
Every user (not just a few MCG people) must have a Kerberos identity.
Every host including workstations (not just mail storage sites) and every service on those hosts must have a Kerberos identity. The post_jump script should be able to create these identities when Linux is installed on a new machine.
We need a master site and a slave on PIC, and a slave server on each Math subnet, meaning one more slave. We need to move the existing Math servers to more suitable hosts. Recommended are Sunset, Tupelo, Walnut, Malibu and Laguna.
We should consider, and likely set up, cross-realm trust between Math and PIC.
Login scripts must ensure that the Kerberos credential is initialized on each user login and shredded on logout. [This is done; it works, skipping silently if the user has no Kerberos identity.]
For unobtrusive initial setup, we should install the PAM module that creates a Kerberos identity upon finding it missing, using the login password.
Microsoft Windows should be configured to use the Math and PIC
Kerberos servers. The Windows infrastructure must be authorized to
add principals needed by Windows, e.g. when a new host joins the
domain
a host principal must be created and its symmetric key
must be written in the registry of the host. We would set up a
special domain to test this initially.
Login scripts also need to start key agent(s) for SSH and GPG. [Done, works, skipped if the user doesn't have any keys of the respective types.]
Install or configure more Kerberized servers and clients. These include:
Configure SSH to use the Kerberos credential to authenticate if it is available. [This is done, and works.]
Include pam_krb5 in the password changing PAM stack. [Done, works.]
Obtain and/or write a password changing bridge from Windows to MIT Kerberos. Samba may have the feature.
Configure all NFS clients to use version 4 including GSSAPI (Kerberos) transitive authentication, and configure servers to accept authentication if proffered, otherwise interpreting the user as <nobody>, for readonly access to public software. [This was working on Simba but got broken in the v10.2 upgrade; need to re-do.]
Install and configure the Kerberized IMAP daemon.
Make sure GSSAPI (Kerberos) is configured and working in all our Postfix outgoing mail servers. It is supposed to be built by default.
Locate a Kerberos authentication patch for Thunderbird on Linux, for IMAP authentication. Make sure Outlook Express and Thunderbird on Windows can and do offer Kerberos as an auth mechanism for IMAP. Similarly, make sure that outgoing SMTP can be authenticated by Kerberos.
Get the GSSAPI patch for the Jabber server to compile. Find a client that will do GSSAPI.
Install mod_auth_kerb on all Apache webservers. Install the
negotiateauth plugin for Mozilla / Firefox. Make sure Microsoft
Internet Explorer is configured for Integrated Windows
Authentication
as a pushed-out policy.
Configure all restricted web content to use Kerberos authentication.
We could install the Kerberos patch for CUPS, but this is probably not worth the effort.
Additional support is needed for home users.
Our VPN server has pretty drastic firewall rules. These will need to be relaxed to allow Kerberos authentication from home.
On a standalone home Windows box, can it do primary authentication to the Mathnet or PIC domain, receiving our Kerberos credential in the normal way? What holes do we need to make in our firewall to make this happen? Are we then exposed to trans-net attacks? Will we demand that the home user use a VPN?
On a networked remote Windows box, e.g. for our prof when visiting at another institution, can he authenticate on both that place's net and at Mathnet at the same time? Will Windows present the right credential to the two nets?
Similar issues for a home Linux box; we need to provide a script to automate authenticating to Mathnet, and if the remote Linux box is part of its own Kerberos realm (lacking cross-realm trust), the script needs to not wipe out the credential from the machine's home net.
On what terms will we allow NFSv4 mounting through our firewall?
Find a Windows client for NFSv4. This is for both work and home.
All the patches and plugins we provide automatically at Mathnet need to be documented and made available for home users to download.
These action items would be very helpful but as a practical matter we are not likely to actually accomplish them.
Modify OpenVPN to use generic GSSAPI for authentication. But there's then a chicken and egg issue: if we demand that the VPN be running before sending packets over the Kerberos auth port, the remote user will not have a Kerberos credential when the VPN starts up.
Make Kerberos authentication work through database middleware including from Windows clients.
Change the Kerberos protocol to use RSA keys. Change the Authentication Server to obtain the corresponding public keys from LDAP, i.e. junk the Kerberos database entirely.
Create GSSAPI and/or SASL mechanisms that use X.509 certificates for authentication.
Locate or create a universal key agent that can serve SSH, GPG, and
OpenSSL software, either storing the secret key(s) itself or acting as
middleware to a smart card. Keychain
is a beginning in this
direction but not exactly what I have in mind.
The present so-called authentication method for credit card sales, both on the web and in person, is completely inadequate and is regularly and rampantly subverted by thieves. A major part of the problem is that the client's complete credential is made public and can be replayed to buy things. Points where the credential escapes are:
A client making an online purchase has a keystroke logger on his computer which reports all the credit card information to its owner.
A client responds to a request by his bank's
security
department to confirm his credit card information. (Phishing.)
Merchant personnel copy the credit card while it is not visible to the client. This is particularly a risk at restaurants.
The datafeed from the merchant to the bank is compromised. This is
called skimming
. Promiscuous RFID smart cards have a related
attack vector.
An online merchant's website is hacked and the credit card information is stolen, sometimes over a considerable interval of time.
Let's go over various existing authentication mechanisms for credit cards:
Originally the merchant used a printing device to transfer the embossed letters on the credit card to a paper document which was supposed to be read by automation. The paper was expensive to handle and was often illegible. The waste (carbon paper) was stolen by thieves, and merchant personnel had physical access to the document for an extended time. These various disadvantages prompted the widespread adoption of magnetic card readers.
Existing credit cards have the account
number, client's name, etc. encoded on a magnetic stripe on the back, so
it can be harvested by the merchant's reader (and thieves tapped into the
data stream, a procedure called skimming
).
There are several variants of smart cards, some smarter than others. First, some emit their payload as-is, like a high-tech magnetic stripe, while others encrypt their payload, and the most advanced ones actually do RSA-type cryptographic services, including being suitable for generic authentication. Second, some cards require electrical contact with the reader per ISO 7810 and ISO 7816, whereas others communicate with a not-too-distant reader by radio at 13.56 MHz according to ISO 14443. Thief-type access at a modest distance has been demonstrated; for this to be practical the card's payload must be in plain text, which it is in some bank deployments. Communication is promiscuous, that is, the card will talk to any reader; I have not heard of cards which require the reader (or the server communicating through the reader) to authenticate before it will respond.
The smarter smart cards include a serial number programmed on the chip during manufacturing, which cannot be changed and which is part of the contained encryption key; thus the enemy cannot reprogram one card to be a copy of another without the collusion of manufacturing personnel.
Some smart cards are able to create (or accept from an outside source) a RSA secret key, and will not reveal it unless the enemy removes the cover and probes the memory cells, which would make the chip unusable as a smart card after that.
One company sells inexpensive (US$36) smart card readers that can be used on Linux and Windows, and cards for general cryptography for US$28 each.
To my mind the right
way to do e-commerce involves three major
steps: the client and merchant establish trusted identities known to their
banks, and they both use these identities to sign a payment agreement which
their banks will believe in. In more detail:
The client user creates his own secret and public RSA keys (or his bank may create them for him).
If available in his jurisdiction, e.g. the Netherlands or Finland, he gets his Department of Population Control to certify (sign) his public key, producing an X.509 certificate. Otherwise the certificate will have to be self-signed.
When opening an account at a bank, the client authenticates by whatever means may be convincing. The government certification would help here. When the bank accepts his identity, he gives his certificate to the bank. A helpful and service-oriented bank (I'm being sarcastic here) would extract his public key from the presented certificate and sign it, producing a new X.509 certificate which assures recipients that he has an account with that bank.
Similarly, when he undertakes any debt such as a home mortgage, apartment rental, or cellphone contract, he gives the creditor his certificate. The one signed by the bank would be preferred by the creditor.
The various creditors report personal information about the client to credit reporting agencies, accompanied by the public key. In particular, when the client creates a new secret and public key, e.g. for annual renewal, the public key will be filed with the reporting agencies.
Eventually the client opens a credit or debit card account, and the card issuer receives from him his public key and a signature with the corresponding secret key. This authoritatively matches up the applicant with the records held by the credit reporting agencies: no more opening a credit card account with a stolen identity.
The procedure for making a purchase would go like this. The Distinguished Names of the client and the merchant identify them uniquely and replace today's account numbers.
An integral part of using the card account is that the merchant sends the bill to the client and gets it back, signed. Several mechanisms, not mutually exclusive, can be envisioned for this operation.
An e-mail solution is decidedly low-tech but has the advantage of needing no special hardware and relatively simple software. Disadvantages are that the merchant needs to find out and type in the client's e-mail address, and the client has to have a computer on the net to receive the mail. The client should have a simple shell script that the message is filtered through right from the mail reader, which appends his information and signs the outgoing message using the secret key stored in his key agent. The user interface of Mozilla-type readers such as Thunderbird works better with a plugin that forks off this shell script.
The client navigates to the merchant's
website and collects his bill. While a text-only browser might only be able
to save the file, after which the client uses a command line interface to
sign it and post it back to the merchant's website, a much more user-friendly
method is a browser plugin that does the same thing with one mouse click.
The web form method is clearly the preferred way for online shopping, e.g. at
amazon.com, since when checking out
the client is already on the
correct page of the site.
The client inserts a smart card in the merchant's equipment, or (for online shopping) into a reader on his own computer, which sends it the bill through the electrical contacts or over the radio. This is the mode that banks are promoting as a replacement for cards with magnetic stripes. But we don't really want the card to authorize every bill presented to it. Here are my preferred card features:
The card should be provided with the root certificates of a number of trust agents, and should require the merchant to present a certificate signed by one of them, containing the public key corresponding to the secret key by which the bill was signed by the merchant. Thus a thief who snuck a bill in ahead of the merchant would have to give his own true name, making that mode of attack ineffective.
The card should have a small LCD screen, visible when it is in the reader, that can show the merchant's abbreviated name and the amount of the bill. This guards against a RFID card associating with the wrong reader, or errors (or fraud) in the amount to be paid.
The card should have a button, capacitive sensor, etc. so the client can positively indicate that the transaction should proceed.
The card should have storage for content to be appended to bills that it signs, specifically the client's Distinguished Name and bank URL.
The card should be able to store a considerable number of bills sent to it, which the client will copy periodically to his money management records. Particularly it should save those that the client refused to sign, for later forensic investigation.
If the card is also going to be used for general transitive authentication, we can't require a button press on every authentication, e.g. for access to files. The answer may be that each request for authentication should include a purpose, which is appended to the document being signed. Some purposes would be exempt from the button press. If the bank were to receive a bill signed with a file access purpose (as an attempt at fraud), it obviously would not pay.
Competently made cards cannot be cloned, but if the card is physically stolen, how can we prevent the thief from using it? The obvious way is a password. But it must not be entered through the merchant's equipment lest it be revealed to thieves. A keypad on the card could not be complete enough to do a proper password: it would be better to have none than to have a four-digit PIN. There's a big advantage in having a complete PDA-sized computer which holds the secret key and that could include a fingerprint reader and an interface good enough to do passwords. But that idea would not fit at all with the traditional way of using credit cards, nor with the bank's desire to control as much of the credit card process as possible, specifically the secret key.
Another possibility (likely equally unpopular with the users) is for the card to act like any other key agent: it requires a proper password delivered from the client's own computer (which the client must have available, e.g. on a journey), which turns it on for 24 hours, after which the password must be given again.
I object to the mode envisioned by some banks, where the card can
remain in the client's pocket: the merchant reaches into his electronic
wallet
and extracts whatever he wants. That's an open invitation
to electronic pickpockets and duplicitous merchants.
The same card could be used for online shopping and for generic authentication if the client's computer had an inexpensive card reader. The software interface to the smart card should be the same as for a purely software key agent, so client software need not distinguish them.
Many brands of smart cards can be programmed in Java and have cryptographic and X.509 operations as library routines, so it would not be a big effort to do the various operations described: verifying trust in the merchant's certificate, appending saved client data, and signing the bill.