With the start of the new academic year we need to think about what projects we want to do. I've listed my items, in approximate order of priority as seen by me.
To get the benefits of the new OS release, we need to get it deployed promptly.
All but 3 servers (Arachne, Testudo, Ulanda) are presently upgraded, and Arachne will get its turn very soon. The two backup machines are not in urgent need, but it's best to get the upgrades over with.
A few Bamboos have been upgraded. This needs to be pushed because of version skew: C++ programs compiled on v10.1 will use libstdc++-4.1.0 while v9.2 or v8.2 has libstdc++-5.0.x (older despite the higher number). If the program was compiled on v9.2, v10.1 has compat-libstdc++-5.0.7 so the program will run, but not vice versa. We need to make this problem go away by upgrading all the bamboos promptly. Non-bamboo compute nodes have already been upgraded.
We need to get v10.1 onto workstations, aggressively. Reasons:
The KDE and Gnome menus are full of useless stuff and they lack stuff useful for Mathnet. Also the FVWM menu needs to be reviewed again. We have a list of functions that we want on the menus; now we have to create the menus themselves and deploy them. At the same time, the login scripts should be reviewed thoroughly.
Our handouts and writeups are almost 100% from the last handout campaigns around 1992 and 1996, and are ridiculously out of date. We currently have a "handout" command to display them on a VT100, with lots of features. We should completely rewrite and revise all our handouts, presenting them as web documents.
When a Mathnet program or
script is installed, invariably some machines are down, and they miss the
update. We have had occasional bugs issues from this cause. We need a program
to automatically review the installed files, and to report when a machine
has an old version. The same program can serve a tripwire
function,
detecting programs that have been altered either accidentally or as a hostile
act.
Every time someone has to mess
with Ethernet cables in the machine room, it's a nightmare of tracing cables to
make sure we have the right ones. At least they don't drape over the top of
the racks like they used to. Charlie had a good idea, to mark each end of each
wire with a serial number. Let's make this happen before the next network
upheaval. It would also help to document which ports the various wires are
supposed to be plugged into. How about adding to equip a field for the Cisco
box and port number? The Cisco config file item for each port is supposed to
have a comment telling where it goes. A script could compare these
tables
and report missing or inconsistent entries so we can fix them.
Presently our webmail is set up
as a minimal installation, with no memory
features such as user
preferences, and only the IMP web mail application. I think we now have the
infrastructure (sqlite, or possibly Microsoft SQL Server via Trifox Vortex)
so we can save user information, and this means we can
install additional Horde functions: calendar, address book and task list.
Also at minimum, we should upgrade to the current version.
We need to learn whether PDA's can be synced with a Horde calendar or address book, and if so, we need to document the procedure.
Whenever a machine is connected
directly to a Cisco port, when we need to do O.S. maintenance across the
network, the installer invariably gets stuck because the Cisco box won't open
its end of the port until 30 seconds after the host turns it on. This happens
both when booting off the CD and with the (hoped to be) no hands
procedure. Isn't it true that the spanning tree delay can be dispensed with on
all except trunk ports? If so, let's do that.
NIS was developed at the dawn of Sun Microsystems' successful foray into network computing. It has worked well, but the modern world has different values, e.g. more CPU power and network bandwidth to expend on directory services, and more dangerous and aggressive security threats. For the most part, the NIS maps are reasonable, that is, they give us information that we need and that we can keep up to date. LDAP is an alternative with better security, reliability and behavior under stress. There is a schema (set of tables) for LDAP which is essentially a drop-in replacement for NIS. We have a long-range plan to replace NIS with LDAP, and perhaps now is the time to get serious about this.
The amount of spam we receive is ridiculous. We want to reduce its effect on our system and on our users. Several alternatives have been discussed at various times. Here are my thoughts about what we ought to do about spam.
First we need to assess the magnitude of the problem. Anecdotal counts, or counts from just one person, aren't going to be helpful; we need to survey the entire department. I propose to review recent history: the last 7 days of maillogs on the department MX, and for all user mailboxes, all messages sitting in mailboxes (including the spam mailbox(es)), received in the last 7 days, and received since messages were last deleted from the mailbox (which generally would be more recent than 7 days). I propose to collect these statistics (avoiding personally identifiable information):
Our governing standards and culture require that we act as a common carrier, not making judgments about content. We can (and do) reject messages that are clearly defective, e.g. where the sender address cannot be replied to, and where we can declare actual hazard, as with executable content, but we can't actually refuse to deliver anything else. This means that there is no way to mitigate our costs at the system level to handle spam. However, we can be helpful to our users in their own efforts to dispose of spam.
It's been pointed out that certain commercial blacklist organizations catch a lot of messages. Corporate culture is a lot different from University culture, and there is a question of political correctness. Before embracing one or more of these services we should check very carefully whether they are too zealous or whether hostile competitors can cause a denial of service by falsely accusing someone of sending spam. However, assuming the answer is favorable, Postfix on the MX can tag messages accordingly, and our standard user's procmailrc can (on the user's responsibility) send such messages to the spam mailbox. Edson posted a review in which the sysop used three such services, and the message was tagged/rejected if any one triggered. Counting messages that the services let through but that SpamAssassin tagged, that number was only about 5% of the number the services caught. (No data on how many SpamAssassin missed that the services caught, which of course depends on the cutoff level.)
It's been proposed that we run Sophos PureMessage on the MX. Let's compare PureMessage with SpamAssassin:
PureMessage may or may not be better than SpamAssassin at catching spam. With PureMessage we get frequent corporate updates of signatures; with SpamAssassin some of us get Bayesian adaptiveness. Any difference in effectiveness has to be tested empirically.
How much CPU power is actually going to spam detection? My anecdotal feeling is that it's not very much; this should be validated with real data. In any case, with PureMessage it would be concentrated on the MX, which could be a problem in fault situations. With SpamAssassin the CPU load is distributed to the various homedir servers.
Presently we use SpamAssassin in the user's account just before delivery; however, I believe it's possible to do system-level tagging with SpamAssassin, and even to have some degree of user customization if file permissions are right. If we choose to do that, which I'm not advocating.
How does PureMessage handle user customization, particularly cut levels, blacklists and whitelists? (Not that whitelists are very helpful in any case.)
Does PureMessage have a tag-only option? In the docs I read several years ago the emphasis was on spam diversion, but I'm sure that it can tag only.
How bad is the system administration burden going to be? I worried then that PureMessage is going to be a time sink, and I still worry.
Unless PureMessage performs significantly better than SpamAssassin in empirical tests, I'm not enthusiastic about system-level spam tagging.
If we tag spam at the system level, e.g. using blacklists, then we could consider diverting it to a different mailbox. Let's call it /m1/spam/$user.$date, in other words, a public directory same as for the system mailboxes, with the mailboxes separated by date and purged after N days by us, not by the user. One of our major problems is inactive users acting as spam traps and filling up the filesystem; system level diversion and (delayed) deletion would take care of that issue. Also, that would take care of clueless users who don't set up spam filtering themselves: we're still delivering the mail, just in a different place.
I'm thinking of delivering to a directory belonging to Mathnet so we clearly have the right to rotate and delete the contents. An alternative is to deliver to, and rotate, ~user/Mail/spam. The advantage is that IMAP clients such as Pine and Horde/IMP can display it with no need to tell them where the spam mailbox is. The disadvantage is that we aren't supposed to be monkeying with files in a user's home directory.
PureMessage has a fairly elaborate mechanism for spam diversion. I wonder if it's too elaborate. I think the best is to deliver to a file on the homedir server in parallel with the regular system mailbox, and to read it with the same tools used for the regular mailbox.