We want to monitor the health of our servers in the machine room, and particularly, their temperatures. We want some automated assistance to make this happen.
A web search about temperature monitoring reveals three keystone packages:
The request was to install on all Linux (SuSE) servers the monitoring packages Cacti and/or Munin, and to configure the servers appropriately (i.e. load device drivers) to make available the various sensor information.
Jimc can do this easily and promptly. But suppose jimc is not around? The purpose of this web page is to document the procedure for a moderately complex installation like this one.
The first step is to search for the packages on the SuSE Build Service, http://software.opensuse.org/search This site is a repo (RPM software repository) for standard packages that would not fit in the main distro, for developers' projects to distribute specialized packages, and for supporting non-SuSE RPM-based distros such as Fedora and CentOS. If the Build Service doesn't have it, try Packman and if they don't have it, Google is your friend :-) Non-SuSE packages should be stored in the Mathnet repo, not SuSE-build.
The searcher will go direct to the package if unique, or will give you
a choice list. On the package page,
Show Other Versions (don't try to
Direct Install on your desktop workstation as an ordinary user).
Find your distro version. Ipmitool is in the official release, but often
you will have to click on
Show Unstable Packages and guess which version
seems most trustworthy.
Now on distro.math.ucla.edu (currently hosted on Sunset), change directory
to /h1/www/htdocs/Distro-repo (currently a symlink to /s1/SuSE). In that
directory you will find a file called
source.me, which sets some
shell variables with directory and command names. Do so:
You can view the file to see what variables are available. You need to update this file if the repo is moved elsewhere.
Now use the
snarf build service script to download the packages.
With the -p option it will download to the Mathnet repo instead of SuSE-build.
You may download multiple packages in one execution. Mathnet policy is to
download both the i586 and x86_64 architecture versions of all packages
On the distro host type the command $sbs, then the URLs (space separated).
On your web browser right click on the 32bit or 64bit or noarch links for
the package, and select
Copy Link Address, and paste into your xterm
on the distro host (and a space).
Here's an example for ipmitool:
$sbs http://download.opensuse.org/repositories/openSUSE:/11.4/standard/i586/ipmitool-1.8.11-6.3.i586.rpm http://download.opensuse.org/repositories/openSUSE:/11.4/standard/x86_64/ipmitool-1.8.11-6.3.x86_64.rpm
The script will download the packages and file them in the correct
directories in the local repo. At the end of each download it prints
Saved in $pkgfile, using $pkgfile to represent the filename.
Next step is to deal with dependencies. I hope to automate this someday, but so far that's vaporware. Use this command line:
rpm -q --requires -p $pkgfile |& less
It will show a list of requirements, from which you can infer the package names. For libraries you can do something like this, e.g. for ipmitool's requirement for libcrypto.so.1.0.0:
(result:) /lib/libcrypto.so.1.0.0 (goody, it's already installed)
rpm -qf /lib/libcrypto.so.1.0.0
(result:) libopenssl1_0_0-1.0.0c-18.42.1.i586 (from this package)
In the requirement list you can ignore all the rpmlib items, and you will quickly learn which libraries are standard (in this case, libc, libm and libreadline). Infer the dependent packages, check if they're already installed or already in one of the local repos (SuSE-distro, SuSE-build, Mathnet), and for those that aren't, find and download those, and analyse dependencies recursively. For some nightmare packages you'll end up downloading 30 dependencies or more, but ipmitool is easy, with all dependencies already available.
Now rebuild the repo metadata like this. In
source.me you will see
that you have variables for the repo directories and for the
DISPLAY='' $mkr $bs
At the end it has to sign the metadata backbone file with GPG and it will want the root password for that. I prefer the textbased query so I suppress X-Windows as shown. If you prefer the X-Windows pop-up box, don't say DISPLAY=''.
Now it would be a good idea to test the installation of your keystone package(s). It's most efficient, i.e. gives you the most useful error messages, if you install all the keystone packages at once. Pick a machine, and do:
zypper refresh #(downloads the new metadata)
zypper install ipmitool
If you got all the dependencies, the package(s) will be installed. If not,
it will announce missing dependencies, but stupidly it will only show one per
command-line package. If it says
Obscure-Package cannot be provided
without a reason, but you know you downloaded it, try adding that package
explicitly on the command line, and it will tell you what recursive dependency
is missing. Go back and obtain missing dependencies, recursively. The
operative word fragment here is
While it's OK to run Zypper by hand on one machine for testing, the right way is to add your new keystone packages to Mathnet's package management file, /m1/custom/mathnet.sel. Then the installation process can be automated, and the package will not be deleted in cleanup, and will be preserved in upgrades.
Windows XP (and above) can be monitored via SNMP. See the FAQ on Munin, referring to How to Monitor Windows (but Cacti also can poll via SNMP, and the Linux SNMP daemon can provide useful information also). There are monitoring plugins for various parameters in the Linux /proc filesystem. IPMI data can also be retrieved with SNMP or with native plugins, which is our main use case.
Both packages use RRDtool to create the graphs. They perodically poll the hosts being monitored, from a master site, store the results, and create graphs or textual displays on demand. RRDtool has its own database engine specialized for circular data, i.e. it is configured to hold samples for a prespecified interval such as one week, and newly added samples overwrite the oldest ones. Multiple archives (RRA) with different intervals are allowed. RRDtool also has a graph creation engine, which the front end scripts can call.
When storing data, RRDtool can be executed separately on each host data point, or can run as a daemon, with journalling. Data extraction is handled per event, e.g. per graph. Communication with the daemon is normally via a UNIX domain socket, but a TCP socket can be used (with no authentication and with optional hostbased access restriction).
Required packages for Cacti:
httpd, i.e. a webserver that can run PHP. Apache is good.
PHP, which is the language Cacti is written in. It also needs these modules:
RRDtool is not listed as a prerequisite, but is used to store the data from the hosts.
mysql, client and server. They don't say anywhere what they store in the SQL database, but it isn't the host data.
net-snmp. This is the standard way to poll hosts for operational data.
For polling hosts there is a script called cmd.php, but if you
have a lot of hosts you can install a daemon called
is compiled code.
Details on Munin:
Munin is written in Perl. It is architecture-independent, but some monitoring plugins depend e.g. on the /proc filesystem and hence are Linux specific. Dependent Perl packages are not arcane.
A program runs on each host (as root) to collect the data, and forwards it to a master site (running as a special user) for storage.
Munin is designed to work fairly well out of the box, requiring relatively little hand labor to set it up.
All hosts (that participate in monitoring) need the Munin client package; only the master site needs the Munin master package.
To deliver the graphs and statistics, the webserver needs to run the Munin master as a CGI, either with native execution or via mod_perl.
Support in Munin of IPv6 depends mainly on whether the dependent Perl packages, particularly Net::Server, support it.
The master's configuration file /etc/munin/munin.conf needs to have all the nodes listed explicitly. We will have to write a script to automate this. The conf format is pretty simple.
The node's conf file /etc/munin/munin-node.conf basically is used in the constructor of Net::Server. The node software is a TCP daemon listening on some port, and the master polls it. Hostbased access control. Normally it runs as root (to read /proc).
The main package has a collection of common plugins, and additional plugins can be found at Munin's github area.
Munin is the name of one of Odin's pet ravens who would fly over the world and spy for him.