Raspberry Pi Graphics (2022)

James F. Carter <jimc@jfcarter.net>, 2022-02-11

The machines involved in this report are:

Piki, a Raspberry Pi 3B. Started out in audio performance, moved to desktop replacement and development.
Holly, a Raspberry Pi 3B. Started as desktop replacement, became the Wi-Fi access point.
Iris, an Intel NUC 6CAYH. Local configuration management master site, and the host for working on Piki's SD card.
Jacinth, an Intel NUC 6CAYH. The main router.

Back in 2018 I got a pair of Raspberry Pi 3B's, but had a great deal of trouble making the OpenSuSE ARM port work on them, so they ended up in the cold palace. See here for the history of the Raspberry Pi's in 2018, and here for a foray into a desktop replacement or thin client role (2020).

In 2021 I hit a bug in the MediaTek mt76x2u driver (x86_64) for mt7612u in the Terow ROW02FD Wi-Fi NIC on Jacinth. (See Wi-Fi Evolution on CouchNet in 2021 for info about this NIC and reasons for picking it.) The bug fortunately stays hidden on aarch64 (Raspberry Pi), so I moved Holly into Jacinth's cabinet and transplanted the Terow NIC to it.

Now I have no desktop machine and I also have a maintenance problem: current versions of Tumbleweed for ARM do not put out video on HDMI, so I involuntarily have a headless RPi (no monitor, no keyboard, no mouse). See Holly Hosed, Tested Backup about a bad update that made me reinstall the image and lose the video.

Since Holly is mission critical (Wi-Fi access point), I can't just do random interventions and reboots to bring back video. Instead I'm going to resurrect the other RPi (Piki), which will start out as a clone of Holly, but at the end the lessons learned in making it go will be propagated back to Holly.

I mount discs by label, since a UUID is ugly and impossible to type by hand. Since it often happens, for repairs or investigations, that one machine has two or more entire discs mounted at the same time (e.g. ROOT, EFI and SWAP from both Holly and Piki), I've assigned a number to each disc which becomes a suffix on the label. Holly is on disc 03 and Piki is on 12.

Local Configuration Management Scripts

These are the local configuration management (LCM) scripts mentioned in this document, and some miscellaneous scripts:

hostgroup: Hosts are members of various sets such as architecture, operating system versions, and roles. The hostgroup command and/or Perl module evaluates a set expression (intersection, union, etc) and produces a list of hosts in that set.
post_jump: When a new machine is set up or after a major upgrade, post_jump installs locally managed configuration files, installs wanted but missing packages, removes unwanted packages, and numerous other nitpicky details that otherwise would get forgotten. It is named for the post_jump customization script used when installing Sun Microsystems' SunOS and Solaris operating systems.
audit-repos: It installs the appropriate repo definitions (/etc/zypp/repos.d/*.repo) for the target host's OS version and architecture, particularly non-SuSE repos like PackMan. I also have a caching proxy (squid) for package files, and audit-repos appends its URL to baseurls that are supposed to be accessed through it. Thus an update package to be installed on multiple hosts is only once downloaded (slowly) across the net.
audit-pkgs: /m1/custom/couchnet.sel and extra.sel are lists of specifically wanted packages, referred to as keystone packages, in Debian terminology, or @world for Gentoo. The files include the hostgroups where packages are wanted. audit-pkgs can install any missing keystone packages, or can remove unwanted ones (not removing dependencies of keystone packages), or can do an online update or dist-upgrade.
audit-scripts: It uses /m1/custom/scripts.dat and scripts.extra to determine which service units (daemons, sockets, timers, etc.) are wanted, per hostgroups, and enables or disables them as appropriate.
mkkeystone: SuSE's zypper is pretty good about keeping keystone packages installed if you originally install from the installation disc, but I had trouble with an image as the package source; zypper wanted to do the dist-upgrade for Tumbleweed by removing almost all the packages on the host including the kernel. Troubleshooting is hard and this may not be SuSE's fault, but I decided to get the keystone concept under my control by creating a metapackage, per host, that requires all keystone packages for that host. mkkeystone generates that package. It would have been more SuSE-like to create a keystone pattern, but my scripts are aligned to handle packages.
hostdata.db: Information about each host is stored in a database, including its name, IP address(es), hostgroups and SSHFP records. From the database, /etc/hosts, DNS zone files (with DNSSEC), etc. are derived.
remote-time.cgi: This isn't exactly a LCM script, but it extracts the present time from the maxter site and steps the local clock to match, after sanity checks. It gets about 20msec accuracy. This is essential on a RPi which has no realtime clock, so when booted or when post_jump starts, its clock will be off by days if not months. I normally run chrony, but on a new installation it will not have its LCM configuration file and will not be enabled, so post_jump has to run remote-time.cgi very early.
chkstat.J: Normally chkstat sets permissions from the files in /etc/permissions* and /usr/share/permissions, but it makes judgments whether files have unexpected security issues, and if so it refuses to fix them, specifically their ownership. So I re-implemented chkstat without the paranoia.
reown: When you install OpenSuSE from an installation disc or an image, the numeric UIDs and GIDs are chosen per the order in which the packages are installed, which is most likely unique per installation and certainly will not match what's in my LCM files. The reown script stats every file on the machine, translates the file's UID and GID to alphabetic using the old passwd and group files, translates them back using the new files, and changes the file's ownership if wrong.
diffie.J: To mitigate the Logjam family of exploits, I rebuild the Diffie-Hellman groups monthly for SSH and OpenSSL. This takes a long time, up to 1/2 hour on the RPI's. Normally the housekeeping scripts run at midnight, when something long-running will not bother the humans.
ssync: ssync is my wrapper for rsync. Normally rsync just copies the files, with no feedback about what was copied. I add logging output by inserting this option: --log-format="%o %f"

Stuffing SD Card #12

This is pretty much identical to the procedure in Holly Hosed, Tested Backup.

Download your image

Locate and download the current XFCE image for aarch64 (or whichever desktop environment you prefer). https://en.opensuse.org/HCL:Raspberry_Pi3 (hardware compatibility list) has a link to http://download.opensuse.org/ports/aarch64/tumbleweed/appliances/openSUSE-Tumbleweed-ARM-XFCE-raspberrypi.aarch64.raw.xz; if the link is broken, which isn't too rare, dig around for the latest version in the containing directory. Actually, for verifying the signature, it's a lot easier if you download using the URL with the file's full name. Today's size: 1.12e9 bytes (1.12 Gb) compressed, took 283sec, 4.2Mb/s.

Also download the SHA256 checksum from $URL.sha256 and the signature (with SuSE's key) called $URL.sha256.asc . To check the signature, first obtain a trusted instance of SuSE's package signing key on your keyring. (Tracing it back to a trust anchor is beyond the scope of this writeup.) Then:

gpg --verify $file.sha256.asc $file.sha256
Then to check the image itself:
sha256sum -c < $file.sha256
It should reply OK (not too swiftly; the file is big).

A commonly seen error is to get the wrong filename. The download site has a symlink to the current version, but $file.sha256 contains the actual filename separated from the hash by two blanks, so it's better to use the real filename in the download URL. (Or rename what you mistakenly downloaded.)

If you don't have the signing key in your keyring, it tells you which key was used; in this case using RSA key B88B2FD43DBDC284. An expert in cyber warfare would laugh at the following suggestion, but if you trust that your distro hasn't been tampered with, look in /usr/lib/rpm/gnupg/keys/ for a key file with that key ID: currently gpg-pubkey-3dbdc284-53674dd4.asc (the first hex number is the last 8 hex digits of the key ID). The file is ASCII and gives the key's owner, which you may or may not believe. More robust ways to trace the key to a trust anchor are way out of scope for this document.

Copy the image to the card

xzcat $image.xz | dd bs=4M of=/dev/mmcblk0 iflag=fullblock oflag=direct status=progress
sync
Rate ~17Mb/sec. Total (uncompressed): 5.91Gb, 373sec, 15.8Mb/sec.

Partition sizes as delivered

1: EFI 67.1Mb fat16
2: SWAP 524Mb
3: ROOT 5.3Gb ext4
Total: 5.9Gb

At first boot, partition 3 will be expanded to fill the card (not swift). The partitions are labeled per their roles. If you have duplicate partition labels, you will need to do something special.

To re-label the partitions

These command lines are going to re-label EFI, SWAP and ROOT. If a host partition has the same label, it isn't clear which wins, but avoid risks: figure out the device path of the partition on the Raspberry Pi's card (not the host), such as /dev/mmcblk0p3, and use that when re-labeling.

fatlabel /dev/disk/by-label/EFI EFI-12 (from dosfstools)
mkswap -L SWAP-12 /dev/disk/by-label/SWAP
tune2fs -L ROOT-12 /dev/disk/by-label/ROOT

Host keys and other files

I expect that during troubleshooting I'm going to have to re-image Piki several times, so I set up a directory (/s1/holly/piki-conf/) that I can just copy (rsync) onto the card. Here's what's in it:

./etc/hostname
./etc/passwd shadow group (Beware, numbers don't match the image.)
./etc/passwd.orig and /etc/group.orig (copies of /etc/{passwd,group} on the pristine image).
./etc/resolv.conf
./etc/ssh/moduli ssh_config sshd_config
./etc/ssh/ssh_host_$TYPE.key{,.pub} where TYPE = rsa ed25519 ecdsa dsa
./etc/ssl/hostcerts/dhparams.pem h-piki-cft-R2024.{cia,usr}
./etc/ssl/private/h-piki-cft-R2024.{kca,key}
./etc/sysconfig/network/ifcfg-en0 (also eth0, lo, and others, probably not necessary)
/etc/systemd/system/multi-user.target.wants/nscd.service (to /dev/null preventing it from starting, because it screws up and causes ssh to not exec commands, with no error message)
./etc/apache2/piki.conf ./home/httpd/htdocs/index.shtml and others for Piki's web front page (useful and served only to the local LAN)
./home/jimc/.bashrc and a lot of others from my mini homedir that I put on non-user hosts, including my home-built editor
./m1/custom/extra.sel scripts.extra restarter.conf
./root/.bashrc
./root/.ssh/authorized_keys id_rsa_functest{,.pub}
And backup.pln files that might be missed by my generic installation script.
I'm not adding /etc/krb5/krb5.keytab, post_jump should get this right.

Generating new host keys

Piki needs its own SSH host keys and SSHFP records. To generate the host keys:

ssh-keygen -t rsa -C "piki-2022" -N '' -f ./ssh_host_rsa_key -b 2048
ssh-keygen -t dsa -C "piki-2022" -N '' -f ./ssh_host_dsa_key
ssh-keygen -t ecdsa -C "piki-2022" -N '' -f ./ssh_host_ecdsa_key -b 521
ssh-keygen -t ed25519 -C "piki-2022" -N '' -f ./ssh_host_ed25519_key
I tried -A in place of the types, but it generated no keys, probably because it looked in the default location and was aware that the host already had keys.
The new SSH keys need to go in the above directory of conf files to go onto Piki.

To generate the SSHFP records:

ssh-keygen -r piki -f ./ssh_host_${TYPE}_key
(comes out on standard output.) TYPE = rsa dsa ecdsa ed25519.
Put them in hostdata.db; sign and re-publish the zone files.
Find details in Unbound, DNSSEC and SSHFP.

As part of putting the new SSHFP records in hostdata.db, Piki's IP address assignment will be revived and added to /etc/hosts, which needs to be copied into Piki's conf file directory, as well as being installed on all the other hosts.

Now that Jacinth knows Piki's IP address(es), rebuild kea-dhcp{4,6}.conf so Piki gets its proper IP address by DHCP.

Generating host keys for TLS

The various services that use TLS for transport, such as Apache2 and Postfix, need Piki's host key. Resurrect it from the CFT Certificate Authority and copy the private and public keys into the piki-conf directory in /etc/ssl/{private,hostcerts}. It should be possible to run /etc/ssl/certsetup.sh specifying an absolute path to the cert just deposited, to produce the chain files and symlinks which that script generates.

Stuffing the card

Now we're ready to copy the configuration files onto Piki's card:
mount /dev/disk/by-label/ROOT-12 /mnt
ssync -a -O /s1/holly/piki-conf/ /mnt/
umount /mnt

Fixing file ownership

I wrote the script reown that changes all files' UID and GID to match the alphabetic-numeric mapping in /etc/passwd and /etc/group that was just copied onto the card (the originals being saved). I had planned to let post_jump run it, but I got cold feet and decided to reown on the host, before booting Piki.

Boot Piki

Move the SD card into Piki, and turn on power so it can boot.

An image's root partition is only slightly bigger than the minimum to hold the software. But when it is first booted, code in the initrd expands the root partition to fill the rest of the media. Booting Piki took 5 minutes; don't panic. Due to the keys in piki-conf, you should be able to connect to the machine by ssh with publickey authentication. If you do log in to the physical console, the userID is root and the password is linux. You should immediately change it to your normal root password, even though the /etc/shadow file about to be installed will also have the (hash of the) normal root password.

Local Configuration Management: Install Everything

The next step should be to run post_jump on the distro master site. It has these major steps. But improvements were needed in some places, tagged with ** and discussed more in the next section.

Sets default command line arguments. Checks that configuration directories exist. Checks that the target is in sane hostgroups, has a FQDN in DNS, etc.
(Re)installs SSH configuration that will allow publickey authentication when the master does SSH to the target (avoiding giving the root password from here on).
Sanity checks on the target, followed by basic infrastructure setup:
- Is it in the hostgroup of its architecture? (The master site knows what architecture it is supposed to have, but has to run ssh on the target to find out what architecture and OS version it actually has.)
- Does it have the OS version for which the master is going to send config files?
- Does it know what kernel it needs?
- Does it even have a kernel? (Or did some prior step delete it?)
- Can it read the repo for that version?
- Can it do transitive authentication on the Kerberos master, to install the keytab with the host keys?
- It then updates the repo definitions in /etc/zypp/repos.d (audit-repos).
- It installs basic programs like rsync and curl, if missing.
- It makes a temporary directory (and cleans up leftovers from previous attempts).
- ** It uses my script reown to map the numeric UID-GID of installed programs to an alphabetic name as known to the system as installed, and then to map this back to numeric per the /etc/passwd and /etc/group that are about to be installed.
- The time is synced (remote-time.cgi), which is essential on a Raspberry Pi which has no realtime clock.
Local configuration directories are created. Includes creating /etc/exports for NFS.
Configuration files are copied to the target using the same procedure as for weekly dist-upgrades (for OpenSuSE Tumbleweed). Includes backing up the originals. Also sets the hostname.
Creates some per-host configuration files. Fixes permissions using chkstat. ** Fixes permissions again using my chkstat.J because chkstat refuses to fix some files with the wrong owner.
More per-host config files: /m1/custom/extra.sel and scripts.extra, if not already present.
Installs wanted but missing packages. (audit-pkgs -i)
** Dis/Enables services (audit-scripts). Verifies that it's not going to remove the wanted kernel. Then it removes unwanted packages. (audit-pkgs -e)
It does a dist-upgrade (Tumbleweed) or online update (Leap). (audit-pkgs -U or -u)
Additional setup.
- Transfer config files again if altered by new or updated packages.
- Start chrony, to sync the time professionally.
- Fix permissions again.
- Run SuSEconfig modules, e.g. font setup.
- Get rid of .rpmnew or .rpmsave files; we've already overwritten the production files.
- Check that the SSL host key is present, warning if not.
- Create or fix /var/spool/mail .
- Warn if /etc/passwd etc. have been monkeyed with.
- Check the Kerberos host key, and reinstall the keytab if missing.
- ** Run daily housekeeping, omitting very long steps.

Improvements to Local Configuration Management

Syncing Numeric UIDs and GIDs: When Linux is installed from an image or installation disc, the numeric UIDs and GIDs are determined by what was installed and their order. They won't match the UIDs and GIDs in the conf files soon to be installed. My script reown will sync them.
Really Fixing Permissions: Permissions are normally fixed by chkstat. But it has a nasty feature that it is suspicious about security implications of file and directory ownership, and it refuses to re-own some directories. I re-implemented chkstat in my script chkstat.J, minus the paranoia, and both are run after packages are installed or updated.
Dis/Enabling Services Before Removing Packages: The Raspberry Pi image uses NetworkManager, but I put Wicked on my RPi's, more practical on non-interactive machines, and NetworkManager was removed. One time there was a crash during the online update (probably because the lease expired for the IP address, and NetworkManager didn't renew it because it had been removed), and when I tried to reboot, the net would not come up, a total loss on this headless machine. So in post_jump I moved audit-scripts just before the removal step, so NetworkManager would be forgotten and Wicked would get enabled. (So why didn't I just plug in a monitor and fix it by hand? Read all about that issue in this very document.)
Daily Housekeeping: Speed of Arthritic Snail: Housekeeping scripts mostly run quickly, but two are long: mandb (index of man pages for quick finding) and generating custom Diffie-Hellman groups to resist the Logjam family of exploits. Together they take about half an hour on a RPi, which is very annoying when you're doing a development installation. Without mandb being run, the man command can fall back to searching the manpath, so mandb can be deferred until the normal daily housekeeping run at midnight. The Diffie-Hellman group files (for OpenSSL and for SSH) are installed from the master site, which is not best practice but is better than using the builtin default group, so again, generating them can be deferred. (I rebuild them monthly, independently on each host, and the master instance will be recognized as being out of date.)
Correct version of U-Boot: The image uses u-boot-rpiarm64 so it can work on the broadest range of RPi's. But mine are Raspberry Pi 3B's. I'm not sure what exactly are the differences between u-boot-rpiarm64 and u-boot-rpi3, but it seems more prudent to use the latter, being the closest match to the actual hardware. This selection is made in /m1/custom/couchnet.sel (LCM).

Problems and Fixes

nscd is poisonous

sshd got into a mode where you do ssh to Piki and it does nothing, sans error messages. I pulled Piki's plug and brought the SD card back to Iris.

piki: /ROOT is mounted on iris: /mnt
journalctl --ROOT=/mnt -u sshd -- Uses implicit pager, probably less.
sshd.service begins by ssh-keygen -A to generate missing host keys, which fails, and sshd never starts. (So how could there be no error message, connection refused, from ssh?)
Message: sshd-gen-keys-start[1279]: No user exists for uid 0. This is while checking for missing server keys. Unit then fails, sshd never started.
Disable nscd. Put the card back in Piki and reboot. rm /mnt/etc/systemd/system/multi-user.target.wants/nscd.service
sshd let me on!
Disable permanently: ln -s /dev/null /s1/holly/piki-conf/etc/systemd/system/multi-user.target.wants/nscd.service

This problem occurred part way through post_jump, before audit-scripts was run, which would have disabled nscd. In the past I've had various mysterious problems with nscd, and while it's a great concept, I've made a policy to turn it off. But to investigate this issue I started it again by hand and ran strace.

Command line: strace -f -o nscd.str nscd -F -t 1 >& nscd.err &
It reports EACCES for /etc/group /etc/passwd /etc/nsswitch.conf /usr/etc/nsswitch.conf . (but these files exist as seen by my root session). If it can't read these files, that would explain the error message from sshd-gen-keys-start that there is no user for UID 0.
Backbone process 1555 can open /var/lib/nscd/services (rw) OK Also /var/lib/nscd/passwd /var/lib/nscd/group /etc/nsswitch.conf (!!) /etc/passwd .
It's starting to look like an effect of containerization or systemd process hardening, where normal files in /etc are not visible in the namespace that nscd's child process locked itself into. Since I'm not going to use nscd anyway, I'm not going to try to fix the process hardening or whatever cause.
Also discovered: /etc/resolv.conf exists but is empty, and DNS names cannot be resolved. I installed Iris' /etc/resolv.conf and "host" works now. Adding /etc/resolv.conf to piki-conf.

pam_ldap requires nslcd

Another similar issue where sshd accepted a connection but did nothing; the message in the journal was:
sshd[5034]: error: PAM: pam_open_session(): Cannot make/remove an entry for the specified session
Shortening a long story: PAM login uses pam_ldap, which relies on nslcd (LDAP cache daemon), which was not running. Enabling and starting nslcd fixed this problem.

Google search for: pam_open_session "Cannot make/remove an entry for the specified session"
Forum post on StackExchange: OP MattBianco, 2014-08-26. The message means that some PAM module returned some error. Very heterogeneous causes; many modules can return many kinds of fatal errors. SELinux enforcing (and misconfigured) is a common problem causing this message.

ConsoleKit is missing

ConsoleKit used to handle giving permission for devices like the keyboard to a user when logging in. But ConsoleKit has been deprecated for a long time in favor of systemd-logind, and I added pam_systemd to my PAM scripts but never removed pam_ck_connector. It's available for x86_64, which is why I wasn't motivated to get rid of it, but it's not available for aarch64. I finally removed pam_ck_connector.so from my PAM scripts.

pam_quality is missing

pam_quality has replaced pam_cracklib on x86_64, but neither of these are available for aarch64 as far as I can see in the package searcher, nor per zypper info pam_pwquality. PAM is good about missing modules, not dying; it just means that I can't change my password on a Raspberry Pi, and there's a nonfatal error message on each login. Hiss, boo, but I think I can't or shouldn't do anything to fix this issue.

Update: it has reappeared in repo openSUSE-Tumbleweed-Oss! Installed on Piki and Holly.

texlive

texlive is soooo screwed up! Something's posttrans script complains that there are no formatting packages, only binary code. Why do we even have it on the RPi's? I did zypper --non-interactive remove --dry-run texlive, and these packages were to be removed (excluding those named texlive*): keystone-piki texinfo . Obviously texinfo is the culprit. It is a keystone package; nothing else depends on it.

I edited couchnet.sel, wanting texinfo only on hostgroup jim, specifically Jacinth and Xena (my laptop). The texlive family was wanted on hostgroup user, changed to jim also.

Package signing keys

I cleaned up the package signing keys, tossing those that did not sign any packages, and rearranging couchnet.sel to match. This was a big job, necessary for getting the installed packages right, but is off topic for this document.

Bluetooth Setup

/etc/systemd/system/bluetooth-fw.J.path waits for /dev/ttyAMA1 to appear, then starts bluetooth-fw.J.service which runs btattach on this tty. However, with the present kernel, device tree, etc. /dev/ttyAMA1 no longer appears. Even so, the HCI gets registered as /sys/class/bluetooth/hci0. From /var/log/boot.msg, we have:

hci0: BCM43430A1 'brcm/BCM43430A1.hcd' Patch (firmware loaded)
(Bluetooth supposedly comes on, on Holly, but not on Piki.)
Holly has 2147 firmware files; Piki has 447. Running comm:
Only on Piki: None.
On both 447 (not hard to deduce).
Only on Holly: 1700, a lot of irrelevant ones.
Is bnx ours? I don't think so. raspberrypi/bootloader has a lot, but not likely relevant for Bluetooth.
An easy solution isn't going to pop out; forum scrounging needed.

Apple AirPods on RPi Bluetooth -- I spotted this tidbit: By default, it cannot be found by Bludtooth scanning. You need to install bluez-auto-enable-devices… I wonder if this is a BLE issue and if it is related to the issue of discovering my new Bluetooth folding keyboard.

Speed test results:

The tests are sha512 sums of big files in the buffer cache, and using dd to copy a block device to /dev/null. Scores are in kbytes/sec. Let N = the number of cores. The total score is 3/8*sha512 + N/8*sha512 + IOspeed.

Host	Sha512	x#cores	IOspeed	Total	Rounded
holly	41905	167620	1785	37559	40000
piki	31735	126940	1889	28712	28000
diamond	201352	805408	86335	219350	220000

It's interesting that the two RPi's are not identical in CPU speed, even though they are both the same model and are running the same software. I/O speed is not supposed to influence this test, plus Piki's SD card is faster while its CPU is slower.

Diamond is currently the fastest machine in the house; an Intel NUC 11PAHi5. CPU is an Intel Core® i5-1135G7 @2.4GHz and the disc is a NVME SSD (about 2.4e9 bytes/sec; can't tell the actual Chinese vendor).

Piki is Ready

The above list condenses 7 repetitions of installing Linux on Piki from scratch. Finally all the steps get done error-free (cross fingers), I have a backup if graphics fixes trash the card and Piki is now ready for interventions in the area of graphics.

Raspberry Pi Graphics (Finally)

In hindsight, I have two related issues: /dev/dri/card0 is missing, which the modesetting X-Server driver needs for communicating with the graphics processor, and the X-Server can't fall back to the EFI framebuffer because the fbdev driver is not installed. The first priority is to activate anything capable of displaying video, and then to revive /dev/dri/card0.

The first thing I did was to install xf86-video-fbdev. This produced normal graphical output on the monitor. It's not very fast, but it's a whole lot better than nothing. I installed package glmark2, a graphics benchmark for OpenGL; details below. As a positive control I ran it on Xena with Intel® UHD Graphics; the overall score was 2019, looks like the average of frames per second over 32 tasks which are tuned to all give similar speeds on particular reference hardware. On Piki, with software rendering (no acceleration), fbdev's overall score was 22. Which is usable, though you won't be playing video games on the framebuffer.

So why wasn't xf86-video-fbdev installed in the first place? Because it wasn't configured as a wanted package, and was tossed during post_jump. Not a brilliant maneuver. Fixed.

The Hardware Compatibility List page for Raspberry Pi-3 in the Troubleshooting - Graphics Acceleration section, recommends to install xf86-video-fbturbo to get the framebuffer working. (And a configuration option has to be added to load the relevant module.) I tried it and ran glmark2: the overall score was still 22 and the individual tests' scores were almost the same as with fbdev. Glmark2 may not have been using the aspects (moving and scrolling windows) that fbturbo particularly targets. However, fbturbo on Holly goes a lot faster, 137 FPS, but Holly has no monitor and has related configuration differences which likely give a speed improvement, so the worth of fbturbo is not assured. However, I kept fbturbo configured on both hosts.

Further in the Graphics Acceleration section, package Mesa-dri-vc4 was mentioned. As with fbdev, I installed it (and added it to couchnet.sel) and from etc/X11/Xorg.conf.d/20-kms.conf I removed Option "AccelMethod" "none". However, /dev/dri/card0 did not appear.

How to use the EFI framebuffer if VC4 is not delivering output: In /boot/efi/config.txt replace dtoverlay=vc4-kms-v3d,cma-default with dtoverlay=disable-vc4 Alternatively or in parallel, add to kernel cmdline: modprobe.blacklist=vc4 (I didn't do either of these -- yet.)

Various Tries to Activate Graphics

HCL Recommendations

To activate graphics acceleration: (I think they're talking about text mode, either full screen or in non-graphic windows). Install package xf86-video-fbturbo and in /etc/X11/xorg.conf.d/99-fbturbo.conf put:
Section "Module" \n Load "shadow" \n EndSection

For accelerated graphical graphics, install Mesa-dri-vc4 and toss Option "AccelMathod" "none" from 20-kms.conf (which I don't have). Apparently you need both interventions.

Finding the packages: xf86-video-fbturbo and Mesa-dri-vc4 are both available in the SuSE download repo, but the latter is marked experimental. Neither have any dependencies. Trouble free installation. I need to add these to couchnet.sel (if they work).

20-kms.conf is gone; no AccelMethod overrides anywhere else either. I added /etc/X11/xorg.conf.d/99-fbturbo.conf per the above instructions, but /usr/share/X11/xorg.conf.d/99-fbturbo.conf also exists and defines a device with the fbturbo driver. Starting out evaluation with both of these files active.

I restarted display-manager, which restarts the X-Server. /var/log/Xorg.0.log had these items:

Modules loaded: shadow* glx fbturbo* fbdevhw* fb (* = newly installed)
fbturbo is using the EFI framebuffer, so it says.
AIGLX: not DRI2 capable, using swrast (software rendering), better than nothing.
No sign of VC4 driver/device. Update: this is a red herring. It would have used the modesetting driver if the screen had been DRI2 capable.
Testing: no sign of HDMI output. But the X-server does start and lightdm does listen on port 5900 (VNC). I could log in to VNC and start a XFCE session. Unlike on Holly.

Digging around… (with partial success)

We have /usr/lib64/xorg/modules/drivers/ = *_drv.so where * = ati dummy fbturbo modesetting radeon. Most, except fbturbo, come from package xf86-video-*. None of these are vc4, and fbdev and vesa are also missing.

Update: The modesetting driver (which we have) is the one that should be used, but it requires DRM (direct rendering), cued by /dev/dri/card0, and will refuse to load if it's missing.

Sidetrack, I see /boot/efi/overlays/disable-vc4.dtbo Digging in /boot/efi/config.txt , this is not included. Whew! And the vc4 kernel module and dependencies are loaded, see /proc/modules.

"zypper info 'xf86-video-*'" produces these values for *: amdgpu ark ati chips dummy fbdev fbturbo fbturbo-live i128 mach64 mga neomagic nouveau nv qxl r128 savage sis sisusb tdfx v4l vesa voodoo (fbturbo-live description == fbturbo, but from a git source.) None of these are vc4.

Next try: add to /boot/efi/extraconfig.txt: dtoverlay=disable-vc4 Yay! Items that are working:

Boot messages are shown on HDMI output.
Display manager starts, greeter is there, and VNC port (5900) is open.
I can log in on the physical console (with SuSE default background).
Bluetooth is operational (despite absence of /dev/ttyAMA1).
Selen's Bluetooth keyboard/mouse emulator can connect to Piki. It can move the cursor and send keystrokes, in parallel with the physical keyboard and mouse.
Except for lack of accelerated graphics, Piki is operational.

Neatening things up for installation on Holly: in /etc/lightdm/lightdm-gtk-greeter.conf add or change to background=/m1/custom/background.jpeg

Non-RPi hosts show their custom backgrounds. When I set this up initially, Piki showed its background (the frog) but it faded into SuSE wallpaper with Geeko. Once the user (the user who previously logged in) sets his desktop background, the lightdm greeter no longer uses the Geeko backgrojnd, leaving the configured custom backgroujnd visible. This may or may not have to be done separately for VNC and for the physical console.

Holly (VNC) shows a black screen at 1024x768px, probably because I haven't yet done any of these mitigations on Holly.

Updating Holly to the same state as Piki

I'm not going to install from scratch; I'll do post_jump steps by hand. These snapshots were saved:

0121b/descr.txt -- Production Holly, after mega-update. Works.
0205a/descr.txt -- Piki after complete install + checkout.

Diffs between Piki (left) and Holly (right) excluding obvious ones like host keys. These descriptions were updated after several interventions, not showing the state when I first turned to this step.

Piki's /boot/efi/extraconfig.txt has dtoverlay=vc4-kms-v3d; gpu_mem=128. Holly still has dtoverlay=disable-vc4
Found in an old backup copy of extraconfig.txt:
# https://www.raspberrypi.org/documentation/configuration/config-txt/README.md
# To turn on internal audio: (Documented several places.)
dtparam=audio=on
/etc/sysconfig/network/dhcp -- almost every host is different. Needs cleanup.
/etc/sysconfig/network/ifcfg-{en0,lo} are identical (good)
/etc/systemd/system/multi-user.target.wants/nscd.service is missing from both (good). At one point I masked this by linking it to /dev/null, but that's not necessary.
/home: Permissions differ, need to fix.
/home/httpd/htdocs/ front page for Piki, Holly has one, don't mess. Need to revive Piki's front page. [Done.]
/home/jimc (mini homedir, lacks ssync, fixed) NOW all equal
/m1/custom/ extra.sel scripts.extra restarter.conf (compare by hand). All correct.
/root/.bashrc -- old version, fixed.

Key interventions to sync Holly with Piki:

/boot/efi/extraconfig.txt content: Holly still has dtoverlay=disable-vc4; keep it. But on Piki I re-enabled vc4 trying to get acceleration (DRI2) to work.
Install xf86-video-fbturbo Mesa-dri-vc4
/etc/X11/xorg.conf.d/99-fbturbo.conf (loads module shadow)
Holly had /etc/X11/xorg.conf.d/20-kms.conf which tried to use the modesetting driver (and failed). Turned off. Piki never had this.
With all of these, Holly's display manager is back! But of course without 3D graphics.

Searching for xf86-video-vc4 (and unicorns)

Google found nothing for xf86-video-vc4 with the quotes. The Gentoo wiki page about Raspberry Pi VC4 produced some useful information, including a key quote: The Raspberry Pi 3 VC4 driver is NOT available on 64bit ARM. The RPi Foundation has stated 'we are not working on this, and are unlikely to do so in the near future.' Using the open source vc4-fkms-v3d driver is recommended. (This is in extraconfig.txt.) Here's their checklist:

You need the vc4 kernel module loaded. (Got it.)
You need raspberrypi-firmware. (Got it.)
You need media-libs/mesa (what Gentoo calls it). Probably Mesa-dri-vc4 is what they're referring to. (Got it.)
In /boot/efi/extraconfig.txt you need dtoverlay=vc4-kms-v3d . And they recommend gpu_mem=128 . (Got it.)
Reboot and see what happens.
lsmod; check that vc4 is loaded. Yes it is.
Make sure that /dev/dri/card0 got created. It wasn't, hiss, boo.
Run "glxgears -info" from Gentoo x11-apps/mesa-progs. In the info it should report "GL_RENDERER = Gallium 0.4 on VC4" or similar. If it reports "llvmpipe" it's a half success. "swraster" is a failure.
If you have llvmpipe and fbturbo, remove 'Driver"fbturbo"' from /usr/share/X11/xorg.conf.d/99-fbturbo.conf (or deep-six that file).

Jimc finds: Red Hat has a package glx-utils that provides just glxgears and glxinfo, and another package mesa-demos (lower case) with more demos for the Mesa direct rendering libraries. Both of these are available on the SuSE Build Service as community packages, for x86_64 only. But OpenSuSE also has Mesa-demos (capitalized) that includes glxgears and glxinfo and many more, again as a community package (and I recognize the name of the SuSE developer who maintains it), for both x86_64 and aarch64 (ARM). Many but not all of the demos depend on packages glew and libGLEW2_2 or 2_1, depending. I'm installing Mesa-demos on the RPi's and using only the ones, like glxgears, that don't need libGLEW2_x. Someday I should recompile Mesa-demos from source on Tumbleweed, both architectures. But I'm not going to hold up this project to get the complete set of demos.

Normally glxgears is synced with vertical blanking, so the reported frame rate will equal the monitor's vertical refresh rate. To make it run as fast as possible, set this environment variable: vblank_mode=0 glxgears . Also useful is glxgears -info but it shows a gazillion supported extensions. Note, glxgears is not designed as a benchmark and particularly should not be used for purchasing decisions, only to see if accelerated graphics is doing anything. Frame rates:

Xena (Intel CometLake-U GT2 UHD Graphics on intel driver): 7378 FPS.
Piki (llvmpipe software rendering on fbdev driver): 15 FPS.
Holly (llvmpipe software rendering on fbturbo driver): 168 FPS.
Conclusion: at least until I get DRI working, I should use fbturbo.

Piki: Current Issues and Plans (2022-02-07)

When the greeter is showing, the screensaver runs and puts the monitor in black screen mode (if so configured) but doesn't turn it off with DPMS, despite power management configuration to do that. (Workaround: monitor's power button.) A lot of people complain about this; so far I haven't found a fix.

Re-learn how to run glmark2, and get a baseline for non-accelerated X-Windoes.

Then try to activate acceleration on Piki.

Results from glmark2

Installing glmark2 on Piki: It's for OpenGL 2.0, but 3.0 is coming soon. OpenGL ES is a nonproper subset of full OpenGL intended for embedded systems (think Android), and it has a matching glmark2-es. Even though the RPi is basically a cellphone motherboard, you run a desktop OS on it and you have full OpenGL, and should test it with glmark2, not glmark2-es. glmark2 is already installed on the x86_64 hosts (except virtual ones). Its info says that it uses only ES compatible API. (So what's the difference from glmark2-es2?)

Results from glmark2 on various hosts. FPS is frames per second, reporting the slowest and fastest values with the corresponding test names. Score seems to be the average FPS over 32 tests. Output from time is also given: elapsed, user and system times in secs. The user and system times are for the client, not counting the server. and can be greater than the elapsed time because the CPU has multiple cores.

Xena, Mesa Intel(R) UHD Graphics (CML GT2)

elap 331 sec, user 100 sec, sys 38 sec, score 2019 FPS
FPS: 187 (terrain) -- 3106 (build (plaster horse))

Piki, llvmpipe (LLVM 13.0.0, 128 bits) (vc4 disabled, with fbturbo)

elap 350 sec, user 664 sec, sys 9 sec, score 22 FPS
FPS: 1 (terrain + refract (glass rabbit)) -- 43 (bump-render (asteroid))

Piki, llvmpipe (LLVM 13.0.0, 128 bits) (vc4 disabled, with fbdev)

elap 349 sec, user 646 sec, sys 9 sec, score 22 FPS
FPS: 1 (terrain + refract (glass rabbit)) -- 43 (bump-render (asteroid))

Reprise: Jimc's setup from 2018

Jimc's report from 2018: https://forums.raspberrypi.com/viewtopic.php?t=223592

Steps from that howto:

Edit /etc/X11/xorg.conf.d/20-kms.conf and comment out Option "AccelMethod" "none".
Install package Meta-dri-vc4 which provides the direct rendering module for the X-Server.
In /boot/efi/extraconfig.txt you need dtoverlay=vc4-kms-v3d or dtoverlay=vc4-fkms-v3d ("fake" KMS). There are varying reports which variant works better or doesn't work at all. 2018-era forum posts suggest that fkms works better for streaming video, so I'm trying that one first.
The kernel command line needs a nonzero cma allocation. Based on another forum post I'm using cma=300M (unit of megabytes is required). Different people recommend different values. I suspect without proof that this is an upper bound; the driver is known to expand and contract video RAM dynamically. For this driver gpu_mem is irrelevant and can be left at the default (32 on SuSE, 16 on Gentoo, in megabytes).
Reboot to get the correct dtoverlay and cma value.

Doing this on Piki:

Do I have the needed device tree overlay installed? Yes, /boot/efi/overlays/vc4-fkms-v3d.dtbo is a copy of /boot/vc/overlays/vc4-fkms-v3d.dtbo which is owned by raspberrypi-firmware-dt-2022.01.19-1.1.noarch
Mesa-dri-vc4 version 21.3.6-301.1.aarch64 is installed, providing /usr/lib64/dri/vc4_dri.so , the direct rendering module.
About /etc/X11/xorg.conf.d/20-kms.conf: It currently (2022) isn't part of package Mesa-dri-vc4, identified as Eric Anholt's driver. It creates a Device using the modesetting driver. The 2018 version had 'Option "AccelMethod" "none"' which you need to remove to turn on 3D acceleration, but in the absence of this file, nothing needs to be removed.
xf86-video-fbdev and fbturbo are installed but fbturbo is not configured for use.
/boot/efi/extraconfig.txt has dtoverlay=vc4-fkms-v3d
/boot/grub2/grub.cfg and /etc/default/grub in the Linux command line include cma=300M, as does /proc/cmdline.
Rebooting (140 sec) and checking out results:
- No HDMI output to the physical monitor.
- The X-Server is running. lightdm has a greeter on the main display, and is listening on VNC (5900/tcp); it starts another greeter if you connect to the port (vncviewer piki) and the expected graphical content is seen.
- /var/log/Xorg.0.log reports: modesetting and fbdev are autoconfigured (no fbturbo). /dev/dri/card0 is absent and modesetting gets unloaded. fbdev gets initialized normally. DRISWRAST GL provider is initted (software rendering).
- /var/log/boot.msg chatter about vc4: It was bound as input{0,1,3,4,5} (not sure what these do) but drm:vc4_vec_bind failed to get clock: -2 with the result that vc4-drm: probe of soc:gpu failed with error -2. Very likely this is the major culprit.
- Kernel modues loaded per /proc/modules: vc4, vchiq, bcm2835_mmal_vchiq, drm, and other dependencies.
- Forum posts seem to blame bugs in kernel modules, not user setup.
I'm re-enabling fbturbo on Piki.

Trying Out Raspberry Pi OS (RPi-OS)

What I should have done earlier is, put the RPi-OS image on a card and see whether it does 3D acceleration, and if so, what they do right that I'm not doing.

RPi-OS download page. There are about 6 versions; I want Raspberry Pi OS with desktop (64bit), Size: 1.14Gb compressed, 4.16Gb uncompressed. They include the SHA256 hash as text.
Copy to the SD card (19.7Mb/sec)
unzip -p $file.zip | dd bs=4M of=/dev/mmcblk0 iflag=fullblock oflag=direct status=progress
On the first boot it's slow because it's resizing the root partition.
It came up with (correct) IPv4+6 addresses and DNS, courtesy of DHCP and router advertisements.
I'm using it exactly as installed, just with my locale and user password per the provided setup app. Out of the box, 3d acceleration is not enabled.
Chromium is the official web browser. I wanted to install glmark2 and/or Mesa-demos with glxgears etc, but could not find them. There's a Snapcraft package called glmark2-example, some kind of IoT demo so it says, but the description is skimpy and I was hesitant to install it without knowing more about it.
To test 3D acceleration I played several of my video test files using VLC. Without 3D acceleration, all of them used 200% to 300% CPU (multiple cores). All could not keep up with the 25FPS frame rate even though the VLC window was small, probably 640x480px. Originals were 1080p (1920x1080px).
Checking in /var/log/Xorg.0.log: /dev/dri/card0 was opened and modesetting was the driver used. Glamor was disabled. AIGLX says: Screen 0 is not DRI2 capable, using swrast.
I tried raspi-config. The GUI version doesn't have the Advanced tab that changes the device tree overlay; I used the ncurses one, just raspi-config with no command line arguments, in an Xterm. Legacy means vc4-disable, the installation default. Full KMS is what yu want; they've deprecated fake KMS. You also have to tell it to activate 3d acceleration (i.e. remove Option AccelMethod None).
Repeating the video test files on VLC: it was definitely better. CPU loads were 100% to 150% and a lot fewer frames were delayed. But it's not what you would expect on a $700 modern laptop: on Xena doing the same tests, CPU never got over 25% and performance was totally smooth.
Conclusion: It's likely that these tests showed the best the Raspberry Pi 3B can do, and the result is not good enough to make it worthwhile to do a giant project to get DRI working on the RPi 3B. Probably a RPi 4 would be worth it. So I'm closing the 3D acceleration part of this project.
Reverting (on SuSE) to vc4-disable, with the fbturbo driver.