How to fix ZFS pool not importing at boot
Issue description
You are running a Linux-based machine with an install of ZFS on Linux. Everything seems to work correctly, but
after restarting your machine, the ZFS pool is not visible. You can still import your pool manually using
zpool import poolname
or zpool import -a
.
In my situation, this issue does not occur with Debian 10 on an x86-based machine, but it does occur on a Raspberry Pi 4 running Ubuntu Server 20.04.1 LTS.
Setup:
- Raspberry Pi 4 (4GB)
- Ubuntu Server 20.04.1
- 32GB microSD card for root filesystem
- 2x 8TB WD Elements external USB 3.0 drives (ZFS mirror)
Potential causes and solutions
When looking this issue up, there were many potential causes and solutions provided:
- ensure that all ZFS related services are enabled
- add a startup delay in ZFS related configuration files
- just import the pools (duh)
- recreate the ZFS cache file
In my specific scenario, none of these seemed to do the trick.
Investigation
The ZFS pools should be automatically imported on boot by a service named zfs-import-cache.service
. The cache file
should contain information about the imported pools.
Once you have started your system and don’t see any ZFS pools available, try running systemctl status zfs-import-cache.service
or journalctl -u zfs-import-cache.service
. This should show you if the automatic import succeeded or not.
The pool tanker
is a ZFS mirror that is supported by two 8TB WD Elements external hard drives. They are big, loud and
require some time to spin up the platters before you can actually read data off of them. At this point in the investigation,
a good friend recommended that I check what disks were recognized by the system at the time when this import service was
starting up.
For that, you need to modify the systemd service by running systemctl edit zfs-import-cache.service
and put these contents
in there:
[Service]
ExecStartPre=/usr/bin/lsblk
This addition will run the lsblk
command before the command specified in the systemd service is actually executed.
The output of this command will be visible in the service logs. After making this addition, reboot your system and run
journalctl -u zfs-import-cache.service
again. You should be able to see all the connected drives on the system.
In my case, the system recognized the microSD card and an SSD that was connected over USB, but not the hard drives. This was a very clear indication that the system was not waiting until the drives had spun up before attempting to import the ZFS pool.
Hacky solution
To resolve this issue quickly and move on with more interesting things, you can use the same trick that we used for
troubleshooting this issue, but instead of listing the connected hard drives, we can make the systemd service delay its
execution by putting a sleep
statement in there.
Edit the systemd service again with systemctl edit zfs-import-cache.service
and replace the contents with this:
[Service]
ExecStartPre=/usr/bin/sleep 15
This will make the service sleep for 15 seconds before importing the ZFS pool(s). It could be 10, or 30, or whatever value works for you, 15 seconds was chosen here because it is more than enough time for the disks to spin up.
Proper solution (probably)
There are probably better solutions, such as messing around with udev rules or enforcing the delay somewhere else in the ZFS pool import chain of operations, but I have not investigated them at this point. Feel free to contact me with a better solution, and I will update the post accordingly.
Subscribe to new posts via the RSS feed.
Not sure what RSS is, or how to get started? Check this guide!
You can reach me via e-mail or LinkedIn.
If you liked this post, consider sharing it!