FS#62288 - [util-linux] fstrim.timer not working because fstrim returns non-zero status on success.

Attached to Project: Arch Linux
Opened by Barafu Albino Cheetah (Barafu_Albino_Cheetah) - Tuesday, 09 April 2019, 18:40 GMT
Last edited by Toolybird (Toolybird) - Sunday, 11 June 2023, 07:41 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Christian Hesse (eworm)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 2
Private No

Details

Description: fstrim is a part of core/util-linux package and comes with preconfigured systemd .service and systemd .timer. However, this timer may be impossible to enable, if a system has ntfs-3g mount. Possibly, other filesystems are affected.

When fstrim is started on all filesystems (fstrim -Av), it queues every filesystem to find out if it has TRIM support. Filesystems without TRIM support are silently ignored. But some file systems,(ntfs-3g) do not support even being queued about TRIM support. In this case fstrim goes on to next filesystem, but in the end quits with status code 64, despite having done its job properly.
Systemd unit provided does not account for this case and reports error status. This makes impossible to use fstrim timer on such machine.

Even worse, if an NTFS USB stick is inserted later, and weekly fstrim activation happens by timer, the timer gets deactivated. The GUI user will not even know this, unless they regularly check systemd logs.


Some logs:

barafu@Shamaniak ~ $ sudo /sbin/fstrim -Av
[sudo] password for barafu:
/storage/deep: 1.5 TiB (1628524642304 bytes) trimmed on /dev/sdd1
fstrim: /storage/windows: FITRIM ioctl failed: Bad file descriptor
/storage/btrfs-root: 57.2 GiB (61409009664 bytes) trimmed on /dev/sdb3
/efi: 87.4 MiB (91637248 bytes) trimmed on /dev/sdb2
/home: 57.2 GiB (61408309248 bytes) trimmed on /dev/sdb3
/: 57.2 GiB (61409284096 bytes) trimmed on /dev/sdb3


barafu@Shamaniak ~ $ systemctl status fstrim.service
● fstrim.service - Discard unused blocks on filesystems from /etc/fstab
Loaded: loaded (/etc/systemd/system/fstrim.service; static; vendor preset: disabled)
Active: failed (Result: exit-code) since Tue 2019-04-09 20:58:38 MSK; 8s ago
Docs: man:fstrim(8)
Process: 3579 ExecStart=/sbin/fstrim -Av (code=exited, status=64)
Main PID: 3579 (code=exited, status=64)

Apr 09 20:57:59 Shamaniak systemd[1]: Starting Discard unused blocks on filesystems from /etc/fstab...
Apr 09 20:58:02 Shamaniak fstrim[3579]: fstrim: /storage/windows: FITRIM ioctl failed: Bad file descriptor
Apr 09 20:58:38 Shamaniak fstrim[3579]: /storage/deep: 1.5 TiB (1628524642304 bytes) trimmed on /dev/sdd1
Apr 09 20:58:38 Shamaniak fstrim[3579]: /storage/btrfs-root: 57.2 GiB (61404303360 bytes) trimmed on /dev/sdb3
Apr 09 20:58:38 Shamaniak fstrim[3579]: /efi: 87.4 MiB (91637248 bytes) trimmed on /dev/sdb2
Apr 09 20:58:38 Shamaniak fstrim[3579]: /home: 57.2 GiB (61404299264 bytes) trimmed on /dev/sdb3
Apr 09 20:58:38 Shamaniak fstrim[3579]: /: 57.2 GiB (61392850944 bytes) trimmed on /dev/sdb3
Apr 09 20:58:38 Shamaniak systemd[1]: fstrim.service: Main process exited, code=exited, status=64/USAGE
Apr 09 20:58:38 Shamaniak systemd[1]: fstrim.service: Failed with result 'exit-code'.
Apr 09 20:58:38 Shamaniak systemd[1]: Failed to start Discard unused blocks on filesystems from /etc/fstab.


Steps to reproduce:
1. Install fsutils
2. Mount ntfs partition with ntfs-3g
3. systemd enable fstrim.timer

Workaround:
In fstrim.service, replace "ExecStart=/sbin/fstrim -Av" with "ExecStart=-/sbin/fstrim -Av"
This task depends upon

Closed by  Toolybird (Toolybird)
Sunday, 11 June 2023, 07:41 GMT
Reason for closing:  Not a bug
Additional comments about closing:  See comments
Comment by Dave Reisner (falconindy) - Tuesday, 09 April 2019, 18:42 GMT
From your log:

Apr 09 20:58:02 Shamaniak fstrim[3579]: fstrim: /storage/windows: FITRIM ioctl failed: Bad file descriptor

From the fstrim manpage:

64 some filesystem discards have succeeded, some failed

Looks like you have a problem to fix.
Comment by Barafu Albino Cheetah (Barafu_Albino_Cheetah) - Tuesday, 09 April 2019, 18:52 GMT
This is exactly what I mean. fstrim does not support ntfs-3g, and gives error code when meets it, but still works on all other drives. It should not be considered a failure of "fstrim -Av" because all OTHER filesystems are trimmed properly, and no one expects it to trim a windows partition.
With current state of things, users with windows partition mounted can not use fstrim timer at all.
P.S. My Windows mount works fine.
Comment by Barafu Albino Cheetah (Barafu_Albino_Cheetah) - Tuesday, 09 April 2019, 18:56 GMT
Of cause, the proper solution will be to fix fstrim itself so that it does not produce the error code on NTFS partitions. But I think editing .service file is more realistic. It is important because it affects all dual-boot users who have SSDs and have their Windows filesystems mounted.

I was wrong about usb-sticks, though. I did tests and it seems the device needs to be SSD for problem to appear.
Comment by Dave Reisner (falconindy) - Wednesday, 10 April 2019, 16:40 GMT
Wouldn't it be better to fix fstrim itself such that -A silently skips over filesystems that don't support trim?
Comment by Barafu Albino Cheetah (Barafu_Albino_Cheetah) - Thursday, 11 April 2019, 14:28 GMT
fstrim considers it a feature, not a bug. -A does silently skip filesystems that do not support trim. Code 64 is returned when fstrim was unable to determine if the filesystem has trim support at all. It is a documented behavior. I think it is a right decision to return a non-zero code in this case because it is something that may need user's attention. But it is a warning, not an error, because fstrim still carries out its main function. There is just no concept of a warning return code. It is nice to log a warning, but I suggest that the systemd service should not enter a failed state over it. Fstrim provides no visible results of its work, and users will not even notice that their weekly fstrim has turned itself off. Usual people do not read logs unless something is broken.
Comment by Dave Reisner (falconindy) - Thursday, 11 April 2019, 16:21 GMT
The manpage states: "Errors from filesystems that do not support the discard operation are silently ignored.", but that doesn't appear to be what's happening here. The code checks for EOPNOTSUPP or ENOTTY as "unsupported", but ntfs seems to return EBADF. Either NTFS needs fixing to return a better error code, or fstrim needs fixing to include EBADF as "not supported" so that the filesystem is silently skipped.
Comment by Juan Simón (j1simon) - Monday, 07 October 2019, 07:33 GMT
I have similar problem. In my case fstrim skip the NTFS drive (It is mounted as ntfs not as ntfs-3g) but it fails when executes trim in root filesystem.

My fstab:
# /dev/sda2
UUID=6b9723a9-cbc4-4cd9-8487-d3e8125f75c2 / f2fs defaults,lazytime,nodiscard 0 0
# /dev/sda1
UUID=9F54-D6D1 /boot vfat defaults,lazytime,errors=remount-ro 0 2
# /dev/sdb1
UUID=7B2F6F666368F43F /media/Comics ntfs defaults,lazytime,nofail,uid=1000,gid=1000,blksize=4096 0 0
....

/dev/sda is an internal SSD.
/dev/sdb is a Samsung Portable SSD T3 connected by USB.

If I run the same command as in the fstrim.service file, it doesn't fail:

$ sudo /sbin/fstrim --fstab --verbose --quiet
/boot: 201 MiB (210707456 bytes) trimmed on /dev/sda1
/: 12.3 MiB (12935168 bytes) trimmed on /dev/sda2

But when the service file is executed:

$ systemctl start fstrim.service
Job for fstrim.service failed because the control process exited with error code.
See "systemctl status fstrim.service" and "journalctl -xe" for details.

$ systemctl status fstrim.service
● fstrim.service - Discard unused blocks on filesystems from /etc/fstab
Loaded: loaded (/usr/lib/systemd/system/fstrim.service; static; vendor preset: disabled)
Active: failed (Result: exit-code) since Mon 2019-10-07 09:19:38 CEST; 2s ago
Docs: man:fstrim(8)
Process: 53623 ExecStart=/sbin/fstrim --fstab --verbose --quiet (code=exited, status=64)
Main PID: 53623 (code=exited, status=64)

oct 07 09:19:38 juan-pc systemd[1]: Starting Discard unused blocks on filesystems from /etc/fstab...
oct 07 09:19:38 juan-pc fstrim[53623]: fstrim: /: FITRIM ioctl failed: Read-only file system
oct 07 09:19:38 juan-pc fstrim[53623]: /boot: 201 MiB (210707456 bytes) trimmed in /dev/sda1
oct 07 09:19:38 juan-pc systemd[1]: fstrim.service: Main process exited, code=exited, status=64/USAGE
oct 07 09:19:38 juan-pc systemd[1]: fstrim.service: Failed with result 'exit-code'.
oct 07 09:19:38 juan-pc systemd[1]: Failed to start Discard unused blocks on filesystems from /etc/fstab.

Why does it work well when I run it directly from the console but it fails when I run the service file?
Why are the "--verbose" and "--quiet" options set in fstrim.service? Aren't they contradictory?

$ cat /usr/lib/systemd/system/fstrim.service
[Unit]
Description=Discard unused blocks on filesystems from /etc/fstab
Documentation=man:fstrim(8)

[Service]
Type=oneshot
ExecStart=/sbin/fstrim --fstab --verbose --quiet
ProtectSystem=strict
ProtectHome=yes
PrivateDevices=no
PrivateNetwork=yes
PrivateUsers=no
ProtectKernelTunables=yes
ProtectKernelModules=yes
ProtectControlGroups=yes
MemoryDenyWriteExecute=yes
SystemCallFilter=@default @file-system @basic-io @system-service

Comment by Nicolas Dhouailly (Nico60) - Sunday, 13 October 2019, 11:02 GMT
@j1simon, try with adding ReadWritePaths=/media/Comics under [Service] in fstrim.service, to solve your issue.
Comment by Juan Simón (j1simon) - Sunday, 13 October 2019, 11:33 GMT
I don't know if I should continue to write in this task because I changed the file system of the external USB disk from NTFS to F2FS.
But the strange thing is that FSTRIM now ignores that disk.
lsblk says this disc (/dev/sdc) doesn't support TRIM:

$ lsblk --discard
NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
loop0 0 4K 4G 0
loop1 0 4K 4G 0
sda 0 4K 2G 0
├─sda1 0 4K 2G 0
└─sda2 0 4K 2G 0
sdb 0 0B 0B 0
└─sdb1 0 0B 0B 0
sdc 0 0B 0B 0
└─sdc1 0 0B 0B 0

But hdparm does:

$ sudo hdparm -I /dev/sdc | grep TRIM
* Data Set Management TRIM supported (limit 8 blocks)

Who's right?
----------------------

On the other hand, I think this service file is wrong. If I execute "/sbin/fstrim --fstab" works (although it still ignores the external drive) but it fails when I execute the .service file.

$ sudo fstrim --fstab --verbose
/boot: 218,5 MiB (229093376 bytes) trimmed in /dev/sda1
/: 0 B (0 bytes) trimmed in /dev/sda2

$ systemctl status fstrim.service
* fstrim.service - Discard unused blocks on filesystems from /etc/fstab
Loaded: loaded (/etc/systemd/system/fstrim.service; static; vendor preset: disabled)
Active: failed (Result: exit-code) since Sun 2019-10-13 13:22:03 CEST; 3s ago
Docs: man:fstrim(8)
Process: 67015 ExecStart=/sbin/fstrim --fstab --verbose --quiet (code=exited, status=64)
Main PID: 67015 (code=exited, status=64)

Oct 13 13:22:03 juan-pc systemd[1]: Starting Discard unused blocks on filesystems from /etc/fstab...
Oct 13 13:22:03 juan-pc fstrim[67015]: fstrim: /: FITRIM ioctl failed: Read-only file system
Oct 13 13:22:03 juan-pc fstrim[67015]: /boot: 218,5 MiB (229093376 bytes) trimmed en /dev/sda1
Oct 13 13:22:03 juan-pc systemd[1]: fstrim.service: Main process exited, code=exited, status=64/USAGE
Oct 13 13:22:03 juan-pc systemd[1]: fstrim.service: Failed with result 'exit-code'.
Oct 13 13:22:03 juan-pc systemd[1]: Failed to start Discard unused blocks on filesystems from /etc/fstab.

The problem, in my case, is "ProtectSystem=strict" (https://www.freedesktop.org/software/systemd/man/systemd.exec.html#ProtectSystem=): "If set to "strict" the entire file system hierarchy is mounted read-only, except for the API file system subtrees /dev, /proc and /sys..."
I've commented the line "ProtectSystem=strict" in service file and now it works well but it still ignores the external drive.
Comment by oud54036@zzrgg.com (soredake) - Sunday, 29 March 2020, 11:17 GMT
Any progress on this?
Comment by Barafu Albino Cheetah (Barafu_Albino_Cheetah) - Thursday, 10 September 2020, 12:36 GMT
I can not provide more info because fstrim stopped complaining about a drive after I added it to RAID. I know there were no fixes for this, so I guess I shouldn't close it.
Comment by Juan Simón (j1simon) - Thursday, 10 September 2020, 13:24 GMT
For me it works well now: "fstrim from util-linux 2.36"
Comment by Bitwave (bitwave) - Wednesday, 24 May 2023, 18:23 GMT
this sounds similar to a problem with udisks2 and ntfs-3g: https://github.com/util-linux/util-linux/issues/2267
Comment by Toolybird (Toolybird) - Sunday, 11 June 2023, 07:41 GMT
These days filesystems can be skipped with config e.g. [1]

[1] https://github.com/util-linux/util-linux/issues/2040

Loading...