FS#75937 - divide error: 0000 btrfs_qgroup_reserve_data

Attached to Project: Arch Linux
Opened by MarkW (moriartynz) - Sunday, 18 September 2022, 19:31 GMT
Last edited by Toolybird (Toolybird) - Saturday, 24 September 2022, 23:24 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To No-one
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
Core dumps When trying to copy a file to an external SMR HDD formatted with btrfs.

The external drives have been working well up to now. It is only over the past couple of days that problems have arisen. The issue is repeatable, even after rebooting. Both external drives have the same issue. The internal drives are formatted with ext4 and do not exhibit issues.

Additional info:
* package version(s)
Kernel: 5.19.7-arch1-1
* config and/or log files etc.
Attached kernel messages from journal in kernel.txt
Attached core dumped message from journal in core_dumped.txt
Attached core dump (zst)
* link to upstream bug report, if any

Steps to reproduce:
sudo mkdir /mnt/backup-drive /mnt/backup-drive2
sudo mount /dev/sde1 /mnt/backup-drive
sudo mount /dev/sdf1 /mnt/backup-drive2
sudo cp /var/backup/full/data/202209* /mnt/backup-drive/

The above steps are specific to the specific machine in question. However, each file copied is ~1G, the target drives are not full:
df -h yields:
Filesystem Size Used Avail Use% Mounted on
...
/dev/sde1 3.7T 2.4T 1.3T 66% /mnt/backup-drive
/dev/sdf1 3.7T 2.8T 909G 76% /mnt/backup-drive2
This task depends upon

Closed by  Toolybird (Toolybird)
Saturday, 24 September 2022, 23:24 GMT
Reason for closing:  Not a bug
Additional comments about closing:  See comments
Comment by Toolybird (Toolybird) - Sunday, 18 September 2022, 22:17 GMT
So IIUC the kernel glitches out which then causes sudo to core dump? If true, then the kernel is of course the root cause and we can excuse sudo for crashing. Do earlier kernels work? i.e. is this a regression? If so, then git bisection is an option to track down the offending commit [1][2].

Either way, seeing as you can reproduce it, it's probably best if you report this upstream to the kernel btrfs maintainers.

[1] https://wiki.archlinux.org/title/Kernel#Debugging_regressions
[2] https://wiki.archlinux.org/title/Bisecting_bugs_with_Git
Comment by MarkW (moriartynz) - Saturday, 24 September 2022, 03:38 GMT
I finally got to the bottom of this issue. The UUIDs of the BTRFS volumes is identical on the two drives. I started to suspect things when only attaching one of the drives worked, but only if done on a clean boot. If the other drive is installed at any stage thereafter, it causes problems. Something is being cached by the kernel driver and it really gets its knickers in a twist if another, different drive, gets connected with the same UUID. I haven't got to the bottom of how the BTRFS volumes have the same UUID, but that's a human discussion!
Comment by Toolybird (Toolybird) - Saturday, 24 September 2022, 23:24 GMT
Thanks for reporting back and nice troubleshooting. After a quick search online, it appears identical UUIDs is an unsupported btrfs config [1]. Hopefully you can remediate it.

[1] https://unix.stackexchange.com/questions/612486/

Loading...