FS#78646 - [linux] 6.3.4-zen1-1-zen is susceptible to XFS metadata corruption bug

Attached to Project: Arch Linux
Opened by Pawel Kraszewski (PKraszewski) - Monday, 29 May 2023, 16:38 GMT
Last edited by Toolybird (Toolybird) - Monday, 29 May 2023, 22:00 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To No-one
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:

My one of my Arch deployments was just hit by XFS metadata corruption bug, mentioned in recent Phoronix texts:

* https://www.phoronix.com/news/Linux-6.3-XFS-Metadata-Corrupt
* https://www.phoronix.com/news/XFS-Patch-For-Linux-6.3


Sibling error report: https://bugzilla.redhat.com/show_bug.cgi?id=2208553

Additional info:
* package version: 6.3.4-zen1-1-zen

* One machine with no damage (so far), switched linux-lts till fix is released
* One machine at least one partition damaged, currently it is xfs_repair'ing under linux-lts. I'll report extend of damage tomorrow.

Steps to reproduce:
* Have up-to-date core/linux package.
* Perform some writes to XFS partition
* Kernel log starts to report "Metadata corruption detected" and writes start to fail
* On reboot, the partition is in unmountable state.
This task depends upon

Closed by  Toolybird (Toolybird)
Monday, 29 May 2023, 22:00 GMT
Reason for closing:  Fixed
Additional comments about closing:  linux-zen 6.3.4.zen2-1
Comment by loqs (loqs) - Monday, 29 May 2023, 18:12 GMT Comment by Pawel Kraszewski (PKraszewski) - Monday, 29 May 2023, 18:45 GMT
Thank you for your prompt answer.

Dang, I've just noticed the fix is few commits older, https://github.com/zen-kernel/zen-kernel/commit/9dd7220bff2cea320116b9de7016bf3a55798978

As damaged partition is 3TB and it takes awful lot of time to fix (and I have a paycheck work to do), I'll drop to LTS kernel until the official&confirmed fix is released. As I verified, the machine at home (that was *not* hit by the bug) has zero stripe unit (thus is invulnerable), which seems to confirm, that damage/lack of it is related to the above mentioned patch. I'll verify the bad one tomorrow morning CEST.

I think my second partition @ work (smaller, easily restorable and not impairing my work) might also be vulnerable, I'll perform tests on it.

Best regards,

--
PS, also sorry, I mislabeled zen kernel to core, which it of course isn't.

--
EDIT: 24 hours later xfs_repair is still running. Meh...

Loading...