FS#42563 - [linux] [btrfs-progs ] kernel 3.17.1-1 with btrfs-progs 3.16.2-1 corrupts fs using btrfs snapshots

Attached to Project: Arch Linux
Opened by Thomas Kuschel (oe1tkt) - Sunday, 26 October 2014, 13:10 GMT
Last edited by Sébastien Luttringer (seblu) - Tuesday, 04 November 2014, 22:50 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Tom Gundersen (tomegun)
Sébastien Luttringer (seblu)
Architecture All
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 3
Private No

Details

Description:
Using new kernel 3.17.1-1 and and btrfs filesystem may end in
a read-only filesystem, when using snapshots (tested with read-only snapshots)

Additional info:
% uname -a
Linux blue 3.17.1-1-ARCH #1 SMP PREEMPT Wed Oct 15 15:04:35 CEST 2014 x86_64 GNU/Linux
% pacman -Qs btrfs-progs
local/btrfs-progs 3.16.2-1
Btrfs filesystem utilities

This bug is already reported here:
http://www.spinics.net/lists/linux-btrfs/msg38151.html

Steps to reproduce:
1. Create btrfs filesystem
2. Use subvolumes, e.g. % btrfs sub create /test
3. Do some copying to the subvolume, e.g. cp <several files> /test/<several files>
4. In another terminal (during copying) % btrfs subvolume snapshot -r /test mysnapshot

Result: Filesystem switches to ro (read-only) and has several failures seen with demsg, created snapshots are possibly corrupt, but not usable.

Workaround / Fix:
Download latest official btrfs-tools Version 3.17 from:
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/kdave/btrfs-progs.git
and compile & install it to your system (version 3.17) to fixing up the filesystem.
Please update the package btrfs-progs as soon as possible to 3.17

BR
Thomas
This task depends upon

Closed by  Sébastien Luttringer (seblu)
Tuesday, 04 November 2014, 22:50 GMT
Reason for closing:  Fixed
Additional comments about closing:  3.17.2-1
Comment by Sébastien Luttringer (seblu) - Sunday, 26 October 2014, 21:46 GMT
In spite of btrfs-progs v3.17 has reach [core] today, upstream bug report refer to a problem of corruption in the kernel 3.17.x when use of read-only snapshots and some other conditions[1]. I found nothing about a bug related to btrfs-progs and a potential workaround using btrfs-profs v3.17. Could you give me your sources?

I also failed to reproduce with your steps by steps on a linux 3.17.1 and btrfs-profs 3.16.2. That's not a 100% reproducible bug.

I found a fix from Chris Mason[2] which was not pulled for 3.17.1. It will probably for 3.17.2 [4].

[1] http://www.spinics.net/lists/linux-btrfs/msg38329.html
[2] https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/?h=stable-3.17&id=babe65ac4dae4598127c5700be00fd97fd06762d
[3] https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/log/?id=refs/tags/v3.17.1
[4] http://www.spinics.net/lists/linux-btrfs/msg38481.html
Comment by Thomas Kuschel (oe1tkt) - Sunday, 26 October 2014, 23:22 GMT
Indeed, it's not easy to reproduce the bug.
I am using a self made script making automatically snapshots every 15 minutes, like the known "snapper".
After upgrading linux 3.16.x to 3.17.1, the btrfs-progs 3.16.2, btrfs corrupted my file system.
I just downloaded/installed the btrfs-progs 3.17-1 from [core], no issues found on the filesystem any more.
btrfs scrub status /
scrub status for 43cb717d-d817-4f16-862a-6ae7c189cfce
scrub started at Sun Oct 26 22:48:06 2014 and finished after 4834 seconds
total bytes scrubbed: 123.13GiB with 0 errors
Comment by Vorbote (vorbote) - Tuesday, 28 October 2014, 15:00 GMT
Hi, as far as I can tell, from my experiences this weekend with snapper, you can use that app to reproduce the bug. As far as I can tell, the problem is not the new tools. I was able to recover a filesystem with btrfs-tools 3.17 that 3.16.2 left for dead. But there is an additional factor: I was downloading some files with aMule. I wonder @oe1tkt if you were using something similar (bittorrent, etc.)?
Comment by Sébastien Luttringer (seblu) - Tuesday, 28 October 2014, 19:54 GMT
Fix is in 3.18rc2.
Comment by Thomas Kuschel (oe1tkt) - Wednesday, 29 October 2014, 00:44 GMT
Thank you very much Sébastien, I already saw the fix in Linux kernel source tree. I hope it will be fixed in stable 3.17.2 too.
In addition, I attached my script which fired the bug, and for reproducing that behaviour. I modified the script, so with kernel version 3.17 and 3.17.1 read-and-write snapshots instead of read-only ones are created. For about 10 hours, it is running without problems.
Test system is a netbook Intel Atom CPU N550@1.50GHz with SSD
% lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 238.5G 0 disk
├─sda1 8:1 0 250M 0 part /boot
└─sda2 8:2 0 238.2G 0 part
└─crypt 254:0 0 238.2G 0 crypt /
@vorbote: no torrent nor any aMule etc., just creating readable snapshots after booting 3.17.1-1 and enabled quarterly triggered read-only snapshots with that script. Thanks for your comment.
Comment by Thomas Kuschel (oe1tkt) - Sunday, 02 November 2014, 10:37 GMT
Fixed in latest Archlinux: [core] linux 3.17.2-1
Note: Unfortunately Current Release: 2014.11.01 included Kernel: 3.17.1

Loading...