FS#80283 - [linux] ntfs3 module flushes changes to files only on unmount

Attached to Project: Arch Linux
Opened by Giovanni Santini (ItachiSan) - Saturday, 18 November 2023, 11:33 GMT
Last edited by Buggy McBugFace (bugbot) - Saturday, 25 November 2023, 20:13 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To No-one
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
When using an NTFS partition with `ntfs3`, the user is able to browse and create files.
However, the content within the file cannot be read until unmount.
Specifically, the file has its content but cannot be read from it;
only forcing a cache drop or remounting the partition ensures the file can be read again.

Additional info:
* package version: linux 6.6.1
* config and/or log files: none of note, can make some if needed
* link to upstream bug report, if any: unsure where to report, I can open it if needed
* mailing list discussion: https://lists.archlinux.org/hyperkitty/list/arch-general%40lists.archlinux.org/thread/I6ZUTCVDNGVHN6OHLCEFRUGWZQVC4XX4/

Steps to reproduce:
1. Mount the NTFS partition with `ntfs3`, either via `mount` or `udisksctl`
2. Create a file -> file is created properly
3. Write content to the file
4. See the file content -> file is empty (this is the issue=
5. Remount partition
6. See the file content -> file has content

Running a cache drop via sysctl forces the flush to files, however it needs to be done every time a new modification is done to a file:
```
(17:17) giovanni @ ~ $ udisksctl mount -b /dev/nvme0n1p5 -t ntfs3
Mounted /dev/nvme0n1p5 at /run/media/giovanni/Data
(17:17) giovanni @ ~ $ cat /run/media/giovanni/Data/mount_test.txt
Using udisks and ntfs3
(17:17) giovanni @ ~ $ echo "Using udisks and ntfs3" > /run/media/giovanni/Data/mount_test.txt
(17:18) giovanni @ ~ $ cat /run/media/giovanni/Data/mount_test.txt
(17:18) giovanni @ ~ $ sync; sudo sysctl vm.drop_caches=3
vm.drop_caches = 3
(17:18) giovanni @ ~ $ cat /run/media/giovanni/Data/mount_test.txt
Using udisks and ntfs3
(17:18) giovanni @ ~ $ echo "Using udisks and ntfs3 again" >
/run/media/giovanni/Data/mount_test.txt
(17:18) giovanni @ ~ $ cat /run/media/giovanni/Data/mount_test.txt
(17:18) giovanni @ ~ $ sync
(17:18) giovanni @ ~ $ cat /run/media/giovanni/Data/mount_test.txt
(17:18) giovanni @ ~ $ sync; sudo sysctl vm.drop_caches=3
vm.drop_caches = 3
(17:19) giovanni @ ~ $ cat /run/media/giovanni/Data/mount_test.txt
Using udisks and ntfs3 again
(17:19) giovanni @ ~ $
```

ntfs-3g has no issues, so it is not on the partition side.

This task depends upon

Closed by  Buggy McBugFace (bugbot)
Saturday, 25 November 2023, 20:13 GMT
Reason for closing:  Moved
Additional comments about closing:  https://gitlab.archlinux.org/archlinux/p ackaging/packages/linux/issues/6
Comment by Giovanni Santini (ItachiSan) - Saturday, 18 November 2023, 12:16 GMT
Addition:
linux-lts does not have the issue.
Comment by loqs (loqs) - Saturday, 18 November 2023, 14:56 GMT
Did you open an upstream bug report on bugzilla.kernel.org or the ntfs3@lists.linux.dev mailing list as suggested on the mailing list thread you linked to?
What mount options are you using for the manual mount testing? As you noted linux-lts does not have the issue, have you been able to bisect the issue [1]?

[1]: https://wiki.archlinux.org/title/Kernel#Debugging_regressions
Comment by Giovanni Santini (ItachiSan) - Sunday, 19 November 2023, 16:14 GMT
Hi, no, I didn't open a bug report.
I will do it now since I was away until now.
I will also try to bisect the issue.
Comment by Giovanni Santini (ItachiSan) - Sunday, 19 November 2023, 18:32 GMT
Correction:
I will try linux-mainline first, then do the report and bisecting :)
Comment by loqs (loqs) - Sunday, 19 November 2023, 19:04 GMT
You can obtain linux-mainline prebuilt from [1] or [2]. If the issue is not fixed in mainline, you can use the ALA [3] to determine which release introduced the issue. Which should reduce the number of bisection steps. If you need help with the bisection, please ask.

[1]: https://wiki.archlinux.org/title/Unofficial_user_repositories#miffe
[2]: https://wiki.archlinux.org/title/Unofficial_user_repositories#archlinuxcn
[3]: https://wiki.archlinux.org/title/Arch_Linux_Archive
Comment by Giovanni Santini (ItachiSan) - Sunday, 19 November 2023, 20:20 GMT
I built the package myself, the issue is still there.
I will use the `downgrade` script and will jump between the latest patch version of the latest kernel releases e.g. 6.5.x, 6.4.x, ...
Most likely will be from 6.2, given that LTS is 6.1.x and doesn't have the issue.
Comment by Giovanni Santini (ItachiSan) - Sunday, 19 November 2023, 21:41 GMT
I was not able to boot from the kernels 6.2 as they were mentioning that the vfat module was missing.
Comment by Giovanni Santini (ItachiSan) - Sunday, 19 November 2023, 21:49 GMT
I can boot the old "linux" kernel 6.1.12 and it works fine.
I will start bisecting meanwhile.
Comment by loqs (loqs) - Sunday, 19 November 2023, 22:01 GMT
> I was not able to boot from the kernels 6.2 as they were mentioning that the vfat module was missing.
Could you provide some more details on that? Which particular 6.2 package did you try? Was the issue while the boot was still using the initrd or after the boot had switched to the root file-system?
Comment by Giovanni Santini (ItachiSan) - Sunday, 19 November 2023, 22:08 GMT
I tried both 6.2.0 and 6.2.13.
The error was that it was not able to mount the /efi partition due to missing vfat module.
I believe it was in the root file system, since I was able to browse the /lib/modules folder, where I found the actual module.
However, insmod and modprobe didn't do the trick.
Comment by loqs (loqs) - Sunday, 19 November 2023, 22:13 GMT
Did you check the output of `uname -a` when you were dropped to the rescue shell? Most common cause of such as issue is a kernel version mismatch due to the ESP not being mounted when the kernel package was updated.
Comment by Giovanni Santini (ItachiSan) - Sunday, 19 November 2023, 22:41 GMT
No, I didn't, I can try tomorrow.
I am currently bisecting linux-git but got an error.
I did a clean build of the package and then ran the git bisect, giving all the good and bad tags.
I then started the build via `makepkg -esfi` as recommended, then got asked many question to which I always pressed enter and now I get an error when building...
What shoud I do?
Comment by loqs (loqs) - Sunday, 19 November 2023, 23:09 GMT
Which commit is failing to build? Could you attach the log for the build failure? (makepkg -L enables logging)
Comment by Giovanni Santini (ItachiSan) - Monday, 20 November 2023, 16:41 GMT
Hi, here is my report:

1. I fetched the AUR sources for `linux-git` and built it with `makepkg -Lsf`
2. Stepped inside the Git folder and started the bisect, marked v6.1 as good and v6.3 as bad
3. In the package folder I ran `makepkg -Lesf`, the build fails saying it misses an header.

I am trying to avoid running `make clean` since the compile time would increase a fair bit.
I also tried to give reasonable answers to the Kconfig configuration file, may it be yes, no or module.

I attach the latest build log.

Comment by loqs (loqs) - Monday, 20 November 2023, 17:03 GMT
I think you may need to do a clean build for that issue [1], I can not reproduce the issue in a clean chroot. Although I would suggest retrying 6.2.0 after ensuring the ESP is mounted. I can build the bisection kernels for you but if there is a different issue which prevents testing the packages it would not help.

[1]: https://lore.kernel.org/lkml/87a5yl4b53.fsf%40oldenburg.str.redhat.com/
Comment by Giovanni Santini (ItachiSan) - Monday, 20 November 2023, 20:02 GMT
I would love to test 6.2.13, but I have no clue why it fails.
In the root shell I ran this:

---
$ uname -a
Linux archlinux-tug 6.2.13-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 26 Apr 2023 20:50:14 +0000 x86_64 GNU/Linux
$ cat /proc/filesystems
nodev sysfs
nodev tmpfs
nodev bdev
nodev proc
nodev cgroup
nodev cgroup2
nodev cpuset
nodev devtmpfs
nodev binfmt_misc
nodev configfs
nodev debugfs
nodev tracefs
nodev securityfs
nodev sockfs
nodev bpf
nodev pipefs
nodev ramfs
nodev hugetlbfs
nodev devpts
nodev autofs
nodev efivarfs
nodev mqueue
nodev resctrl
nodev pstore
ext3
ext2
ext4
$ modinfo modinfo /usr/lib/modules/6.2.13-arch1-1/kernel/fs/fat/vfat.ko.zst
filename: /usr/lib/modules/6.2.13-arch1-1/kernel/fs/fat/vfat.ko.zst
author: Gordon Chaffee
description: VFAT filesystem support
license: GPL
alias: fs-vfat
srcversion: 10921C44C2661B8AEC07C0E
depends: fat
retpoline: Y
intree: Y
name: vfat
vermagic: 6.2.13-arch1-1 SMP preempt mod_unload
sig_id: PKCS#7
signer: Build time autogenerated kernel key
sig_key: 4D:E6:E9:2F:73:2C:26:A1:83:35:C9:A5:9E:55:63:98:DD:F0:CA:6E
sig_hashalgo: sha512
signature: 30:65:02:31:00:D0:76:F1:1F:72:F0:2A:FA:8A:60:04:B8:4D:4A:49:
6C:D2:57:78:94:BF:A2:69:30:00:EE:29:DE:9E:40:24:12:43:83:C3:
36:A2:8B:FE:C6:10:F5:43:8C:DD:51:1F:7F:02:30:18:CF:88:7D:68:
6A:05:47:AE:40:EB:48:06:4C:CF:C2:6F:37:44:72:49:E8:17:58:01:
D9:30:CE:88:20:0F:5A:12:03:37:89:20:62:7C:14:4D:F3:F7:FB:95:
5B:EF:3F
---

When I was trying to load it though via modprobe or mount it said "unknown symbol".
Comment by loqs (loqs) - Monday, 20 November 2023, 20:39 GMT
Did the message include the name of the "unknown symbol"?
Comment by Giovanni Santini (ItachiSan) - Monday, 20 November 2023, 21:19 GMT
It indeed did.
I am attaching my `script` record, catted to a separate file.
I can provide the normal script file but I have no timings inside it.
Comment by loqs (loqs) - Tuesday, 21 November 2023, 01:14 GMT
Can you please attach the output of dmesg from the rescue prompt as well?
Comment by Giovanni Santini (ItachiSan) - Tuesday, 21 November 2023, 07:59 GMT
Sure can do!
I have the same issue with `linux-git` given my commit, so it is useful to fix it :)
Comment by Giovanni Santini (ItachiSan) - Tuesday, 21 November 2023, 09:16 GMT
Here are the logs from the rescue shell.

EDIT: I uploaded the same file twice but can't remove it ^^"
Comment by loqs (loqs) - Tuesday, 21 November 2023, 11:58 GMT
Were the dmesg and journal from after attempting to modprobe vfat?

A workaround for the vfat issue would be comment out the efi mount entry in fstab or add noauto to that entries options string. You could also try testing ntfs3 from the rescue shell.
Comment by Giovanni Santini (ItachiSan) - Tuesday, 21 November 2023, 14:07 GMT
Yes, I tried the mount and the modprobe, still got the issue.
I see from my logs that I have no ntfs3 filesystem either in the logs.
I can regenerate the logs if needed.
Comment by loqs (loqs) - Tuesday, 21 November 2023, 18:11 GMT
What if ntfs3 and vfat are built into the kernel? See diff and src bundle attached; built package linked below:

https://drive.google.com/file/d/1ItZuRnbYGLX1QOqUbwzb2Qg8UBJOeP2J/view?usp=sharing linux-6.2-1.1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/12YhRoM_HAAmRxF2s8CrnWLf583DXFAaE/view?usp=sharing linux-headers-6.2-1.1-x86_64.pkg.tar.zst
Comment by Giovanni Santini (ItachiSan) - Tuesday, 21 November 2023, 22:21 GMT
Thanks! I will try the packages right now and let you know :)
Comment by Giovanni Santini (ItachiSan) - Tuesday, 21 November 2023, 22:35 GMT
Huh. Kernel fault?
See picture attached
Comment by loqs (loqs) - Tuesday, 21 November 2023, 22:55 GMT
No clue. Also I can not reproduce the issue locally on 6.6.3 although this is a custom kernel.
Comment by Giovanni Santini (ItachiSan) - Wednesday, 22 November 2023, 08:31 GMT
@loqs the boot or the NTFS error?
I can boot 6.6 kernel with no issue.

I have a last resort: using a live ISO backdated to then.
Will try that.
Comment by Giovanni Santini (ItachiSan) - Wednesday, 22 November 2023, 08:54 GMT
I confirmed in March ArchISO, with Linux 6.2.1, that the same issue occurs.
See the attached logs. :)
I can finally open an issue upstream :P
Comment by loqs (loqs) - Wednesday, 22 November 2023, 10:34 GMT
I have no clue on the kernel fault. I can not reproduce the original issue with linux 6.6.3. Has any one else reproduced the original issue?
Comment by Giovanni Santini (ItachiSan) - Wednesday, 22 November 2023, 12:48 GMT
I don't think so, I have no real clue why this happens either...
I do not see any upstream report.

I've created one, regarding the problem:
https://bugzilla.kernel.org/show_bug.cgi?id=218180

I am not sure whether I need to also mail ntfs3 AT lore.kernel or not.
Comment by loqs (loqs) - Wednesday, 22 November 2023, 12:55 GMT
I would suggest asking on your thread on arch-general if anyone else can reproduce the issue or ask one of the other support channels.
Comment by Giovanni Santini (ItachiSan) - Wednesday, 22 November 2023, 21:38 GMT
Ok, somehow I made the kernel panic disappear.
I did:
1. Switch from systemd to busybox for mkinitcpio
2. Bundled vfat and ntfs3 in the image
3. Removed the package *kernel-module-hooks*
4. Disabled UEFI

I marked 6.1 as good and 6.2 bad for the bisect.
First step worked, now second step.
Comment by loqs (loqs) - Thursday, 23 November 2023, 14:29 GMT
Upstream closed the report as unreproducible on 6.7-rc2 [1].
Edit:
Did you test that a locally built 6.1 kernel does not have the issue?

[1]: https://bugzilla.kernel.org/show_bug.cgi?id=218180#c2
Comment by Giovanni Santini (ItachiSan) - Thursday, 23 November 2023, 22:46 GMT
The kernel panic for the Git kernel is still there... will try a clean build again, hopefully it will fix that.

I noticed, I will try to rebuild linux-mainline and see if the issue is still there...

No, I didn't build a local 6.1 kernel, except the one I get for bisecting the bug; that is a 6.1 kernel that works.
Comment by loqs (loqs) - Thursday, 23 November 2023, 23:02 GMT
Are you using the same kernel config for both the mainline and bisection builds? Without another user being able to produce the issue there is no way to rule out a configuration issue on your system.
Comment by Giovanni Santini (ItachiSan) - Thursday, 23 November 2023, 23:10 GMT
I do understand that, I would love it to be so...
For the kernel building configuration, I use both the ones in the package.

To be more precise:

For linux-mainline I build as in the package.

For linux-git I did the first build with the package config on the latest config, then I started bisecting.
In one of the first attempts I was not able to boot at all the kernel, so what I normally do is this via a script:

cd src/linux-torvalds
make -s kernelrelease > version
make olddefconfig
make clean

I added the "make clean" due to the kernel stack trace error, will see if that helps.

Loading...