FS#10512 - x86_64 Disk I/O Issues in Kernel26

Attached to Project: Arch Linux
Opened by Daniel Rammelt (shazeal) - Monday, 26 May 2008, 19:47 GMT
Last edited by Tobias Powalowski (tpowa) - Wednesday, 08 October 2008, 07:38 GMT
Task Type Feature Request
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture x86_64
Severity Medium
Priority Normal
Reported Version 2007.08-2
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 6
Private No

Details

Description: The default CFS I/O scheduler options in the kernel essentially cause the system to halt for disk I/O operations due to the way CFS handles root processes. CFS assigns the majority of the I/O slices to root and forgets about the user, causing the system to become unresponsive until disk I/O has ceased. While same disk transfers leave the system sluggish with intermittent pauses.

This problem was not readily apparent in the i686 kernel and was only noticed when changing to the x86_64 Arch. Similar behavior has been noted in both Gentoo and Ubuntu see links below.

The problem is solved by enabling CONFIG_CGROUP_SCHED see diff attached.

Additional info:

Reproduced on...
- kernel26-2.6.25.4 (Stock Arch)
- zen-kernel26-2.6.25-zen2 (Using Arch config)
- vanilla-kernel-2.6.25.4 (Using Arch config)

Solved by...
- kernel26-2.6.25-4 (Stock arch config + attached diff)
- zen-kernel26-2.6.25-zen2 (Using BFQ I/O scheduler + Stock arch config)

http://forums.gentoo.org/viewtopic-t-482731-highlight-disk+raid+amd64.html
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/188226

Steps to reproduce:
Use stock Arch x86_64 kernel and transfer large amounts of data from disk -> disk, or same disk large file transfers.
This task depends upon

Closed by  Tobias Powalowski (tpowa)
Wednesday, 08 October 2008, 07:38 GMT
Reason for closing:  Fixed
Comment by Daniel Rammelt (shazeal) - Monday, 26 May 2008, 19:59 GMT
Seems the attach did not work? Trying again.
Comment by Daniel Rammelt (shazeal) - Monday, 26 May 2008, 22:23 GMT
Sorry again, the above diff is incorrect I had set FAIR_GROUP_SCHED by mistake when making the clean diff. Attached is a diff without that option set.
Comment by Aaron Griffin (phrakture) - Tuesday, 27 May 2008, 20:40 GMT
Curious, does this cause segfaults in the ext2 module for you? My x86_64 system used to segfault in ext2 all the time. The hardware is kinda dead now, so it MAY be unrelated, but I'm just checking.
Comment by john mcullan (mullman) - Sunday, 01 June 2008, 14:19 GMT
Will this make the next kernel version/rebuild?
Comment by Alberto Gonzalez (Luis) - Friday, 25 July 2008, 16:18 GMT
So this problem appeared with 2.6.25? That is, it was not present in 2.6.24 before the whole group scheduler merge? In that case, does anyone know the exact options to disable the whole group scheduler and have the same behavior as 2.6.24? I'm saying this because now 2.6.26 has:

CONFIG_CGROUPS=y
CONFIG_CGROUP_NS=y
CONFIG_GROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_CGROUP_SCHED=y

And that's causing again latency issues (as expected). So how to revert to the pre-group scheduler behavior and avoid the problem completely? Anyone knows? Do we need to enable CGROUP but disable GROUP?
Comment by Thomas Bächler (brain0) - Friday, 25 July 2008, 16:33 GMT
I added the options as they were requested here, assuming that they should _fix_ these issues. I can't do anything for over a week now, so please try it out yourself and report what you found.

I am using x86_64 myself and haven't seen any problems with disk I/O.
Comment by Alberto Gonzalez (Luis) - Friday, 25 July 2008, 18:50 GMT
I'm trying to figure this out... Let's see:

The original poster is confusing CFS (the CPU scheduler) and CFQ (the I/O scheduler). Apparently he has a problem with CFQ, not with CFS. Now, he has found two solutions:

- Change CFQ for BFQ (BFQ is not in mainline, I wonder if changing to anticipatory would also solve his problem)
- Enable a new feature appeared in 2.6.25 called CGROUPS

So, if I understand correctly, the problem with I/O should exist *before* 2.6.25 and has nothing to do with the CFS scheduler. It's only that the CGROUPS feature seems to solve it (though it's just a workaround for the real bug in CFQ). Is this correct?

Now, regarding the group scheduler in CFS, it is known and warned by developers that it has a price regarding latency. To disable it and get the exact 2.6.24 behavior we should not set CONFIG_GROUP_SCHED. This is what it was done in 2.6.25 and the latency problems were solved. But then this confusing report has raised some doubts. Could the original reporter clarify the situation? Did the problem exist before 2.6.25? Does it go away by changing CFQ for anticipatory I/O scheduler ("echo anticipatory > /sys/block/{DEVICE-NAME}/queue/scheduler")?

Thanks.
Comment by Thomas Bächler (brain0) - Friday, 25 July 2008, 20:42 GMT
Okay, so are you saying we should revert to the configuration we had in 2.6.25?
Comment by Alberto Gonzalez (Luis) - Friday, 25 July 2008, 21:30 GMT
Yes, basically I think we should keep that config from 2.6.25.

The original report says that with that config and changing the I/O scheduler to BFQ the problem was solved, so it really seems that his problem is with the CFQ I/O scheduler. I don't think that enabling CGROUPS in the CPU scheduler to workaround a bug in the I/O scheduler is the right approach, even if it happens to work.

At least this is my opinion with the given facts.

Comment by Daniel Rammelt (shazeal) - Sunday, 27 July 2008, 07:12 GMT
I admit to confusing my report with CFQ/CFS however, the issue is with how the scheduler itself handles root user processes under CFQ. The CGROUPS thing "solves" (I admit a real code fix would be preferable), the issue by causing CPU activity caused by I/O to be handled via CGROUPS rather than just giving root the flat 2x weighting for CPU time.

I wont argue that its a great fix because its not, this issue has been on going since 2.6.19. I have recently switched to an Intel platform and the issue is non existant (previously nforce/amd), it is also non existant on i686 using exactly the same kernel config (on the nforce/amd build). So I get the feeling the issue lies somewhere in the kernel core. That said I still use CGROUPS as it stops background tasks running as root stealing CPU time from my foreground applications.
Comment by Alberto Gonzalez (Luis) - Sunday, 27 July 2008, 11:21 GMT
Ok, thanks for clarifying the situation. So it is an old bug that happens with nforce/amd platform on x86_64 when using CFQ I/O scheduler, and that it can be solved by enabling CGROUPS in kernels >= 2.6.25, right?

I guess that the best thing to get a real fix would be to report this upstream. As a workaround (as long as enabling CGROUPS is a better solution for you than just switching to anticipatory scheduler), maybe we could try enabling CGROUPS but disabling GROUP_SCHED and see if that solves your problem without hurting latency for other users?
Comment by Daniel Rammelt (shazeal) - Sunday, 27 July 2008, 12:40 GMT
TBH if this is going to cause issues for the people that have no problem with disk I/O under amd64 arch I would say drop it completely. This issue (not the CGROUPS thing) has been repeatedly reported upstream since 2.6.19, so its obviously a tough one to fix.
Comment by Oscar (borkdox) - Monday, 18 August 2008, 12:12 GMT
This bug is still alive. I just did a fresh install. I was copying some files from my old /home to my new /home, the desktop becomes unresponsive when opening new applications (konqueror or openoffice for example). The mouse even lags a little bit, and opening menus is slower.

I assume this is not the normal behavior.
Comment by Oscar (borkdox) - Wednesday, 20 August 2008, 03:37 GMT
I decided to compare Arch(x86_64) with 32-bit and 64-bit ubuntu 8.04. 64-bit ubuntu performed worse, lagging really bad. 32-bit ubuntu performed about the same except that there was no mouse lag.

So I think this lag is normal behavior due to high IO activity on the same drive as the root(/). Not going to lie though, Windows Vista IO Scheduler provides better interactivity than the CFQ in mainline. I guess this behavior will improve as the mainline CPU and IO schedulers mature.

Sorry for re-opening this.
Comment by Alberto Gonzalez (Luis) - Wednesday, 20 August 2008, 04:29 GMT
Well, it's normal that under high I/O activity the system becomes less responsive. On my old pentium 4 with a rather slow HD it's not too bad though, so it probably depends on the hardware (in fact, on my hardware Windows is unusable under any I/O, that's the main reason why I switched to Linux some years ago).

However, you might want to try the anticipatory I/O scheduler instead of CFQ. It works much better for some people. You can change it on the fly with the command "echo anticipatory > /sys/block/{DEVICE-NAME}/queue/scheduler" (the device name should be sda, sdb, etc...) or at boot time by adding "elevator=as" to your kernel line in /boot/grub/menu.lst
Comment by Tobias Powalowski (tpowa) - Sunday, 05 October 2008, 16:13 GMT
can we close this again?
Comment by Oscar (borkdox) - Wednesday, 08 October 2008, 02:51 GMT
yes

Loading...