FS#77340 - Linux kernel >=6.1 suffer from AMD fTPM stutter
Attached to Project:
Arch Linux
Opened by Jonas Jefe (jonaslorincz) - Tuesday, 31 January 2023, 05:28 GMT
Last edited by Toolybird (Toolybird) - Tuesday, 14 March 2023, 04:04 GMT
Opened by Jonas Jefe (jonaslorincz) - Tuesday, 31 January 2023, 05:28 GMT
Last edited by Toolybird (Toolybird) - Tuesday, 14 March 2023, 04:04 GMT
|
Details
Description:
Linux kernel >=6.1 exhibits a stuttering issue that occurs once every few hours. See https://www.reddit.com/r/archlinux/comments/zvgev0/audio_stuttering_issues_with_kernel_611/ https://www.reddit.com/r/linux_gaming/comments/zzqaf7/having_intermittent_stutters_with_a_ryzen_cpu/ https://bbs.archlinux.org/viewtopic.php?id=282333 for detailed information. The stutter causes the framerate of the display to decrease dramatically and causes bursts in the audio output. Additional info: * linux 6.1.0 or later Steps to reproduce: * Use Linux kernel >=6.1 * Use AMD Ryzen CPU with fTPM enabled * Wait for a few hours Here is a Rust program that will monitor for a potential stutter: ```rs use std::{ thread::sleep, time::{Duration, Instant}, }; use chrono::Local; fn main() { let test = Duration::from_millis(10); let expected_offset = Duration::from_millis(5); loop { let start = Instant::now(); sleep(test); let elapsed = start.elapsed(); if elapsed > test + expected_offset { println!( "[{}] stutter: took {:?}, expected <{:?}", Local::now().format("%Y-%d-%m %H:%M:%S"), elapsed - test, expected_offset ); } } } ``` Many lines printed from this program will indicate a stutter (the stutter will last 1-2 seconds and causes ~30ms extra delay reported by the program). |
This task depends upon
Closed by Toolybird (Toolybird)
Tuesday, 14 March 2023, 04:04 GMT
Reason for closing: Fixed
Additional comments about closing: linux 6.2.6.arch1-1
Tuesday, 14 March 2023, 04:04 GMT
Reason for closing: Fixed
Additional comments about closing: linux 6.2.6.arch1-1
It is worth to mention that, if TPM is not used on your system (I suspect most Arch desktops), there may be an easy way to simply disable it in BIOS as a workaround. Otherwise look for a BIOS updates that fixes the issue (e.g. https://www.techspot.com/news/94939-bios-update-amd-pcs-fixes-ftpm-related-performance.html).
Additionally, my computer (ASUS G513QY)'s latest BIOS update does not contain the fix.
What's strange is that this issue doesn't happen in kernel 6.0.19, which I'm using currently.
I know we should look for which part of the code causing it. At least now we have a clue.
So what exactly happened from 6.0.x -> 6.1.x? Because this build config is enabled in 6.0.x too. Need a bit more time to find out.
I looked around more thoroughly and found this gem 🐸
Every clue I found until now are point to this random/tpm part.
Will do some investigation later, I am now still in git bisect to test the kernel.
run "sudo cat /dev/hwrng > /dev/null" for around 5-15 minutes, and here you go.
so there must be something that keeps calling the hardware random numbers generator in 6.1.x
as time goes on, at a certain point, minor error stack together and lead to another error (something overflow I guess?)
plus, if you use the rust monitoring program, you will notice timeout errors in 6.1.x even when you do nothing. However, this isn't the case in 6.0.x as long as you not calling hwrng.
The git bisect result shows exactly the same!
I am writing a report in the upstream bug report now, hope this problem can be fixed soon.
Edit:
Backported to 6.1.19 and 6.2.6.