FS#18815 - [kernel26] CPU Frequency Scaling Stuck [minimum,randomly]
Attached to Project:
Arch Linux
Opened by orbisvicis (orbisvicis) - Wednesday, 24 March 2010, 07:07 GMT
Last edited by Jan de Groot (JGC) - Tuesday, 28 September 2010, 14:34 GMT
Opened by orbisvicis (orbisvicis) - Wednesday, 24 March 2010, 07:07 GMT
Last edited by Jan de Groot (JGC) - Tuesday, 28 September 2010, 14:34 GMT
|
Details
Description:
This is not 100% reproducible, but it seems to happen mostly within 10 minutes of startup or right after a resume. Googling, I've found that many people from a wide range of distributions have run into this issue, but all dating back about two years. For example, some links: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/138465 https://bugzilla.kernel.org/show_bug.cgi?id=10564 * ^ exactly identical to my issue As most reporting, I use a Dell 9400 laptop, with a T2400 intel core duo and an A10 BIOS. More details to follow. Also, I use cpufreqd with the ondemand or performance governors - through the acpi-cpufreq driver - depending on AC power. However, this has nothing to do with cpufreqd or the governors: the frequency is throttled to a minimum whether or not the cpufreqd daemon is running, and with any governor. When the CPU is throttled ACPI logs to syslog: Mar 24 00:57:46 cinnabar logger: ACPI group/action undefined: processor / CPU0 Mar 24 00:57:46 cinnabar logger: ACPI group/action undefined: processor / CPU1 And acpi_listen records the processor moving into P-Sate 2: processor CPU0 00000080 00000002 processor CPU1 00000080 00000002 The files at "/proc/acpi/processor/CPU?" confirm the switch. After throttling the information reflected at "/sys/devices/system/cpu/cpu?/cpufreq/" doesn't change much. The files "/sys/devices/system/cpu/cpu?/cpufreq/cpuinfo_cur_freq" are updated to display the stuck frequency and become immutable. Neither one of: cpufreq-set "..options.." cat 1833000 >/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq succeed in modifying the CPU frequency or the files. However, I can successfully switch governors or drivers - in other words edit any other information stored "/sys/devices/system/cpu/cpu?/cpufreq/*" that does not concern the current CPU frequency. (either through "cat" or "cpufreq-set"). Afterwards cpufreq-info displays the updated information. These are some errors I was able to collect: $ cpufreqd -D -V7 cpufreqd_loop : New Rule ("AC Rule"), applying. cpufreqd_set_profile : Couldn't set profile "Performance High" set for cpu0 (100-100-performance) cpufreqd_loop : Cannot set policy, Rule unchanged ("none"). $ cpufreq-set -r -f 1.83Ghz $ cpufreq-set -r -g performance -u 1.83GHz -d 1.83GHz :: Setting cpufreq governing rules , cpu 0Error setting new values. Common errors: - Do you have proper administration rights? (super-user?) - Is the governor you requested available and modprobed? - Trying to set an invalid policy? - Trying to set a specific frequency, but userspace governor is not available, for example because of hardware which cannot be set to a specific frequency or because the userspace governor isn't loaded? These are the kernel modules I have loaded. Unloading and reloading any or all of them does not allow me to "unstick" the CPU frequency. $ lsmod | grep -i freq cpufreq_powersave 646 0 cpufreq_ondemand 6897 0 acpi_cpufreq 5631 0 freq_table 1955 2 cpufreq_ondemand,acpi_cpufreq processor 26526 3 acpi_cpufreq Following is the output of cpufreq-info before and after the throttling. cpufreq-info before the throttling: analyzing CPU 0: driver: acpi-cpufreq CPUs which run at the same hardware frequency: 0 1 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: 10.0 us. hardware limits: 1000 MHz - 1.83 GHz available frequency steps: 1.83 GHz, 1.33 GHz, 1000 MHz available cpufreq governors: powersave, ondemand, performance current policy: frequency should be within 1.83 GHz and 1.83 GHz. The governor "performance" may decide which speed to use within this range. current CPU frequency is 1.83 GHz. cpufreq-info after the throttling: analyzing CPU 0: driver: acpi-cpufreq CPUs which run at the same hardware frequency: 0 1 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: 10.0 us. hardware limits: 1000 MHz - 1.83 GHz available frequency steps: 1.83 GHz, 1.33 GHz, 1000 MHz available cpufreq governors: powersave, ondemand, performance current policy: frequency should be within 1000 MHz and 1000 MHz. The governor "performance" may decide which speed to use within this range. current CPU frequency is 1000 MHz. List of files in "/sys/devices/system/cpu/cpu?/cpufreq/" $ ls -lah /sys/devices/system/cpu/cpu?/cpufreq/ /sys/devices/system/cpu/cpu0/cpufreq/: total 0 drwxr-xr-x 2 root root 0 Mar 24 01:04 . drwxr-xr-x 7 root root 0 Mar 24 01:04 .. -r--r--r-- 1 root root 4.0K Mar 24 01:04 affected_cpus -r-------- 1 root root 4.0K Mar 24 01:08 cpuinfo_cur_freq -r--r--r-- 1 root root 4.0K Mar 24 01:04 cpuinfo_max_freq -r--r--r-- 1 root root 4.0K Mar 24 01:04 cpuinfo_min_freq -r--r--r-- 1 root root 4.0K Mar 24 01:08 cpuinfo_transition_latency -r--r--r-- 1 root root 4.0K Mar 24 01:08 related_cpus -r--r--r-- 1 root root 4.0K Mar 24 01:04 scaling_available_frequencies -r--r--r-- 1 root root 4.0K Mar 24 01:04 scaling_available_governors -r--r--r-- 1 root root 4.0K Mar 24 01:05 scaling_cur_freq -r--r--r-- 1 root root 4.0K Mar 24 01:08 scaling_driver -rw-r--r-- 1 root root 4.0K Mar 24 01:55 scaling_governor -rw-r--r-- 1 root root 4.0K Mar 24 01:04 scaling_max_freq -rw-r--r-- 1 root root 4.0K Mar 24 01:04 scaling_min_freq -rw-r--r-- 1 root root 4.0K Mar 24 01:55 scaling_setspeed /sys/devices/system/cpu/cpu1/cpufreq/: ... (It's the same as for the other core) ... Also sensors show very cool temperatures which are expected if running throttled at 1Ghz with fans at maximum. So there does not seem to be any reason for the BIOS to throttle the CPU. sensors acpitz-virtual-0 Adapter: Virtual device temp1: +28.5°C (crit = +99.0°C) coretemp-isa-0000 Adapter: ISA adapter Core 0: +26.0°C (crit = +100.0°C) coretemp-isa-0001 Adapter: ISA adapter Core 1: +25.0°C (crit = +100.0°C) As far as I can tell there is no other information in any of the log files. Other symptoms include: . every action (including typing) becomes sluggish . X uses an abnormally high percentage of CPU . time ticks faster. For example "sleep XY" after throttling is approximately 1.75 times faster than "sleep XY" before throttling. (This doesn't happen during normal frequency scaling). . fans are forced to high. Some information about my system: $ acpitool -c CPU type : Genuine Intel(R) CPU T2400 @ 1.83GHz Min/Max frequency : 1833/1833 MHz Current frequency : 1833 MHz Frequency governor : performance Freq. scaling driver : acpi-cpufreq Cache size : 2048 KB Bogomips : 3662.69 Bogomips : 3663.30 # of CPU's found : 2 Processor ID : 0 Bus mastering control : yes Power management : yes Throttling control : yes Limit interface : yes Active C-state : C0 C-states (incl. C0) : 3 Usage of state C1 : 549980 (10.6 %) Usage of state C2 : 4622981 (89.2 %) T-state count : 8 Active T-state : T0 Processor ID : 1 Bus mastering control : yes Power management : yes Throttling control : yes Limit interface : yes Active C-state : C0 C-states (incl. C0) : 3 Usage of state C1 : 305218 (6.0 %) Usage of state C2 : 4737969 (93.9 %) T-state count : 8 Active T-state : T0 $ lshw -c cpu WARNING: you should run this program as super-user. *-cpu product: Genuine Intel(R) CPU T2400 @ 1.83GHz vendor: Intel Corp. physical id: 1 bus info: cpu@0 version: 6.14.8 serial: 0000-06E8-0000-0000-0000-0000 size: 1833MHz capacity: 1833MHz width: 32 bits capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon bts aperfmperf pni monitor vmx est tm2 xtpr pdcm cpufreq configuration: id=0 *-logicalcpu:0 description: Logical CPU physical id: 0.1 width: 32 bits capabilities: logical *-logicalcpu:1 description: Logical CPU physical id: 0.2 width: 32 bits capabilities: logical Additional info: . I can't figure out why I'm only seeing these symptoms now. There haven't been any important software updates recently, I haven't touched the BIOS or other operating systems, nor have I modified the hardware. Thoughts: Since my bug is almost identical to the kernel bug (see second link) of about two years ago, this is probably a kernel issue, most likely caused when the kernel is unaware that the BIOS changes the CPU frequency. I'm not really sure what to do about this. Most likely I am the only one with these symptons, and I doubt I have the time to test patches and rebuild kernels.. |
This task depends upon
Closed by Jan de Groot (JGC)
Tuesday, 28 September 2010, 14:34 GMT
Reason for closing: No response
Additional comments about closing: No activity in +1 month. Original reported no longer affected.
Tuesday, 28 September 2010, 14:34 GMT
Reason for closing: No response
Additional comments about closing: No activity in +1 month. Original reported no longer affected.
"processor.ignore_ppc=1" helps *somewhat*
. seems to keep the processor in the C0 state and prevent (random) switching to other P-states. Therefore, clock frequency and voltage are maintained at maximum, and "cpufreq-set" and "cpufreqd" work as expected.
. does not affect T-states. Therefore, my processor will (randomly) be throttled to a T6 state - at 25% performance. This is veeery slow.(I guess this explains why the system clock was runing faster and I/O seemed to lag)
. does not affect BIOS overriding fan speeds. Not even i8kmon can modify the speeds. (it's noisy, but I don't really care)
Apparently the acpi-cpufreq module outputs some information, but I haven't gotten around to checking the _PPC ACPI information from my laptop. Perhaps that information could help clarify why this is happening.
As a stop-gap measure, does anyone know how to manually switch T-states?
This doesn't mean the actual cause still isn't valid. To rehash and clarify:
My laptop was 'ratcheted' into the highest (lowest performing) P-state. For example, once it entered a C3 state, it would never fall back to a {C0,C1,C2} state, no matter how thermally cool it was. I mean, I could throw my laptop into the freezer (10-15C) and it still wouldn't switch back to it's native P-state. Now, is this a kernel issue or a BIOS issue? I'm not sure.
The other problem's listed are invalid:
i8k has been broken since 2.6.33 or 2.6.34, different issue: https://bbs.archlinux.org/viewtopic.php?id=96356
cpu T-state throttling *was* working correctly