FS#10106 - CPU Frequency stucked at lowest one

Attached to Project: Arch Linux
Opened by Nicolas Bigaouette (big_gie) - Monday, 07 April 2008, 17:32 GMT
Last edited by Jan de Groot (JGC) - Tuesday, 17 June 2008, 09:23 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture x86_64
Severity High
Priority Normal
Reported Version 2007.08-2
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:
CPU frenquency scaling is not working right. Around 80% of boots, I am stuck at the lowest frequency. I could not find a common denominator to each "wrong" boot. I'm out of idea.

> cpufreq-info
cpufrequtils 002: cpufreq-info (C) Dominik Brodowski 2004-2006
Report errors and bugs to linux@brodo.de, please.
analyzing CPU 0:
driver: acpi-cpufreq
CPUs which need to switch frequency at the same time: 0 1
hardware limits: 800 MHz - 2.00 GHz
available frequency steps: 2.00 GHz, 2.00 GHz, 1.60 GHz, 1.20 GHz, 800 MHz
available cpufreq governors: ondemand, performance
current policy: frequency should be within 800 MHz and 800 MHz.
The governor "ondemand" may decide which speed to use
within this range.
current CPU frequency is 800 MHz.
analyzing CPU 1:
driver: acpi-cpufreq
CPUs which need to switch frequency at the same time: 0 1
hardware limits: 800 MHz - 2.00 GHz
available frequency steps: 2.00 GHz, 2.00 GHz, 1.60 GHz, 1.20 GHz, 800 MHz
available cpufreq governors: ondemand, performance
current policy: frequency should be within 800 MHz and 800 MHz.
The governor "ondemand" may decide which speed to use
within this range.
current CPU frequency is 800 MHz.

My processor info:
> cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Duo CPU T7300 @ 2.00GHz
stepping : 10
cpu MHz : 800.000
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm ida
bogomips : 3997.13
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Duo CPU T7300 @ 2.00GHz
stepping : 10
cpu MHz : 800.000
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm ida
bogomips : 3991.28
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

See also:
http://bbs.archlinux.org/viewtopic.php?id=44153


Additional info:
* package version(s): kernel26 2.6.24.4-1, but problem exist since a couple of weeks (maybe months?)
* config and/or log files etc.

I am attaching 2 dmesg. One "dmesg_20080407_13h26_800MHz.log" when booted at 800MHz and the other "dmesg_20080316_14h34_2GHz.log", the last time the computer booted at 2GHz.

I really need my 2GHz, I cannot run simulations at almost the third of the maximum speed.

To investigate more, I would need cpufreq debuging, which is not set in kernel26:
# CONFIG_CPU_FREQ_DEBUG is not set

Thank you.
This task depends upon

Closed by  Jan de Groot (JGC)
Tuesday, 17 June 2008, 09:23 GMT
Reason for closing:  Not a bug
Additional comments about closing:  Hardware issue. Next time use the correct power supply.
Comment by Jan de Groot (JGC) - Monday, 07 April 2008, 21:20 GMT
There's more issues with these dell things:
http://bugs.archlinux.org/task/9955 lists an issue where only one core comes back online after resume.
Comment by Nicolas Bigaouette (big_gie) - Monday, 07 April 2008, 21:32 GMT
I know, I've reported both :P

Removing the module in bug #9955 might correct an issue related. This was suggested around the web. But it is not the same problem, at least not the same symptoms...
Comment by Nicolas Bigaouette (big_gie) - Wednesday, 09 April 2008, 00:29 GMT
I booted my computer this morning in the bus (on battery, nothing attached, wireless switch to off) and got 2GHz.

I again booted when attached to docking station (with usb mouse/keyboard, usb webcam, eth0 connected (wireless switch still off) and again, I have 2GHz. Maybe its the wireless driver (iwlwifi, I think its in kernel?)

I'm attaching a diff of dmesg of a 800MHz boot and a 2GHz boot. I'm also attaching a list a loaded modules...

Looking on the web, somebody suggested it could be a buggy DSDT. I'm using a "corrected" one which I compiled myself. It doesn't seems to affect this problem though...

Somebody suggested a kernel option at boot to disable windows 2006 (vista) (acpi_osi="!Windows 2006") but I got a kernel panic right at the begining of the boot process.

Somebody else suggested it could be dust in the GPU fan (I though intel's integrated video did not have fans?) My video is a Intel X3100:
00:00.0 Host bridge: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub (rev 0c)
00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 0c)
00:02.1 Display controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 0c)

Comment by Nicolas Bigaouette (big_gie) - Wednesday, 09 April 2008, 00:51 GMT Comment by Nicolas Bigaouette (big_gie) - Monday, 21 April 2008, 17:54 GMT
I compiled my own kernel and enabled CONFIG_CPU_FREQ_DEBUG=y. Booting it with "cpufreq.debug=7" kernel option, I get the following dmesg.

Look at lines 402 and up:
freq-table: setting show_table for cpu 0 to ffff81007f1bf900
cpufreq-core: CPU 1 already managed, adding link
cpufreq-core: setting new policy for CPU 0: 800000 - 2001000 kHz
acpi-cpufreq: acpi_cpufreq_verify
freq-table: request for verification of policy (800000 - 2001000 kHz) for cpu 0
freq-table: verification lead to (800000 - 2001000 kHz) for cpu 0
acpi-cpufreq: acpi_cpufreq_verify
freq-table: request for verification of policy (800000 - 800000 kHz) for cpu 0
freq-table: verification lead to (800000 - 800000 kHz) for cpu 0
cpufreq-core: new min and max freqs are 800000 - 800000 kHz

cpufreq-core set policy for 800MHz to 2GHz, this i verified by "acpi_cpufreq_verify" but then another request for verification is done and the new frequencies are set to 800MHz->800MHz...
Comment by Scott H (stonecrest) - Saturday, 26 April 2008, 06:01 GMT
I'm glad I stumbled upon this, it seems very similar to my situation. I'm on a Dell Latitude C400. For about the past 2-3 months, usually within 5 minutes of booting up, the laptop will suddenly scale back to its lowest freq and get stuck in this state. Trying to force the frequency/governor, reloading acpi-cpufreq, and any other means have non effect. If I shutdown and boot back up, things are mostly fine. But come the next night, the first boot will again have scaling issues. The scaling issue is always accompanied by my fan going into high gear and, likewise, getting stuck in that state.

I'm currently recompiling the kernel with CPU_FREQ_DEBUG and ACPI_DEBUG and then I'll be rebooting with both "cpufreq.debug=7 and acpi.debug_level=0x1f". I'll post if these shed any new light on the matter from what has already been posted.
Comment by Scott H (stonecrest) - Saturday, 26 April 2008, 06:03 GMT
Also, I should point out that I'm using the 2.6.25 kernel in testing and it doesn't help.
Comment by Scott H (stonecrest) - Sunday, 27 April 2008, 16:16 GMT
I get pretty much the same output from dmesg. It looks like it triggers for you during booting while it triggers for me around 5 minutes after booting. I searched bugzilla.kernel.org and didn't find a report about this so I added on upstream:

http://bugzilla.kernel.org/show_bug.cgi?id=10564
Comment by Andrew Yates (andrewy) - Sunday, 27 April 2008, 17:16 GMT
Has anyone confirmed that this bug exists in the vanilla kernel?
Comment by Scott H (stonecrest) - Sunday, 27 April 2008, 20:27 GMT
Yes, I have triggered the bug with the following line commented out of the kernel26 PKGBUILD:

# Add -ARCH patches
# See http://projects.archlinux.org/git/?p=linux-2.6-ARCH.git;a=summary
# patch -Np1 -i $startdir/src/${_patchname} || return 1
Comment by Nicolas Bigaouette (big_gie) - Friday, 16 May 2008, 14:16 GMT
After less then a month without seeing the bug, I again face the problem with 2.6.25.3.

It is easier to reproduce now: when I am plugged in with AC, I'm stuck at 800MHz. If I unplug or dock the laptop, ondemand is back to full frequency spectrum...
Comment by Franco (francux) - Sunday, 08 June 2008, 16:24 GMT
I had a similar problem several times, I have a Pentium M and sometimes it begun to run at its slowest speed. I reset bios to default settings and everything worked fine...
Comment by Glenn Matthys (RedShift) - Friday, 13 June 2008, 13:52 GMT
If you don't need frequency scaling you can usually disable it in the system's BIOS. The name may not always be obvious, it may also be called EIST, Speedstep, Enhanced Speedstep, ACPI C states, etc... Disabling it in the BIOS will prevent any frequency scaling from happening.

This bug also occurs on systems that only have support for ACPI C states enabled, I experience the same problem on my Core 2 Duo desktop system with intel X38 chipset.
Comment by Nicolas Bigaouette (big_gie) - Friday, 13 June 2008, 16:57 GMT
Hum, I'm still facing this problem. When the computer is:
1) connected to AC, I'm stuck at 800MHz
2) connected of dock: full range from 800MHz to 2GHz
3) on battery: full range from 800MHz to 2GHz

I just played with the BIOS setting, as Glenn suggested. If I disable speedstep it will stay at 800MHz. It will not be at 2GHz. This what I understood from the BIOS text. I did not have tested it though.

Inside the BIOS, I reseted some warnings about power. I know I have already saw them and disabled them, but I wanted to see if it was telling me something. Then on bootup, I was connected through AC bu tI got a warning. It was telling me that my 65W adapter was not powerful enough and that the battery could take longer to charge. I think the 65W adapter should be powerful enough for the machine, even if it's lower then the 90W default...

But maybe the BIOS is seeing it as not powerful enough, and then disable speedstep, while if on battery it does not see any so keep speedstep. On my docking station (only a port replicator in fact), I have a 90W adapter, so that could be the cause of the problem...

I'm not home for the day, and will be out for the weekend, so I will not be able to test that at home until monday night. But I'll try to find a 90W adapter elsewhere to verify.
Comment by Nicolas Bigaouette (big_gie) - Friday, 13 June 2008, 17:15 GMT
OMG that was the problem!!!!!! I just tried with someone else's 90W adapter, and it can go to 2GHz!!!!

But then, I've been using the 65W since almost a year, with the problem appearing a couple of months ago... kernel-bios update combination maybe...
Comment by Glenn Matthys (RedShift) - Friday, 13 June 2008, 17:19 GMT
It's probably downscaling because the 65 watt adaptor can't provide enough power to let the processor run reliably at 2 Ghz
Comment by Nicolas Bigaouette (big_gie) - Friday, 13 June 2008, 17:30 GMT
I disable speedstep in the bios to verify at which freq. it would run. It now runs at 2GHz.

65W should be enough for 2GHz... I've been running with this for a year. It started to act like this only a couple of months ago. I can't remember if it was after a kernel upgrade or a bios upgrade... Theres a new bios, so I'll try it.
Comment by Glenn Matthys (RedShift) - Tuesday, 17 June 2008, 09:07 GMT
Hmm you're trying to use out-of-spec hardware while the manufacturer clearly recommends otherwise. The laptop probably measures how much power it can get from your adaptor, and since it needs 95 watt for full operation it tries to get as much power as it can. Because the laptop is trying to draw more power than the adaptor can give it's starting to give out (rapid aging), and it can't give you the same amount of power anymore compared to a few months ago. The laptop notices it gets less power and therefore downscales the processor to reduce power drawn to make the system operate reliably. That's why "it worked" untill a few months ago.

Loading...