FS#29850 - [linux] 3.3.5 - 3.4.2 Load average much higher double the power usage

Attached to Project: Arch Linux
Opened by Jonas Jelten (TheJJ) - Friday, 11 May 2012, 21:58 GMT
Last edited by Tobias Powalowski (tpowa) - Monday, 05 November 2012, 15:01 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 24
Private No

Details

Description:

Problem first described here:
https://bbs.archlinux.org/viewtopic.php?pid=1100418

I recently noticed that my wattage and load average was much higher than before.

This happens since kernel 3.3.5, my load average in idle is
0.91 0.75 0.68

just i3 and urxvt running. Was about 0.02 0.01 0.00 before the kernel update.

Also my wattage has more than doubled, needed 7-9W before, now 15-25W (in idle).
This also affects the battery runtime slightly (now 3h instead of ~9h).


powertop(2), top, iotop, atop, xrestop show nothing special, i can post details if anyone wanted.


I don't know for sure if it was the kernel update, but i've got 2 friends with different laptops, and both have the same issues (all 3 have intel cpus).


Additional info:
* Linux 3.3.5-1-ARCH #1 SMP PREEMPT Mon May 7 19:57:51 CEST 2012 x86_64 Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz GenuineIntel GNU/Linux


iostat:
Linux 3.3.5-1-ARCH (jjpad) 05/11/2012 _x86_64_ (4 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
2.65 0.00 0.83 0.04 0.00 96.47

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 14.90 139.40 71.44 404026 207054


free -m
total used free shared buffers cached
Mem: 7873 1209 6664 0 105 404
-/+ buffers/cache: 699 7173
Swap: 0 0 0

Now, writing the bug report with firefox: load average: 1.04, 0.82, 0.72


Steps to reproduce:
upgrade to 3.3.5, and watch the system load.

if more information is needed, i will post it.
This task depends upon

Closed by  Tobias Powalowski (tpowa)
Monday, 05 November 2012, 15:01 GMT
Reason for closing:  Fixed
Comment by Tobias Powalowski (tpowa) - Sunday, 13 May 2012, 12:30 GMT
Status on 3.3.6?
Comment by Jonas Jelten (TheJJ) - Sunday, 13 May 2012, 15:32 GMT
problem is still persistent in stock arch kernel 3.3.6 from [testing]
Comment by Jonas Jelten (TheJJ) - Thursday, 17 May 2012, 23:27 GMT Comment by Leonid Isaev (lisaev) - Sunday, 20 May 2012, 18:23 GMT
Yes, it seems that way... I'm not sure that this is a bad thing though: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/985661
Comment by Jonas Jelten (TheJJ) - Monday, 21 May 2012, 23:44 GMT
Part of the problem may be solved with i915 parameters, i'm getting to 9-12 W with this. Not as good as it was once, but at least the laptop lasts for ~5 h.

What I did was this:
~ $ modinfo i915 | tail -n 16
depends: drm,drm_kms_helper,intel-gtt,i2c-core,video,button,i2c-algo-bit,intel-agp
//{...}
vermagic: 3.4.0-1-ARCH SMP preempt mod_unload modversions
//{...}
parm: i915_enable_rc6:Enable power-saving render C-state 6. Different stages can be selected via bitmask values (0 = disable; 1 = enable rc6; 2 = enable deep rc6; 4 = enable deepest rc6). For example, 3 would enable rc6 and deep rc6, and 7 would enable everything. default: -1 (use per-chip default) (int)
//{...}

This reveals some cool new i915 parameters, especially, that the enable_rc6 parameter CHANGED.


We intel igp users have set i915_enable_rc6=1 at the time it was essential, this now wastes energy.
When I changed it to
=========================================
i915_enable_rc6=7
=========================================
my laptop stopped heating the room with ~5-15W.


Ah, by the way, the energy problem was still persistant with stock 3.4-ARCH kernel and Intel KMS driver.

The high load is still not fixed.


Powertop Result: total idle, wifi connected, urxvt and i3 running

The battery reports a discharge rate of 9.03 W
The estimated remaining time is 379 minutes

Summary: 31.7 wakeups/second, 0.0 GPU ops/second, 0.0 VFS ops/sec and 0.4% CPU use

Power est. Usage Events/s Category Description
4.44 W 0.0 pkts/s Device Network interface: eth0 (e1000e)
2.93 W 66.7% Device Display backlight
554 mW 11.7 pkts/s Device Network interface: wlan0 (iwlwifi)
68.8 mW 267.3 µs/s 7.8 Interrupt [6] tasklet(softirq)
34.8 mW 80.8 µs/s 4.0 kWork ieee80211_iface_work
23.8 mW 432.0 µs/s 2.7 Process i3status
15.9 mW 11.5 µs/s 1.8 Timer clocksource_watchdog
15.0 mW 5.8 µs/s 1.7 Timer intel_gpu_idle_timer
14.1 mW 214.0 µs/s 1.6 Interrupt [9] acpi
12.3 mW 95.2 µs/s 1.4 Interrupt [1] timer(softirq)
8.82 mW 10.3 µs/s 1.0 kWork pci_pme_list_scan
7.50 mW 96.1 µs/s 0.9 Interrupt [42] i915
6.61 mW 84.3 µs/s 0.8 Process /usr/lib/upower/upowerd
6.17 mW 1.0 ms/s 0.7 Process /usr/bin/X -nolisten tcp vt07 -auth /var/run/slim.auth
etc.

Comment by Thomas S (Xight) - Tuesday, 22 May 2012, 12:42 GMT
AMD a8-3870k is affected as well in both 3.5 & 3.6 . Powertop shows nothing unusual downgrading to 3.4 fixes this issue. As I don't have the intel graphics however I have IGP via 6550D maybe it's a IGP thing.
Comment by Ypnose (Ypnose) - Wednesday, 23 May 2012, 10:23 GMT
Same things here with a Phenom II X4 in 3.3.5 & 3.3.6. My system is DWM + Full CLI (very minimal) and normally my load average is around 0.01 0.03 0.03 in idle.
Yesterday, I noticed the issue. I was around 1.20 1.14 0.98 just with firefox and mumble. Even if the system is in idle (programs disabled), the load is surprisingly high.
Switched to 3.3.4 this morning and it seems to be much better.
Just for your information, I'm running AMD + NVIDIA (Phenom II X4 955 BE + GTX460).
Comment by Tobias Powalowski (tpowa) - Wednesday, 23 May 2012, 13:29 GMT
Status on 3.4.x?
Comment by Jonas Jelten (TheJJ) - Wednesday, 23 May 2012, 15:26 GMT
Still persistant in 3.4-1, just now i have 1.18, 0.83, 0.68 (running i3, firefox, thunderbird, urxvt)
Comment by Leonid Isaev (lisaev) - Wednesday, 23 May 2012, 16:43 GMT
Just to clarify things:
(a) this is not a problem of kernel using more resources, but rather an issue on how it reports their usage;
(b) I personally haven't noticed any increase in system temperature or decrease in battery life, so those of you who experience higher temps/battery consumption -- look somewhere else.
Comment by Jonas Jelten (TheJJ) - Wednesday, 23 May 2012, 16:45 GMT
>>look somewhere else.
changing i915_enable_rc6=7 helped a lot.
Comment by Raphael Cazenave (Schrod) - Wednesday, 06 June 2012, 14:50 GMT
Hey,
with 3.3.7-1-ARCH ,32bits, on AMD x2 5000+,
i have an high load-average in idle.
Comment by Leonid Isaev (lisaev) - Thursday, 07 June 2012, 01:04 GMT
@tpowa:
testing/linux-3.4.1-1 fixes the ``problem'' for me... thanks for a quick update.

In idle with only few sshfs mounts (over wpa2 wifi) and open mupdf's load average is like this:
$ uptime
19:52:05 up 3:51, 1 user, load average: 0.02, 0.07, 0.06
After watching an html5 video on the firefox homepage:
$ uptime
19:59:40 up 3:59, 1 user, load average: 0.35, 0.45, 0.24
which is reasonable I guess.

Overall I think that now load averages are reported accurately, albeit of course differently than in versions <=3.3.5.
Comment by Sudhir Khanger (donniezazen) - Thursday, 07 June 2012, 16:34 GMT
I am also noticing significant increase in power consumption and temperature in 3.4 compared to 3.3. On 3.3 my system would idle at 10W and 50C but on 3.4 power consumption idles at 15W and temperature is very inconsistent which fall in 55-60C category making fan run at 3500-4500RPM.

Kernel - Linux arch 3.4.1-1-ARCH #1 SMP PREEMPT Tue Jun 5 09:05:01 CEST 2012 x86_64 GNU/Linux
Kernel boot line - i915.i915_enable_rc6=1 i915.i915_enable_fbc=1 i915.lvds_downclock=1 drm.vblankoffdelay=1
Comment by Tobias Powalowski (tpowa) - Wednesday, 13 June 2012, 09:17 GMT
So is this now fixed in 3.4.2 or still there?
Comment by Sudhir Khanger (donniezazen) - Wednesday, 13 June 2012, 15:46 GMT
No, it is not fixed on 3.4.2.
Comment by Ypnose (Ypnose) - Sunday, 17 June 2012, 18:00 GMT
Do you know if kernel's devs are aware about this bug?
Comment by Sudhir Khanger (donniezazen) - Sunday, 17 June 2012, 18:02 GMT
Tobias Powalowski is the Arch maintainer. I am not sure if this bug has been reported upstream.

Sadly, it's been pushed to stable/core with all terrible power problems.
Comment by Tobias Powalowski (tpowa) - Sunday, 17 June 2012, 18:13 GMT
I don't report bugs upstreams i don't hit myself.
Comment by Leonid Isaev (lisaev) - Sunday, 17 June 2012, 18:33 GMT
> Do you know if kernel's devs are aware about this bug?

Yes, read the launchpad report above.

> Sadly, it's been pushed to stable/core with all terrible power problems.

If there are power problems, they are "routine" (because drivers change, kernel's power consumption usually fluctuates as opposed to decreasing monotonically with increasing version) and not related to the loadavg readings.
Comment by indianahorst (indianahorst) - Wednesday, 20 June 2012, 11:06 GMT
Still not fixed in 3.4.3-1. It is really annoying.

Tobias Powalowski: why do you maintain the kernel package when you don't report bugs to the kernel developers?
Comment by Tobias Powalowski (tpowa) - Wednesday, 20 June 2012, 11:22 GMT
I report bugs if I hit them myself, i cannot give any debugging on stuff I don't suffer myself.
Comment by Krzysztof Majzerowicz-Jaszcz (Cristos666) - Sunday, 24 June 2012, 14:09 GMT
I can confirm that with 3.4.4-1 it is not fixed.
Still getting the same issues like described above - high system load when idling, significantly reduced battery life and CPU temp increased by almost 10 degrees (when idling)
This happens BOTH on my Lenovo g560 and Asus EeePC 1201NL.
Comment by Karol Błażewicz (karol) - Sunday, 24 June 2012, 15:03 GMT
I'm using linux 3.4.3-1 and things are OK on my 32-bit Arch.
Comment by Krzysztof Majzerowicz-Jaszcz (Cristos666) - Sunday, 24 June 2012, 15:10 GMT
OK, here's some more info:

uname -a
Linux cri-arch 3.4.4-1-ARCH #1 SMP PREEMPT Sat Jun 23 10:53:18 CEST 2012 x86_64 GNU/Linux

lspci
00:00.0 Host bridge: Intel Corporation Core Processor DRAM Controller (rev 02)
00:01.0 PCI bridge: Intel Corporation Core Processor PCI Express x16 Root Port (rev 02)
00:16.0 Communication controller: Intel Corporation 5 Series/3400 Series Chipset HECI Controller (rev 06)
00:1a.0 USB controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05)
00:1b.0 Audio device: Intel Corporation 5 Series/3400 Series Chipset High Definition Audio (rev 05)
00:1c.0 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 1 (rev 05)
00:1c.1 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 2 (rev 05)
00:1c.2 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 3 (rev 05)
00:1c.4 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 5 (rev 05)
00:1d.0 USB controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 05)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev a5)
00:1f.0 ISA bridge: Intel Corporation Mobile 5 Series Chipset LPC Interface Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset 4 port SATA AHCI Controller (rev 05)
00:1f.3 SMBus: Intel Corporation 5 Series/3400 Series Chipset SMBus Controller (rev 05)
00:1f.6 Signal processing controller: Intel Corporation 5 Series/3400 Series Chipset Thermal Subsystem (rev 05)
01:00.0 VGA compatible controller: NVIDIA Corporation GT218 [GeForce 310M] (rev a2)
01:00.1 Audio device: NVIDIA Corporation High Definition Audio Controller (rev a1)
06:00.0 Network controller: Broadcom Corporation BCM4313 802.11b/g/n Wireless LAN Controller (rev 01)
07:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8101E/RTL8102E PCI Express Fast Ethernet controller (rev 02)
ff:00.0 Host bridge: Intel Corporation Core Processor QuickPath Architecture Generic Non-core Registers (rev 02)
ff:00.1 Host bridge: Intel Corporation Core Processor QuickPath Architecture System Address Decoder (rev 02)
ff:02.0 Host bridge: Intel Corporation Core Processor QPI Link 0 (rev 02)
ff:02.1 Host bridge: Intel Corporation Core Processor QPI Physical 0 (rev 02)
ff:02.2 Host bridge: Intel Corporation Core Processor Reserved (rev 02)
ff:02.3 Host bridge: Intel Corporation Core Processor Reserved (rev 02)

yaourt -Q nvidia
extra/nvidia 302.17-1

more info available on request :)
Comment by indianahorst (indianahorst) - Sunday, 24 June 2012, 15:47 GMT
It is really strange. I hoped this bug would disappear if I change the kernel from linux 3.4.3 to linux-lts 3.0.36-1. But I recognized that the load was still between 0.5 to 1.0 (or higher) and the power consumption was also 4-5 W higher than normal (normal = core/linux 3.3.3-1).

So the load seems not only to depend on the kernel version > 3.3.5, but also on an unknown factor. But it may be possible that the scheduler has also been changed in the linux-lts version?
Comment by Sudhir Khanger (donniezazen) - Sunday, 24 June 2012, 16:28 GMT
I tried linux-mainline 3.5rc3 from AUR and it is no better than what we have now. Temperature and load both seems to be very high in 3.5rc3
Comment by Krzysztof Majzerowicz-Jaszcz (Cristos666) - Sunday, 24 June 2012, 16:34 GMT
I just tried the LTS kernel and still got the same issue - high loads, higher cpu temp, as F Berberich said. Perhaps some changes were backported to LTS line of kernels ?
Comment by Ypnose (Ypnose) - Sunday, 24 June 2012, 19:20 GMT
Someone reported on the forum, he fixed this issue by switching to ck patchset (3.4 kernel).
For now, I stay on 3.3.4.
Comment by Sudhir Khanger (donniezazen) - Sunday, 24 June 2012, 19:36 GMT
Hey Hugues Tranli, I wouldn't say I have seen any improvements on linux-ck 3.4.
Comment by Ypnose (Ypnose) - Monday, 25 June 2012, 20:39 GMT
@Sudhir Khanger: Did you enable the alternate BFQ Scheduler as the wiki says?
Comment by Sudhir Khanger (donniezazen) - Tuesday, 26 June 2012, 04:03 GMT
@Ypnose I first noticed a high temp and high power usage problems on linux-ck. I tried it again, this afternoon, with bfq scheduler, and at one point of time my system was running at 65C instead of 50C. I am currently, happily, running linux-ice from AUR which keeps my system at 50C and less than 10W of power usage. I don't know how to compile a 3.3 kernel. I am looking into it, I should not have upgraded.
Comment by Jonas Jelten (TheJJ) - Sunday, 01 July 2012, 17:27 GMT
still there in 3.4.4-2..
Comment by Steve (Roken) - Wednesday, 04 July 2012, 00:45 GMT
The problem remains on 3.4.4.2 (Phenom II X4 965 BE) stock and ck, whether or not bfs is enabled on ck.
Comment by Krzysztof Majzerowicz-Jaszcz (Cristos666) - Wednesday, 04 July 2012, 08:15 GMT
ACK, 3.4.4-2 both normal and CK (regardless of BFS enabled or not) - problem still persists.
Comment by Peter (sehnpaa) - Sunday, 15 July 2012, 06:24 GMT
Linus just released 3.5-rc7 with "the loadavg calculation fix patch".
https://lkml.org/lkml/2012/7/14/186
Comment by Steve (Roken) - Sunday, 15 July 2012, 12:14 GMT
Afraid not. Just built and installed rc7 - system idle for 20 minutes and 1 minute average still reading 1.76. On 3.2 kernel this would be 0.8.
Comment by Sudhir Khanger (donniezazen) - Sunday, 15 July 2012, 15:28 GMT
09:27:48 up 6:34, 1 user, load average: 0.01, 0.05, 0.06

Temperature is still too much to be usable.
Comment by Ypnose (Ypnose) - Sunday, 15 July 2012, 15:34 GMT
I switched to 3.4.4 cause of glibc update. In spite of the high load average, my CPU isn't warmer compared to 3.3.4. (Phenom II X4 955)
And this issue becomes really boring...
Comment by Ypnose (Ypnose) - Sunday, 15 July 2012, 16:48 GMT
Tried 3.5-rc; as we said before, bug still there
Comment by Ypnose (Ypnose) - Sunday, 22 July 2012, 10:06 GMT
What do you think about 3.4.6 kernel? I saw in the changelox some news about load average.
Now, my load seems to be better even if it's a bit different compared to 3.3.4.
It's still higher than before...
Comment by Krzysztof Majzerowicz-Jaszcz (Cristos666) - Sunday, 22 July 2012, 12:49 GMT
Yeah, system load seems better now, however, CPU temp and energy consumption still very high :(.
Comment by Jonas Jelten (TheJJ) - Sunday, 22 July 2012, 13:39 GMT
load is indeed better now, but i really don't understand what makes the laptop consume that much more energy..
The CPU (SandyBridge i5-2520) is in C7 sleep in 99 % of the time. No 2d/3d load except urxvt.

Anyone already figured out what exactly is wasting the electricity?

My bets: PCIe or i915, but as users with AMD are also affected, i think it's some ACPI or PCIe ASPM issue.

By the way, Bill Gate's oppinion on ACPI once was this (1999): http://antitrust.slated.org/www.iowaconsumercase.org/011607/3000/PX03020.pdf
Comment by Krzysztof Majzerowicz-Jaszcz (Cristos666) - Sunday, 22 July 2012, 13:53 GMT
My system was configured with "pcie_aspm=force" kernel option. Now, either with or without this setting, power consumption is still quite high, as is CPU temp (around 10*C more than it was pre 3.3.2 kernel)
Comment by Sudhir Khanger (donniezazen) - Sunday, 22 July 2012, 19:27 GMT
Cristos666 ASPM fix is no longer required. It is available in latest kernels and has been back ported to 3.2 lts. I use these kernel parameters and they work excellent on any pre 3.4 kernel.

https://wiki.ubuntu.com/Kernel/PowerManagement/PowerSavingTweaks
Comment by Ypnose (Ypnose) - Monday, 23 July 2012, 19:05 GMT
Finally, after two days of use, I don't understand what's going on...
Watching videos on Youtube + HTML5 and doing work on terminals, my load is around 1.2 last minute and 0.97 last five minutes.
My setup is minimal as hell (DWM), I really don't get it.
Right now (writing these lignes), I have 0.18 0.24 0.23 (compared to 0.01 0.03 0.05 on 3.3.4).
Comment by Leonid Isaev (lisaev) - Monday, 23 July 2012, 19:16 GMT
"Me too" comments don't help unless you provide additional info about your system.

Please note:
(a) DWM/i3/openbox/... are NOT minimal. It's a myth. They are all very inefficient when it comes to window redrawing compared to more advanced window managers in major DEs.
(b) Commit 7490d0a4cfefa16f9d8ce636eb5b2e13d2432db3 in linux 3.4.6.

For example, I have right after logging to flyspray:
$ uptime
14:12:14 up 4:21, 1 user, load average: 0.16, 0.12, 0.08
and after typing this message
$ uptime
14:15:00 up 4:24, 1 user, load average: 0.01, 0.07, 0.06
Comment by Jonas Jelten (TheJJ) - Friday, 10 August 2012, 23:52 GMT
i'm on 3.5.1-1-ARCH now and still having power usage issues, anyone else affected?
(x220t, intel i5-2520M)
Comment by Sudhir Khanger (donniezazen) - Friday, 10 August 2012, 23:55 GMT
Yep, I am running 3.3 due to power usage problem.
Comment by Jonas Jelten (TheJJ) - Saturday, 11 August 2012, 01:11 GMT
Please contribute your hardware specs here, this damn bug needs to be whiped out.

https://bbs.archlinux.org/viewtopic.php?pid=1144709
Comment by Greg (dolby) - Monday, 15 October 2012, 02:34 GMT
Is this still a problem with 3.6?
Comment by Steve (Roken) - Monday, 15 October 2012, 06:30 GMT
Certainly here things seem to be much better with 3.6

Loading...