FS#79300 - [glibc] 2.38 posix_memalign regression

Attached to Project: Arch Linux
Opened by lopster (lopster) - Saturday, 05 August 2023, 01:30 GMT
Last edited by freswa (frederik) - Thursday, 17 August 2023, 15:52 GMT
Task Type Bug Report
Category Upstream Bugs
Status Closed
Assigned To Giancarlo Razzolini (grazzolini)
freswa (frederik)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 25
Private No

Details

Description: This is probably an upstream issue but glibc seems to want bugs to start with the distro. If someone here can at least reproduce this I guess that would be good.

Beginning with the glibc 2.38 update, I started seeing up to 100% CPU utilization in mpv when streaming videos using yt-dlp. I attached gdb to the process and the mpv demux thread is stuck looping with this stack trace on any break:

#0 chunk_ok_for_memalign (p=p@entry=0x7fffc4fd5c00, alignment=alignment@entry=64, nb=nb@entry=32) at malloc.c:5010
#1 0x00007ffff509cc47 in _int_memalign (av=av@entry=0x7fffc4000030, alignment=alignment@entry=64, bytes=bytes@entry=24) at malloc.c:5116
#2 0x00007ffff509d442 in _mid_memalign (alignment=alignment@entry=64, bytes=bytes@entry=24, address=<optimized out>) at malloc.c:3636
#3 0x00007ffff509ed23 in __posix_memalign (size=24, alignment=64, memptr=0x7fffcb91ac20) at malloc.c:5905
#4 __posix_memalign (memptr=0x7fffcb91ac20, alignment=64, size=24) at malloc.c:5889
#5 0x00007ffff5c52126 in av_malloc () from target:/usr/lib/libavutil.so.58
#6 0x00007ffff5c522fe in av_mallocz () from target:/usr/lib/libavutil.so.58
#7 0x00007ffff5c27073 in av_buffer_ref () from target:/usr/lib/libavutil.so.58
#8 0x00005555555d509f in new_demux_packet_from_buf (buf=0x7fffc5b4e700) at ../mpv/demux/packet.c:104
#9 0x00005555555c715a in handle_block (block_info=0x7fffcb91ad40, demuxer=0x7fffc40a0ee0) at ../mpv/demux/demux_mkv.c:2709
#10 demux_mkv_read_packet (demuxer=0x7fffc40a0ee0, pkt=<optimized out>) at ../mpv/demux/demux_mkv.c:2958
#11 0x00005555555b4d5b in read_packet (in=0x7fffc4057f80) at ../mpv/demux/demux.c:2284
#12 thread_work (in=0x7fffc4057f80) at ../mpv/demux/demux.c:2559
#13 0x00005555556d7460 in demux_read_any_packet.isra.0 (demuxer=<optimized out>) at ../mpv/demux/demux.c:2839
#14 0x00005555555cfb0a in do_read_next_packet (demuxer=0x7fffc4024070, src=0x7fffc4024bc0) at ../mpv/demux/demux_timeline.c:276
#15 0x00005555555d001b in do_read_next_packet (src=0x7fffc4024bc0, demuxer=0x7fffc4024070) at ../mpv/demux/demux_timeline.c:374
#16 d_read_packet (demuxer=0x7fffc4024070, out_pkt=0x7fffcb91b720) at ../mpv/demux/demux_timeline.c:394
#17 0x00005555555b4d5b in read_packet (in=0x7fffc408d560) at ../mpv/demux/demux.c:2284
#18 thread_work (in=in@entry=0x7fffc408d560) at ../mpv/demux/demux.c:2559
#19 0x00005555555b5160 in demux_thread (pctx=0x7fffc408d560) at ../mpv/demux/demux.c:2577
#20 0x00007ffff508c9eb in start_thread (arg=<optimized out>) at pthread_create.c:444
#21 0x00007ffff511123c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

chunk_ok_for_memalign is new in glibc 2.38.

Steps to reproduce:

Start a video streaming with a command like this (videos from any site supported by yt-dlp seem to cause it):

mpv --demuxer-max-bytes=300000k https://www.pbs.org/video/australia-animals-with-cameras-637xr7/

Pause the video to allow the demuxer cache to fill.

Watch top/htop to see CPU usage spike. It sometimes helps to pause/resume the video a few times. Eventually it can get stuck at 100% of a core. mpv starts having performance issues as well.
This task depends upon

Closed by  freswa (frederik)
Thursday, 17 August 2023, 15:52 GMT
Reason for closing:  Fixed
Additional comments about closing:  glibc-2.38-3
Comment by Toolybird (Toolybird) - Saturday, 05 August 2023, 01:52 GMT
Possibly related [1], so yeah, looks like an upstream issue.

[1] https://sourceware.org/pipermail/libc-alpha/2023-August/150653.html
Comment by lopster (lopster) - Saturday, 05 August 2023, 02:07 GMT
Thank you, that sounds like it. I did some searching but didn't find that thread.
Comment by Toolybird (Toolybird) - Saturday, 05 August 2023, 02:12 GMT Comment by Toolybird (Toolybird) - Saturday, 05 August 2023, 02:41 GMT
Upstream ticket for tracking [1]

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=30723
Comment by lopster (lopster) - Saturday, 05 August 2023, 19:43 GMT
Use this at your own risk.

Here's an easy binary patch that seems to work around this issue for the arch glibc 2.38-2 build:

Copy /usr/lib/libc.so.6 somewhere. Load it in a hex editor.

The bytes at offset 9ca05 should be hex 74 79.

Change them to EB 79.

Prior to running mpv, execute this command:

export LD_LIBRARY_PATH=/directory_with_modified_libc:$LD_LIBRARY_PATH

This makes glibc take a code path like 2.37 did.
Comment by Philip Müller (philm) - Sunday, 06 August 2023, 12:26 GMT
The issue might got introduced with [24cdd6c71debfd10a9f7cb217fe2a2c4c486ed6f](https://sourceware.org/git/?p=glibc.git;a=commit;h=24cdd6c71debfd10a9f7cb217fe2a2c4c486ed6f). It adds a chunk scanning algorithm to the _int_memalign code path that reduces heap fragmentation by reusing already aligned chunks instead of always looking for chunks of larger sizes and splitting them. The tcache macros are extended to allow removing a chunk from the middle of the list. The goal of that change is to fix the pathological use cases where heaps grow continuously in workloads that are heavy users of memalign. However, posix_memalign now calls chunk_ok_for_memalign (which takes most of time).
Comment by lopster (lopster) - Sunday, 06 August 2023, 16:30 GMT
The workaround below was posted in the mpv report by narpfel. It's probably better than a binary patch to libc.so. Installation of the arch package jemalloc is required.

https://github.com/mpv-player/mpv/issues/12076#issuecomment-1666892864

Another workaround: Using a different allocator, e. g. jemalloc, via LD_PRELOAD, as described here: https://github.com/jemalloc/jemalloc/wiki/Getting-Started

$ LD_PRELOAD=`jemalloc-config --libdir`/libjemalloc.so.`jemalloc-config --revision` mpv ...
Comment by Michael J Evans (mjevans) - Monday, 07 August 2023, 03:41 GMT
I've noticed both the MPV issue and changes in behavior for a common java based game. I'm not at all sure that the behavioral changes a mods heavy Minecraft game are related, but glibc touches far more than just MPV and a game, it also intersects with every other audio, graphics, and other systems library that might be linked against glibc.

The present severity of 'low' is, in my opinion, overly optimistic about how little a potential impact _any_ bug in glibc can be. This is difficult to isolate but might be an unbounded or very bad worst-possible-performance case regression in a core system library.
Comment by lopster (lopster) - Monday, 07 August 2023, 04:12 GMT
I originally posted this as critical priority, which admittedly may have been excessive, because it seems unlikely that mpv is the only impacted app.

I find this glibc change somewhat baffling. It's like a poorly performing garbage collector was put in the aligned allocator code path. Less than ideal since aligned memory is most likely to be used by apps seeking the best performance.
Comment by Toolybird (Toolybird) - Monday, 07 August 2023, 05:33 GMT
> The present severity

As mentioned elsewhere, severity is basically useless for Arch tickets. Upping the severity is not going to magically turn us all into Glibc hackers :) It's an upstream issue after all...so it doesn't get the bug fixed any faster.
Comment by lopster (lopster) - Monday, 07 August 2023, 05:49 GMT
A simple Arch specific patch to 2.38 could revert this change and restore 2.37 performance.
Comment by n_ (n_) - Monday, 07 August 2023, 05:53 GMT
Is it out of the question to rollback to 2.37 in the repo since it's not been too long since 2.38, and then rebuild all packages published after 2023-08-04 16:01 UTC?
Comment by Toolybird (Toolybird) - Monday, 07 August 2023, 06:04 GMT
> A simple Arch specific patch

This might be needed if upstream don't fix it in a timely manner. Feel free to prepare and submit a tested patch for consideration.
Comment by freswa (frederik) - Monday, 07 August 2023, 09:34 GMT
> Is it out of the question to rollback to 2.37 in the repo since it's not been too long since 2.38, and then rebuild all packages published after 2023-08-04 16:01 UTC?

Yes

>> A simple Arch specific patch
>This might be needed if upstream don't fix it in a timely manner. Feel free to prepare and submit a tested patch for consideration.

+1
Comment by lopster (lopster) - Monday, 07 August 2023, 13:06 GMT
OK, I'll put together a patch. It will likely just change line 5078 of malloc/malloc.c from 'if (av != NULL)' to 'if (0)'. That will be the least number of lines changed and restores the 2.37 path.
Comment by loqs (loqs) - Tuesday, 08 August 2023, 14:24 GMT
While only covering one affected application, disabling memalign in ffmpeg is relatively simple.
The revert of the glibc memalign changes is much larger.

I could not reproduce the issue with ffmpeg due to:
$ mpv --demuxer-max-bytes=300000k https://www.pbs.org/video/australia-animals-with-cameras-637xr7/
[ytdl_hook] ERROR: An extractor error has occurred. (caused by KeyError('title')); please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U
[ytdl_hook] youtube-dl failed: unexpected error occurred
Failed to recognize file format

Using a synthetic test the glibc patch restored performance.
Comment by lopster (lopster) - Tuesday, 08 August 2023, 16:04 GMT
>I could not reproduce the issue with ffmpeg due to:
>$ mpv --demuxer-max-bytes=300000k https://www.pbs.org/video/australia-animals-with-cameras-637xr7/
>[ytdl_hook] ERROR: An extractor error has occurred. (caused by KeyError('title')); please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U

That's still working for me. Maybe it's a region locked video, sorry about that. The issue should happen with any youtube or other yt-dlp supported site video of decent size as well.
Comment by lopster (lopster) - Tuesday, 08 August 2023, 16:10 GMT
Attached is a patch to the Arch glib build repo that disables the problematic chunk scanning code for glibc. It adds 2 patches to the glibc build. One disables the chunk scanning code and the other disables the tests.
Comment by loqs (loqs) - Tuesday, 08 August 2023, 17:11 GMT
@lopster do you think think the ffmpeg code could benefit from making the use of memalign configurable or using mmap once then sub allocating [1]?

[1] https://sourceware.org/pipermail/libc-alpha/2023-August/150771.html
Comment by lopster (lopster) - Tuesday, 08 August 2023, 17:47 GMT
> @lopster do you think think the ffmpeg code could benefit from making the use of memalign configurable or using mmap once then sub allocating [1]?

I saw that discussion and still don't get the thinking here.

I don't know anything about ffmpeg internals but, yes, that would almost certainly be a better way to go. I'm not sure they should have to do that, however. The previous glibc allocation strategy was apparently working for them.

A similar argument could be made for whatever apps they have in mind that are prompting this change due to heap fragmentation. Fragmentation has been an issue forever with environments that don't have garbage collection. I would suggest those apps should employ a special purpose allocator to resolve their issues rather than mess up a general purpose allocator. Maybe the former glibc strategy was wasting memory but in a lot of cases memory is more available than CPU. mpv went from using low CPU and reasonable memory to completely unusable with this change so the tradeoff went way too far in the other direction, in my opinion. mpv is probably not the only impacted app.
Comment by loqs (loqs) - Tuesday, 08 August 2023, 17:54 GMT
A glibc tunable that selected the strategy to optimize for speed or memory usage would be preferable but the discussion does not seem to be heading towards that result.
Comment by Gergő Sályi (gergo94) - Tuesday, 08 August 2023, 19:18 GMT
Hi all! I've been following this since also spotted the issue in mpv.

I hope it gets resolved quickly, but I find this comment by the glibc devs worrying: https://sourceware.org/pipermail/libc-alpha/2023-August/150729.html

I really disagree with that mentality - blaming the API consumers after a change that broke things that previously worked. (quote: "I'm tempted to not fix this, to teach developers to not use posix_memalign() unless they REALLY need it ;-)" )

I mean people literally hotfixing this with LD_PRELOADing alternative allocators like 'jemalloc' to get it back working. I also just tried to LD_PRELOAD 'mimalloc' and mpv works fine in the use case in question here too.

I think glibc malloc allowing itself a to preform horribly in a (maybe suboptimal or even erranous) API use case that otherwise works just OK with both 'jemalloc' and 'mimalloc' would be a bad outcome.
Comment by Gergő Sályi (gergo94) - Tuesday, 08 August 2023, 19:19 GMT
Btw for the better or worse ffmpeg uses posix_memalign as its general purpose allocation routine so all heap data is aligned for SIMD operations:
https://git.ffmpeg.org/gitweb/ffmpeg.git/blob/HEAD:/libavutil/mem.h#l112
https://git.ffmpeg.org/gitweb/ffmpeg.git/blob/HEAD:/libavutil/mem.c#l96

Disabling aligned allocations in ffmpeg entirely will likely impact the performance of SIMD accelerated subroutines (SSE prefers 128 bit, AVX prefers 256 bit alignment). Unaligned SIMD accesses may not crash but hitting page boundaries is not good in SIMD routes that tend to be performance critical.
Comment by loqs (loqs) - Tuesday, 08 August 2023, 20:24 GMT
@gergo94 should the performance not be restored then ffmpeg could provide its own posix_memalign replacement similar to mesa's fallback [1] or swap to a different malloc provider than glibc.

[1] https://github.com/Mesa3D/mesa/blob/mesa-23.0.2/src/util/os_memory_aligned.h#L71
Comment by n_ (n_) - Wednesday, 09 August 2023, 03:12 GMT
Thanks @lopster, your patch works me.
Comment by Philip Müller (philm) - Wednesday, 09 August 2023, 06:42 GMT
This is the wonderful world of open source. I really don't get the glibc developers here. Other applications like ffmpeg and mpv using this posix_memalign functionality since years and now low level upstream decides to change the API. If they want to update the API, they have to continue the old way too and have the new API with a new name so other projects can choose if they want to adopt to the new way or not. Why is there even a discussion to begin with. We all know that most developers are developing in their spare time.

Now every project which used that function in the past have to look for something else? It worked! Why break it? If an API is used, even in a weird way it won't be a bug it will become a feature. Even Torvalds said that at their level for the kernel development. glibc is in a similar realm and should honor that too: Never break user space!

The reason is not a historical one but a practical one. There are many many many programs that run on top of the Linux kernel; if a kernel interface breaks those programs then everybody would need to upgrade those programs.

At Manjaro side we use your patch @lopster, thx for that.
Comment by loqs (loqs) - Wednesday, 09 August 2023, 19:55 GMT
glibc 2.38 with patch from https://inbox.sourceware.org/libc-alpha/877cq4yrpg.fsf%40oldenburg.str.redhat.com applied
https://drive.google.com/file/d/1tAi0w_lvb_mMrhlsaYhplmqBS2Goc3Zw/view?usp=sharing glibc-2.38-2.1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/16pXFwdKLbwjowsuTjl98jYv7JRAJ0mmo/view?usp=sharing lib32-glibc-2.38-2.1-x86_64.pkg.tar.zst

I can no longer reproduce the issue with the patch applied.
Comment by Mauricio Quintela (mfcmquintela) - Wednesday, 09 August 2023, 20:03 GMT
@loqs thanks for providing patched versions (although the glibc-2.38-2.1-x86_64.pkg.tar.zst link shows access denied EDIT: fixed). building with @lopster 's patch was failing tests for me, so was still using the jemalloc workaround
Comment by loqs (loqs) - Wednesday, 09 August 2023, 20:04 GMT
> @loqs thanks for providing patched versions (although the glibc-2.38-2.1-x86_64.pkg.tar.zst link shows access denied)
Sharing on the link should now be fixed. Sorry about that.

https://drive.google.com/file/d/1tAi0w_lvb_mMrhlsaYhplmqBS2Goc3Zw/view?usp=sharing glibc-2.38-2.1-x86_64.pkg.tar.zst
Comment by Mauricio Quintela (mfcmquintela) - Wednesday, 09 August 2023, 20:18 GMT
@loqs Installed patched versions (also rebuilt ffmpeg, libplacebo and mpv, for a sanity check). Same issue described in my github thread [1] where 1 cpu core gets fully loaded is still present, although read speed have returned to normal. I can keep testing for different versions.

[1] https://github.com/mpv-player/mpv/issues/12076
Comment by loqs (loqs) - Wednesday, 09 August 2023, 20:24 GMT
@mfcmquintela can you report that on https://sourceware.org/bugzilla/show_bug.cgi?id=30723 preferably with a step by step so the glibc developers can reproduce it locally.
Comment by Toolybird (Toolybird) - Wednesday, 09 August 2023, 22:17 GMT
@lopster, are you seeing any improvements with the proposed patch [1] from the upstream ticket?

[1] https://inbox.sourceware.org/libc-alpha/877cq4yrpg.fsf%40oldenburg.str.redhat.com/T/
Comment by Gergő Sályi (gergo94) - Wednesday, 09 August 2023, 23:03 GMT
I also tried this [1]. I started from '$ pkgctl repo clone glibc', then adding it to PKGBUILD, building (confirmed in $SRCDIR that patch applied) and installing.

At me for the mpv issue this patch did not help significantly (just by looking at cpu use in htop, I did no bench).

If someone wants to see a flamegraph with mpv already barely keeping up with demuxing at 100% CPU use, I attach mine here taken with this patch [1] applied.

[1] https://inbox.sourceware.org/libc-alpha/877cq4yrpg.fsf%40oldenburg.str.redhat.com/
Comment by lopster (lopster) - Wednesday, 09 August 2023, 23:21 GMT
> @lopster, are you seeing any improvements with the proposed patch [1] from the upstream ticket?

No...the progression of CPU usage might have been a little different just from looking at top but mpv still ended up at 100% utilization with approximately the same amount of data cached (~200MB on my machine).
Comment by n_ (n_) - Wednesday, 09 August 2023, 23:24 GMT
Since the conversation on libc-alpha sounds like there's more allocator changes on the horizon, a question popped into my mind. Is it possible to globally use jemalloc/mimalloc on Arch, at least until glibc's allocator improvements/shakeups settle down in a year or so? FreeBSD is happy with jemalloc. I tried LD_PRELOAD'ing mimalloc and that, unsurprisingly, crashed Firefox which is uses jemalloc exclusively (internally). It was worth a test.
Comment by Gergő Sályi (gergo94) - Wednesday, 09 August 2023, 23:45 GMT
Reproduction with mpv dosn't need a network video stream btw, this will also do:

$ mpv --no-config --cache=yes --demuxer-max-bytes=1G big_local_video_file.mkv

Out of the total cpu time of the mpv process here, about 75 % is in the "demuxer_thread" -> __posix_memalign -> ... -> chunk_ok_for_memalign call path
Comment by lopster (lopster) - Thursday, 10 August 2023, 18:13 GMT
This ffmpeg command can generate a 1 hour long 3MB video to be used with the mpv command that @gergo94 posted to repro the issue:

ffmpeg -f lavfi -i anullsrc -f lavfi -i color=color=black -t 3600 -f matroska test.mkv

Also works in a pipeline:

ffmpeg -f lavfi -i anullsrc -f lavfi -i color=color=black -t 3600 -f matroska - | mpv --no-config --cache=yes --demuxer-max-bytes=1G -
Comment by Meikk99 (Meikk99) - Thursday, 10 August 2023, 21:24 GMT
Applying the patch from [1]
(on top of the previous one from [2])
seems to fix the performance issues with mpv for me.

[1] https://inbox.sourceware.org/libc-alpha/87pm3uajev.fsf%40oldenburg.str.redhat.com
[2] https://inbox.sourceware.org/libc-alpha/877cq4yrpg.fsf%40oldenburg.str.redhat.com
Comment by Toolybird (Toolybird) - Thursday, 10 August 2023, 21:26 GMT
There is now another upstream patch candidate [1] to be applied on top of the first one. Could those affected please test?

[1] https://inbox.sourceware.org/libc-alpha/87pm3uajev.fsf%40oldenburg.str.redhat.com/
Comment by loqs (loqs) - Thursday, 10 August 2023, 22:12 GMT Comment by Gergő Sályi (gergo94) - Thursday, 10 August 2023, 22:46 GMT
> There is now another upstream patch candidate [1] to be applied on top of the first one. Could those affected please test?

I tested as you said and with that now mpv is fixed for me - moreover demuxer performance looks fully back to normal (based on some basic bench I did).
Comment by Gergő Sályi (gergo94) - Thursday, 10 August 2023, 22:53 GMT
And this is expected - the second upstream patch just cuts out the same 2.38-release introduced code path that we were cutting out with @lopster 's patch - isn't it?
Comment by lopster (lopster) - Friday, 11 August 2023, 01:14 GMT
>There is now another upstream patch candidate [1] to be applied on top of the first one. Could those affected please test?

I tested with the latest patch and it fixes the issue for me as well.

>And this is expected - the second upstream patch just cuts out the same 2.38-release introduced code path that we were cutting out with @lopster 's patch - isn't it?

Yes, the latest patch deletes the code that was being bypassed by the patch I put up.
Comment by - (fx333) - Friday, 11 August 2023, 10:53 GMT
I think it's about time for Arch to ship a patch disabling the problematic code path.
This bug can break the performance of potentially every application. Just yesterday I noticed suspicious periods of 100% CPU with a node application after upgrading the system.
People expect their Arch machines to work and not break in obscure ways, nevermind that the workaround is digging up a patch from the bugtracker and rebuilding glibc.

As an additional data point there are 77 libraries and 46 executables on my system that directly import posix_memalign. [1] [2]
Nobody can guarantee that these aren't affected.


[1] cd /usr/bin; file * | grep 'ELF 64-bit.*executable' | cut -d : -f 1 | while read f; do readelf -a "$f" | grep -E 'FUNC.+\bposix_memalign@GLIBC'; done | wc -l
[2] cd /usr/lib; file * | grep 'ELF 64-bit.*shared' | cut -d : -f 1 | while read f; do readelf -a "$f" | grep -E 'FUNC.+\bposix_memalign@GLIBC'; done | wc -l
Comment by lopster (lopster) - Friday, 11 August 2023, 15:53 GMT
> I think it's about time for Arch to ship a patch disabling the problematic code path.

@Toolybird commented on the glibc bug [1] saying that Arch will ship after the patches are committed to the glibc repo. The 2nd patch has been revised to resolve test issues and is in pre-commit review.

https://sourceware.org/bugzilla/show_bug.cgi?id=30723
Comment by loqs (loqs) - Friday, 11 August 2023, 16:25 GMT Comment by Thomas Lübking (luebking) - Friday, 11 August 2023, 20:00 GMT
Thanks, those patches seem to remedy the situation for me.
Another (unexpected) victim seems to have been cpulimit…
Comment by Jimmy Jim Joe Jabadoo (BabyGuinan) - Friday, 11 August 2023, 20:38 GMT
Tested several hours with mpv (local & yt-dlp; also several instances) and the glibc-2.38-2.2-x86_64.pkg.tar.zst and lib32-glibc-2.38-2.2-x86_64.pkg.tar.zst packages: runs total fine.

Same conditions (but just one hour of testing) with glibc-2.38-2.3-x86_64.pkg.tar.zst and lib32-glibc-2.38-2.3-x86_64.pkg.tar.zst: Also fine.
Comment by Michael J Evans (mjevans) - Tuesday, 15 August 2023, 18:38 GMT Comment by pabello (pabel) - Tuesday, 15 August 2023, 20:00 GMT
using both patch's

< https://drive.google.com/file/d/1Pg5WXVzVL5P2Ws-z_ubVQ3npekNerBT7/view?usp=sharing >
&
< https://drive.google.com/file/d/1lZGwjDqRSwNzFORcf_NFmMxQFu3riZHK/view?usp=sharing >

with pacman -U lib32-glibc-2.38-2.3-x86_64.pkg.tar.zst glibc-2.38-2.3-x86_64.pkg.tar.zst

in arch 6.4.10-arch1-1 and using
mpv https://mcdn.daserste.de/daserste/de/master.m3u8

causes the same high cpu load as before/without patches

so the patches will not help me prevent unusal high cpu load for mpv streaming


screenshot htop with mpv https://mcdn.daserste.de/daserste/de/master.m3u8
=> Screenshot_mpv_20230815_205731__x.jpg

in comparison htop watching with vlc the same stream
=> Screenshot_vlc_20230815_205940__x.jpg



___________
$ inxi -CG
CPU:
Info: quad core model: Intel Core i7-8665U bits: 64 type: MT MCP cache:
L2: 1024 KiB
Speed (MHz): avg: 1060 min/max: 400/4800 cores: 1: 900 2: 900 3: 900
4: 2100 5: 900 6: 981 7: 900 8: 900
Graphics:
Device-1: Intel WhiskeyLake-U GT2 [UHD Graphics 620] driver: i915 v: kernel
Device-2: Chicony ThinkPad T490 Webcam driver: uvcvideo type: USB
Display: x11 server: X.Org v: 21.1.8 driver: X: loaded: modesetting
dri: iris gpu: i915 resolution: 1920x1080~60Hz
API: OpenGL v: 4.6 Mesa 23.1.5 renderer: Mesa Intel UHD Graphics 620 (WHL
GT2)

__
ps
observed since about 4 weeks that streaming the <oeffentlich rechtliche> begin stuttering after about 10-30sec resulting in stopping => <Audio device underrun detected> while increasing cpu loads

stuttering slows down with --cache=no in mpv.conf


Comment by Toolybird (Toolybird) - Tuesday, 15 August 2023, 22:59 GMT
> so the patches will not help me

Well, clearly you have an entirely different problem then. Please don't hijack tickets with unrelated spam...take it to the proper support channels (Forum/IRC/Mailing Lists/Reddit/etc) to see if anyone can help troubleshoot. Any more off-topic comments will be deleted.
Comment by Thomas Lübking (luebking) - Wednesday, 16 August 2023, 08:00 GMT
We don't actually know that, "stuttering slows down with --cache=no in mpv.conf" sounds related.
We've only two anecdotal datapoints in favor of the patch, one against that can perfectly falsify the theory.

However, there're oc. multiple factors ignored:
@pabello, 1st off all reboot to ensure a clean slate after the package update to rule out FS cache issues.
Then does --cache=no also completely drop the CPU load?
Do the hack-a-rounds in https://bugs.archlinux.org/task/79300#comment220065 & https://bugs.archlinux.org/task/79300#comment220078 work for you and can you reproduce the issue w/ and w/o the patched libs on a local file following https://bugs.archlinux.org/task/79300#comment220881

If so
$ md5sum /usr/lib/libc.so.6

Otherwise I'd look into whether the stream gets maybe decoded in SW (mpv output) and/or
$ type mpv
$ ldd /usr/bin/mpv | grep libc

@Toolybird, feel free to clean up after us ;)
@pabello, if the bug mods don't like that kind of investigation here, move it to the bbs and dm seth (yes, I'm allowed to suggest that ;)
Comment by pabello (pabel) - Wednesday, 16 August 2023, 11:17 GMT


@Toolybird : last intrusion .. watched the mpv/ffmpeg prob since some weeks and found here the seemingly right track .. may further switch to bbs


@luebking :
< 1st off all reboot to ensure a clean slate after the package update to rule out FS cache issues > => done

< Then does --cache=no also completely drop the CPU load? > => no

< Do the hack-a-rounds in https://bugs.archlinux.org/task/79300#comment220065 & https://bugs.archlinux.org/task/79300#comment220078 work for you and can you reproduce the issue w/ and w/o the patched libs on a local file following https://bugs.archlinux.org/task/79300#comment220881 >

=> one machine thinkpad t490s = patched => unusual high CPU as mentioned in pic <Screenshot_mpv_20230815_205731__x.jpg> .. w/ and w/o variants of mpv.conf .. see below
=> other machine thinkpad yogi 3rd, almost same CPU = unpatched => same unususal high CPU .. w/ and w/o variants of mpv.conf .. see below

stuttering seems to disappear since last <pacman/yay -Syu> in both machines
upgrades done almost daily
both machines working only via wifi
high CPU load while mpvstreaming remains

as usual i remember mpv used below 3% CPU in thinkpads


stuttering and <Audio device underrun detected> never appears on manjaro/desktop via LAN


mpv.conf w/ and w/o <--cache=no> does not drop down/change CPU load
also tried <--demuxer-max-bytes=50MiB>, <--demuxer-max-back-bytes=25MiB> and/or <--ytdl-raw-options=cache-dir=/dev/null>
=> no change .. CPU load still unusual high as seen in pic


patched machine:
$ md5sum /usr/lib/libc.so.6
30c41839a22a17fcfa6001ca98bc3751 /usr/lib/libc.so.6


$ type mpv
mpv ist /usr/bin/mpv


$ ldd /usr/bin/mpv | grep libc
libcdio_paranoia.so.2 => /usr/lib/libcdio_paranoia.so.2 (0x00007f62820ff000)
libcdio_cdda.so.2 => /usr/lib/libcdio_cdda.so.2 (0x00007f62820f6000)
libcdio.so.19 => /usr/lib/libcdio.so.19 (0x00007f62820cb000)
libc.so.6 => /usr/lib/libc.so.6 (0x00007f627f000000)
libcairo.so.2 => /usr/lib/libcairo.so.2 (0x00007f627e0ce000)
libcrypto.so.3 => /usr/lib/libcrypto.so.3 (0x00007f6272200000)
libcairo-gobject.so.2 => /usr/lib/libcairo-gobject.so.2 (0x00007f62713f2000)
libcap.so.2 => /usr/lib/libcap.so.2 (0x00007f626f6ce000)



$ mpv /mnt/Daten/Videos/stream/ipTV.m3u

Playing: https://mcdn.daserste.de/daserste/de/master.m3u8
[ffmpeg/demuxer] hls: Can't support the subtitle(uri: master_subs_webvtt.m3u8)
Video --vid=1 (h264 640x360 50.000fps) (2147 kbps)
(+) Video --vid=2 (h264 1920x1080 50.000fps) (8501 kbps)
Video --vid=3 (h264 1280x720 50.000fps) (5491 kbps)
Video --vid=4 (h264 960x540 50.000fps) (2816 kbps)
Video --vid=5 (h264 480x270 50.000fps) (1311 kbps)
(+) Audio --aid=1 --alang=de (*) (aac 2ch 48000Hz) (1311 kbps)
Audio --aid=2 --alang=klare sprache (aac 2ch 48000Hz) (1311 kbps)
AO: [pulse] 48000Hz stereo 2ch float
VO: [gpu] 1920x1080 yuv420p
AV: 00:06:27 / 00:06:28 (100%) A-V: 0.000
Cannot seek in this stream.
You can force it with '--force-seekable=yes'.
AV: 00:06:27 / 00:06:28 (100%) A-V: 0.000
. . .


¿ please give me a hint > what you mean with < ... and dm seth ... > ?


i hope not to cause further confusion . . .


__________
@Toolybird : thanx for instruction + clearing
pardon me for breaching etiquette
Comment by Mauricio Quintela (mfcmquintela) - Wednesday, 16 August 2023, 11:58 GMT
Tested for several hours and with several kernel versions the 2.38-2.3 patch for both glibc and lib32-glibc.
Everything seems to be running fine. The cpu usage in mpv is back to normal and the read speed (both local and network) back to saturating the connection.
Comment by Thomas Lübking (luebking) - Wednesday, 16 August 2023, 12:46 GMT
@pabello, the issue typically hinges on the mpv cache invocation, so you're most likely facing something entirely different (and I suspect your vaapi installation to be broken, but in doubt just try "--hwdec=vaapi")
Open a thread on the BBS, find the user seth (should not be hard) and send a direct message (mail) linking your thread.

@Toolybird, I've a local copy of this subthread so feel free to just delete all related messages.

Loading...