FS#78777 - [electron] 22 still causes applications to segfault on maximized window when using wayland

Attached to Project: Arch Linux
Opened by Manuel (SunRed) - Wednesday, 14 June 2023, 12:41 GMT
Last edited by Toolybird (Toolybird) - Friday, 27 October 2023, 21:38 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Caleb Maclennan (alerque)
Architecture x86_64
Severity Very Low
Priority Low
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:

The current version of Electron 22 causes seemingly all applications that use this package to segfault when the window is resized or maximized when run in native wayland mode (`--ozone-platform-hint=auto` or `--ozone-platform-hint=wayland`). That this is some kind of behavior that only applies to the Arch package with its dozen patches can be seen when using the 'electron22-bin' package from the AUR, updated to 22.3.12, that packages the official build from GitHub where this behavior is not present. This should be related to the problem in  FS#78753  but other than the comments state, it is not fixed.
This behavior was already present with `electron` before the 22.3.12 update but I expected this bug to be fixed with the new version.

Additional info:

Running on KDE/kwin 5.27.5, any electron22 application with `--ozone-platform-hint=auto` to use native wayland
https://aur.archlinux.org/packages/electron22-bin updated to 22.3.12 works without problems.

Steps to reproduce:

Run the vscode `code` package with `code --ozone-platform-hint=auto` (to use native wayland instead of xwayland) or any other package that uses system electron with electron(22) from the official repos and try to maximize or snap the window. Reproducible so far on KDE/kwin 5.27.5, I have not checked other wayland compositors yet.
This task depends upon

Closed by  Toolybird (Toolybird)
Friday, 27 October 2023, 21:38 GMT
Reason for closing:  Deferred
Additional comments about closing:  Please see comments
Comment by Toolybird (Toolybird) - Wednesday, 14 June 2023, 21:24 GMT
> The current version of Electron 22

Please always specify the *exact* pkg version. Think about the future when someone reviews this ticket. Currently there is a version in [extra] and another in [extra-testing].
Comment by Toolybird (Toolybird) - Wednesday, 14 June 2023, 22:04 GMT
FWIW, I couldn't repro in a fresh Gnome VM. But I was able to repro in a Plasma VM. Interestingly, the first run seemed fine, but then it crashed every time after that. Backtrace attached.
   gdb.txt (4.6 KiB)
Comment by Manuel (SunRed) - Thursday, 15 June 2023, 07:23 GMT
Oh, I am sorry. I will keep that in mind for future reports. Even though the version is mentioned in the problem description, I will say here once again that it was tested using electron/electron22 22.3.12 from the extra repo.
I see there has just an update with version .13 been pushed, I will try it out later.
Comment by Caleb Maclennan (alerque) - Friday, 16 June 2023, 13:55 GMT
I am not able to replicate this. I'm on Wayland and using `code` as a test since it is electron22 only:

```console
$ ELECTRON_RUN_AS_NODE=1 exec /usr/lib/electron22/electron /usr/lib/code/out/cli.js /usr/lib/code/code.js --ozone-platform-hint=wayland
```

No problems observed maximising, going full screen, resizing, etc.

What is a more exact test case? What app are you actually using? Can you give me a launch command that doesn't use flag file parsing at all to compare?
Comment by Caleb Maclennan (alerque) - Friday, 16 June 2023, 19:31 GMT
I'm wondering if this isn't an iteration of this bug upstream: https://bugs.chromium.org/p/chromium/issues/detail?id=1442633

Try clearing your ".config/chromium/Default/GPUCache" or wherever your Electron is stashing stuff.
Comment by Toolybird (Toolybird) - Saturday, 17 June 2023, 00:23 GMT
I played with this again after the latest round of updates. It still crashes under Plasma Wayland session. But this time I also tested with 'electron22-bin' package from the AUR...and it still crashes! This contradicts the claims from @reporter. It's gotta be some Plasma related Wayland glitch...but now it does *not* look like an Arch packaging bug.
Comment by Claudia Pellegrino (Auerhuhn) - Monday, 19 June 2023, 16:27 GMT
Still broken on Electron 22.3.13 and VS Code 1.79.2 running under Sway:

$ rm -rf ~/.config/Code\ -\ OSS/{Cache,CachedData,CachedExtensions,CachedExtensionVSIXs,CachedProfilesData,Code\ Cache,DawnCache,GPUCache}
$ pacman -Q code electron22 sway wlroots
code 1.79.2-1
electron22 22.3.13-2
sway-git r7148.c0876290-1
wlroots-hidpi-xprop-git 0.17.0.20230619.041950-1
$ ELECTRON_RUN_AS_NODE=1 /usr/lib/electron22/electron /usr/lib/code/out/cli.js /usr/lib/code/code.js --ozone-platform-hint=wayland

Still segfaults, even with all the cache directories wiped.
Comment by Caleb Maclennan (alerque) - Tuesday, 04 July 2023, 20:21 GMT
Still not able to replicate here (Hyprland). Has anybody passed this on upstream? Or poked around for other bug reports around the web? This doesn't seem like a packaging issue. If it turns out there is anything we can do in packaging to mitigate I'm happy to do so, but I have limited time/resources to hunt down a Chromium bug I can't replicate.
Comment by Claudia Pellegrino (Auerhuhn) - Sunday, 09 July 2023, 18:10 GMT
> Has anybody passed this on upstream? Or poked around for other bug reports around the web?

I’ve found these upstream tickets:

- https://github.com/electron/electron/issues/37531 ([Bug]: Crash when opened under Wayland)
- https://github.com/electron/electron/issues/38430 ([Bug]: Continuous segfaults when opening more than one window on wayland)
- https://github.com/microsoft/vscode/issues/181533 (segfault on wayland with 1.78 when window.titleBarStyle=custom)
- https://github.com/microsoft/vscode/issues/184124 (Crash when rebuilding application menu on wayland)

However, I couldn’t find a ticket on the Chromium bug tracker.
According to a stack trace I found elsewhere [1], the call stack of the segfault involves Chromium code, not Electron code.
So wouldn’t the Chromium tracker be the proper place to file a bug?

I’ve also noticed that no one has provided an Electron Fiddle reproduction to the Electron bug tracker, even though maintainers state that they prefer bug reports that include one.
I tried to create a reproduction but failed. I’d need more time to wrap my head around how Electron Fiddle works.

[1]: https://github.com/electron/electron/issues/37531#issuecomment-1474016994
Comment by Claudia Pellegrino (Auerhuhn) - Sunday, 09 July 2023, 18:15 GMT
~~I just re-checked and it looks like someone reported it: https://bugs.chromium.org/p/chromium/issues/detail?id=1462590~~

~~Looks promising to me, might be worth tracking!~~

**Edit:** my mistake, that one’s entirely unrelated, the stack traces are not even remotely similar.
Comment by Claudia Pellegrino (Auerhuhn) - Friday, 18 August 2023, 09:30 GMT
For the record, here’s the stack trace at the time of the crash on my machine:

```bash
$ sudo pacman -Syu code
[…]
$ pacman -Qi code
Name : code
Version : 1.81.1-1
Description : The Open Source build of Visual Studio Code (vscode) editor
Architecture : x86_64
URL : https://github.com/microsoft/vscode
Licenses : MIT
Groups : None
Provides : vscode
Depends On : electron22 libsecret libx11 libxkbfile ripgrep
[…]
$ pacman -Q electron22
electron22 22.3.20-1
$ sudo pacman -U https://geo.mirror.pkgbuild.com/extra-debug/os/x86_64/electron22-debug-22.3.20-1-x86_64.pkg.tar.zst
[…]
$ pacman -Qi electron22-debug
Name : electron22-debug
Version : 22.3.20-1
Description : Detached debugging symbols for electron22
Architecture : x86_64
[…]
$ ELECTRON_RUN_AS_NODE=1 /usr/lib/electron22/electron /usr/lib/code/out/cli.js /usr/lib/code/code.js --ozone-platform-hint=wayland
[the app window opens but immediately closes]
$ coredumpctl gdb
[…]
Core was generated by `/usr/lib/electron22/electron --ozone-platform-hint=wayland /usr/lib/code/code.j'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f547f840db8 in __memcpy_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:761
761 LOAD_ONE_SET((%rsi), PAGE_SIZE, %VMM(4), %VMM(5), %VMM(6), %VMM(7)) #5 0x00007f547f76f9eb n/a (libc.so.6 + 0x8c9eb)
[Current thread is 1 (Thread 0x7f5470985f80 (LWP 3250663))]
(gdb) bt
#0 0x00007f547f840db8 in __memcpy_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:761
#1 0x000055cccf5db415 in printing::PrinterSemanticCapsAndDefaults::Paper::Paper(printing::PrinterSemanticCapsAndDefaults::Paper const&) ()
#2 0x000055ccced0fe1d in content::BrowserContext::GetStoragePartition(content::SiteInstance*, bool) ()
#3 0x000055ccceef218b in content::HostZoomMap::GetZoomLevel(content::WebContents*) ()
#4 0x000055cccf323dad in content::WebContentsImpl::GetPendingPageZoomLevel() ()
#5 0x000055cccf1e6d82 in content::RenderWidgetHostImpl::GetVisualProperties() ()
#6 0x000055cccf1e74c2 in content::RenderWidgetHostImpl::SynchronizeVisualProperties(bool, bool) ()
#7 0x000055cccf1fe219 in content::RenderWidgetHostViewAura::SynchronizeVisualProperties(cc::DeadlinePolicy const&, absl::optional<viz::LocalSurfaceId> const&) ()
#8 0x000055cccf1fdc24 in content::RenderWidgetHostViewAura::SetSize(gfx::Size const&) ()
#9 0x000055cccf201645 in content::RenderWidgetHostViewAura::OnBoundsChanged(gfx::Rect const&, gfx::Rect const&) ()
#10 0x000055ccd0f89189 in aura::Window::OnLayerBoundsChanged(gfx::Rect const&, ui::PropertyChangeReason) ()
```

No matter which exact minor/patch version of Arch’s `electron22` package I have installed, stack frames #0 and #2 are always the same.
#0 is always a segfault in `__memcpy_avx_unaligned_erms`, which is part of glibc, and #2 is always `content::BrowserContext::GetStoragePartition`, which is part of Chromium code.

The symbol in stack frame #1 (`printing::PrinterSemanticCapsAndDefaults::Paper::Paper`) is bogus though. The correct symbol would be the copy constructor
of the `StoragePartitionConfig` class in `content/public/browser/storage_partition_config.cc` [1], as indicated by the stack return address compared to the disassembly.

The source code for stack frame #2 is in `content/browser/browser_context.cc` inside the `GetStoragePartition(SiteInstance*, bool)` method [2], specifically in the first leg of the ternary operator, where the method `SiteInstanceImpl::GetStoragePartitionConfig()` is invoked.

The stack return addresses, when correlated to the disassembled binary, suggest that the `GetStoragePartitionConfig` method [3] in `content/browser/site_instance_impl.cc` may be returning a corrupted `StoragePartitionConfig` object. At least the `partition_domain` field appears to be corrupted, which is the first field that the copy constructor tries to `memcpy`, according to my disassembly.

tl;dr:
Execution flow is going through `browser_context.cc:125` [2], which calls `site_instance_impl.cc:941` [3], which returns a (possibly corrupted?) object.
That object is going through a copy constructor [1], which internally hands off one or two corrupted pointer(s) to `memcpy`, which finally segfaults.

[1]: https://github.com/chromium/chromium/blob/118.0.5949.0/content/public/browser/storage_partition_config.cc#L17-L18

[2]: https://github.com/chromium/chromium/blob/118.0.5949.0/content/browser/browser_context.cc#L125

[3]: https://github.com/chromium/chromium/blob/118.0.5949.0/content/browser/site_instance_impl.cc#L941-L950

Note: Electron v22.3.20 uses Node.js v18.17.1 and Chromium v118.0.5949.0, which is why I’ve pinned all the links to the `118.0.5949.0` Git tag.

Any ideas or clues what to investigate next?
Comment by Claudia Pellegrino (Auerhuhn) - Friday, 18 August 2023, 09:47 GMT
-This morning, I found a silly workaround, too:-
Update: just realized that OP has mentioned that already.

Replacing Arch’s `electron22` package with `electron22-bin` (AUR) fixes the crash.
Even copying over just the `/usr/lib/electron22/electron` binary from `electron22-bin` into an affected `electron22` install heals it.

In other words, the issue appears only in Arch Linux’s build for me but not in the upstream binary release package.
That makes me somewhat reluctant to file a bug against the Chromium project.
By copying the two binaries back and forth, I can ~ 95% reproduce that that works around the crash. (Very occasionally, VS Code crashes on startup with seemingly unrelated stack traces.)

No idea how to go from here though. Ideas?
Comment by loqs (loqs) - Friday, 18 August 2023, 14:57 GMT
@Auerhuhn if you updated code to the latest git commit which uses electron25 can you still reproduce the issue? As the code-git PKGBUILD in AUR is out of date you can use the attached diff to update the code PKGBUILD.

> That makes me somewhat reluctant to file a bug against the Chromium project.
Chromium only supports the latest stable major release or newer currently 116 while electron22 uses 108. It is still supported by nodejs until 22 becomes EOL on 2023-Oct-10.
Comment by Claudia Pellegrino (Auerhuhn) - Friday, 18 August 2023, 17:11 GMT
@loqs I can reproduce the issue in latest `code` with the patch you provided, albeit with a slightly different stack trace.
Because the segfault always seems to occur inside a C++ copy constructor, I’ll be calling it the *copy constructor crash*:

```
$ coredumpctl gdb
[…]
Core was generated by `/usr/lib/electron25/electron --enable-features=WaylandWindowDecorations --ozone'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000055f022db3e9b in GURL::GURL(GURL const&) ()
[Current thread is 1 (Thread 0x7ff21b478180 (LWP 266760))]
(gdb) bt
#0 0x000055f022db3e9b in GURL::GURL(GURL const&) ()
#1 0x000055f022db3f3d in GURL::GURL(GURL const&) ()
#2 0x000055f021e2586a in content::SiteInfo::SiteInfo(content::SiteInfo const&) ()
#3 0x000055f021bc8be0 in content::ProcessLock::ProcessLock(content::ProcessLock const&) ()
#4 0x000055f021881397 in content::ChildProcessSecurityPolicyImpl::CanAccessDataForMaybeOpaqueOrigin(int, GURL const&, bool) ()
#5 0x000055f021884086 in content::ChildProcessSecurityPolicyImpl::CanAccessDataForOrigin(int, url::Origin const&) ()
#6 0x000055f02187d55f in content::ChildProcessSecurityPolicyImpl::Handle::CanAccessDataForOrigin(url::Origin const&) ()
#7 0x000055f0219654bd in content::DOMStorageContextWrapper::IsRequestValid(content::DOMStorageContextWrapper::StorageType, blink::StorageKey const&, absl::optional<base::TokenType<blink::LocalFrameTokenTypeMarker> >, content::ChildProcessSecurityPolicyImpl::Handle, base::OnceCallback<void (base::BasicStringPiece<char, std::char_traits<char> >)>) ()
#8 0x000055f021965344 in content::DOMStorageContextWrapper::OpenLocalStorage(blink::StorageKey const&, absl::optional<base::TokenType<blink::LocalFrameTokenTypeMarker> >, mojo::PendingReceiver<blink::mojom::StorageArea>, content::ChildProcessSecurityPolicyImpl::Handle, base::OnceCallback<void (base::BasicStringPiece<char, std::char_traits<char> >)>) ()
#9 0x000055f021e455e2 in non-virtual thunk to content::StoragePartitionImpl::OpenLocalStorage(blink::StorageKey const&, base::TokenType<blink::LocalFrameTokenTypeMarker> const&, mojo::PendingReceiver<blink::mojom::StorageArea>) ()
[…]
```

I can reproduce the copy constructor crash across multiple attempts.
For the record, other apps that require system `electron25`, for example `discord_arch_electron`, exhibit the same crash.

If I have built `electron-bin` and `electron25-bin` from the AUR, then running the following command lines fixes `discord_arch_electron`:

```
$ sudo pacman -R --assume-installed electron electron
$ sudo pacman -R --assume-installed electron25 electron25
$ sudo pacman -Sy --noconfirm electron-bin electron25-bin # The metapackage `electron-bin` is technically not needed. I’m still including it in this example, because Discord specifically requires it
$ sudo ln -s electron25 /usr/bin/electron # we need this temporarily, because `electron-bin` doesn’t seem to provide the symlink that some Electron apps expect
```

Even though Discord is now healed, `code` (latest) still crashes. However, it crashes differently. This time, gdb doesn’t unroll properly for some reason:

```
(gdb) bt
#0 0x000055f253d6575a in ()
#1 0xaaaaaaaaaaaaaa01 in ()
#2 0x0000000100000005 in ()
#3 0x6154c47a7b8598d3 in ()
#4 0x8397d1a06e95d111 in ()
#5 0x00000001aaaaaa00 in ()
#6 0xaaaaaaaa00000000 in ()
#7 0x0000000000000000 in ()
```

Re-installing Arch’s Electron packages brings back the copy constructor crash to `code` (latest) – and to Discord as well:

```
$ sudo pacman -R --assume-installed electron electron-bin
$ sudo pacman -R --assume-installed electron25 electron25-bin
$ sudo rm -f /usr/bin/electron # clean up the symlink we added before
$ sudo pacman -Sy --noconfirm electron electron25
```

```
# After running the app and watching it crash:
$ coredumpctl gdb
[…]
Core was generated by `/usr/lib/electron25/electron --enable-features=WaylandWindowDecorations --ozone'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000055d84a73fe9b in GURL::GURL(GURL const&) ()
[…]
```

Note that without the `--ozone-platform-hint=wayland` switch, every app works. Even `code` (latest) works, no matter if `electron25-bin` or Arch’s `electron25` is installed.
Comment by Claudia Pellegrino (Auerhuhn) - Friday, 18 August 2023, 17:15 GMT
tl;dr Installing code (latest) didn’t help. It just makes the crash bring its annoying cousin.
Comment by Claudia Pellegrino (Auerhuhn) - Monday, 25 September 2023, 11:56 GMT
Another user resubmitted the bug against Electron:

https://github.com/electron/electron/issues/39449
Comment by Caleb Maclennan (alerque) - Wednesday, 27 September 2023, 14:28 GMT
As far as I can make out this is a bug in Electron, and specifically in a version that is *almost* EOL. I'm happy to apply patches if somebody figures out something to apply to our packaging that would help make it work well with other packages like `code`, but otherwise I have to plans to even try hunting for the problem...
Comment by loqs (loqs) - Wednesday, 27 September 2023, 14:49 GMT
Does adding the parameter --disable-features=PartitionAllocBackupRefPtr help[1]?

[1] https://github.com/electron/electron/issues/39775#issuecomment-1735464518
Comment by Toolybird (Toolybird) - Friday, 27 October 2023, 21:37 GMT
> Does adding the parameter --disable-features=PartitionAllocBackupRefPtr help

Unfortunately not. I've just revisited the current state of play and the issue is still present under Plasma Wayland. Currently, the AUR -bin variant seems fine.

Luckily, "code" is now updated to depend on electron25 and the issue doesn't repro there. The only 2 apps left are "keybase-gui" and "wire-desktop". Seeing as this is basically unfixable at the Arch level, I'm closing this as "Deferred" until either someone comes up with a fix, or the pkg falls out of support and is dropped.

Loading...