A tale of cylinders and shadows

Like I wrote before, we at Collabora have been working on improving WebKitGTK+ performance for customer projects, such as Apertis. We took the opportunity brought by recent improvements to WebKitGTK+ and GTK+ itself to make the final leg of drawing contents to screen as efficient as possible. And then we went on investigating why so much CPU was still being used in some of our test cases.

The first weird thing we noticed is performance was actually degraded on Wayland compared to running under X11. After some investigation we found a lot of time was being spent inside GTK+, painting the window’s background.

Here’s the thing: the problem only showed under Wayland because in that case GTK+ is responsible for painting the window decorations, whereas in the X11 case the window manager does it. That means all of that expensive blurring and rendering of shadows fell on GTK+’s lap.

During the web engines hackfest, a couple of months ago, I delved deeper into the problem and noticed, with Carlos Garcia’s help, that it was even worse when HiDPI displays were thrown into the mix. The scaling made things unbearably slower.

You might also be wondering why would painting of window decorations be such a problem, anyway? They should only be repainted when a window changes size or state anyway, which should be pretty rare, right? Right, that is one of the reasons why we had to make it fast, though: the resizing experience was pretty terrible. But we’ll get back to that later.

So I dug into that, made a few tries at understanding the issue and came up with a patch showing how applying the blur was being way too expensive. After a bit of discussion with our own Pekka Paalanen and Benjamin Otte we found the root cause: a fast path was not being hit by pixman due to the difference in scale factors on the shadow mask and the target surface. We made the shadow mask scale the same as the surface’s and voilà, sane performance.

I keep talking about this being a performance problem, but how bad was it? In the following video you can see how huge the impact in performance of this problem was on my very recent laptop with a HiDPI display. The video starts with an Epiphany window running with a patched GTK+ showing a nice demo the WebKit folks cooked for CSS animations and 3D transforms.

After a few seconds I quickly alt-tab to the version running with unpatched GTK+ – I made the window the exact size and position of the other one, so that it is under the same conditions and the difference can be seen more easily. It is massive.

Yes, all of that slow down was caused by repainting window shadows! OK, so that solved the problem for HiDPI displays, made resizing saner, great! But why is GTK+ repainting the window even if only the contents are changing, anyway? Well, that turned out to be an off-by-one bug in the code that checks whether the invalidated area includes part of the window decorations.

If the area being changed spanned the whole window width, say, it would always cause the shadows to be repainted. By fixing that, we now avoid all of the shadow drawing code when we are running full-window animations such as the CSS poster circle or gtk3-demo’s pixbufs demo.

As you can see in the video below, the gtk3-demo running with the patched GTK+ (the one on the right) is using a lot less CPU and has smoother animation than the one running with the unpatched GTK+ (left).

Pretty much all of the overhead caused by window decorations is gone in the patched version. It is still using quite a bit of CPU to animate those pixbufs, though, so some work still remains. Also, the overhead added to integrate cairo and GL rendering in GTK+ is pretty significant in the WebKitGTK+ CSS animation case. Hopefully that’ll get much better from GTK+ 4 onwards.

8 thoughts on “A tale of cylinders and shadows”

  1. Nice work. So will these fixes help all GTK application? Will they show up in next GTK 3.22.x point release, or will we have to wait until 3.24?

    1. Yes, all GTK+ applications benefit at least from the improved performance while resizing!

      There will be no 3.24 as the next GTK+ version is 4. It might change things quite a bit, but both fixes are there. If those code paths survive the fixes will be present =).

      The improved performance for shadow rendering is already in 3.22.4, the other fix should be in 3.22.5 which should come out at some point.

  2. Pixman… This means that the shadows are calculated on CPU side, instead of GPU – which seems entirely wrong from the start, and actually provides a pretty strong reason why client side rending of Window decorations is plain wrong.

  3. In a similar vein – do most gtk apps paint a white background first and then draw on it ?

    I’ve been in a lot of situations (e.g. long running vms or going into hibernation and coming out again) where it would seem this way.

    Has any work been done to test the draw efficiency of whole apps (at least the most popular ones?).

    1. It’s not a white background, but indeed it does paint a background for most widgets. I don’t think there is a continued effort of testing draw efficiency of apps, unfortunately, only when performance issues like this one are detected.

  4. Thanks. I wondered about the lags, especially with Wayland. And nice to see some benefits as side-effect of Apertis, I remember the “hard” discussion on GUADEC.

    don’t think there is a continued effort of testing draw efficiency of apps, unfortunately, only when performance issues like this one are detected.

    Really!? I thought one of the very reasons for the demo-applications was and is testing of performance?

    1. Well, the demo applications make it easier to test performance, but:

      1) they not always replicate the complexity and corner cases hit by real applications

      2) there is no automated testing keeping historical data

      3) there are no good metrics for automatically detecting performance degradation

      If there is no automatic testing and good metrics, it’s much more likely you’ll only notice performance problems when you hit the issues in the real world.

      These last few days I’ve seen discussion in GTK+’s IRC channel about coming up with metrics and running perf tests for GTK 4, so that is encouraging =)

Leave a Reply to kov Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>