A tale of cylinders and shadows

Like I wrote before, we at Collabora have been working on improving WebKitGTK+ performance for customer projects, such as Apertis. We took the opportunity brought by recent improvements to WebKitGTK+ and GTK+ itself to make the final leg of drawing contents to screen as efficient as possible. And then we went on investigating why so much CPU was still being used in some of our test cases.

The first weird thing we noticed is performance was actually degraded on Wayland compared to running under X11. After some investigation we found a lot of time was being spent inside GTK+, painting the window’s background.

Here’s the thing: the problem only showed under Wayland because in that case GTK+ is responsible for painting the window decorations, whereas in the X11 case the window manager does it. That means all of that expensive blurring and rendering of shadows fell on GTK+’s lap.

During the web engines hackfest, a couple of months ago, I delved deeper into the problem and noticed, with Carlos Garcia’s help, that it was even worse when HiDPI displays were thrown into the mix. The scaling made things unbearably slower.

You might also be wondering why would painting of window decorations be such a problem, anyway? They should only be repainted when a window changes size or state anyway, which should be pretty rare, right? Right, that is one of the reasons why we had to make it fast, though: the resizing experience was pretty terrible. But we’ll get back to that later.

So I dug into that, made a few tries at understanding the issue and came up with a patch showing how applying the blur was being way too expensive. After a bit of discussion with our own Pekka Paalanen and Benjamin Otte we found the root cause: a fast path was not being hit by pixman due to the difference in scale factors on the shadow mask and the target surface. We made the shadow mask scale the same as the surface’s and voilà, sane performance.

I keep talking about this being a performance problem, but how bad was it? In the following video you can see how huge the impact in performance of this problem was on my very recent laptop with a HiDPI display. The video starts with an Epiphany window running with a patched GTK+ showing a nice demo the WebKit folks cooked for CSS animations and 3D transforms.

After a few seconds I quickly alt-tab to the version running with unpatched GTK+ – I made the window the exact size and position of the other one, so that it is under the same conditions and the difference can be seen more easily. It is massive.

Yes, all of that slow down was caused by repainting window shadows! OK, so that solved the problem for HiDPI displays, made resizing saner, great! But why is GTK+ repainting the window even if only the contents are changing, anyway? Well, that turned out to be an off-by-one bug in the code that checks whether the invalidated area includes part of the window decorations.

If the area being changed spanned the whole window width, say, it would always cause the shadows to be repainted. By fixing that, we now avoid all of the shadow drawing code when we are running full-window animations such as the CSS poster circle or gtk3-demo’s pixbufs demo.

As you can see in the video below, the gtk3-demo running with the patched GTK+ (the one on the right) is using a lot less CPU and has smoother animation than the one running with the unpatched GTK+ (left).

Pretty much all of the overhead caused by window decorations is gone in the patched version. It is still using quite a bit of CPU to animate those pixbufs, though, so some work still remains. Also, the overhead added to integrate cairo and GL rendering in GTK+ is pretty significant in the WebKitGTK+ CSS animation case. Hopefully that’ll get much better from GTK+ 4 onwards.

WebKitGTK+ hackfest 5.0 (2013)!

For the fifth year in a row the fearless WebKitGTK+ hackers have gathered in A Coruña to bring GNOME and the web closer. Igalia has organized and hosted it as usual, welcoming a record 30 people to its office. The GNOME foundation has sponsored my trip allowing me to fly the cool 18 seats propeller airplane from Lisbon to A Coruña, which is a nice adventure, and have pulpo a feira for dinner, which I simply love! That in addition to enjoying the company of so many great hackers.

Web with wider tabs and the new prefs dialog
Web with wider tabs and the new prefs dialog

The goals for the hackfest have been ambitious, as usual, but we made good headway on them. Web the browser (AKA Epiphany) has seen a ton of little improvements, with Carlos splitting the shell search provider to a separate binary, which allowed us to remove some hacks from the session management code from the browser. It also makes testing changes to Web more convenient again. Jon McCan has been pounding at Web’s UI making it more sleek, with tabs that expand to make better use of available horizontal space in the tab bar, new dialogs for preferences, cookies and password handling. I have made my tiny contribution by making it not keep tabs that were created just for what turned out to be a download around. For this last day of hackfest I plan to also fix an issue with text encoding detection and help track down a hang that happens upon page load.

Martin Robinson and Dan Winship hack
Martin Robinson and Dan Winship hack

Martin Robinson and myself have as usual dived into the more disgusting and wide-reaching maintainership tasks that we have lots of trouble pushing forward on our day-to-day lives. Porting our build system to CMake has been one of these long-term goals, not because we love CMake (we don’t) or because we hate autotools (we do), but because it should make people’s lives easier when adding new files to the build, and should also make our build less hacky and quicker – it is sad to see how slow our build can be when compared to something like Chromium, and we think a big part of the problem lies on how complex and dumb autotools and make can be. We have picked up a few of our old branches, brought them up-to-date and landed, which now lets us build the main WebKit2GTK+ library through cmake in trunk. This is an important first step, but there’s plenty to do.

Hackers take advantage of the icecream network for faster builds
Hackers take advantage of the icecream network for faster builds

Under the hood, Dan Winship has been pushing HTTP2 support for libsoup forward, with a dead-tree version of the spec by his side. He is refactoring libsoup internals to accomodate the new code paths. Still on the HTTP front, I have been updating soup’s MIME type sniffing support to match the newest living specification, which includes specification for several new types and a new security feature introduced by Internet Explorer and later adopted by other browsers. The huge task of preparing the ground for a one process per tab (or other kinds of process separation, this will still be topic for discussion for a while) has been pushed forward by several hackers, with Carlos Garcia and Andy Wingo leading the charge.

Jon and Guillaume battling code
Jon and Guillaume battling code

Other than that I have been putting in some more work on improving the integration of the new Web Inspector with WebKitGTK+. Carlos has reviewed the patch to allow attaching the inspector to the right side of the window, but we have decided to split it in two, one providing the functionality and one the API that will allow browsers to customize how that is done. There’s a lot of work to be done here, I plan to land at least this first patch durign the hackfest. I have also fought one more battle in the never-ending User-Agent sniffing war, in which we cannot win, it looks like.

Hackers chillin' at A Coruña
Hackers chillin’ at A Coruña

I am very happy to be here for the fifth year in a row, and I hope we will be meeting here for many more years to come! Thanks a lot to Igalia for sponsoring and hosting the hackfest, and to the GNOME foundation for making it possible for me to attend! See you in 2014!

Google’s User-Agent sniffing makes one more victim

Remember when I said Epiphany worked out of the box with Youtube’s WebM? Well, Google has recently decided to deny us WebM, like it did before with Wave, the Pacman doodle, and who knows what else? \o/

Wouldn’t it be nice if Google practiced what they preach?

Update: so it looks like my message went through to the people who needed to see it, and they found a filtering error in the User Agent sniffing code that made it think Epiphany was a too old Safari – I’m told the change will land in Youtube soon, thanks for those paying attention, and working on this! User Agent sniffing keeps being a problem, of course, and there are other stuff to fix, so I will probably still push my patch to spoof the user agent to google services which are still mishandling Epiphany, but it’s good to see some progress being made!

Update2: I started shipping a patch to send the Chrome user agent string to google domains in the Debian package for WebKitGTK+, when the “enable-site-specific-quirks” setting is enabled (which is the case for Epiphany); I already found something we were missing out on =D Google Images seems to have been greatly improved, and now faking being Chrome we are also able to enjoy it:

Google Images improved

WebKitGTK+ and the Web Inspector

When I started working on WebKitGTK+ I was a web developer, writing IT applications using Python and Django, and building features for content portals running Plone (argh). Even though I was an Epiphany user ever since it was forked off Galeon, I still had to use Firefox for my work, because I couldn’t really live without Firebug.

It should come as no surprise, then, that one of my first patches to WebKitGTK+ was actually making the awesome Web Inspector work in our port. After the initial support, though, not a lot has been done to further improve it, partly because it was already good enough for many uses, partly because I somehow started doing non-web development again ;).

These last weeks, through my R&D efforts in Collabora, I have been able to push Web Inspector features and integration a bit further. A simple change that boosts the Inspector’s usability quite a bit is having the nodes that are being hovered highlighted. Along with that, the ability to attach the inspector to Epiphany’s window should make it easier to use for poking the DOM.

The Web Inspector has a number of settings that control its behaviour. Since, for instance, enabling javascript debugging may slow down javascript performance, the inspector usually has it disabled by default, and provides a button to enable it. It also provides an option for always enabling that feature, but that does not work right now, because we are not saving/restoring the relevant settings. A solution to that is in the works using the GSettings infrastructure that was recently merged into glib.

Here’s a simple screencast, showing these improvements in action (click the video to check it out in full size):

WebKitGTK+ and WebM

So you probably heard about WebM, right? It’s the awesome new media format being pushed by Google and a large number of partners, including Collabora, following the release of the VP8 video codec free of royalties and patents, along with a Free Software implementation.

It turns out that if you are a user or developer of applications that use the GStreamer framework, you can start taking advantage of all that freedom right away! Collabora Multimedia has developed, along with Entropy Wave GStreamer support for the new format, and the code has already landed in the public repositories, and is already being packaged for some distributions.

I just couldn’t wait the few days it will take for the support to be properly landed in Debian unstable, so I went ahead and downloaded all of the current packages from the pkg-gstreamer svn repository, built everything after having the libvpx-dev package installed, and went straight to a rather unknown, small video site called Youtube with my GStreamer-powered WebKitGTK+-based browser, Epiphany!:

Youtube showing a webm video in Epiphany

If you’re running Debian unstable, or any of the other distributions which will be lucky to get the new codecs, and support packages soon, you should be able to get this working out of the box real soon now. Check the tips on WebM’s web site on how to find WebM videos on youtube.

WebKit2 and WebKitGTK+

So you’ve seen people talking about WebKit2, perhaps have seen someone claiming it “drops support for Linux“, and you’ve been wondering what the hell that means for WebKitGTK+. Well, welcome to the preemptive Q&A section with WebKitGTK+ maintainers =D. Let’s first explore some history so we can better understand what exactly is going on.

What exactly is WebKit2?

Currently, when we say “WebKit” we really mean one of the ports that are built on top of WebCore using the WebKit layer. WebCore is the part that does all of the hard Web-related work, WebKit an API layer that exposes WebCore functionality in a coherent way, so that the platform-specific ports can expose a public API layer for their applications to use, which is usually also called “WebKit”. This WebKit layer was designed by Apple to build the Mac, and Windows ports it maintains, and was later released as Free Software so that other ports, such as the GTK+, Qt, EFL ports could be built on top of it, instead of having to do all the heavy lifting from WebCore directly.

Current WebKit model

WebKit2 is nothing more than the second version of that interface, with a whole lot of changes on what you can expect from it, and on how it interacts with WebCore, and the platform-specific API and UI. First of all, the first WebKit was not API stable, and that interface was usually not made public by the various ports – they only exposed their platform-specific APIs. WebKit2 is being designed to provide a stable, cross-platform, C-based, non-blocking public API. This is huge. It will allow cross-platform code to be written without having to consider language, and port differences for basic functionality.

The second big change is the API will be made fully non-blocking. Currently most things you do are asynchronous already, but some of them may be completed in a synchronous ways (like, loading a string into WebKit instead of an URI). This is important for responsiveness, and is also a very important need for what comes next: process splitting.

WebKit2 will bring into WebKit proper the concept of splitting the UI process from the Web process, similar to what Chromium has. It also much more awesome than what Chrome has for a large number of reasons, including, but not limited to:

  1. It’s being contributed directly to the WebKit project, in a cross-platform way that lets ports such as WebKitGTK+ take advantage of it, instead of being shipped directly into Safari, like Google does with Chrome;
  2. The process separation goes bellow the API layer, meaning that all complexity involved in managing the process separation is handled by the library, and hopefully none of it leaks to the application using it; that means that applications like Devhelp and Yelp will be able to take advantage of this without having to make their lives more complicated;
There’s a much better diagram in WebKit2’s wiki page, but here goes a simplified version that demonstrates what I’m talking about:
WebKit2 model

What WebKit2 is not?

WebKit2 is NOT a rewrite of the whole WebKit stack. Webcore will continue mostly unchanged, and all ports currently building on top of it will keep working. It is also not a fork – the code lives in the same tree as the current version of WebKit, which will allow us to progressively move towards using this new, improved layer. WebKit2 is not Apple-only, and it is not dropping Linux support. Initial builds of the code that is being landed will likely show up building on Linux in the near future (specially because us porters are already eager to play with it).

What happens to WebKitGTK+?

In the near future, nothing special. We will continue working towards making it feature-complete, more stable, faster, and rocking on it as always. We will, though, start working out how we can best take advantage of WebKit2 in order to provide an even more awesome library for the G world. What this means is you can expect us to have a library that will provide a nice GTK+ widget, just like we have today, with a GObject-based API, like we have today, but that is built on top of this new WebKit2 infrastructure, taking advantage of the process-splitting, and the bigger focus on not blocking the UI thread. This should give us a platform that is more stable, and faster and more responsive than what we already have today.

The API is bound to change, of course, but the WebKit2 version of WebKitGTK+ will be a separate, parallel-installable library, and we will keep supporting the WebKit1 version while we work on making the new one at least as good as the current one. This is long term we’re talking here. We’ll likely see WebKitGTK+ 1.4, and 1.6 come to life before we are satisfied enough with WebKitGTK+2.

We hope this clears some of the doubts up, and lightens your hearts!

The WebKitGTK+ maintainers.

WebKitGTK+ 1.1.90 is out!

We’re coming close to GNOME 2.30 release date, and we are getting ready to branch a stable release off of WebKit’s svn trunk in preparation for that. The idea of the stable branch is to try to maintain, and improve stability, with no additional features going in. Speaking of features, though, if you’ve been paying attention you will have noticed WebKitGTK+ has come a long way, now.

We came from not having basic features such as download support or openning links in new tabs, a more-or-less working HTML5 media implementation, and very few or missing in action developers to a thriving project, that gets more, and more attention, and contributors every day, with advanced features available, and rocking HTML5 media support that leaves little to be desired. It’s been just over one year since we started rolling mostly bi-weekly releases, each adding more awesome features.

There are still many issues, and we are not always equipped as a team to handle all the specifics of the engine ourselves, but I am really happy with the progress we’ve made, and really thankful for the support my employer Collabora has given all the way for this to happen, including the early work on plugins, and many other things before my time as a contributor. When I switched to using Epiphany with the WebKit backend as my default browser back in January 2009, that meant having to deal with a whole lot of misbehaviour, and work-around a lot of painful brokeness. These days I enjoy a snappy, functional browser that makes me happy.

If you haven’t done so yet, go download, and test the newest Epiphany, with the latest WebKitGTK+, and help us make the GNOME 2.30’s web browser rock even more!

WebKitGTK+, and the Page Cache

So, one of the things I get to do during work hours for Collabora is to contribute code, and do maintenance tasks for WebKitGTK+, and have been doing so since early last year, working on all kinds of things, from improving the network backend to handle the real-world web, to fixing scrolling problems, while reviewing patches from the many awesome developers who have been joining us (more on that later =D).

One of the big features I have worked on this past month or so, along with Xan Lopez is the Page Cache. The page cache is a feature of web browsers that makes going back, and forward between pages in the same view very fast. It’s better explained in this post, but to summarize, the idea is that instead of destroying all the work you have done since downloading the resources, and having to reparse/rebuild the structures the view uses to display the page from the cached resources, you hit pause on the page, and store the whole thing as is, and when coming back to it, you just hit play. You can see in the video two instances of Epiphany, one with the page cache enabled, one with it disabled. Easy to see which was has it enabled. Thanks to KiBi for the suggestion regarding a page that shows this easily =D.

We initially thought we had this feature enabled, since our initialization functions (that exists since before the current maintainers were involved) did setup the number of desired pages in the cache, but during the hackfest we held in December we found out we were fooled all this time. Enabling the page cache does make going back faster, but also made lots of things become unstable and crash.

Since then, we have been working on figuring out all the problems, and fixing them, using help from adventurous users of in-development software ;D. I believe we’re now at a point in which I can happily declare the GTK+ port has a working page cache in trunk! If you’re interested in the nasty details, bear with me!

Let me go back in time a bit, and show you what problems we had. First, some background: the GTK+ port deviates a lot from the other ports when it comes to scrolling. This is because, when designing this part of the port, Holger Freyther had a very nice idea in mind: that the WebView should be a first-class citizen GTK+ scrollable widget. Meaning it would use GTK+’s adjustments for scrolling, and be able to interact with any parent scrolling widget, be it a GtkScrolledWindow, or a MokoFingerScroll.

We cannot just throw away all the rest of the scrolling code in WebCore, though, that deals with all the details related to interacting with the DOM, and JavaScript code. This means our WebView contains adjustments that need to be set, and unset on our port’s version of WebKit’s own representation of the view, called the FrameView, to interact with it, and to get updates on the bounds of content, and such. For every load, in the non-page-cache case, a new FrameView is created, the previous one is destroyed – this means we need to set the adjustments on every load.

The problem starts when you have the page cache enabled, because the code path used to do what is called “commit” the load of a cached page (that is, start replacing the content that is currently being displayed by the one that should now be displayed) is completely different, and we were not setting the adjustments on this new view, so we started with that.

But all was not well. We were still having weird behaviour with scrollbars disappearing, and becoming the wrong size, and worse, crashes when “back” was hit. We then started investigating in more detail how it is that the page cache does its magic, to try and figure out the source of all evil.

It turns out that when you leave a page that can be cached, the existing FrameView is no longer destroyed – it is stored as is in a CachedFrame to be restored if you go back, and a new one is created for the new page. This was having the undesired effect of having the adjustment be set in more than one FrameView at once, causing all kinds of (predictable, after we knew for real what was going on) unwanted effects. Thus, we reworked the code to make sure the adjustments are only ever set in one FrameView at once, making sure they are unset when the FrameView is being frozen, and reset when it’s being restored from the page cache.

Last, but not least, it was discovered that going back to a page that contained resources with data: URIs (such as Google results pages which contain a small number of image hits) also caused a crash. This was because our network backend was not storing the data: URI in the ResourceResponse objects it fed into WebCore. The page cache relies on those responses to recreate the requests it uses to artificially replay the load when restoring the page from the page cache, so we fixed that as well.

What can be taken from all this? Building browsers is a lot of hard work! I can’t think how we could deal with this level of complexity without the awesome testing suite of WebKit. The good news is all of those issues I talked about in this post are now covered by the automate tests that run as part of the normal buildbot cycle in our bots, so we’re covered for the future, at least for these specific problems =D.

Cool hack – html5tube

Did I mention I hate flash? I do. It crashes a lot, and is overall a bad thing for the web, in my opinion. But I do enjoy watching videos on the web, and unfortunately, up to this day, flash is what most sites use to show videos. Months ago I read a couple of blog posts with nice hacks to make Firefox able to play youtube videos without using the flash player. Some recent discussions with colleagues at work got me itching to try my hand at something similar for Epiphany.

HTML5Tube working

So I went ahead, and wrote a new extension that does just that – in youtube video pages, it finds the flash player element, and replaces it with an HTML5 video tag pointing to the actual movie file. This causes the internal HTML5 media player built into WebKitGTK+, that is based on GStreamer, to play the movie. That means you only need to have the necessary GStreamer magic, and the extension enabled, to enjoy the movie.

There are some caveats – in-video text messages are gone (though I’m pretty sure we could get them added somehow), playlists, and other places which display videos other than the ‘normal’ video watching page are not handled, youtube needs to think you have the flash plugin installed, at least, so the only way to make it work right now is to actually have a flash plugin installed. I think we could probably get away with the last problem somehow, by looking at what the totem youtube plugin code does, for instance, and replicating it.

WebKitGTK+ HackFest!

The WebKit hackfest is now over, and I think it was a very productive week. Thank you very much to all who attended, to Igalia for organizing the hackfest, and hosting us so well, to Collabora for having sponsored the event, and allowed me to spend the week working on it, and to the GNOME foundation for having payed all of my costs!

Xan blogged about day 0, and also a summary of all that was done, so I’ll focus on the stuff he forgot to mention ;D. The hackfest, for me, started on day -1 with me not allowing Xan to go sleep before he had reviewed a couple patches of mine to fix DOM context menu handling. It always bothered me that Epiphany failed to open right-click menus in some pages, or let pages handle the right click. Well, this is fixed now, and Zimbra users can now have their right click menus, and WoW players can remove talent points from their calculators =P.

It turns out that many of the attendees don’t like pages messing with their context menus, though, and they had some good points to back up their positions (like pages making it hard to save images, for instance), so I implemented a way to force openning the custom menu: Ctrl-rightclick.

We wanted to use a GtkInfoBar to display questions regarding the form saving – our initial implementation always saved all credentials, but that didn’t sound good enough. Xan and I thought it would be very complicated to make this work, because there were assumptions in the code regarding which widget contains which, but it turned out to be quite trivial – making EphyEmbed a descendant of GtkVBox instead of GtkScrolledWindow, fixing a small number of assumptions, and that was it.

The passwords are saved in the GNOME Keyring. It’s interesting to point out that GNOME Keyring seems to be unhappy with the number of passwords a browser stores – Xan’s daemon was hanging, crashing, and spawning a large number of threads. My daemon decided to take up some 300MB of RAM at one point. It’s somewhat funny to see how much a browser pushes the limits of our platform. We are hoping this will improve with the new keyring APIs, and the rewrite that is ongoing. It’s nice to see my browser form passwords in seahorse, though, and be able to manage them like any other.

One more thing worth of notice, although this post is already a bit too big: one of the main concerns people had during the Hackfest was on making build time smaller. Touching a single file in WebCore causes a debug build of 10 minutes on my laptop. Evan Martin and Benjamin Otte made a push at removing unnecessary includes from WebCore, and WebKitGTK+ files, which brough the build time down a bit. They end up inspiring Aroben, from Apple, to go even further into this, and remove many includes from files all over WebKit.

Evan was also able to bring linking time down by making it possible to link libwebkit without having to build all the intermediate libraries, which brought build time down to 1 minute, when touching a single file in WebCore. Behdad and I also started looking into breaking WebCore up into lots of shared libraries for Debug builds, since we don’t care too much about speed penalties in those. None of these experiments got committed yet, but I am hoping we will be having a better time hacking on WebKitGTK+ in the near future.

It was awesome meeting everyone, by the way! See you around =).