WebKitGTK+ 1.1.90 is out!

We’re coming close to GNOME 2.30 release date, and we are getting ready to branch a stable release off of WebKit’s svn trunk in preparation for that. The idea of the stable branch is to try to maintain, and improve stability, with no additional features going in. Speaking of features, though, if you’ve been paying attention you will have noticed WebKitGTK+ has come a long way, now.

We came from not having basic features such as download support or openning links in new tabs, a more-or-less working HTML5 media implementation, and very few or missing in action developers to a thriving project, that gets more, and more attention, and contributors every day, with advanced features available, and rocking HTML5 media support that leaves little to be desired. It’s been just over one year since we started rolling mostly bi-weekly releases, each adding more awesome features.

There are still many issues, and we are not always equipped as a team to handle all the specifics of the engine ourselves, but I am really happy with the progress we’ve made, and really thankful for the support my employer Collabora has given all the way for this to happen, including the early work on plugins, and many other things before my time as a contributor. When I switched to using Epiphany with the WebKit backend as my default browser back in January 2009, that meant having to deal with a whole lot of misbehaviour, and work-around a lot of painful brokeness. These days I enjoy a snappy, functional browser that makes me happy.

If you haven’t done so yet, go download, and test the newest Epiphany, with the latest WebKitGTK+, and help us make the GNOME 2.30’s web browser rock even more!

WebKitGTK+, and the Page Cache

So, one of the things I get to do during work hours for Collabora is to contribute code, and do maintenance tasks for WebKitGTK+, and have been doing so since early last year, working on all kinds of things, from improving the network backend to handle the real-world web, to fixing scrolling problems, while reviewing patches from the many awesome developers who have been joining us (more on that later =D).

One of the big features I have worked on this past month or so, along with Xan Lopez is the Page Cache. The page cache is a feature of web browsers that makes going back, and forward between pages in the same view very fast. It’s better explained in this post, but to summarize, the idea is that instead of destroying all the work you have done since downloading the resources, and having to reparse/rebuild the structures the view uses to display the page from the cached resources, you hit pause on the page, and store the whole thing as is, and when coming back to it, you just hit play. You can see in the video two instances of Epiphany, one with the page cache enabled, one with it disabled. Easy to see which was has it enabled. Thanks to KiBi for the suggestion regarding a page that shows this easily =D.

We initially thought we had this feature enabled, since our initialization functions (that exists since before the current maintainers were involved) did setup the number of desired pages in the cache, but during the hackfest we held in December we found out we were fooled all this time. Enabling the page cache does make going back faster, but also made lots of things become unstable and crash.

Since then, we have been working on figuring out all the problems, and fixing them, using help from adventurous users of in-development software ;D. I believe we’re now at a point in which I can happily declare the GTK+ port has a working page cache in trunk! If you’re interested in the nasty details, bear with me!

Let me go back in time a bit, and show you what problems we had. First, some background: the GTK+ port deviates a lot from the other ports when it comes to scrolling. This is because, when designing this part of the port, Holger Freyther had a very nice idea in mind: that the WebView should be a first-class citizen GTK+ scrollable widget. Meaning it would use GTK+’s adjustments for scrolling, and be able to interact with any parent scrolling widget, be it a GtkScrolledWindow, or a MokoFingerScroll.

We cannot just throw away all the rest of the scrolling code in WebCore, though, that deals with all the details related to interacting with the DOM, and JavaScript code. This means our WebView contains adjustments that need to be set, and unset on our port’s version of WebKit’s own representation of the view, called the FrameView, to interact with it, and to get updates on the bounds of content, and such. For every load, in the non-page-cache case, a new FrameView is created, the previous one is destroyed – this means we need to set the adjustments on every load.

The problem starts when you have the page cache enabled, because the code path used to do what is called “commit” the load of a cached page (that is, start replacing the content that is currently being displayed by the one that should now be displayed) is completely different, and we were not setting the adjustments on this new view, so we started with that.

But all was not well. We were still having weird behaviour with scrollbars disappearing, and becoming the wrong size, and worse, crashes when “back” was hit. We then started investigating in more detail how it is that the page cache does its magic, to try and figure out the source of all evil.

It turns out that when you leave a page that can be cached, the existing FrameView is no longer destroyed – it is stored as is in a CachedFrame to be restored if you go back, and a new one is created for the new page. This was having the undesired effect of having the adjustment be set in more than one FrameView at once, causing all kinds of (predictable, after we knew for real what was going on) unwanted effects. Thus, we reworked the code to make sure the adjustments are only ever set in one FrameView at once, making sure they are unset when the FrameView is being frozen, and reset when it’s being restored from the page cache.

Last, but not least, it was discovered that going back to a page that contained resources with data: URIs (such as Google results pages which contain a small number of image hits) also caused a crash. This was because our network backend was not storing the data: URI in the ResourceResponse objects it fed into WebCore. The page cache relies on those responses to recreate the requests it uses to artificially replay the load when restoring the page from the page cache, so we fixed that as well.

What can be taken from all this? Building browsers is a lot of hard work! I can’t think how we could deal with this level of complexity without the awesome testing suite of WebKit. The good news is all of those issues I talked about in this post are now covered by the automate tests that run as part of the normal buildbot cycle in our bots, so we’re covered for the future, at least for these specific problems =D.

Regressions, ah, regressions

There are few things I really hate. One of them is regressions. Regressions are bad because they usually take away things we are used to rely on, and leave us with the idea that perhaps the technical improvements didn’t really improve our lifes as a user, despite putting less burden on the developers. Software is made for users, after all.

As part of my work on WebKitGTK+, I always keep an eye on regressions, both from previous WebKitGTK+ releases, and those imposed on embedding applications on their migration away from Gecko, and try to focus some of my efforts into lowering their numbers, whenever I can.

In recent times I have worked on removing a few very user-visible regressions in Epiphany, which I see as the most demanding WebKitGTK+ user in GNOME, such as save page not working, missing
favicon support, failing to
perform server-pushed downloads (such as GMail attachments), and not being able to view source. An example of a regression from a previous version of WebKit also exists: in 1.1.17 we started advertising more than we should as supported by the HTML5 media player, causing download to be almost completely broken.

All of these are working if you are using WebKit and Epiphany from trunk/master, so should be on the next development versions of WebKitGTK+ and Epiphany. Other people have also fixed many other regressions; a few examples: Xan has reimplemented the Epiphany customization of the context menu, Frederic Peters provided a work-around for mailto: links while we don’t have SoupURILoader yet, and Joanmarie Diggs keeps rocking on the accessibility front!

If you find regressions, keep them coming! If you have a patch, even better! =)

Next week WebKitGTK+ team gets together to work furiously on improving WebKitGTK+ in a hackfest sponsored by Collabora, and Igalia, and hosted/organized by Igalia. While there I should also get my hands on one of these. Can’t wait! =)

My first patch to WebKitGTK+ committed!

Well, not really my first patch. But the first thing I tried to mess with when I first started looking at WebKitGTK+ was the WebKitNetworkRequest object, because I was fancing the idea of writing stuff such as HTTP transactions monitoring, and things like that. So I wrote a big patch which exposed the internal WebCore object (ResourceRequest) fully through our own object. That was back in early 2008. We have come a long way since, and through all these months I got a broader perception of what kind of APIs we need, and how WebCore works. We also decided on going soup-only, which had a huge impact on what the final patch actually looks like.

The patch which finally got committed this week is, how can I put it, VERY different from what I had originally written. You can take a look at the long discussions about it in the bug report I used to track progress. I think I should point out that Marco Barisione and Christian Dywan were crucial in helping me get going with my contribution to WebKit at that time.

What this change gives us is basically the fact that a WebKitNetworkRequest now carries more than just the URI for the request (it actually carries with it a reference to the SoupMessage that will be used later in the request processing, which we are planning to expose in the near future), meaning that when WebKit API gives you a request, and you use it to cause a new load (for, say, opening in a new tab), you still get all the headers that were supposed to go with the request, so you don’t lose things such as, for instance, Referer. So, now, after more than 5 years, the bug that complained that Epiphany did not set Referer (and Galeon before that) for new tabs is finally closed.

By the way, this problem has been fixed for Mozilla’s browser back in 2002, but the embedding API is still buggy up to now. There is still hope, since there’s an attached patch that fixes the issue to be reviewed, and landed. If anyone is reading, it might be a good oportunity to get this fixed in there as well, so that users of applications that use Gecko’s embedding API can also benefit!

fish, uma shell para o ano 2000+

Poucas são as coisas que eu alterei na minha lista de aplicações preferidas desde que comecei a usar Debian com GNOME. Algumas novas aplicações que entraram no meu cardápio diário o fizeram em razão de upgrades de hardware no meu computador: eu usava um 486 em 1999, e não conseguia rodar a maioria das aplicações que eu queria rodar.

O resto foi basicamente seguir as estradas de aplicações que se tornavam as melhores dentre seus pares no mundo Debian/GNOME – Rhythmbox, Epiphany, Evolution, Tomboy, GNOME Terminal, Empahy, etc. Algumas também me tocam sempre o coração, e raramente senti vontade de mudar: bash, Emacs entre elas.

Acontece que ultimamente eu tenho experimentado coisas mais modernas, que não tenham que se preocupar com dar suporte a unices que não me interessam e que morreram há mais de 10 anos. Uma das coisas que achei nesse sentido foi o fish. Ele é tão bom quanto o bash, muito menor, separa bem melhor funcionalidades em outros programas e foi feito para os tempos atuais.

Dentre outras coisas, o fish sabe se comunicar com a clipboard do X, sabe lidar bem com GNU screen, sabe usar a base de dados MIME do FreeDesktop (com o comando open). Ele dá informações mais completas e descritivas em coisas como tab completion, e tem uma sintaxe ao mesmo tempo similar o suficiente à do bash para que você não se perca em bobagens, mas mais simples e elegante.

O fish também não emula bugs antigos de shells limitadas como a Bourne Shell. Por exemplo, em bash você tem que se preocupar com se o conteúdo da variável tem espaço ou não. Isso é uma limitação emulada, que vem da Bourne Shell. Um exemplo:

No bash:


kov@abacate:/tmp$ file="a b.txt"
kov@abacate:/tmp$ echo $file
a b.txt
kov@abacate:/tmp$ ls $file
ls: cannot access a: No such file or directory
ls: cannot access b.txt: No such file or directory

No fish:


kov@abacate /tmp> set file "a b.txt"
kov@abacate /tmp> echo $file
a b.txt
kov@abacate /tmp> ls $file
a b.txt

Ver coisas como o fish me faz lembrar que nós estamos no ano 2008. Está na hora de deixar algumas coisas para trás, e evoluir. Nós ainda estamos preocupados em quebrar linha em 80 colunas, criar arquivos ChangeLog, e outras coisas que eram necessárias antigamente e que hoje só fazemos por pura tradição. Hoje nós temos editores poderosos (alguns até ‘antigos’, como o Emacs, que eu ainda amo de paixão e não me vejo deixando tão cedo hehe) e ferramentas de controle de versão muito poderosos. Então vamos em frente.

Diversão com expressões regulares

Hoje meu amigo Metal estava com um probleminha pra escrever um mapeamento de URL para o Django. O Django usa uma idéia que eu acho meio estúpida, que é mapear expressões regulares para métodos, com partes capturadas sendo transformadas em argumentos. Flexível, mas muito propensa a erros e mais complexa do que eu acho que mapeamento de URLs tem que ser.

De qualquer forma, o método dele tinha de receber um endereço de email como único argumento. O mapeamento que ele estava usando era o seguinte:

  • (r’^login/(?P<e>.*)/?$’, ‘projeto.aplicacao.views.login’)

Isso fazia com que a variável ‘e’ do método login recebesse o endereço de email, quando o usuário acessasse a URL /login/endereco@deemail.com. O problema é que algumas vezes a URL acessada é /login/endereco@deemail.com/ (note a barra no final). “Ah, mas tem aquela barra depois do parentese, com o ponto de interrogação que serve justamente pra esse caso”, dirá você… e eu já te digo que se a barra estiver presente ela vai ser capturada para a variável ‘e’! Bug? Nah, só uma peculiaridade de como funcionam as expressões regulares.

Entender duas coisas simples faz você subir de nível em expressões regulares:

  1. ERs são batidas caractere a caractere
  2. * e + são gulosos, e vão comer tudo que você der pra eles

Um exemplo simples; considere o seguinte texto:

"a" b "c"

Se você aplicar a esse texto a seguinte expressão regular: “.*” o que você acha que vai bater? Vejamos? Note que as aspas duplas fazem parte da ER; as aspas simples são só pra impedir o shell de tentar expandir os caracteres especiais.

$ echo ‘”a” b “c”‘ | egrep –colour ‘”.*”‘
“a” b “c”

O egrep coloriu o texto inteiro; isso aconteceu porque a ER foi batida da seguinte forma:

  1. pega o primeiro caractere da ER, ; bate com o primeiro caractere da string, ? bate
  2. pega o segundo caractere da ER, ., ou seja, qualquer caractere; bate com o segundo caractere da string, a? bate
  3. pega o terceiro caractere da ER, *, opa, é só repetir o último o tanto que der agora…; bate com o terceiro caractere da string, ? bate
  4. ainda no terceiro caractere da ER, *; bate com o quarto caractere da string, [espaço_em_branco]? bate
  5. ainda no terceiro caractere da ER, *; bate com o oitavo caractere da string, c? bate
  6. ainda no terceiro caractere da ER, *; bate com o nono caractere da string, ? bate
  7. acabaram os caracteres da string… e agora? bom, pegamos o quarto caractere da ER, ; como estamos no final da string vamos voltar até achar um caractere que bata com ele… voltamos para o nono caractere da stirng, , bateu, acabou aqui

Esse último passo, voltar para tentar bater na string caracteres que ainda existem na ER é chamado de backtracking, e dependendo do tamanho do texto que está sendo processado pode destruir o desempenho da aplicação que estiver usando ER. Bom… mas como resolvemos isso?

$ echo ‘”a” b “c”‘ | egrep –colour ‘”[^”]*”‘
“a” b “c”

O que eu fiz? Basicamente eu troquei o . por uma expressão que diz não “. Ou seja, estou pedindo a ER para bater qualquer número de caracteres que não sejam aspas duplas. Vamos repetir o processo:

  1. pega o primeiro caractere da ER, “, bate com o primeiro caractere da string, “? bate
  2. pega o segundo caractere da ER, [^”], opa aqui temos uma expressão dizendo que não serve o caractere “; bate com o segundo caractere da string, a? bate
  3. pega o terceiro caractere da ER, *, opa, é só repetir o último o tanto que der agora…; bate com o terceiro caractere da string, “? não

O fato de termos usado uma expressão que diz, simplificando, não ” repetido n vezes fez com que o operador * parasse antes de chegar ao final do texto, o que nos poupou muito tempo, e ainda deu o resultado esperado. A solução para o problema do Metal era, portanto, trocar o . por [^/] na expressão dele. Dessa forma a barra, quando aparecia, era batida do lado de fora do parentese, e não era capturada.

Fica como exercício para o leitor imaginar o que acontece quando tentamos pegar o que existe, num gigantesco arquivo HTML, entre a tag <html> e a tag <head> usando a ER <html>.*<head>; eu já posso adiantar que não é nada eficiente… alguém se habilita a postar nos comentários? =D

Update: o coredump lembrou o operador *?, que é o * não-greedy; ele é útil, mas veja nos comentários por que ele não é uma solução para esse problema específico.

PolicyKit rules, and, when error handling bites you

PolicyKit, as I said before, is the authentication/authorization framework that will finally replace the gksu hack with a real solution. I am playing with some test code to be able to perhaps contribute with that goal. Everything was going forward pretty nicely in my tests when I hit a blocker. I spent like two hours trying to figure out the problem. Here’s the code:


sender = dbus_message_get_sender(message);
pk_caller = polkit_tracker_get_caller_from_dbus_name(pk_tracker,
sender,
&dbus_error);
if(dbus_error_is_set(&dbus_error))
{
g_error("Failed to get caller from dbus: %s: %s\n",
dbus_error.name, dbus_error.message);
return NULL;
}

When my mechanism goes to check if the caller is allowed to run the action, I first need to get the caller, of course. That code was always failing with this error message:


** ERROR **: Failed to get caller from dbus: org.freedesktop.DBus.GLib.UnmappedError.CkManagerError.Code0: Unable to lookup session information for process '25594'

It took me a long time, and a lot of code reading on PackageKit to realize the real simple problem with my code: I was assuming that the DBUS error would only be set if polkit_tracker_get_caller_from_dbus_name failed, which is just not the case. The DBUS error was set, but the function actually worked, most probably getting the information it needed from some other thing than ConsoleKit. So that brings us to something like this:


sender = dbus_message_get_sender(message);
pk_caller = polkit_tracker_get_caller_from_dbus_name(pk_tracker,
sender,
&dbus_error);
if(pk_caller == NULL)
{
if(dbus_error_is_set(&dbus_error))
{
g_error("Failed to get caller from dbus: %s: %s\n",
dbus_error.name, dbus_error.message);
return NULL;
}
}

And, voi lá!