Category Archives: development

Montanha: agora de olho nos vereadores de BH e alguns comentários sobre contribuições

Alguns dos leitores talvez saibam que eu escrevi no meio de 2010 um programa chamado ‘Montanha’. A ideia original do programa era me ajudar a escolher um candidato a deputado estadual me dando uma ideia geral de como os deputados da época gastavam os recursos da verba indenizatória. O site da Assembléia Legislativa de Minas Gerais publica essas informações, mas de uma forma muito inconveniente, tornando absolutamente impossível ter uma idea geral de como os deputados gastam a bufunfa. Obviamente botei o código online e subi uma instância pública para que outras pessoas pudessem fazer o mesmo. Depois disso o grande tevaum se juntou ao time e já adicionamos uma instância para a nova legislatura, que tomou posse em 2011.

Nos últimos dias decidi que com os belorizontinos prestando atenção nos vereadores, dada a polêmica sobre o aumento de salários e o veto pelo prefeito, seria um bom momento para criar o coletor e subir uma instância nova do Montanha, pra observar os gastos dos vereadores. Quem olhar vai notar rapidamente que o projeto ainda está pela metade: ainda faltam informações de partido dos vereadores, os links falam em ‘deputados’ e por aí vai, mas sou fiel ao princípio de release early, release often, então não quis esperar – quando os dados começaram a encher o banco botei o projeto pra fora.

Agora alguns comentários sobre questões que as pessoas me colocam:

Bacana! Se precisar de ajuda tamos aí!

Obrigado! Esse é um projeto de software livre – o código está sob a Affero GPL3 e sua contribuição é bem-vinda. Eu acredito firmemente em outro princípio: talk is cheap; show me the code. Eu não pretendo organizar/coordenar os esforços de outras pessoas, então não espere que eu peça ajuda para algo específico ou pegue na mão, sinta-se à vontade para clonar o projeto, fazer as modificações que achar que devem ser feitas e propô-las, não posso garantir que alguma coisa será incorporada ao meu branch, mas estou disposto a discutir questões de design/planos e responder dúvidas sobre o código – no canal #linux-bh da freenode, principalmente =).

Quais os planos pro futuro?

O meu TODO imediato é (e sinta-se à vontade pra roubar qualquer um e fazer):

  • colocar os dados de partido nos dados da Câmara Municipal de BH
  • mudar a interface do montanha para não falar em ‘deputados’, mas em ‘parlamentares’
  • escrever um coletor para os dados anteriores a março de 2010 da CMBH
  • melhorar a linkabilidade das pesquisas – deixar que você envie um link da visão de todos os gastos, por exemplo, com uma busca já feita
  • escrever alguns posts no Observador Político e no Trezentos chamando a atenção para algumas informações expostas pelo montanha
  • adicionar mais gráficos – gasto sobre tempo, por exemplo
  • melhorar a informação que o sistema dá a respeito do período coberto pelos dados
  • aumentar a quantidade de trivia exibida na página de detalhes de parlamentar
  • criar uma página com detalhes e trivia para fornecedores

Por que você não coloca esse projeto no Transparência Hacker (ou outro grupo)?

A minha resposta para esse tipo de pergunta tem sido ‘por quê eu deveria’? Não é que eu seja um lobo solitário, mas eu acho que só faz sentido participar de um projeto específico se houver alguma razão para tal. Visibilidade não me preocupa muito – a mensagem sempre acaba chegando em quem se interessa e em quem me interessa que ela chegue.

Eu não acredito que participar de um grupo – qualquer grupo – seja garantia de contribuidores, também; como eu disse, talk is cheap e disso eu tenho certeza de que acharia muito num grupo, mas acredito que as pessoas que quiserem contribuir vão contribuir independente de estar dentro de um grupo (como o Estêvão faz). Se um pedaço grande da contribuição vier de pessoas que fazem parte de um grupo e fizer sentido discutir o projeto dentro dele, aí sim eu veria sentido, por exemplo.

Uma última preocupação, essa específica com o thack, é que o foco do grupo me parece muito diferente do meu. Meu objetivo é que a sociedade tenha uma ferramenta para observar seus parlamentares. Para que isso aconteça é preciso que a ferramenta tenha uma vida mais longa e seja mantida. Os dados sobre a legislatura passada da ALMG, por exemplo, já foram retiradas do site da ALMG, mas o Montanha continua lá, a sociedade continua tendo acesso não só a todos dados, como a uma visualização mais razoável deles. Eu não estou prometendo que vou manter pra sempre, claro, principalmente porque faço isso no meu tempo vago (em que eu também trabalho pro Debian, GNOME, como, durmo e me divirto), mas a minha ideia é focar nesse um problema e ter uma boa solução razoavelmente perene.

A maioria das coisas que eu vi do thack são hacks muito bacanas, mas sua vida parece ser muito curta – assim que um hack está pronto outra ideia legal aparece e aquela é deixada para trás; essa bola já foi levantada por outras pessoas, inclusive, como exemplo de por quê grupos como o thack não são a solução definitiva para o problema de dados abertos e de por quê concursos de criação de app não substituem um trabalho sério dentro do governo; não é incomum achar coisas com dados de anos atrás ou que sequer continuam funcionando. Note que eu não tenho nada contra o thack, per se, muito menos contra as pessoas que o compõe – eu os considero colegas e amigos, eu só acho que nós surfamos ondas diferentes e isso me faz achar que eu não agregaria valor ao grupo e vice-versa. Obviamente posso ser convencido do contrário eventualmente =)

The Blocks C extension and GIO asynchronous calls

So, I intended to be completely away from my computer during my vacations, but hey. I have been interested in this new extension Apple added to the C language a little while ago which introduces the equivalent of closures to C. Today I spent a few minutes looking into it and writing a few tests with the help of clang.

Here’s something I came up with, to use a block as the callback for a GIO asynchronous call:

#include <Block.h>
#include <gio/gio.h>

typedef void (^Block)();

static void async_result_cb(GObject* source,
                            GAsyncResult* res,
                            gpointer data)
{
    Block block = (Block)data;
    block(res);
}

int main(int argc, char** argv)
{
    g_type_init();

    if (argc != 2) {
        g_error("Blah.");
        return 1;
    }

    GMainLoop* loop = g_main_loop_new(NULL, TRUE);
    GFile* file = g_file_new_for_path(argv[1]);

    g_file_query_info_async(file,
                            G_FILE_ATTRIBUTE_STANDARD_CONTENT_TYPE,
                            G_FILE_QUERY_INFO_NONE, G_PRIORITY_DEFAULT,
                            NULL, async_result_cb, (gpointer) ^ (GAsyncResult* res) {
        GError* error = NULL;
        GFileInfo* info = g_file_query_info_finish(file, res, &error);

        if (error) {
            g_error("Failed: %s", error->message);
            g_error_free(error);
            return;
        }

        g_message("Content Type: %s",
                  g_file_info_get_attribute_string(info, G_FILE_ATTRIBUTE_STANDARD_CONTENT_TYPE));

        g_object_unref(info);
        g_main_loop_quit(loop);
    });

    g_main_loop_run(loop);
    g_object_unref(file);

    return 0;
}

Pretty neat, don’t you think? To build you need to use clang and have the blocks runtime installed (libblocksruntime-dev in Debian). Here’s the command I use:

$ clang -fblocks -o gio gio.c -lBlocksRuntime `pkg-config --cflags --libs gio-2.0`

“Porque eu aprendi na faculdade que alocar memória é caro”

Mais um exemplo interessante a respeito daquilo que falei num post anterior. Recentemente uma pessoa que começou a trabalhar em um projeto em que eu trabalho há algum tempo me pediu que opinasse a respeito de um patch dele. O patch original dele havia sido revisado por um colega de projeto e além de umas bobagenzinhas de estilo havia uma única preocupação: “por quê você quer fazer essa mudança?” A mudança era bastante simples: evitava que uma variável fosse liberada e realocada em alguns casos específicos.

Nesse projeto uma das coisas que as pessoas menos gostam é de otimizações cegas. Se você faz uma mudança, essa mudança tem chances de introduzir bugs e de aumentar a complexidade do código. Se você vai incorrer nesse risco, é melhor saber que está de fato fazendo alguma diferença. Por isso, otimizações em geral só são bem-vindas se forem acompanhadas de um teste de desempenho que mostra melhoria. Se não melhora nada, pra que mudar? É claro que há excessões à regra e algumas mudanças acabam simplificando o código e são bem-vindas mesmo se o único resultado for não piorar o desempenho.

Quando eu falei que não é legal fazer otimizações cegas ele me disse: “Cega ou não, estou fazendo a mudança porque aprendi na faculdade que alocações são caras, principalmente porque causam chamadas de sistema e por isso devem ser evitadas”. Opa. Alocações de memória de fato envolvem o kernel, mas será que todas as alocações de memória causam chamadas de sistema e a consequente (e cara) troca de contexto para modo kernel?

Fácil testar isso. Se for verdade que cada alocação gera uma chamada de sistema o seguinte programa vai exibir 100 milhões de chamadas de sistema quando o executarmos através do strace:


#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv)
{
    int i;
    char* data;

    printf("START\n");
    for (i = 0; i < 100000000; i++)
        data = malloc(10);
    printf("END\n");

    return 0;
}

Vejamos:

[...]
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbe62175000
write(1, "START\n", 6START
)                  = 6
brk(0)                                  = 0x1b83000
brk(0x1ba4000)                          = 0x1ba4000
write(1, "END\n", 4END
)                    = 4
exit_group(0)                           = ?

Mas hein? Duas chamadas de sistema entre os dois writes? Pois é. Acontece que o pessoal que escreveu a malloc() já sabe que pedir memória pro kernel é caro e fizeram a libc pré-alocar uma quantidade maior de memória de uma vez e ir entregando pedaços dessa memória conforme a aplicação pede. Isso significa que alocação de memória não é cara? Não. É razoavelmente cara mesmo que não haja troca de contexto, afinal de contas a libc precisa fazer um tanto de trabalho pra saber quanto tem alocado e saber qual o tamanho de cada pedaço de memória que foi alocado para poder liberar depois com free(). Se ao invés de fazer malloc 100 milhões de vezes eu fizer uma só e fizer 100 milhões de memset() o programinha fica 10 vezes mais rápido.

Isso significa que nós devemos evitar qualquer alocação que seja possível evitar? Depende. Pra começo de conversa esse é um teste extremo, não uma carga de trabalho real. Testes são sempre melhores em cargas reais (ou mais parecidas com algo real). Mas principalmente, se for pra piorar muito a legibilidade do código, torná-lo mais complexo, manter memória comprometida por mais tempo que o necessário, é importante que haja um ganho em desempenho para contrabalancear. E esse ganho tem que ser medido, não imaginado =).

A clutter port of WebKit

In case you missed the news on webkit-dev, Collabora has been working on developing a clutter port of WebKit. It shares the build system with EFL, and most of the backend code comes from the GTK+ port. That means networking is handled by soup, drawing by cairo, multimedia by GStreamer, and so on.

If you’d like to give it a try, you can clone the repository from gitorious:

$ git clone git://gitorious.org/webkit-clutter/webkit-clutter.git

Then to build it you use cmake. From inside the source code directory do this:

$ mkdir build
$ cd build
$ cmake .. -DPORT=Clutter -DSHARED_CORE=1 -DBUILD_MX_LIB=1
$ make

The BUILD_MX_LIB option is optional – it will build what we call the “Mx toolkit library” in addition to the vanilla one. Then you can test that the library is built and working by running the programs inside the “Programs” directory. Enjoy!

WebKitGTK+ hackfest 2010!

Last week I attended the WebKitGTK+ 2010 hackfest. It was a great opportunity to meet up with the other developers, discuss some plans for the future, hack away at WebKitGTK+. But, most importantly, play Street Figher 2 =). Thanks to Collabora and Igalia for sponsoring the hackfest, Igalia for hosting and organizing it (well done!), and the GNOME foundation for having sponsored my trip to Coruña!

Unlike last year we didn’t find any big design issues hurting our work (page cache, I’m looking at you!) on new futures. I also didn’t have any huge plans for new API, although we did manage to get some new stuff in there, like the plugins management API Xan created, and the further work done by Dan in soup. This meant, from my point of view the hackfest has been a great oportunity to look at refactorings that we could do to further simplify understanding the code, changing it, and even sharing it =). Besides pushing the debian packaging of the 1.3.x series a bit further.

Coruña was great as always, and I enjoyed going around, eating and hacking there, although I got a cold on the last days which kinda hindered my ability to stare at the screen for too long, some times. Now that I’m at the hot brazilian summer again I’ll hopefully get better soon =)

Cheers \o/

Sponsored by the GNOME Foundation

WebKitGTK+ and the Web Inspector

When I started working on WebKitGTK+ I was a web developer, writing IT applications using Python and Django, and building features for content portals running Plone (argh). Even though I was an Epiphany user ever since it was forked off Galeon, I still had to use Firefox for my work, because I couldn’t really live without Firebug.

It should come as no surprise, then, that one of my first patches to WebKitGTK+ was actually making the awesome Web Inspector work in our port. After the initial support, though, not a lot has been done to further improve it, partly because it was already good enough for many uses, partly because I somehow started doing non-web development again ;) .

These last weeks, through my R&D efforts in Collabora, I have been able to push Web Inspector features and integration a bit further. A simple change that boosts the Inspector’s usability quite a bit is having the nodes that are being hovered highlighted. Along with that, the ability to attach the inspector to Epiphany’s window should make it easier to use for poking the DOM.

The Web Inspector has a number of settings that control its behaviour. Since, for instance, enabling javascript debugging may slow down javascript performance, the inspector usually has it disabled by default, and provides a button to enable it. It also provides an option for always enabling that feature, but that does not work right now, because we are not saving/restoring the relevant settings. A solution to that is in the works using the GSettings infrastructure that was recently merged into glib.

Here’s a simple screencast, showing these improvements in action (click the video to check it out in full size):

WebKit2 and WebKitGTK+

So you’ve seen people talking about WebKit2, perhaps have seen someone claiming it “drops support for Linux“, and you’ve been wondering what the hell that means for WebKitGTK+. Well, welcome to the preemptive Q&A section with WebKitGTK+ maintainers =D. Let’s first explore some history so we can better understand what exactly is going on.

What exactly is WebKit2?

Currently, when we say “WebKit” we really mean one of the ports that are built on top of WebCore using the WebKit layer. WebCore is the part that does all of the hard Web-related work, WebKit an API layer that exposes WebCore functionality in a coherent way, so that the platform-specific ports can expose a public API layer for their applications to use, which is usually also called “WebKit”. This WebKit layer was designed by Apple to build the Mac, and Windows ports it maintains, and was later released as Free Software so that other ports, such as the GTK+, Qt, EFL ports could be built on top of it, instead of having to do all the heavy lifting from WebCore directly.

Current WebKit model

WebKit2 is nothing more than the second version of that interface, with a whole lot of changes on what you can expect from it, and on how it interacts with WebCore, and the platform-specific API and UI. First of all, the first WebKit was not API stable, and that interface was usually not made public by the various ports – they only exposed their platform-specific APIs. WebKit2 is being designed to provide a stable, cross-platform, C-based, non-blocking public API. This is huge. It will allow cross-platform code to be written without having to consider language, and port differences for basic functionality.

The second big change is the API will be made fully non-blocking. Currently most things you do are asynchronous already, but some of them may be completed in a synchronous ways (like, loading a string into WebKit instead of an URI). This is important for responsiveness, and is also a very important need for what comes next: process splitting.

WebKit2 will bring into WebKit proper the concept of splitting the UI process from the Web process, similar to what Chromium has. It also much more awesome than what Chrome has for a large number of reasons, including, but not limited to:

  1. It’s being contributed directly to the WebKit project, in a cross-platform way that lets ports such as WebKitGTK+ take advantage of it, instead of being shipped directly into Safari, like Google does with Chrome;
  2. The process separation goes bellow the API layer, meaning that all complexity involved in managing the process separation is handled by the library, and hopefully none of it leaks to the application using it; that means that applications like Devhelp and Yelp will be able to take advantage of this without having to make their lives more complicated;
There’s a much better diagram in WebKit2′s wiki page, but here goes a simplified version that demonstrates what I’m talking about:
WebKit2 model

What WebKit2 is not?

WebKit2 is NOT a rewrite of the whole WebKit stack. Webcore will continue mostly unchanged, and all ports currently building on top of it will keep working. It is also not a fork – the code lives in the same tree as the current version of WebKit, which will allow us to progressively move towards using this new, improved layer. WebKit2 is not Apple-only, and it is not dropping Linux support. Initial builds of the code that is being landed will likely show up building on Linux in the near future (specially because us porters are already eager to play with it).

What happens to WebKitGTK+?

In the near future, nothing special. We will continue working towards making it feature-complete, more stable, faster, and rocking on it as always. We will, though, start working out how we can best take advantage of WebKit2 in order to provide an even more awesome library for the G world. What this means is you can expect us to have a library that will provide a nice GTK+ widget, just like we have today, with a GObject-based API, like we have today, but that is built on top of this new WebKit2 infrastructure, taking advantage of the process-splitting, and the bigger focus on not blocking the UI thread. This should give us a platform that is more stable, and faster and more responsive than what we already have today.

The API is bound to change, of course, but the WebKit2 version of WebKitGTK+ will be a separate, parallel-installable library, and we will keep supporting the WebKit1 version while we work on making the new one at least as good as the current one. This is long term we’re talking here. We’ll likely see WebKitGTK+ 1.4, and 1.6 come to life before we are satisfied enough with WebKitGTK+2.

We hope this clears some of the doubts up, and lightens your hearts!

The WebKitGTK+ maintainers.

Designing from the bottom up

Have you ever seen, while dealing in a support channel with a novice that just got in touch with the power of UNIX a conversation that goes like this?:

<novice> How can I process the output of a command, so that any number of spaces gets turned into a newline?
<seasoned> What are you trying to do?
<novice> I want to list the contents of a directory, but I want one per line.
<seasoned> ls -1

I have seen this numerous times, even as one of the actors. At times I was the novice, and many times in #debian-br I was the seasoned person trying to get the novice to focus on the problem they were trying to solve, not on the solution they thought was right.

While reading Máirín Duffy’s awesome paper about contributing to Free Software as a designer I couldn’t help but get that image brought to my memory again, and again. Specially when I read this part:

This means the language and even the approach FLOSS projects take to solving problems tend to be focused on implementation and technology rather than starting with a real-life user problem to solve and determining appropriate implementation afterwards.

That does sound like us, and it does sound like many of the solutions we come up with. While I was reading her paper, there was a reference I got very interested in checking. It’s a PDF with no links in it, so I only had the number of the reference. What I would have to do is I would have to scroll to the end of the paper, and find the reference, then somehow come back to the place I was looking at.

My most immediate thought was ‘you know, maybe evince should have tabs’. Why? Because I could open a new tab, go to the place the reference was at, and to ‘go back’ I just needed to close the new tab. Other options require much more effort – remembering the page I was at, or maybe the scroll offset more or less, and scan for the part of the text I was at. But those are not the only options! I could have the application set a marker on where I am, and have an easy command to go back to that marker, for instance, or evince could provide a way of ‘looking ahead’ without throwing away the current state at all. I’m pretty sure if I look around enough I will find tools that solve this problem in a fairly good way.

Now, I think that is exactly how we ended up with tabs in so many places they do not make sense in, and with so many ad-hoc solutions that solve our problems in half-assed ways. Even in browsers, we tend to use tabs as ad-hoc solutions to real problems we have no real solution to handle yet, such as “I want to check this other thing out real quick, but I do not want to lose any state of this page”, or “I want to check this out, but not right now, so let me open it, and then I’ll come back to it”, or maybe even “I want to look at this now, but since it is going to take a while to load, I might as well let it load in the background, and when I finish reading this I can go look at it”. These are the real problems we have, and I think we need better designs that solve them for real, instead of just patching them with the ad-hoc solution that tabs are.

The other extreme of the spectrum is, of course, not doing something, or even anything for lack of the perfect solution. Using ‘this is not a real solution’ as an excuse to not implement something that could serve as a temporary solution to a problem may cause more frustration than having to deal with the ad-hoc solution that is tested, and being applied to other applications for some time. After all, in many cases the ad-hoc solution can be later replaced with a proper one.

I guess this is another instance of the very difficult problem of balancing different realities: proper design is not always available to start something up, specially if the application is backed by individuals and not by a company or a bigger project that could bring in designers to work on it from the start. In this case having something up and running is usually a very important first step in a free software project – usually required to get enough interest to make it worth designing for.

WebKitGTK+ 1.1.90 is out!

We’re coming close to GNOME 2.30 release date, and we are getting ready to branch a stable release off of WebKit’s svn trunk in preparation for that. The idea of the stable branch is to try to maintain, and improve stability, with no additional features going in. Speaking of features, though, if you’ve been paying attention you will have noticed WebKitGTK+ has come a long way, now.

We came from not having basic features such as download support or openning links in new tabs, a more-or-less working HTML5 media implementation, and very few or missing in action developers to a thriving project, that gets more, and more attention, and contributors every day, with advanced features available, and rocking HTML5 media support that leaves little to be desired. It’s been just over one year since we started rolling mostly bi-weekly releases, each adding more awesome features.

There are still many issues, and we are not always equipped as a team to handle all the specifics of the engine ourselves, but I am really happy with the progress we’ve made, and really thankful for the support my employer Collabora has given all the way for this to happen, including the early work on plugins, and many other things before my time as a contributor. When I switched to using Epiphany with the WebKit backend as my default browser back in January 2009, that meant having to deal with a whole lot of misbehaviour, and work-around a lot of painful brokeness. These days I enjoy a snappy, functional browser that makes me happy.

If you haven’t done so yet, go download, and test the newest Epiphany, with the latest WebKitGTK+, and help us make the GNOME 2.30′s web browser rock even more!

WebKitGTK+, and the Page Cache

So, one of the things I get to do during work hours for Collabora is to contribute code, and do maintenance tasks for WebKitGTK+, and have been doing so since early last year, working on all kinds of things, from improving the network backend to handle the real-world web, to fixing scrolling problems, while reviewing patches from the many awesome developers who have been joining us (more on that later =D).

One of the big features I have worked on this past month or so, along with Xan Lopez is the Page Cache. The page cache is a feature of web browsers that makes going back, and forward between pages in the same view very fast. It’s better explained in this post, but to summarize, the idea is that instead of destroying all the work you have done since downloading the resources, and having to reparse/rebuild the structures the view uses to display the page from the cached resources, you hit pause on the page, and store the whole thing as is, and when coming back to it, you just hit play. You can see in the video two instances of Epiphany, one with the page cache enabled, one with it disabled. Easy to see which was has it enabled. Thanks to KiBi for the suggestion regarding a page that shows this easily =D.

We initially thought we had this feature enabled, since our initialization functions (that exists since before the current maintainers were involved) did setup the number of desired pages in the cache, but during the hackfest we held in December we found out we were fooled all this time. Enabling the page cache does make going back faster, but also made lots of things become unstable and crash.

Since then, we have been working on figuring out all the problems, and fixing them, using help from adventurous users of in-development software ;D. I believe we’re now at a point in which I can happily declare the GTK+ port has a working page cache in trunk! If you’re interested in the nasty details, bear with me!

Let me go back in time a bit, and show you what problems we had. First, some background: the GTK+ port deviates a lot from the other ports when it comes to scrolling. This is because, when designing this part of the port, Holger Freyther had a very nice idea in mind: that the WebView should be a first-class citizen GTK+ scrollable widget. Meaning it would use GTK+’s adjustments for scrolling, and be able to interact with any parent scrolling widget, be it a GtkScrolledWindow, or a MokoFingerScroll.

We cannot just throw away all the rest of the scrolling code in WebCore, though, that deals with all the details related to interacting with the DOM, and JavaScript code. This means our WebView contains adjustments that need to be set, and unset on our port’s version of WebKit’s own representation of the view, called the FrameView, to interact with it, and to get updates on the bounds of content, and such. For every load, in the non-page-cache case, a new FrameView is created, the previous one is destroyed – this means we need to set the adjustments on every load.

The problem starts when you have the page cache enabled, because the code path used to do what is called “commit” the load of a cached page (that is, start replacing the content that is currently being displayed by the one that should now be displayed) is completely different, and we were not setting the adjustments on this new view, so we started with that.

But all was not well. We were still having weird behaviour with scrollbars disappearing, and becoming the wrong size, and worse, crashes when “back” was hit. We then started investigating in more detail how it is that the page cache does its magic, to try and figure out the source of all evil.

It turns out that when you leave a page that can be cached, the existing FrameView is no longer destroyed – it is stored as is in a CachedFrame to be restored if you go back, and a new one is created for the new page. This was having the undesired effect of having the adjustment be set in more than one FrameView at once, causing all kinds of (predictable, after we knew for real what was going on) unwanted effects. Thus, we reworked the code to make sure the adjustments are only ever set in one FrameView at once, making sure they are unset when the FrameView is being frozen, and reset when it’s being restored from the page cache.

Last, but not least, it was discovered that going back to a page that contained resources with data: URIs (such as Google results pages which contain a small number of image hits) also caused a crash. This was because our network backend was not storing the data: URI in the ResourceResponse objects it fed into WebCore. The page cache relies on those responses to recreate the requests it uses to artificially replay the load when restoring the page from the page cache, so we fixed that as well.

What can be taken from all this? Building browsers is a lot of hard work! I can’t think how we could deal with this level of complexity without the awesome testing suite of WebKit. The good news is all of those issues I talked about in this post are now covered by the automate tests that run as part of the normal buildbot cycle in our bots, so we’re covered for the future, at least for these specific problems =D.