My blog has been moved to ariya.ofilabs.com.

Friday, June 10, 2011

your webkit port is special (just like every other port)

One of the most often question I got, "Since browser Foo and browser Bar are using the same WebKit engine, why do I get different feature set?".

Let's step aside a bit. Boeing 747, a very popular airliner, uses Pratt & Whitney JT9D engine. So does Airbus A310. Do you expect both planes to have the same flight characteristics? Surely not, there are other bazillion factors which decide how that big piece of metal actually flies. In fact, you would not expect A310-certified pilot to just jump into 747 cockpit and land it.

(Aviation fans, please forgive me if the above analogy is an oversimplification).

WebKit, as a web rendering engine, is designed to use a lot of (semi)abstract interfaces. These interfaces obviously require (surprise) some implementation. Example of such interfaces are network stack, mouse + key handling, thread system, disk access, memory management, graphics pipeline, etc.

What is the popular reference to WebKit is usually Apple's own flavor of WebKit which runs on Mac OS X (the first and the original WebKit library). As you can guess, the various interfaces are implemented using different native libraries on Mac OS X, mostly centered around CoreFoundation. For example, if you specify a flat colored button with specific border radius, well WebKit knows where and how to draw that button. However, the final actual responsibility of drawing the button (as pixels on the user's monitor) falls into CoreGraphics.

With time, WebKit was "ported" into different platform, both desktop and mobile. Such flavor is often called "WebKit port". For Safari Windows, Apple themselves also ported WebKit to run on Windows, using the Windows version of its (limited implementation of) CoreFoundation library.

Beside that, there were many other "ports" as well (see the full list). Via its Chrome browser (and the Chromium sister project), Google has created and continues to maintain its Chromium port. There is also WebKitGtk which is based on Gtk+. Nokia (through Trolltech, which it acquired) maintains the Qt port of WebKit, popular as its QtWebKit module.

(This explains why any beginner scream like "Help! I can't build WebKit on platform FooBar" would likely get an instant reply "Which port are you trying to build?").

Consider QtWebKit, it's even possible (through customized QNetworkAccessManager, thanks to Qt network modularity) to hook a different network backend. This is for example what is being done for KDEWebKit module so that it becomes the Qt port of WebKit which actually uses KDE libraries to access the network.

If we come back to the rounded button example, again the real drawing is carried out in the actual graphics library used by the said WebKit port. Here is a simplified diagram that shows the mapping:

GraphicsContext is the interface. All other code inside WebKit will not "speak" directly to e.g. CoreGraphics on Mac. In the above rounded button example, it will call GraphicsContext's fillRoundedRect() function.

There are various implementation of GraphicsContext, depending on the port. For Qt, you can see how it is done in GraphicsContextQt.cpp file.

Should you know a little bit about graphics, you would realize that there are different methods and algorithms to rasterize a filled rounded rectangle. A certain approach is to convert it to a fill polygon, another one is to scanline-convert the rounded corner directly. A fully GPU-based system may prefer working with tessellated triangle strips, or even with shader. Even the antialiasing level defines the outcome, too.

In short, different graphic stacks with different algorithm may not produce the same result down to the exact pixel colors. It all depends various factors, including the complexity of the drawing itself.

Now the same concept applies to other interfaces. For example, there is no HTTP stack inside WebKit code base. All network-aware code calls specific function to get resources off the server, post some data, etc. However, the actual implementation is in system libraries. Thus, don't bother trying to find SSL code inside WebKit.

This gets us to this question, "If browser X is using WebKit, why it does not have feature Z?". You may be able to deduce the reason. Imagine a certain graphic stack which glues GraphicsContext for that platform does not implement fillRoundedRect() function, what would happen? Yes, your rounded button suddenly becomes a square button.

As a matter of fact, when someone ports WebKit to a new platform, she will need to implement all these interfaces one by one. Until it is complete, of course not everything would work 100% and most likely only the basics are there. That should feel like putting a jet engine into an airframe that can't fly yet.

"Can we have one de-facto graphics stack that powers WebKit so we always have pixel-perfect rendering expectation?" Technically yes, but practically no. In fact, while Boeing and Airbus may buy the same engine from Pratt & Whitney, they may not want to have the exact same landing gears. Everyone of us wants to be special. A certain system wants to use OpenGL ES, squeeze the best performance out of it and doesn't really care if the selling price goes up. Others want to sacrifice the speed, trim the silicon floor and make the device more affordable. More often, you just have to live with diversity.

And if you want to put aside all the differences, two WebKit ports of the same revision share tons of stuff, especially if they use the same JavaScript engine. They will parse HTML and CSS in the same way, produce the same DOM, yield the same render tree, have the same JavaScript host objects, and so on.

Thus, next time someone shouts "there is no two exact WebKit", you know the story behind it.

7 comments:

jarred said...

Nice post, I like the aviation analogy ;)

Anonymous said...

Maybe...

But still, this does not explain (to me) why Konqueror and rekonq, both with Qtwebkit renderer, behave differently on the same session.

Ariya Hidayat said...

@Anonymous: File a bug with more description. Who can guess what's wrong with your supershort analysis?

Anonymous said...

Cool post! I like it! :)

James Pearce said...

Maybe the 747 is WebKit, and the airlines (who choose how pleasant each flight will be by changing its seating configurations, menu, in-flight entertainment and trim) are the browsers.

But it's also possible I know more about sitting on planes than I do about developing & compiling browsers :-)

Sunny said...

Nice Post, Thanks for sharing your knowledge.

Gaurav M said...

Thanks for this! this always make me wonder