Verso: Writing its own compositor part 1

Wu Yu Wei published on July 20, 2024

5 min, 969 words

In previous post, I mentioned we probably should research our own compositor to get multi-windows support. The compositor is a component in Servo that composites graphical contexts and decides which browsing contexts should be displayed on the surface of the window. Servo's compositor isn't exposed to users directly, despite its being in the same thread as the embedder and being a public module. This makes it difficult to get the work we want the compositor to do. And because we use the channel to send EmbedderEvent for compositor to handle, there's always a delay for compositor handling its messages, which makes it difficult to predict when the frames will be rendered. In the hope of understanding more about how to enable multiple rendering contexts to compositor, I decided I should write at least one for an experiment. Verso#86 is the first step to this exploration.

What has changed

However, this PR hasn't actually written the compositor yet. It just extends embedder's work to what Servo instance does. So now Verso itself will handle all channel and thread creations, and it can call the compositor methods directly. I have to say this is not my first choice, to be honest. I actually prefer using channels to send EmbedderEvent and receive EmbedderMsg to communicate. They are very clean enumerations that help users understand what kind of task the embedder should handle. But like I said I wish to get more control to understand the whole behavior of compositor. So Verso now calls the compositor directly and sends ConstellationMsg itself to the constellation thread. Surprisingly, I think it didn't change a lot. We are still using channels, but we can communicate directly with Constellation and Compositor at this time.

Thread creation

The most obvious benefit is that I have a clear overview of what threads require Embedder to create now. The constellation thread requires several channel senders of these threads when creating. We can potentially spawn the thread lazily or even disable them based on preference. This could reduce huge amount of memory usage at the beginning. Here are threads that Embedder is responsible for spawning and the improvement we can achieve in the future:

Constellation: This is the main station of all Servo components.
Memory and Time Profiler: There's no preference option to toggle. But since there's a constellation message to disable profiler, maybe it can be disabled by default.
DevTools: This can be toggled by Opts::devtools_server_enabled.
Webrender: This is Servo's renderer and has its own thread pool. Perhaps we can set the size of thread pool.
WebGL: It must be created for now because WebXR needs to use it to create a registry. Perhaps there's a way to disable them by default and only spawn it lazily.
Bluetooth: It doesn't have preference option either. We should look for a way to spawn it lazily.
Resource threads: This is a thread pool with the size based on CPU core. This is the one that really needs to be improved in the net crate because it can be spawned multiple times. There's also a Tokio runtime in net that will spawn its thread pool. I think this is the main reason we saw hundreds of MB memory usage upon creating the browser. If you have 32 cores, that means 64 threads will be created already!
Font Cache
Canvas: It should be spawned lazily as well.
Web Driver: This can be toggled by Opts::webdriver_port.
JS engine: Spidermonkey also has its own thread pool.

As you can see, that's quite a few threads and will already use some MB of memory. One of the improvements we should constantly push is trying to spawn only necessary threads at the beginning.

Constellation messages

Another benefit we got is we can focus on handling constellation messages only. But I would say this is a subtle benefit. We change from EmbedderEvent to ConstellationMsg. The improvement we got I think is more about finding some redundant parts that can be eliminated. For example, I found GlFns::load_with is called multiple time during thread creation and event handling, even Verso itself will load again when creating a window. And that's 60 KB each time will be included in the binary. We can manage it to call only once. I also found we can send the web driver message directly to the Constellation without starting a server. Of course, we can also extend EmbedderEvent to do so. But I guess I will never find out if I can reach deeper. This can benefit us by sending some light weight messages without spawning a web driver thread for the more trivial use cases.

The next step

So right now, it's more like a preparation before writing our own compositor. The next step is trying to replicate the existing compsitor in Verso, so we can make sure it can work at least. Then, I want to understand which parts belong to which component, primarily the parts from Webrender. Should they all belong to the compositor? Or should they be kept in the window? Can we also just create one compositor and give it different contexts to handle? The whole compositor is around 4000 LOC. So I think it will take a while. But I'm confident we can clear this out one by one. I'll also upload more posts about this series in the near future.