Spanking Browser for Performance: 100× Speed Improvement

19.07.2024 – Pavel Klavík, Kamila Klavíková

OrgPad’s rendering has severely changed over the last 5 years since we started developing it in December 2018. Originally, we were using React and React Motion library for all animations. Nowadays, we run our custom rendering engine which solves differential equations of spring motion. It prepares every frame and renders it by a combination of direct DOM mutations and 2D canvas rendering. The obtained speed up was about 100×, and we have further plans to improve rendering using WebGL.

Spanking Browser for Performance: 100× Speed Improvement

#development, #OrgPad, #animations, #rendering, #tech

Humble Beginnings With React

We started coding OrgPad in December 2018. The code is fully written in Clojure and ClojureScript, which really boosted our programming capabilities in a small team of two people. The entire code base nowadays consists of about 120k lines of code which would be comparable to 500k-1M lines of code if it would be written in JS or something similar. The code base is also very stable, as evidenced by the following graph generated by Git of Theseus, depicting code changes over six years in our Git repository. It was the first web application we have ever created and we have learned so much about browsers and performance along the way.

OrgPad code is heavily focused around the front end, with about 80k lines of ClojureScript. We use React together with amazing ClojureScript wrappers Reagent and re-frame. Doing the user interface in this way felt like magic. There is one centralized state, describing everything about the displayed document and what is currently visible in the application. For it, we define functions called views which describe how various pieces are put on the screen. When the state is updated by functions called events, React + Reagent + re-frame automatically detects which views have to be rerendered. In this way, we were able to prototype OrgPad very quickly, for example we created this after a few days.

First screenshot of OrgPad

Benefits of Spring Animations

As we discussed in a previous blog post, animations are very important for OrgPad. We were fully sold on the idea of physics-based spring animation when we watched this talk by React Motion’s creator. These animations offer several amazing benefits:

Natural movement: Spring animations are based on physics and we see similar movements every day.
Ease of setup: Spring animations are really easy to set up. Instead of dealing with arbitrary animation curves and timings, we just set two physics parameters: stiffness (how much the spring pushes in the beginning) and damping (how much the spring slows down when reaching the target).
Interruptible and composable: Good animations are interruptible and composable. While animation goes from a state A to another state B, it might be canceled back to A, or changed to another state C. How should the animation continue? When describing animations by arbitrary curves, these problems are tricky to solve. A great animation system can handle these situations, and springs solve them for free. When the animation target changes, its velocity remains towards the old target while acceleration is changed towards the new target, slowly changing the velocity direction.

The great article by Josh Comeau describes spring animations in more detail, including an interactive example at the end. In December 2019, we implemented OrgPad animations using React Motion. This made documents and their changes much more understandable.

The existing approach for animations had a big drawback. Performance for animations is critical and smooth animations look so much better and feel more natural. But updating React + Reagent components on each animation frame has a lot of overhead. React Motion is fine for simpler use cases, for example when one wants to animate a website menu. OrgPad heavily relies on animations since nodes and links move when document updates, nodes change sizes while opening/closing, and so on. We had to make animations more performant.

Solving Differential Equations of Spring Motion

It all started in summer 2020 when we discovered a blog post by Thai Pangsakulyanont describing how to create a spring animation using keyframe animation in CSS. It nicely describes the physics of spring motion and solves the resulting differential equation using Wolfram Alpha. For comparison, React Motion and most other systems simulate the spring motion, always moving the time by a small amount and adjusting the spring position accordingly. Our advantage was that Pavel Klavík used to teach math for about 10 years at Charles University in Prague, including linear algebra and complex analysis. It was quickly apparent to him that we can solve these differential equations in general and quickly know the entire trajectory of a spring.

We started working on it in April 2021 and were able to quickly analyze spring motions in OrgPad. Solving the spring trajectory works in two steps:

For the given stiffness and damping parameters of the spring, OrgPad computes two eigenvalues of the system. Depending on these eigenvalues, the spring behaves in one of three different ways:
- Overdamped motion when eigenvalues are real and distinct.
- Critically damped motion when eigenvalues are real and the same.
- Underdamped motion when eigenvalues are complex conjugates.
Since spring parameters are fixed, the eigenvalues can be computed only once at the beginning for each spring type.
To compute the spring trajectory, the spring has some initial position and velocity. According to the spring type, the trajectory is a certain function with unknown coefficients A and B. We just need to solve a 2×2 system of linear equations to find these coefficients. We can also quickly estimate how long it will take for the spring to come close enough to rest, so the animation can be stopped.

With this knowledge, it was relatively easy to turn this into a working code. We just needed to use this to make OrgPad rendering faster.

New Rendering Engine

The first idea was to use solutions of spring motions to generate CSS keyframe animations, send them to the browser and let it do all the drawing. We have to supply a lot of keyframes to make the piecewise linear approximation accurate; otherwise these small inaccuracies are perceived by the eye as lags in the rendering. Furthermore, there is a significant overhead by generating this CSS, sending it to the browser and parsing it there. Especially when the animation target is changing rapidly, for example when dragging a node, one spends a lot of time generating future positions which won’t be used anyway. And animations between various elements in OrgPad have to be synchronized. For example when a node is moving, the animation of attached links has to match precisely. For all these reasons, we generate CSS keyframe animations only for trivial animations, like opening/closing side-menu or fade-in/out of panels.

Our experiments led us to create a new rendering engine for OrgPad which mostly avoids React. We still use React to place the nodes there initially, and when a node is opened, its displayed content is inserted to DOM by React. But updating animations is done outside of React. The rendering engine runs the animations by mutating DOM elements directly and drawing into on-screen canvas. We use the window.requestAnimationFrame function which allows one to run rendering code right before the browser is ready to put another frame on the screen. The engine knows precisely which nodes and links are currently animating, it updates their state of animations and sets rendering of everything.

Replacing SVG Links With Canvas

In the original React rendering, links were drawn using SVG elements. Completely randomly, we were able to discuss OrgPad rendering performance with Pavel Panchekha. He pointed out that SVG rendering is sluggish in browsers. It was also quite buggy, often links were flickering when zooming into an OrgPage. Pavel suggested that we should rewrite it with 2D canvas rendering: have a single 2D canvas placed at the background into which all visible links in the current frame are drawn as bitmaps. It took us a day to have a working prototype (advantage of being a startup which moves fast) and the speed up was huge.

HTML Structure and Shadow DOM

HTML in an OrgPad document is organized as follows. There is a single wrapper element representing a gray canvas on which the CSS transform property is set: translation for the current camera position and scaling according to the current zoom level. Inside this element, there is one absolutely positioned element for each node. Sadly, there is (or at least used to be) a bug in browsers where changing the transform on the wrapper element recalculates unchanged styles for all nodes which significantly slows down the rendering process. We have mitigated this by placing all nodes inside a Shadow DOM which forces the browser to separate CSS calculations.

Zoom Animation

Another interesting piece of information is how the zoom level is animated. When one works out the math, instead of the zoom level, the reciprocal 1/zoom must be animated by a spring. Great animations for zoom are absolutely essential and without them interactions with OrgPad would be very confusing. Most software gets this wrong, for example the zoom level jumps from one state to another without animations, making it unclear what just happened. Also, zooming should feel the same at different zoom levels, so it has to multiply/divide the current zoom level by a fixed constant. Another important detail is that zooming should preserve the position of the mouse cursor, so we can efficiently move from one part of the document to another one by quickly zooming out and in somewhere else. People often tell us that they would love to see OrgPad’s amazing zoom implemented in other tools as well. Needless to say that we have put a lot of effort into it.

Snapping Links

There were a lot of details we wanted to get right. For example, when creating a link, the link can be dragged anywhere into the gray canvas which creates a new attached node, or it can be dragged into an existing node to connect it. There is an important visual detail that when the link switches to a nearby node, it jumps to the center of this node. So there is an immediate visual indication communicating to the user that the link snapped to a node. Also, when dragging a link to the side of the screen, it slowly scrolls so the link can be made to a node which is currently outside of the screen.

Massive Performance Boost

Overall, this rewrite of OrgPad rendering was a huge success. As we were working on this change for over 3 months, the old version of OrgPad was too slow and almost impossible to use anymore. We do not have exact numbers but rendering was about 10× faster. On larger documents, instead of rendering a few frames per second, the rendering was mostly smooth (say 30 fps). We were quite happy with the result for a while. Unfortunately, when you improve the rendering performance, the occasional lag spikes become even more noticeable and annoying. So further improvements were needed.

Bad Browser Performance for Many Rendered Layers

In summer 2022, we were investigating again how to improve rendering performance. We spent a lot of time playing with the Chrome profiler (as you always should when you want to improve performance), turning things on and off in the rendering, etc. Link rendering into canvas was very fast already, so turning links off had no effect. On the other hand, rendering many layers, one for each node, was super slow.

Browser rendering works in stages, by building a series of caches. When a website changes, certain caches are invalidated and have to be recomputed. For example, moving a layer on the screen preserves its pixels, so just composition of layers has to be redone. On the other hand, changing the size of the layer requires style recomputation, layouting, painting, etc. To reach 60 fps rendering, it is suggested to preserve as many caches as possible and to spend at most 10 ms in JS on each frame, to give a browser enough time to prepare the rendering.

This might work for a simple website but not for OrgPad. Rendering was really slow even when not much was happening, for example, one was just zooming in/out of a large document. In this case, only the transform of the single wrapper element is changed and the background canvas with all links is updated (which does not matter for performance, the same behavior when turned off). All layers are preserved, cached by the browser as large bitmaps.

Rendering Only Visible Layers

So what is going on? We can check the profiler in Chrome running on our MacBook Pro M1 machine. It was measured on a fairly large document with simple nodes (see the video comparison at the end) where we just zoom in and out repeatedly. Since we want to render at 120 fps, each frame has to be finished within 8.3 ms. First, we spent about 2 ms in JS, updating the camera position and redrawing the background canvas with all links. Then the browser takes another 10 ms for the other rendering stages and finishes by composing layers on the GPU for another 30 ms (4 frames!). Together, it is capable of producing only 1 frame out of 5, running at about 24 fps. And this is a super fast machine and the nodes are just boxes with titles.

Something seriously wrong is happening here. A funny fact is that Safari’s rendering used to be much faster and this problem was not happening there. Until recently, when they switched to a similarly broken composition as Chrome does.

When zoomed in, only a few nodes are visible, but the performance remains poor. It turns out that the browser is still composing all these layers even when they are completely off-screen. We have quickly added some geometry computation to our rendering engine so it knows exactly which nodes and links are visible in each frame. Then we set display: none for off-screen nodes and switch it back to display: block when they appear on the screen. The speed improvement of this change was massive when zoomed in.

This is absolutely ridiculous. The browser already knows all this information when it composes layers, so doing a basic pruning by checking which layers are completely off-screen should be completely automatic. No wonder browsers are in general so slow.

Enhanced Image Rendering

Since our rendering engine already knows exactly what is visible in each frame, we also worked on improving image rendering. We wanted to support inserting large images into OrgPad, as they can be easily zoomed and explored. On the other hand, rendering many large images gets very slow. Therefore, when an image is uploaded to OrgPad, we process it in the backend and scale it down to thumbnails of various sizes, from about 100 px to up to 5000 px (or the source image size).

Different thumbnail sizes for an uploaded image

When a document is loaded, we immediately load all the smallest image thumbnails so they can be displayed quickly. Additional thumbnail sizes are loaded as needed. The rendering engine knows precisely which images are visible and what the current zoom level is, so it chooses the optimal thumbnail size. We also sometimes wait for animations to finish before switching to a newly loaded image since rasterizing it immediately on the GPU would cause a small lag spike. This change improved both the usability of OrgPad and the performance of its rendering.

Avoiding HTML Rendering

When looking at slow performance of composing many HTML layers, a natural suggestion is to avoid using these layers. Instead of HTML, we can draw nodes directly into the background canvas, similarly to links. We started investigating this in the fall of 2022.

Moving away from HTML rendering entirely is difficult. Ever since we started with OrgPad, we wanted to have rich content inside nodes which browsers offer very easily. Each node can contain formatted text, images, videos, even other embedded websites and other OrgPad documents. This makes OrgPad really powerful but rendering is tricky.

Other applications usually start with simple rendering and gradually add more formatting and content functionality. We decided to mix these two approaches. In a typical OrgPad document, most nodes are very simple. For example, they are just titles without any content at all. Or they are just images placed directly into gray canvas. So we can create our own faster rendering for simple nodes directly into the canvas, while using the existing HTML rendering for complex nodes. Over time, we plan to expand the number of node types with fast canvas rendering. However, we can't entirely avoid mixing with HTML rendering because we want to support embedded websites within content. These websites are sandboxed by the browser for security reasons.

Fast Rendering of Nodes With Just a Title

We started by rendering nodes with only a title, either with no content or when the node is closed. When the state of the node changes and during certain animations, we switch to HTML rendering. These nodes consist of a colored title, colored background and borders created by combining several box-shadows. The elevated borders are important for OrgPad since they indicate to the user whether there is extra content hidden inside the node.

As we were experimenting with the performance of 2D canvas rendering, we discovered the following:

Rendering colored backgrounds as rectangles with border-radius is super fast.
Rendering shadows is a little bit slower but still manageable.
Rendering text with ctx.fillText is really slow. Especially while changing the zoom level, which results in changing the font-size, likely breaking some browser font caches.

So we cannot render all nodes from scratch in each frame.

Instead, we create a canvas cache for each node and only draw these cached bitmaps in each frame. These caches might take a few hundred MB in memory for a larger document. But surprisingly, we actually save memory because the browser also creates similar caches for HTML rendering which are even larger. To save some memory, the maximum resolution of a canvas cache is limited. Instead, when we zoom in a lot, we draw the visible nodes in each frame from scratch. The advantage here is that only a few of these nodes are now visible, so we can afford this.

As for shadows, there are four types of shadows in the browser: CSS box-shadows, CSS drop-shadow filter, SVG shadow filters and 2D canvas shadows. There are subtle differences in what parameters they take (box-shadow has spread, for example) and what they mean (radius vs diameter). Also, box-shadow allows multiple shadows to be applied at once while multiple shadows in a drop-shadow filter are rendered sequentially: the first shadow is blurred and added, so it affects the second one, and so on. Even the standard is very vague, allowing the browser vendors to change the rendering a little bit.

It required a lot of magic constants to make 2D canvas shadows look close enough to CSS box-shadows. It is quite funny that it is not possible to draw just the shadow of a path with 2D canvas, the path itself is always drawn as well, making it difficult to draw multiple shadows. This can be solved by a neat trick where the path is offset outside the canvas so only its shadow is rendered inside.

Comparison of HTML and canvas rendering

Fast Text Rendering With Bitmap Fonts

To render text, we first need to know the size of a node and the layout of its title. One of the essential subsystems we added to OrgPad very early on is called autoresize. Its goal is to compute the nicest looking size of a node, depending on its content. To do this, we render the node content off-screen while testing multiple widths. For each width, the browser computes the corresponding height. Autoresize searches this space of potential sizes and chooses the one which looks the best. Over the years, we have made several improvements to the system. When rendering titles, we wrap each letter (Unicode grapheme) by a span. After rendering it, we can measure the position of each letter, learn the positions of line breaks, etc.

Next, we create our own bitmap fonts stored in memory canvases called letter atlases. When a new letter is encountered for the first time, we use slow canvas text rendering to add it to the letter atlas. Then we copy letters from a letter atlas to the node canvas cache or to the screen canvas. We have to do extra work because normal letters have to be copied in the title color of the node, but colored emojis have to be preserved. So for each letter we also check if it is a colored emoji by rendering it to a separate canvas.

Letter atlas for two titles

Massive Performance Boost, Again

After implementing these canvas caches and rendering nodes with them, we got another 10× improvement in rendering speed. Below we show profiler output while zooming in the same OrgPage as above. Each frame can be produced in about 5 ms which gives the desired 120 fps performance. We only occasionally drop a frame when browser is busy with something. Also you can check a video comparing HTML rendering and rendering using canvas caches.

We are also currently caching images placed directly on the gray canvas. We plan to expand the number of cached node types in the future, but this already helps a lot.

This approach is certainly not without issues. There are currently subtle rendering differences when switching between canvas and HTML rendering. This could be solved by at least rendering node titles and backgrounds completely using canvas, including all animations. Another problem is that programming anything inside a browser is super flaky and one cannot really depend on it. The browser rarely decides that your page is consuming too much memory and erases all your canvases. They are not even undefined, stuff can be drawn again into them, just the old memory is gone. We plan to add some detection code which will check this and restore letter atlases and caches. This happens more frequently on iOS devices where memory limitations are very strict.

Future Rendering With WebGL

Whenever we want to improve performance, the solution is often to go one layer lower and replace some existing code with code created just for our own use case. So instead of depending on React to sync changes, we have written our own rendering engine which orchestrates everything in each frame. And instead of rendering nodes with HTML, we have written our own rendering into canvases.

The next reasonable step is to go further and implement our own low-level rendering on the GPU directly. We believe we can achieve 100× performance improvement in this way, by drawing everything directly without any caches (aside from bitmap fonts, which are also used in computer games). Also, the memory requirements of OrgPad should become way smaller. We will see how this approach works when we have more time to explore it further.