Operation: Smooth

Mozilla performance notes - a personal view

Tabstrip #5 - TART, Talos Stress, Smooth Scrolling

| Comments

Talos stress

Talos tsvg and tscroll are about to be replaced with tsvgx and tscrollx, respectively (bug 897054). The main difference is that the “x” tests stress Firefox by iterating animations as fast as possible, AKA “ASAP mode”.

The old tests were performing some animation and then report overall (or average per frame) duration. However, they were using intervals which were not stressing Firefox at all, making the results almost meaningless WRT svg/scroll performance, and rather mostly sensitive to timing changes.

Stressing Firefox exposed some issues such as paint starvation (bug 880036).

Hopefully the new tests will have better correlation to our performance. Joel Maher did (and still does) a lot of work on the Talos side of things with these new tests (a l-o-t!)

There’s a more technical explanation on this dev.platform post.

TART - Tab Animation Regression Test

After previous work which went into improving tab animations in Firefox, It’s time to put it into Talos. TART is implemented as an addon which works either stand alone or from within Talos. It measures frames intervals during different animation cases, and it works equally well on mozilla-central and on the UX branch.

TART uses ASAP mode to iterate animation with unlimited frame rate, thus able to expose differences even if under normal conditions it would have been limited to 60Hz.

Joel Maher did most of the talos side of things, and currently the addon awaits review.

  • TART tests animation on 3 main cases:
    • “simple” - open/close of about:blank.
    • “icon” - open/close of a blank html page with favicon and a long title.
    • “newtab” - open with and without newtab preload.

“simple” is tested at DPI scaling of 1.0, “icon” at scaling of 1.0 and 2.0, and “newtab” at default scaling.

  • Each animation produces 3 result values:
    • “all” - Overall average interval.
    • “half” - Average interval over 50% of the designated duration - from the end of the animation backwards.
    • “error” - The difference between the designated duration to the actual one.

Typically, “half” is the value which corresponds to the stable part of the animation once it’s hopefully settled. Most tab animations have first frame (or few) which are longer than the rest, so the “all” value, while not meaningless, doesn’t tell the whole story.

  • Some issues which had to be resolved while working on TART:
    • Inactive tabs are throttled, so to get accurate timing (intervals, end event), the test is executed from the Chrome window (rather than from the tab itself).
    • At DPI scaling of 2.0 at the talos window size, Australis can’t open a second tab without shrinking the first (thus measures animation of more than one tab). Solution: keep the TART tab pinned.
    • On OS X, “ASAP mode” doesn’t work with OMTC. Solution for now: disable OMTC for TART.
    • On ASAP mode, mostly on OSX, paints can be starved (bug 880036). Solution: use a temporary pref to prevent this (bug 884955).
    • At DPI scaling of 2.0, Australis can only have 3 tabs before it starts overflowing the tabstrip (e.g. bug 891450). This made it very hard to test multi-tab shrink/expand in a comparable fashion on m-c and Australis, and eventually this test was dropped for TART v1.

You can watch TART at bug 848358. Other than a review, it also awaits bug 888899 which should enable “ASAP” (stress) mode on OS X as well.

Absolute scroll smoothness on windows

I was asked by Taras to look into scroll smoothness on windows. Apparently, even after vlad’s great timing improvement, the timers filter removal and making vsync on windows work, Firefox still almost never scrolls 100% smoothly on windows.

I did some extensive research into scroll performance of Firefox and other browsers on various platforms, and posted the results at bug 894128. You’ll also find there a a bookmarklet to test scroll performance on any browser.

One interesting observation is that it’s very hard to programatically detect if the scroll is absolutely smooth. Frame intervals apparently don’t tell the whole story (though they are not meaningless), and neither their standard deviation. To judge if the scroll is 100% smooth, it has to be assessed visually.

  • Some interesting observations from this research:
    • On OS X, even with 50% perf per core (and half the cores as well) compared to a fast Windows system, Firefox scrolls much smoother than on Windows, even with OMTC off.
    • Firefox may occasionally degrade scroll for several frames for no apparent reason (the content is homogenous). GC/CC?
    • OMTC typically doubles the throughput on OS X. On windows OMTC is not yet mature enough to test this.
    • Very low intervals variance is not required for 100% smooth scroll (though IE and Chrome mostly manage very low variance compared to Firefox).
    • Looks like Chrome and IE use something like APZC during scroll. IE is especially exceptional and is able to practically never drop a frame even on extremely low-end HW (Atom z2760), and also almost never show blanks during scroll.

Jet has already shown interest in looking into it. Hopefully some day Firefox will scroll as smooth as it can get…

Tabstrip #4 - Vsync, Newtab, Talos, Paint Starvation

| Comments

I’ve been slightly behind with my blog updates and there has been some great progress recently, so this post covers a bit more than usual.

Vsync

Vsync has finally landed on Windows Nightlies not long ago. This means that Firefox will now synchronize its paints with the actual refresh rate of the monitor (if available), which is essential for properly smooth animations. This will work (and is enabled by default) on Windows Vista and later with DWM enabled (“Aero” themes). This also works with WebGL content such as Epic Citadel (demo).

Without vsync, Firefox uses 60hz refresh rate by default, which works relatively decently with 60hz displays, but fails badly at displaying properly smooth animations on monitors with other refresh rates (50hz on quite a few laptop displays, 100hz monitors, etc).

However, the current implementation synchronizes with the main display only, so if using a multi-monitor setup, Firefox windows on secondary monitors might not gain from this.

You can control the refresh rate manually by modifying layout.frame_rate at about:config (and restart Firefox). The default value is -1 which means “Use vsync if available or 60hz otherwise”. Any other integer value > 0 will force that specific refresh rate.

Naturally, if Firefox can’t animate specific content quickly enough to keep up with the designated rate, then the actual rate will be lower.

about:newtab

about:newtab is the default page when opening a new empty tab - with the thumbnails of recently visited pages. It was affecting tab animations badly, to the point of displaying as little as a 3 frames slideshow on low-end systems while animating the tabstrip after pressing the + button to add a tab.

Tim Taubert from the Firefox team has done some extensive work recently on several bugs which fall into two categories:

  • Enable newtab preload by default. Newtab preload means that Firefox prepares the newtab page in the background, such that when opening a new tab it’s already rendered and therefore could be quickly displayed with very little processing. It has been working for a while but disabled by default due to accessibility issues. Tim removed this blocker, as well as simplified the preload mechanism and made sure that background preload doesn’t happen during animation. This landed about a week ago and newtab preload is now enabled by default on nightlies.

  • Prevent unnecessary reflows of the newtab page. Reflow is the process of re-calculating the layout of the page, e.g. after window resize, and it’s relatively CPU intensive. Tim discovered that newtab causes unnecessary reflows on several cases, some of them related to preload. He fixed them and also added an automatic test to detect if new reflows are accidentally introduced over time.

The result of all this work is that animating a new tab with the newtab page is now almost as smooth as opening a blank tab, with few further potential improvements to be worked on soon. Well done!

Talos

Bug 590422 (remove timers averaging filter) landed more than two months ago and should greatly improve timers accuracy, but it caused few regressions which are still being worked on. Talos tscroll was already modified as a result - which landed about 2 weeks ago, and recently tsvg regressions from this bug were examined as well.

As it turned out, tsvg was rendering SVG animations at 100-200ms intervals, and measuring/reporting the overall runtime as representative of SVG performance. At 4 of the 7 animation tests within the tsvg suite, more than 95% of the time was spent inside setTimeout with the browser completely idle. This resulted in tests which in practice represent the accuracy of timers more than anything else.

To make sure the overall runtime represents actual SVG performance, we should iterate the animation frames as fast as possible. We could do this with setTimeout(loop, 0), but since Firefox renders to screen at 60hz, and it’s usually smart enough to delay reflow calculations until it’s absolutely necessary, we could end up fully processing only one in few frames when the SVG iterations are quicker than 60Hz, and at tsvg they’re much quicker than that.

To make sure each frame is fully processed on each iteration, we should therefore use requestAnimationFrame (fondly nicked rAF) - which makes sure that once we’re back at the callback, whatever layout changes which we introduced were already fully processed and flushed by Firefox. However, rAF iterates at 60hz by default (or vsync rate), which is usually not fast enough to stress our SVG tests. The solution is to set layout.frame_rate to a high enough value (during these tests) which causes Firefox to iterate and fully-process each frame as fast as possible. Joel Maher helps a lot with testing and deploying the new tsvg tests, and we hope to land it soon.

Making tsvg iterate ASAP also brought us back to tscroll. The recent change of tscroll was to use rAF instead of setTimeout for scroll iterations. However, while it represents scroll smoothness under normal conditions, it doesn’t detect small regressions - since Firefox isn’t actually being stressed and the scroll doesn’t limit our iteration frequency. But stressing the browser with ASAP iteration will lose the real-world simulation of normal refresh rates. So we don’t have a single approach to satisfy both of these needs.

We’re still considering this, but it seems talos will eventually run tscroll twice: once @60hz to make sure we can keep up at real-world scenarios, and once with ASAP iterations to detect performance regressions at much higher resolutions.

Paint starvation

While working on tsvg with ASAP iterations, we noticed that Firefox sometimes appears frozen for the entire animation (typically less than 1 second) - it only shows the last animation frame. While this happened when rAF was set to iterate ASAP, we discovered that there are scenarios under perfectly normal conditions which would exhibit exactly the same symptom.

This is now bug 880036. It shows is that if, for some reason, Firefox can’t iterate frames as fast as it wants to (60hz by default), then it may spend too much time on these iteration and prevent other event types from being handled on time. The most noticeable one is the paint event, since it makes Firefox appear frozen, but possibly other events are starved as well.

A patch which seems to fix the issue was already posted, but since this freeze appears to ‘fix’ itself after about 2s, we want to first understand this unfreeze mechanism, and possibly fix it instead of posting a patch to cover up for another bug.

Private Firefox Sync Server in 5 Minutes

| Comments

Well, theoretically in 5 minutes.

The official python server is there. The instructions on this page are for setting up a small 3rd party weave (Firefox sync) PHP server which is compatible with current Firefox 21 and nightlies as far as I can tell, and which I started using on my own systems as a replacement for the Mozilla sync server.

I’m not associated with this server in any way, and I’m not a security expert. Use at your own risk.

The short version - with sqlite:

  1. Have a web server with HTTPS and php5+sqlite (if self-signed cert, make sure to permanently accept it before Firefox sync setup).
  2. Create a directory for weave (e.g. /var/www/weave), and put the files from this repository. Make sure the web server has write access to it (e.g. sudo chown www-data /var/www/weave).
  3. Browse to https://your.domain.com/weave/index.php (sqlite selected by default), click OK.
  4. The server is now ready and operational. Sync URL is https://your.domain.com/weave/index.php/ (note the trailing ‘/’).
  5. Once an account was created after setting up sync from Firefox, you can disable further new accounts registrations at settings.php (new pairings with existing accounts will still work).
  6. If using sqlite, make sure that weave_db is not accessible from outside (e.g. using .htaccess).
  7. If required, to reset the server and delete the accounts: delete weave_db and settings.php at the weave directory (and go to step 3).
  8. Migration to a new server: Unlink all devices (Tools>Options>Sync>Unlink), setup sync on one device with the new server, pair the other devices normally. If you don’t wish to share all items (Addons/Passwords/etc), make sure to click “Sync Options” at the setup/pair dialogs since it resets to default more than expected.

Tabstrip Animation Progress #3

| Comments

Quick update on recent progress.

Vsync from a thread

Following the first experimental vsync implementation on Windows (using GetVBlankInfo which is non-blocking and synchronous), I’ve been experimenting with another implementation which vlad suggested: Run a thread which uses WaitForVBlank (blocking until next vblank) and post the timing event to the main thread.

The conclusions are here. In a nutshell: WaitForVBlank works nicely (when it works), but also blocks the main thread, and while some circumvention is possible (using sleep), it’s still far from ideal. Thanks goes (yet again) to Bas for helpful tips on Windows APIs and integration into the Layer manager.

Since OMTC might integrate WaitForVBlank into its pipe more naturally, it was decided to land the original approach - which is very lite and reasonably working, especially when using the main/single monitor - and reconsider the other approach with OMTC.

Expect vsync on Windows nightlies soon.

New tab previews hurts tab animation

That’s bug 843853. I decided to put it on hold for now, as it appears that not enough resources can be directed towards it. It would have been nice, however, to see a more practical list of priorities - which more carefully takes resources into account. Hopefully, we’ll have such list soon.

Timer filter removal regressions

Bug 590422 (remove averaging filter for timers - which probably hurts timing more than it helps) landed some time ago, but still rears its head. The most recent is lost frames while decoding video.

Apparently what happens is that the filter removal also hurts average timer intervals when high resolution timers are not enabled, and video decoding uses timers without making sure they’re in “high-res mode”. The fix was quick everyone’s happy.

This also brought a new discussion about automatically enabling high-res timers according to some heuristics on the requested timeouts (and possibly also the request source). While it sounds useful, the main argument against it is that high-res timers have slightly higher power draw, and that it could be abused by content pages which use setTimeout(0) while what they actually want is “soon enough”, and therefore get much more frequent callbacks than required.

This discussion is still ongoing, you’re welcome to contribute more factors to consider.

Tabstrip animation project goals

There has been some discussion on refining the project goals, to make them both useful from a user perspective (rather than mostly synthetic bar) and also reasonably achievable. We feel comfortable with this definition, which, in a nutshell, is:

  • Have Smooth enough (50FPS or more) animation during the last part of the animation.
  • Make sure this happens on some concrete cases we care about (such that newtab open, close a tab and get into a gmail tab, etc).

What’s next?

  • Land Windows vsync.

  • Get a clear priorities list for newtab slowdown.

  • Implement a regression test which could measure our discrete tab animation goals (tab animation telemetry is useful, but has lower resolution than we’d like), and give each a score. Still not sure if as a talos test, but right now talos appears to be the proper place for it.

Tabstrip Animation Progress #2: Vsync, Newtab Page Rendering, Lightweight Theme

| Comments

Tabstrip animation - Progress #2

The previous post introduced the Tabstrip animation project, and some work which has been done so far. This post reviews some more recent related progress.

Vsync

During the Paris snappy work week, Bas and I discussed animation smoothness, and specifically the default 60hz refresh rate which Firefox currently uses. We’ve conducted a very unscientific poll among some attendees at the meeting, and found out that 4 out of 9 laptops had 50Hz monitor refresh rate. The rest had 60. For those with 60Hz, the current refresh system is acceptable, even if not perfect (actually … Often, developers choose 60 Hz as the refresh rate, not knowing that the enumerated refresh rate from the monitor is approximately 60,000 / 1,001 Hz …), but those with 50Hz monitors will always get a very noticeable jitter during any animation.

So Bas wrote a patch to expose some vblank timing info on windows via widget, and I updated the refresh driver to use this info. The result is decent vsync synchronization on monitors with all refresh rates (you can test the slightly outdated Windows try build, which hasn’t landed).

However, by using the vblank timing info and targeting a timer just past it, we’re forced to “lose” some 2-3ms in order to always hit the correct vsync phase - by targeting a bit later than the actual future vblank signal, to make sure we account for the inherent inaccuracies of our timers system (integer ms interval, callback delays/too-early, etc).

Vladimir didn’t like this approach, and so now we’re trying a different one with a thread which blocks on actual vblank signal, and then posts an event when we’ve hit it (no patch yet).

However, this approach is also not perfect. As bas noted, the earlier approach allows easier association with different windows/monitors, possibly easier to integrate with fallback to the current code path, and also probably allows easier disabling of vsync on cases where we can’t animate fast enough, or even if gamers want to reduce input/output lag to the minimum possible.

The verdict is not out on this yet, but you can follow the windows vsync implementation and the global vsync tracking bug.

New tab page

Tim taubert has been working on improving new tab page rendering, which often competes with tab open animation on resources, and may degrade it, especially on slow systems.

While at it, we got distracted by a patch which regressed animation smoothness after changing its trigger from asynchronous (setTimeout 0) to synchronous. The main arguments were that setTimeout delays the animation start more than we should, and therefore synchronous is the way to go. On the other hand, setTimeout causes a relatively small delay, and could result in considerably smoother animation, since its callback is typically (but not always) handled after the entire current event queue is handled.

Eventually, we settled on using requestAnimationFrame which is asynchronous and will not delay the first animation frame, but may still be handled earlier than setTimeout, especially on slow systems. The numbers approve this theory (15 animation frames using setTimeout, 7 frames after changing to synchronous, and 12 frames after switching to rAF).

Looking back at this, however, this was a distraction and somewhat premature discussion. The reason being that these measurements were taken with about:blank page, and not our typical newtab page (which is still too slow to actually measure improved animation when using it, on slow systems).

Note to self: Identify distractions and premature discussions before they consume too much time.

Back to new tab rendering. Trying different semi-random approaches didn’t get us too far:

E.g. replacing XUL flexbox with CSS3 - which gave some improvement but added complexity, we’ve considered delaying the tab animation until after newtab rendering settles, or delaying newtab rendering until after the animation is done (or even disabling tab animation on slow systems, especially now that we have tab animation smoothness measuring infrastructure).

We should probably get back to basics and profile the interaction between tab animation and newtab page rendering (which is not so easy, since a lot of stuff is happening when a tab is animating concurrently with new tab page rendering, but we should still do that), get some real real numbers such that we could point some real fingers ;)

Yet another note: While the newtab rendering is apparently high priority for both the performance team and the fx-team, in a hindsight, it’s got too little progress. Maybe we have too many high priority items for the resources we can afford, and we need to get back to re-prioritizing the high priority items?

Lightweight Themes (LWT, previously: Personas)

MattN and mconley are implementing Lightweight themes support for Australis.

As part of their ongoing effort to keep performance in check, they’ve performed animation measurements, comparing performance on various systems and getting some real numbers. They’re summurizing their results on this spreadsheet.

Their current results with LTW is that tab animation is on par with Australis (at the above spreadsheet, columns AL-AP, rows 46-58), which itself is almost on par with current theme, which is a great achievement, considering Australis is much more complex.

Keep up the great work! :)

Tabstrip Animation Project #1 - Introduction

| Comments

I’ve had few slow weeks during which I attended some personal matters. Hopefully I’m back in full steam ahead again.

Here’s a summary of recent events.

Tabstrip animation project

As part of the recent Performance team changes (1, 2), I’m now the tech lead for the tabstrip animation project. It’ll require some additional coordinations and regular blog posts, but otherwise, work continues as usual.

The goal of the tabstrip animation project:

Make tabstrip animations as smooth and as snappy as possible, both visually and perceptually, by identifying and removing bottlenecks, deficiencies, and perception issues. The ultimate goal here is 60 FPS on a recent Atom CPU on all platforms, with minimal delays.

So far, this work touched several different subjects:

Talos - Test in a Browser, Noise Detection

| Comments

Talos is a performance tests framework used by Mozilla. It’s invoked on every checkin to the main code repository for early detection of performance regressions. You can find many interesting talos notes on Joel Maher’s blog.

Talos tests are pretty simple - they perform some task within a browser, then report a completion duration (or a comma-separated list of those) via the tpRecordTime javascript talos call. Talos then logs and processes those values, tries to ignore the noisy bits, and comes up with an average and StdDev values (it also deploys a similar process over several runs of the same test) - which can then be compared to other runs of the same test on different versions/platforms of Firefox.

Run and develop talos tests - without talos

While updating tscroll talos test, I found that running the test in a browser outside of the talos framework could be useful. The page could be reloaded to re-run the test, allowing quicker iterations of the code. While this isn’t as sterile as the talos run environment, it’s still useful during development. This required substituting tpRecordTime with my own, and also displaying various statistics on the collected data.

I’m not the not the only one who deployed such tactics, and one can find commented out alerts and statistics calculation (which are not reported to talos) on various talos tests.

Help Wanted - Slow Touchpad Scroll in Firefox?

| Comments

Bug 829952 says that scrolling using the touchpad feels slow on some laptops.

Turns out that it’s quite hard to provide a consistent scroll experience on the various platforms which Firefox supports, due to many laptop and touchpad manufacturers, different drivers, OS configurations etc. We could have provided some (or a lot of) configuration UI, but the best solution would be to get it as good as possible out of the box.

And to do that, we need data - and you might be able to help.

- Hello World -

| Comments

When your manager does your work for you, it means something has gone wrong along the way. So here is me fixing it - finally I’ve setup a blog. You can read a bit about me here.

Welcome aboard.

Since I joined the performance team at Mozilla not long ago, I’ve been working on animation smoothness, starting with tab animation, but touching related subjects as well.

In this space, I hope to provide interesting progress notes on this and related subjects.

Stay tuned.