This past Sunday, I landed a set of changes into Review Board that provide improved performance, such as aggressive browser-side caching of media and pages. It’s just a start, but has already significantly reduced page load times in all of my tests, in some case by several seconds. We implemented these methods for Review Board, but they’re methods that can be applied to any Django project out there.
There are several key things that Review Board now does to improve performance:
- Tells browsers to cache all media for one year.
- Only sends page data if new data is available.
- Compresses all media files to reduce transfer time.
- Parallelizes media downloads.
- Loads CSS files as early as possible.
- Loads JavaScript files as late as possible.
- Progressively loads expensive data.
A lot of the performance improvements come straight from Yahoo!’s Best Practices for Speeding Up Your Site. We’re not doing everything there yet, but we’re working toward it. That’s a great resource, by the way, and I recommend that everyone who has even made a website before go and read it.
So what do the above techniques buy us, and how are we doing them? Let me go into more details…
Caching all media for a year
The average site has one or more CSS files, JavaScript files, and several images. This translates to a lot of requests to the server, which may leave the site feeling slow. On top of this, a browser only makes a few requests to a server at a time, in order to avoid swamping the server, which will further hinder load times. This happens every time a user visits a page on your site.
Aggressive caching makes a huge difference and can greatly reduce load times for users. Review Board now tells the browser to cache media files for a year. Once a user downloads a JavaScript or CSS file, they won’t have to download it again, meaning that in general the only requests the browser needs to make is for the page requests and AJAX requests.
The big pitfall with long-term caching is that the cached resources can go stale. For example, if a new version of an image was uploaded, the browser wouldn’t even know about it, since it was told it should keep its old version for a year before checking again.
We solve this by introducing “media serials,” timestamps that are appended to all media paths. Instead of caching /js/myscript.js, the browser would cache /js/myscript.js?1273618736.
These media serials are computed on the first page request by our djblets.util.context_processors.ajaxSerial context processor. This quickly scans all media files known to the program, finding out the latest modification timestamp. It then provides a {{MEDIA_SERIAL}} variable for templates to append to media URLs as part of the query string.
The benefit to this method is that we can cache media files for a year and not worry about users having stale cached resources the next time we upgrade a copy of Review Board. The filenames requested will be different, browsers will see that the new file is not in the cache, and make a request, caching the new file for a year.
Only send page data if new data is available
Aggressive caching of media files is great and saves a lot of time, but it doesn’t help for dynamically generated content. For this, we need a new strategy.
When a browser makes a request, it can send a If-Modified-Since header to the server containing the Last-Modified value it received the last time it downloaded that page. This is a very valuable header, and there’s some things we can do with it to save both the server and the browser a lot of trouble.
If the browser sends If-Modified-Since, and we know that no new data has been generated since the timestamp provided, we can send an HttpResponseNotModified (HTTP response code 304). This will tell the browser it already has the newest version of the page. The sooner we do this, the better, as it means we don’t have to waste time building templates or doing any expensive database queries.
Djblets, once again, provides some functions to help out here: djblets.util.http.set_last_modified and djblets.util.http.get_modified_since.
The general usage pattern is that we first build a timestamp representing the latest version of the page. This could be the timestamp for a particular object that the page represents. We then check if we can bail early by calling:
if get_modified_since(request, timestamp): return HttpResponseNotModified()
Further down, after building the page, we must set the Last-Modified timestamp, using the same timestamp as above, like so:
set_last_modified(response, timestamp)
We’re using this in only a few places right now, such as the review request details page, but it drastically improves load times. If the review request hasn’t changed and nobody’s commented on it since the browser last loaded the page, a reload of the page will be almost instant.
Compress all media files
Our Apache and lighttpd config files now enable compression by default. By compressing these files, we can turn a relatively large JavaScript file (such as the jquery and jquery-ui files) into a very small file before sending it over to the browser. This reduces transfer times at the expense of compression/decompression time (which is small enough to not worry for deployments of this size, and can be offset by caching of compressed files server-side).
Parallelize media downloads
It’s important to not mix loads of media files of different types. The browser parallelizes media downloads of the same type, in page load order, but if you load one CSS file, one JavaScript file, another CSS file, and then another JavaScript file, the browser will only attempt one load at a time. If you load all the CSS files before all JavaScript files, it will parallelize the CSS file download and then the JavaScript downloads. By enforcing the separation of loads, we can achieve faster page download/render times.
Load CSS files as soon as possible
Loading CSS files before the browser starts to display the page can make the page appear to load smoother. The browser will already know how things should look and will lay the page out accordingly, instead of laying the page out once and then updating that once the CSS files have loaded.
Load JavaScript files as late as possible
JavaScript loads block the browser, as the browser must parse and interpet the JavaScript before it can continue. Sometimes it’s necessary to load a JavaScript file early, but in many cases the files can be loaded late. When possible, we load JavaScript files at the very end of the document body so that they won’t even begin downloading until the page has rendered. This provides noticeable performance for script-heavy pages.
Progressively load expensive data
There are types of data that are just too expensive to load along with the rest of the page. For a long time, Review Board would parse and render fragments of a diff for display in the review request page, but that meant that before the page could load, Review Board would need to do the following:
- Query the list of all comments.
- Fetch every file commented on.
- Apply the stored patch to each file.
- Diff between the original and patched files.
- Render the portion of the diff commented on into the page.
This became very time-consuming, and if a server was down, the page wasn’t available until everything timed out. The solution to this was to lazily load each of these diff fragments in order.
We now display a placeholder table for each diff fragment in roughly the same size of the rendered fragment (to avoid excessive page scrolling on loads). The table contains a spinner showing that something is happening, and, one-by-one (to avoid dogpiling) we load each diff fragment.
The code to render the diff fragment, by the way, takes advantage of the If-Modified-Since header and is also cached for a year. We use an AJAX_SERIAL (same principal as the MEDIA_SERIAL above) to allow for changes in new deployments.
With these caching mechanisms in place, the review request page now loads in roughly a second in many cases (or less once cached), with diff fragments coming in lazily (and then almost immediately on future loads).
More to come…
This was a great first step, but there’s more we can do. Before we hit our 1.0 release, we’re going to batch together all our CSS files and JavaScript files into a couple of combined files and then “minify” them (basically compressing them in such a way to allow for smaller files and faster load times of the interpreted data).
Again, these are techniques we’re now making use of in Review Board, but they’re not in any way specific to Review Board. Anyone out there developing websites or web applications should seriously look into ways to improve performance. I hope this was a good starting point, but seriously, do read Yahoo!’s article as well.
Why doesn’t ETAGS cut it? Compared to modifying the querystring, ETAGS seems to be pretty powerfull.
Have you pushed these changes to the VMware server yet?
@Tony: No. You’ll notice when we do, and there will be an announcement.
@anon: ETAGS doesn’t magically solve caching issues and doesn’t give us the ability to save time server-side. ETAGS is certainly a good thing, but not a replacement for modified times (nor is modified times a replacement for ETAGS).
ETAGS are often generated based on the content of a document, which means performing all the queries needed and doing the templating needed to build that document. What happens is that when an ETAG is generated, it’s sent to the server the next time, and the server still has to do all the work to generate the content. The ETAG saves the transfer time, but none of the server time. In fact, it actually adds to it a little since we need to generate something based on the document (which in our case can be expense for diffs).
By making clever use of the If-Modified-Since header, we can shortcut all that. Given that ability, ETAGS aren’t that helpful to us, so long as we’re handling it in all places. We’re not yet, but we’re also not losing out on much since with the aggressive caching, the page transfer time is pretty small for the remaining pages.
If you’ve got people deploying the app on secure sites, it is also worth sending “Cache-Control: public” for things you don’t mind being stored to disk such as media. Without this, browsers will only cache those resources in memory rather than on disk.
With respect to ETags, if you need to generate the entire page in order to calculate your ETag, you won’t see a huge benefit from their use (it would still cut down the bandwidth though). They help a lot when you have a way of calculating the ETag without generating the entire contents (e.g ViewVC uses revision numbers as ETags for a number of pages, which are cheap to check). That said, if the only identifier you have is a monotonically increasing date field, then using last-modified is probably okay. Just make sure there is no way for the date to go backwards …
…and to check that altogether, install http://developer.yahoo.com/yslow/
@vinz: YSlow! is an awesome tool. It’s been invaluable to this.
@James: Both good points. We’ll probably be taking advantage of ETags for certain pages. The pages where the modified times have been important were our big performance bottlenecks, but now that those are taken care of, I’m going to look into integrating ETags in some places.