• Home
  • Quick Start
    • Configurator
    • Download YUI 3
  • Documentation
    • User Guides
    • Examples
    • Tutorials
    • API Docs
  • Community
    • Gallery
    • Blog
    • Forums
    • YUI Theater
    • Calendar
  • Contribute
    • YUI on GitHub »
    • File a Ticket
    • View Tickets
    • Dashboard
  • Other Projects
    • YUI 2
    • YUI Compressor
    • YUI Doc »
    • YUI Builder
    • YUI PHP Loader
    • YUI Test
    • YUI Website
  • YUI
  • Blog
  • Performance

Blog: Category ‘Performance’

« Older Entries
|
Newer Entries »

Performance Research, Part 6: Less is More — Serving Files Faster by Combining Them

This article is the sixth in a series of YUIBlog articles describing experiments conducted to learn more about optimizing web page performance (Part 1, Part 2, Part 3, Part 4, Part 5).

In Performance Research Part 1, we discussed how reducing the number of HTTP requests has the biggest impact on improving the response time and is often the easiest performance improvement to make. One technique without having to simplify the page design is to combine multiple scripts into a single script, and similarly combine multiple stylesheets into a single stylesheet.

Combining multiple files reduces the extra bytes from HTTP headers as well as potential transfer latency caused by TCP slow starts, packet losses, etc.

Figure 1 shows a graphical view of how time is spent loading a page with six separate scripts. Notice that for every file, the browser makes a separate HTTP request to retrieve the file. The gaps between the scripts indicate the time the browser takes to parse and render each script. Figure 2 shows the how time is spent loading a page with the same six scripts combined into a single script.

Figure 1. Loading a page with six separate scripts

Figure 1. Loading a page with six separate scripts

Figure 2. Loading a page with one combined script

Figure 1. Loading a page with one combined script

Combining JavaScript and CSS files as part of the development process can be burdensome. It usually makes sense during development to organize the code into logical modules as separate files. Typically, combining those separate files before product release is either a manual process or part of a build process. Every time one of the individual files is changed, the larger file needs to be re-combined and re-pushed. The cost of this across an organization as large as Yahoo! is significant.

Serve Files Faster using Combo Handler

Combo Handler, built in collaboration by Yahoo!’s Exceptional Performance team and the groups that support our CDN, is one solution to combine multiple files into a single, larger file.

Combo Handler provides a way to allow developers to maintain the logical organization of their code in separate files, while achieving the advantages of combining those into a single file as part of the final user experience. It alleviates the need for the time-consuming re-build and re-push processes. In addition, Combo Handler integrates seamlessly into a content delivery network, taking full advantage of the benefits of a CDN while reducing the drawbacks of dynamically combining separate files.

We’ve been using this service across many Yahoo! properties for some time now to help improve end users’ response times. Thanks to the YUI team, it is now available to all of you that are using the Yahoo!-hosted YUI JavaScript files. (Note: Combo-handling of CSS files is not supported at this time.) Head over to the YUI Configurator to generate combo-ready filepaths customized for your specific YUI implementation.

Combo Handler Best Practices

When using Combo Handler to combine files, pay special attention to the order in which the files are specified. Not only could there be file dependencies, browsers will only use the cached version of a file if the filename extracted from the URL is identical. For example, suppose the following smaller files (dom.js and event.js) are combined into a single larger file using Combo Handler:


http://yui.yahooapis.com/combo?event.js&dom.js


http://yui.yahooapis.com/combo?dom.js&event.js

In the example above, the browser will download and cache both files separately because the filenames are actually different.

Also, you may not always want to combine all files into one single file. Suppose you have one or more scripts that are shared across multiple pages in your site in addition to scripts that are only used on specific pages. By combining everything into one large file and using this file across your entire site, some pages will spend time downloading more than it really needs. Instead, take a look at different types of combinations. You might combine the scripts that are used in every page across your site into one script. Then for each page or group of pages, combine common scripts into another separate script.

Yahoo! HotJobs Combines and Reduces Response Time by 8%!

The Exceptional Performance team ran an experiment with Yahoo! HotJobs to determine the response time savings our users would benefit from by combining multiple files into a single file. Two real user test buckets were created for this experiment. In one bucket, users visited a page with six JavaScript files left uncombined. In the second bucket, users visited the same page with the six JavaScript files combined into one single file.

Combining six JavaScript files into one single JavaScript file improved performance by almost 8% on average for Yahoo! HotJobs’ users on broadband bandwidth speeds and 5% for users on lan. No design or feature changes required!

Keep in mind that the page we tested was already highly optimized for performance and had a YSlow “A” grade. The response time savings depend on a number of factors including number of files combined, browser caching patterns, etc. This experiment supported our previous research, which indicated that reducing HTTP requests is an effective way to improve response times for our end users.

Takeaways

Improve response times by combining multiple JavaScript and CSS files. Yahoo!’s Combo Handler Service is one solution that provides a way to make fewer HTTP requests for Yahoo!-hosted JavaScript files, and also leverages the benefits of a Content Delivery Network.

  • Combine scripts and stylesheets to reduce HTTP requests.
  • Look at different types of file combinations.
  • Avoid users from having to download more than they really need.
  • Pay special attention to the order in which files are combined.
By Tenni TheurerJuly 21st, 2008

Combo Handler Service Available for Yahoo-hosted JS

We’ve been talking for a long time at Yahoo about the importance of minimizing HTTP requests to improve performance. One important technique for YUI users has long been to use the pre-built "rollup" files (like yahoo-dom-event.js, which combines the YUI Core in a single minified HTTP request) and to create custom rollups that aggregate all of your YUI JS content in a single file. You’ll notice that we do a lot of this on our core Yahoo properties. For example, if you go to check on the Tour de France on Yahoo! Sports, you’ll find that numerous YUI components are aggregated with custom Sports-specific JS resources in a single HTTP request (here’s the aggregate file).

Thanks to the hard work of the Yahoo Exceptional Performance team and the groups that support our CDN, we’re now able to offer ad-hoc file aggregation — "combo handling" — to file served from yui.yahooapis.com. So, a request for the full YUI Rich Text Editor, which previously looked like this…

<script type="text/javascript" 
   src="http://yui.yahooapis.com/2.5.2/build/yahoo-dom-event/yahoo-dom-event.js"></script>
<script type="text/javascript" 
   src="http://yui.yahooapis.com/2.5.2/build/container/container_core-min.js"></script> 
<script type="text/javascript" 
   src="http://yui.yahooapis.com/2.5.2/build/menu/menu-min.js"></script> 
<script type="text/javascript" 
   src="http://yui.yahooapis.com/2.5.2/build/element/element-beta-min.js"></script> 
<script type="text/javascript" 
   src="http://yui.yahooapis.com/2.5.2/build/button/button-min.js"></script>
<script type="text/javascript" 
   src="http://yui.yahooapis.com/2.5.2/build/editor/editor-beta-min.js"></script> 

…can now be written this way:

<script type="text/javascript" 
src="http://yui.yahooapis.com/combo?2.5.2/build/yahoo-dom-event/yahoo-dom-event.js&
2.5.2/build/container/container_core-min.js&2.5.2/build/menu/menu-min.js&
2.5.2/build/element/element-beta-min.js&2.5.2/build/button/button-min.js&
2.5.2/build/editor/editor-beta-min.js"></script>

In one step, this eliminates five separate HTTP requests.

Combo handling is built into the YUI Configurator interface. A few notes regarding combo handling on yui.yahooapis.com:

  • If you’re using the YUI Configurator, this option ("Combine All JS Files") is enabled by default as long as you’re using the default base path.
  • Combo-handling of YUI CSS files is not supported at this time.
  • In an upcoming release, we’ll provide built-in combo-handling support in YUI Loader and restructure filepaths in YUI’s CSS resources to make them combinable as well.
  • YUI Configurator will always output the current version of the library, but all YUI JS files from 2.2.0 onward are present on yui.yahooapis.com and can be combined using the same combo-handling syntax.

We hope combo handling provides a easy performance win for those of you letting Yahoo serve your YUI files. Discussion of combo handling and all YUI issues takes place in our community forum — please join us there and let us know how this works for you.

By Eric MiragliaJuly 16th, 2008

Helping the YUI Compressor

Nicholas Zakas joined Yahoo! in 2006. He is the author of Professional Ajax and Professional JavaScript for Web Developers. He’s a contributor to our Yahoo! Juku. His Maintainable JavaScript presentation is available on YUI Theater.

Julien’s YUI Compressor is an incredibly useful tool for decreasing the size of your JavaScript files. Since it uses Rhino to parse your JavaScript code, it can perform all kinds of smart operations to save bytes in a completely safe way:

  • Replacement of local variable names with shorter (one, two, or three character) variable names.
  • Replacement of bracket notation with dot notation where possible (i.e. foo["bar"] becomes foo.bar).
  • Replacement of quoted literal property names where possible (i.e. { "foo":"bar" } becomes { foo:"bar" } ).
  • Replacement of escaped quotes in strings (i.e. 'aaa\'bbb' becomes "aaa’bbb").

Running your JavaScript code through YUI Compressor results in tremendous savings by default, but there are things you can do to increase the byte savings even further.

Use Constants for Repeated Values

In my talk, Maintainable JavaScript, I talk about using constants (really, just variables that you have no intention of changing) to store repeating values. The idea is that your code is more maintainable because you have a single place to change a value instead of multiple places. As it turns out, this technique also helps YUI Compressor to remove more bytes. Consider the following function:

function toggle(element){
    if (YAHOO.util.Dom.hasClass(element, "selected")){
        YAHOO.util.Dom.removeClass(element, "selected");
    } else {
        YAHOO.util.Dom.addClass(element, "selected");
    }
}

This simple function is designed to toggle the “selected” class on a given element. If the element has the class, then it’s removed; if the element doesn’t have the class, it’s added. As a result, the string “selected” appears three times in the function. The function takes 212 bytes (including white space). When compressed, the resulting code is as follows:

function toggle(A){if(YAHOO.util.Dom.hasClass(A,"selected")){YAHOO.util.Dom.removeClass(A,"selected")}else{YAHOO.util.Dom.addClass(A,"selected")}}

This code weighs in at 146 bytes (a savings of 30%), but you can see that the string “selected” still appears three times. Moving the repeated value into a variable makes the code more maintainable and allows YUI Compressor to remove extra space. Here’s the rewritten function:

function toggle(element){
    var className = "selected";
    if (YAHOO.util.Dom.hasClass(element, className)){
        YAHOO.util.Dom.removeClass(element, className);
    } else {
        YAHOO.util.Dom.addClass(element, className);
    }
}

This code is slightly larger than the original (241 bytes versus 212 bytes), but compresses down to the following:

function toggle(A){var B="selected";if(YAHOO.util.Dom.hasClass(A,B)){YAHOO.util.Dom.removeClass(A,B)}else{YAHOO.util.Dom.addClass(A,B)}}

Note that this compressed code only has one instance of “selected”, resulting in a final byte size of 136 bytes, 10 bytes fewer than the previous version. The savings grow as the instances of the string increase, so if you have 20 places where “selected” was being used, you’d see even greater savings.

Replacing repeated values in your code can lead to greater incremental savings as the number of repeated values increases, as well. It is worthwhile to consider this approach not just for strings, but also for numbers (even Boolean values, if you so desire).

Store Local References to Objects/Values

The YUI Compressor can’t perform variable replacement for either global variables or multi-level object references, so it’s better to store these in local variables. The previous example has three instances of YAHOO.util.Dom in the source code, and so the compressed version also has three instances. By storing YAHOO.util.Dom in a local variable, you can reduce the number of times that it appears in the compressed code. For example:

function toggle(element){
    var className = "selected";
    var YUD = YAHOO.util.Dom;
    if (YUD.hasClass(element, className)){
        YUD.removeClass(element, className);
    } else {
        YUD.addClass(element, className);
    }
}

This version of the function is 238 bytes, and when compressed, shows even greater savings than the previous versions of the function:


function toggle(A){var B="selected";var C=YAHOO.util.Dom;if(C.hasClass(A,B)){C.removeClass(A,B)}else{C.addClass(A,B)}}

The final weight for this version is 118 bytes, a savings of 28 bytes over the original compressed function and 120 bytes smaller from the uncompressed version. And this is just one function, imagine if you got the same savings for all functions in your script.

Keep in mind that this technique also applies to object properties, so if className were a member of an object, its value should be stored locally as well. For instance:

function toggle(element){
    var YUD = YAHOO.util.Dom;
    if (YUD.hasClass(element, Constants.className)){
        YUD.removeClass(element, Constants.className);
    } else {
        YUD.addClass(element, Constants.className);
    }
}

In this function, Constants.className contains the class to use. The variable Constants is global, so its name cannot be replaced. You could set up a reference to Constants, but that is inefficient because you’re only using one property of that object in the function, so set up a reference to Constants.className to save even more bytes:

function toggle(element){
    var className = Constants.className
    var YUD = YAHOO.util.Dom;
    if (YUD.hasClass(element, className)){
        YUD.removeClass(element, className);
    } else {
        YUD.addClass(element, className);
    }
}

Avoid eval()

By this point, you’ve been told that eval() is evil multiple times and by multiple people. YUI Compressor agrees. The nature of eval() is such that the code executed has access to the variables that are present in the scope in which eval() was called. Because of that, YUI Compressor can’t safely do variable name changing when eval() is present. For example:

function doSomething(code){
    var msg = "hi";
    eval(code);
}

doSomething("alert(msg)");   //”hi”

Even though the string that is being passed to eval() exists outside of the function in which eval() is called, it still has access to the local variables in that function. Since YUI Compressor can’t possibly know that the variable code contains a reference to a variable in the function, it doesn’t change the variable names in the doSomething() function, resulting in a less-than-optimal compression. Remember this: any time you use eval() in a function, that function’s variables cannot be renamed. The best approach is, as often said, to avoid eval() at all costs. If you absolutely must use eval() for some reason, try to isolate it away from other code so that the amount of variable renaming issues are minimal. For example:

function myEval(code){
    return eval(code);
}

function doSomething(code){
    var msg = “hi”;
    var count= 10;

    myEval(code);
}

In this code, the call to eval() is isolated away from the main body of the doSomething() function. Now, YUI Compressor is free to replace variables in doSomething().

Avoid with

The with statement is another that is often recommended to avoid in JavaScript. For YUI Compressor, the reason is the same for eval(): just the presence of with in a function causes variable renaming to be skipped for the entire function. There is just no way to keep track of variables versus object properties in the context of a with statement, so YUI Compressor rightly leaves the code as-is to avoid breaking the functionality. The best advice here is to avoid using with altogether. If you follow the advice of storing local copies of objects/properties, you should have no use for with.

Use the Verbose Option

YUI Compressor has a “verbose” option (activated by the –v command line switch) that can help in the identification of some of these issues as well as a few others. The verbose option prints out warnings to the console indicating things that are preventing the YUI Compressor from fully doing its job. It will, for instance, tell you that a function contains eval() or the with statement, and therefore cannot be properly compressed. It also does analysis of variables, telling you if a variable was never defined (in which case it becomes global and cannot have its name replaced), if a variable was defined and never used (which just wastes space), and if a variable has been declared multiple times (also a waste of space).

Conclusion

When used alone, the YUI Compressor achieves an excellent compression rate of your JavaScript code. The greatest byte savings are achieved by taking full advantage of variable replacement. The hints presented here have the primary goal of ensuring the YUI Compressor can do variable replacement whenever possible. Using constants to represent repeated values not only aids in compression, but also aids in the maintainability of your code by limited the number of areas that must be updated to accommodate a change in the value. Using local variables for multi-level object references allows for greater compression through variable replacement as well as providing faster runtime performance (local variable access is faster than global variable access and object property lookup). Perhaps most important is to ensure that you don’t use eval() or with when they’re not necessary, as each causes variable replacement to be turned off in the containing function. The YUI Compressor does a lot for you, but it can’t do everything. You can help it out greatly by following these tips.

By Nicholas C. ZakasFebruary 11th, 2008

Performance Research, Part 5: iPhone Cacheability – Making it Stick

This article, co-written by Wayne Shea, is the fifth in a series of articles describing experiments conducted to learn more about optimizing web page performance (Part 1, Part 2, Part 3, Part 4). You may be wondering why you’re reading a performance article on the YUI Blog. It turns out that most of web page performance is affected by front-end engineering, that is, the user interface design and development.

At MacWorld 2008, Steve Jobs announced that Apple sold 4 million iPhones to date, that’s 20,000 iPhones sold every day. Net Applications reports that total web browsing on iPhone is up at 0.12% for December 2007, topping the web browsing on all Windows Mobile devices combined. Apple’s iPhone has changed the game for many users browsing the web on a mobile device. Web developers can now create functionally rich and visually appealing applications that run within the iPhone’s version of the Safari Mobile web browser. While the iPhone presents new and exciting opportunities for mobile web application developers, it also provides a unique set of performance challenges.

Limited information is available on this device and understanding the cache properties of the browser is essential to creating a high performance web site. In earlier posts, we described how 80% or more of the end-user response time is spent on the front-end, and why the cache matters. In this research, Yahoo!’s Exceptional Performance team investigated the iPhone cache properties and looked at how the performance rules are affected. We were particularly interested in the following cache properties on the iPhone:

  • The maximum cache limit for an individual component.
  • The maximum cache limit for multiple components.
  • The effect of gzipped components on the maximum cache limits.
  • Whether cached components are persistent between power cycles.

We conducted our cache experiments with both Apple’s iPhone and iPod Touch, and came to the same conclusions.

Cache Hit or Miss?

In Part 2, we discussed the importance to differentiate between end user experiences for an empty versus a primed cache page view.

When an external component (scripts, stylesheets, and images) is referenced in an HTML page, the browser makes an HTTP request and stores the component in memory while the HTML page is rendered. Though components are stored in the browser’s memory during rendering, they may or may not be stored in the browser’s cache. A “cache miss” refers to when the browser bypasses the cache and requests the component over the network. A “cache hit” refers to when the component is found in the cache and the corresponding HTTP requests are avoided.

Components are cacheable when they include either the expires or cache-control header.

      Expires: [Expiration time in GMT Format]
      Cache-Control: max-age=[Expiration time in seconds]

Components that do not have one of the above headers will not be cached by the browser. To discover the cache capabilities on the iPhone browser and get a cache hit, we configured the server to include the following response header:

      Expires: Thu, 15 Apr 2010 20:00:00 GMT

Maximum Cache Limits

In our experiments, we varied the size of different types of components (images, stylesheets, and scripts) to determine the maximum cache size for an individual component. We found that if the size of component is greater than 25 KB, the iPhone’s browser does not cache the component. Thus, web pages designed specifically for the iPhone should reduce the size of each component to 25 Kbytes or less for optimal caching behavior.

The good news is if the browser downloads a component larger than 25 KB, components already in the browser cache are not affected. Components already in the cache are only replaced by newer cacheable components under 25 KB using the LRU (least recently used) algorithm.

Apple’s website indicates a 10 MB limit for individual components. The limit applies to the browser ability to store component in memory (not disk). However, the actual size that the iPhone can handle is much smaller, and depends on memory fragmentation and other applications that may be running concurrently. The uncached components are reclaimed by the browser when the page unloads.

To determine the maximum limit of the iPhone cache for multiple components, we incremented the number of 25 KB sized components embedded in our page. We tested the various component types and found that the iPhone browser was able to cache a maximum of 19 external 25KB components. The maximum cache limit for multiple components is found to be 475K – 500 KB.

Compressed Components

We also analyzed the impact of the cache characteristics on components transmitted with and without compression. We were surprised to find that the 25 KB maximum cache limit for a component is independent to whether the component was sent gzipped. The Safari browser on the iPhone decodes the component before saving it to the cache. Therefore, only the uncompressed size matters, which further emphasizes the importance of keeping the size of components small.

The Effect of Power Cycle

Every once in awhile, iPhone and iPod Touch users will need to force a hard reset, or in other words, cut the power and reboot the device. This is achieved by a hold of the sleep button for five seconds, and a simple slide to power off. Suppose a user was browsing your site at the moment before the reset. Will the images and stylesheets still be in the browser’s cache when the user returns to ensure a speedy response time when the user returns? We discovered that the iPhone browser cache is not persisted across power cycle. This means that the Safari browser cache on iPhone allocates memory from the system memory to create cached components but does not save the cached components in persistent storage.

Case Study: Yahoo! Front Page

Yahoo! launched a beta version of the mobile home page at the Consumer Electronics Show (CES) in January 2008. From a performance standpoint, this makes perfect sense. The iPhone has an amazing UI, but it is limited by the small cache size and slow network speed. Downloading large components over the air through the EDGE network is slower compared to DSL. According to published reports, the typical data download speed varies from 82 kbps to 150 kbps. Though the WiFi network speed is usually more acceptable, it’s better to give users the choice in which experience they’d prefer. Let’s take a closer look at the caching characteristics of the mobile and desktop versions of Yahoo!’s Home Page on the iPhone. Figure 1 below shows a comparison between the two.

Figure 1. Yahoo! Front Page Mobile and Desktop versions on the iPhone

Yahoo! Front Page Mobile and Desktop versions on the iPhone

The desktop version of Yahoo!’s home page is roughly 11 times heavier in total size than the mobile version. As a result, the response time to load the desktop version on the iPhone is over 10 times as long. Table 1 shows a summary of the total size and number of HTTP requests to load the Yahoo!’s Front Page mobile version. Loading the page on the iPhone over the EDGE network, it took on average 2.2 seconds to load with an empty cache and on average only 1.5 seconds to load with a primed cache. Table 2 shows the desktop version took on average 25.4 seconds to load with an empty cache and on average 19.9 seconds with a primed cache. That’s 32% faster with a primed cache than empty cache to load the mobile version, rather than only 22% faster to load the desktop version. While the mobile site is designed for optimal caching behavior, the desktop version contains many more components that are uncacheable by the iPhone.

Table 1. iPhone Mobile Experience
Empty Cache Primed Cache
HTML/Text 5K (23K*) 5K (23K*)
Images 14K 5K
Total Size 19K (37K*) 10K (28K*)
HTTP Requests 23 4
Response Time 2.2 sec 1.5 sec
Table 2. iPhone Desktop Experience
Empty Cache Primed Cache
HTML/Text 32K (121K*) 32K (121K*)
Images 117K 32K
JS/CSS 74K (278K*) 73K (272K*)
Total Size 223K (517K*) 137K (425K*)
HTTP Requests 30 4
Response Time 25.4 sec 19.9 sec

* Uncompressed sizes measured in kilobytes.

Takeaways

Design sites specific for iPhone users. In addition to improved usability, you will also reduce the overall page weight and enhance end-user’s performance. Yahoo!’s Exceptional Performance team identified 13 rules for making web pages fast. The iPhone cache experiment suggests an additional performance rule specific for developing web sites for the iPhone:

Reduce the size of each component to 25 Kbytes or less for optimal caching behavior.

Given that the wireless network speed on iPhone is limited and the browser cache is cleared across power cycle, it is even more important to make fewer HTTP requests to achieve good performance than in the desktop world. To reduce the number of HTTP requests, Safari on iPhone supports image map, CSS sprites, inline images and inline CSS images. Take advantage of the browser cache whenever possible. If an external component can be shared across multiple pages in the site, remember that each individual component has to be smaller than 25 KB to be cacheable. Also, the maximum cache limit of all components is 475 – 500 KB. Minify all the JavaScript, CSS and HTML. For components that aren’t shared across multiple pages, consider making them inline.

By Tenni TheurerFebruary 6th, 2008

Hosting YUI Files for Implementations in Mainland China

Announcing support for hosting YUI in China on YUIBlog.cnBack in February 2007, we opened up hosting of YUI files on Yahoo’s content delivery network to all users, and we maintain a page describing how you can implement YUI while drawing all of its resources from our network. What we’ve heard from the YUI community is that having this choice is a big deal — and more than a billion YUI files were served from our yui.yahooapis.com last week, a number that has grown steadily since we opened up that service.

The yui.yahooapis.com domain is an edge-hosted CDN, and it automatically draws files from data centers as close as possible to the source of the request, optimizing performance. While that works well in most locations, one area where we were seeing poor response times was in China, where a growing community of YUI users is located. To help improve performance for implementers serving the China market, we’re announcing today the availability of cn.yui.yahooapis.com, a CDN specifically for that region.

As of today, the following two paths will both work for retrieving the minified Yahoo Global Object:

  • http://cn.yui.yahooapis.com/2.4.1/build/yahoo/yahoo-min.js (China region)
  • http://yui.yahooapis.com/2.4.1/build/yahoo/yahoo-min.js (standard, global usage)

For most implementations, you’ll want to continue using the standard yui.yahooapis.com, but if your project serves China primarily the new domain will improve your response times and deliver a better experience. For users in mainland China, we’ve seen as much as a 5x improvement in response times based on initial tests.

A bit more about this (in Chinese) on YUIBlog.cn, a blog created recently by the user experience team at Yahoo! China (a company in the Alibaba group). A big thanks to Hongwei Zeng, an engineer at Yahoo! China, for helping to make this arrangement possible.

By Eric MiragliaJanuary 15th, 2008

Performance Research, Part 4: Maximizing Parallel Downloads in the Carpool Lane

This article, co-written by Steve Souders, is the fourth in a series of articles describing experiments conducted to learn more about optimizing web page performance (Part 1, Part 2, Part 3). You may be wondering why you’re reading a performance article on the YUI Blog. It turns out that most of web page performance is affected by front-end engineering, that is, the user interface design and development.

Parallel Downloads

The biggest impact on end-user response times is the number of components in the page. Each component requires an extra HTTP request, perhaps not when the cache is full, but definitely when the cache is empty. Knowing that the browser performs HTTP requests in parallel, you may ask why the number of HTTP requests affects response time. Can’t the browser download them all at once?

The explanation goes back to the HTTP/1.1 spec, which suggests that browsers download two components in parallel per hostname. Many web pages download all their components from a single hostname. Viewing these HTTP requests reveals a stair-step pattern, as shown in Figure 1.

Figure 1. Downloading 2 components in parallel

Figure 1. Downloading 2 components in parallel

If a web page evenly distributed its components across two hostnames, the overall response time would be about twice as fast. The HTTP requests would look as shown in Figure 2, with four components downloaded in parallel (two per hostname). The horizontal width of the box is the same, to give a visual cue as to how much faster this page loads.

Figure 2. Downloading 4 components in parallel

Figure 2. Downloading 4 components in parallel

Limiting parallel downloads to two per hostname is a guideline. By default, both Internet Explorer and Firefox follow the guideline, but users can override this default behavior. Internet Explorer stores the value in the Registry Editor. (See Microsoft Help and Support.) Firefox’s setting is controlled by the network.http.max-persistent-connections-per-server setting, accessible in the about:config page.

It’s interesting to note that for HTTP/1.0, Firefox’s default is to download eight components in parallel per hostname. Figure 3 shows what it would look like to download these ten images if Firefox’s HTTP/1.0 settings are used. It’s even faster than Figure 2, and we didn’t have to split the images across two hostnames.

Figure 3. Downloading 8 components in parallel

Figure 3. Downloading 8 components in parallel

Most web sites today use HTTP/1.1, but the idea of increasing parallel downloads beyond two per hostname is intriguing. Instead of relying on users to modify their browser settings, front-end engineers could simply use CNAMEs (DNS aliases) to split their components across multiple hostnames. Maximizing parallel downloads doesn’t come without a cost. Depending on your bandwidth and CPU speed, too many parallel downloads can degrade performance.

If browsers limit the number of parallel downloads to two (per hostname over HTTP/1.1), this raises the question:

What if we use additional aliases to increase parallel downloads in our pages?

We’ve seen a couple great blogs and articles written recently on the subject, most notably Ryan Breen of Gomez and Aaron Hopkins over at Google. Here’s another spin. The performance team at Yahoo! ran an experiment to measure the impact of using various numbers of hostname aliases. The experiment measured an empty HTML document with 20 images on the page. The images were fetched from the same servers as those used by real Yahoo! pages. We ran the experiment in a controlled environment using a test harness that fetches a set of URLs repeatedly while measuring how long it takes to load the page on DSL. The results are shown in Figure 4.

Figure 4. Loading an Empty HTML Document with 20 images using Various Number of Aliases

Figure 4. Loading an Empty HTML Document with 20 images using Various Number of Aliases

Note: Times are for cached aliases, empty file cache page loads on DSL (~800 kbps).

In our experiment, we vary the number of aliases: 1, 2, 4, 5, and 10. This increases the number of parallel downloads to 2, 4, 8, 10, and 20 respectively. We fetch 20 smaller-sized images (36 x 36 px) and 20 medium-sized images (116 x 61 px). To our surprise, increasing the number of aliases for loading the medium-size images (116 x 61px) worsens the response times using four or more aliases. Increasing the number of aliases by more than two for smaller-sized images (36 x 36px) doesn’t make much of an impact on the overall response time. On average, using two aliases is best.

One possible contributor for slower response times is the amount of CPU thrashing on the client caused by increasing the number of parallel downloads. The more images that are downloaded in parallel, the greater the amount of CPU thrashing on the client. On my laptop at work, the CPU jumped from 25% usage for 2 parallel downloads to 40% usage for 20 parallel downloads. These values can vary significantly across users’ computers but is just another factor to consider before increasing the number of aliases to maximize parallel downloads.

These results are for the case where the domains are already cached in the browser. In the case where the domains are not cached, the response times get significantly worse as the number of hostname aliases increases. For web pages desiring to optimize the experience for first time users, we recommend not to increase the number of domains. To optimize for the second page view, where the domains are most likely cached, increasing parallel downloads does improve response times. The choice depends on which scenario was most typical.

Another issue to consider is that DNS lookup times vary significantly across ISPs and geographic locations. Typically, DNS lookup times for users from non-US cities are significantly higher than those for users within the US. If a good percentage of your users are coming from outside the US, the benefits of increasing parallel downloads is offset by the time to make many DNS lookups.

Our rule of thumb is to increase the number of parallel downloads by using at least two, but no more than four hostnames. Once again, this underscores the number one rule for improving response times: reduce the number of components in the page.

Steve Souders is also writing a series of blogs on Yahoo! Developer Network describing best practices he’s developed at Yahoo! for improving performance (Part 1, Part 2).

By Tenni TheurerApril 11th, 2007

Performance Research, Part 3: When the Cookie Crumbles

This article, co-written by Patty Chi, is the third in a series of articles describing experiments conducted to learn more about optimizing web page performance (Part 1, Part 2). You may be wondering why you’re reading a performance article on the YUI Blog. It turns out that most of web page performance is affected by front-end engineering — that is, the user interface design and development.

HTTP cookies are used for a variety of reasons such as authentication and personalization. Information about cookies is exchanged in the HTTP headers between web servers and browsers. This article discusses the impact of cookies on the overall user response time.

HTTP Quick Review

Cookies originate from web servers when browsers request a page. Here is a sample HTTP header sent by the web server after a request for www.yahoo.com:

  HTTP/1.1 200 OK
  Content-Type: text/html; charset=utf-8
  Set-Cookie: C=abcde; path=/; domain=.yahoo.com

The header includes information about the response such as the protocol version, status code, and content-type. The Set-Cookie is also included in the response and in this example the name of the cookie is “C” and the value of the cookie is “abcde”. Note: The maximum size of a cookie is 5051 bytes in IE 6.0 and 4096 bytes in Firefox 1.5.

The browser saves the “C” cookie on the user’s computer and sends it back in future requests. The “domain=.yahoo.com” specifies that the browser should include the cookie in future requests within the .yahoo.com domain and all its sub-domains. For example, if the user then visits finance.yahoo.com, the browser includes the “C” cookie in the request. Since an Expires attribute is not included in this example, the cookie expires at the end of the session.

Here is a sample HTTP header for finance.yahoo.com sent by the browser:

  GET / HTTP/1.1
  Host: finance.yahoo.com
  User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; ...
  Cookie: C=abcde;

Notice that the “C” cookie, originating from www.yahoo.com, is also included in the request for finance.yahoo.com.

Impact of cookies on response time

The performance team at Yahoo! ran an experiment to measure the impact of retrieving a document with various cookie sizes. The experiment measured a static HTML document with no elements in the page. The primary variable in the experiment was the cookie size. We ran the experiment using a test harness that fetches a set of URLs repeatedly while measuring how long it takes to load the page on DSL. The results are shown in Table 1.

Table 1. Response times for various cookie sizes
Cookie Size Median Response Time (Delta)
0 bytes 78 ms (0 ms)
500 bytes 79 ms (+1 ms)
1000 bytes 94 ms (+16 ms)
1500 bytes 109 ms (+31 ms)
2000 bytes 125 ms (+47 ms)
2500 bytes 141 ms (+63 ms)
3000 bytes 156 ms (+78 ms)

Note: Times are for page loads on DSL (~800 kbps).

These results highlight the importance of keeping the size of cookies as low as possible to minimize the impact on the user’s response time. A 3000 byte cookie, or multiple cookies that total 3000 bytes, could add as much as an 80 ms delay for users on DSL bandwidth speeds. The delay is even worse for users on dial-up.

How big are users’ cookies set at the .yahoo.com domain?

Cookies set at the .yahoo.com domain affect the overall response time for users visiting web pages across the Yahoo! network. Figure 1 shows the percentages of Yahoo!’s total page views with various cookie sizes set at the .yahoo.com domain.

Figure 1. Percentage of Page Views with Various Cookie Sizes

Figure 1. Percentage of Page Views with Various Cookie Sizes

About 80% of page views have fewer than 1000 bytes of cookies, which correlates to about a 5 to 15 ms delay for users on DSL bandwidth speeds. While the data shows that the majority of page views aren’t impacted by a significant delay, it also shows that about 2% of page views have over 1500 bytes of cookies set at the .yahoo.com domain. Although 2% sounds insignificant, at Yahoo! this translates to millions of page views per day, a compelling motivation for us to investigate this 2% and eliminate unnecessary cookies, reduce cookie sizes, and set cookies at more granualar domain levels.

In an earlier post about browser cache usage, one user made a comment about the side-effects of different browsers. Since Internet Explorer and Firefox have different implementations for the maximum size and number of cookies supported, we also analyzed the data by browser type and found no significant difference between the cookie sizes. It would be interesting to further investigate whether there is a difference in performance across browsers.

Analysis of Cookie Sizes across the Web

To show how Yahoo!’s cookie usage relates to those of other companies, we analyzed the cookies set by other popular web sites. For this experiment, we cleared all our cookies and visited only the home pages of these web sites. Table 2 shows between 60 and 500 bytes of cookie information included in the HTTP headers.

Table 2. Total Cookie Sizes
  Total Cookie Size
Amazon 60 bytes
Google 72 bytes
Yahoo 122 bytes
CNN 184 bytes
YouTube 218 bytes
MSN 268 bytes
eBay 331 bytes
MySpace 500 bytes

Note: We only requested the home page.

The data in Table 2 reflects only cookies set at the top domain levels to eliminate any cookies that may have been set by ads. The total cookie size for Yahoo! (122 bytes) in Table 2 differs from the cookie sizes indicated in Figure 4 because in this experiment we visited only the home pages of each web site. The data in Figure 4 reflect real users, many of whom visit multiple Yahoo! web pages. To illustrate, if tv.yahoo.com and movies.yahoo.com wanted to share information within a cookie, the cookie must be set at the .yahoo.com domain. The total cookie size set at the .yahoo.com domain for a user who visits multiple Yahoo! sub-domains is typically higher than the total cookie size set for a user who only visits www.yahoo.com.

Setting cookies at the appropriate path and domain is just as important as the size of the cookie, if not more. A cookie set at the .yahoo.com domain impacts the response time for every Yahoo! page in the .yahoo.com domain that a user visits.

Takeaways

  • Eliminate unnecessary cookies.
  • Keep cookie sizes as low as possible to minimize the impact on the user response time.
  • Be mindful of setting cookies at the appropriate domain level so other sub-domains are not affected.
  • Set an Expires date appropriately. An earlier Expires date or none removes the cookie sooner, improving the user response time.
By Tenni TheurerMarch 1st, 2007
« Older Entries
|
Newer Entries »

Pages

  • About
  • Contribute
  • YUI Jobs

Recent Posts

  • YUI Weekly for May 17th, 2013
  • Yahoo’s International Team Is Hiring!
  • YUICompressor 2.4.8 Released
  • YUI 3.10.1 Released to Fix SWF Vulnerability
  • YUI Weekly for May 10th, 2013

Archives

Categories

  • Accessibility (25)
  • CSS 101 (6)
  • Design (51)
  • Development (590)
  • Frontend Jobs at Yahoo (13)
  • Graded Browser Support (8)
  • In the Wild (63)
  • Miscellany (11)
  • Open Hours (44)
  • Performance (23)
  • Releases (25)
  • Target Environments (11)
  • Yeti (3)
  • YUI 3 Gallery (29)
  • YUI Events (45)
  • YUI Implementations (55)
  • YUI Theater (146)
  • YUI Weekly (37)

Meta

  • Log in
  • Entries RSS
  • Comments RSS
  • WordPress.org
© 2013 YUI Blog