Yahoo! Hack Day: Exceptional Performance

I’m at Yahoo’s hack day and im listening to the presentation on efficiency and performance.
Steve Souders and Tenni Theurer are presenting.

Where is the efficiency coming from?
80/20 rule: Only 10 to 20% of the time is spent in Apache responding with the web page – no more than 20% of the response time is actually in retrieving the HTML document. The rest – up to 90% – is spent somewhere else.

80% of the time brosers spend time getting external files – CSS, JS, images…
What about cache on the user’s side?
First page view: 41 components, far-future expires header on most items
Subsequent page view: 4 elements.

Does this benefit the user? How many users come in with empty cache? How many page views happen with empty cache?

Experiment: Set an image with an expiry date in the past. Depending on the response coming back you can tell who has it in their cache (#200 means not in cache, #304 means in cache).
Results: 40-60% comes in with empty cache!
Keep in mind: Empty Cache User Experience! There is no easy way out. Huge difference between empty and full cache. It is important to optimize both experiences.

How can you apply this to build a lightning-fast hack?
Since hacks have almost no back-end and is all front-end, what can you do?
Sure, optimization happens later on and is not crucial to a hack, but it needs to run as good as possible.

Profiling Tetris Game:

Using IBM Page Detailer packet sniffer we analyze the page.
Before optimization: Nothing had an expires header and nothing was GZipped! 112 Kbytes, 9 HTTP requests.

After optimization: 6 items, 47Kbytes. Far-future expires header on images, CSS, Javascript. To overcome the fact that if its in cache and you change it, it won’t be reloaded, you use file names that includes a version control number or a date.

18 ms response time with a full-cache far-future-cache setup versus the 427 ms response time for empty-cache original.


Allows you to send any string of data with requests. In implementation of Apache… but not a good idea, since part of it is a specific server-id, so the INodes won’t match, and this defeats the purpose of caching! Turn ETags off if you have more than one server. If you rely on ETags, you’re basically disabling a #304 response so that the file is not downloaded again even if it hasn’t changed (just the etags that’s different).

We’re hiring!
souders at …

They need technology evangelists that spreads the code-well word!

Comments are closed.