News

Welcome to End Point’s blog

Ongoing observations by End Point people

SDCH: Shared Dictionary Compression over HTTP

Here's something new in HTTP land to play with: Shared Dictionary Compression over HTTP (SDCH, apparently pronounced "sandwich") is a new HTTP 1.1 extension announced by Wei-Hsin Lee of Google last September. Lee explains that with it "a user agent obtains a site-specific dictionary that then allows pages on the site that have many common elements to be transmitted much more quickly." SDCH is applied before gzip or deflate compression, and Lee notes 40% better compression than gzip alone in their tests. Access to the dictionaries stored in the client is scoped by site and path just as cookies are.

The first client support was in the Google Toolbar for Internet Explorer, but it is now going to be much more widely used because it is supported in the Google Chrome browser for Windows. (It's still not in the latest Chrome developer build for Linux, or at any rate not enabled by default if the code is there.)

Only Google's web servers support it to date, as far as I know. Someone intended to start a mod_sdch project for Apache, but there's no code at all yet and no activity since September 2008.

It is interesting to consider the challenge this will have on HTTP proxies that filter content, since the entire content would not be available to the proxy to scan during a single HTTP conversation. Sneakily-split malicious payloads would then be reassembled by the browser or other client, not requiring JavaScript or other active reassembly methods. This forum thread discusses this threat and gives an example of stripping the Accept-encoding: sdch request headers to prevent SDCH from being used at all. Though the threat is real, it's hard to escape the obvious analogy with TCP filtering, which had to grow from stateless to more difficult stateful TCP packet inspection. New features mean not just new benefits but also new complexity, but that's not reason to reflexively reject them.

SDCH references:

2 comments:

Anonymous said...

Stripping headers won't have any effect. Web servers will know what user agent is being used, and can choose to serve sdch content (which will work) regardless...

Payloads which can get interpreted as either sdch or not can even be used to probe reception regardless of any headers at all. I've been doing this for years with gzip encoding (I crafted a "hacked" gzip page which (if gzip works) sets one cookie, but if gzip fails, sets a different one - so the rest of my site knows what the browser understands regardless of MitM interference.)

Not to mention - proxies can't screw with headers over SSL anyhow...

Anonymous said...

SSL inspection / cert spoofing allows proxies to mess with https headers just fine.

also, a proxy that can strip sdch from the Accept-Encoding header can also alter User-Agent, or reject content that comes back marked as sdch-encoded, presenting an error instead to the client. Not useful, but can be done.