Log in

No account? Create an account
Lord Yupa

February 2010

Powered by LiveJournal.com
Danger Mouse

Mozilla oddity? [Updated]

I finally got around to setting up mod_gzip again (I broke it quite a while back, and just haven't gotten around to fixing it).

One thing I noticed, is that Mozilla doesn't always "work" correctly with it.

For example, when I try to get http://topher.zyp.org (a test page I use for playing with HTML, XHTML, CSS, etc), Mozilla somehow manages to incorrectly request the page. Here's the apache log:

tconl171104.tconl.com - - [14/Nov/2001:02:17:29 -0600] "GET / HTTP/1.1" 304 - "-" "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.5) Gecko/20011011" mod_gzip: SEND_AS_IS:NO_200 In:0 Out:0 Ratio:0pct.

Here's what it should look like (taken from lynx and w3m):

topher.zyp.org - - [14/Nov/2001:02:16:03 -0600] "GET / HTTP/1.0" 200 852 "-" "Lynx/2.8.4rel.1 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.6b" mod_gzip: OK In:1794 Out:852 Ratio:53pct.

topher.zyp.org - - [14/Nov/2001:02:20:21 -0600] "GET / HTTP/1.0" 200 852 "-" "w3m/0.2.1-inu-1.5" mod_gzip: OK In:1794 Out:852 Ratio:53pct.

So, why isn't Mozilla getting it? What's really odd, too, is if I manually gzip index.html (to index.html.gz) and request index.html (as opposed to "/"), then mozilla will properly get the gzipped file just fine.

And, I know mozilla can handle gzipped files (I've played with manually compressed files before, with it). Then, I remembered that in the Preferences, under Debug, there's a list of acceptable Encoding types. So, I glanced at that:

gzip, deflate, compress;q=0.9

Well, I'm confoozled. I can't think of any reason why mozilla wouldn't be getting/requesting the file correctly. Oh, and what's really strange, is that every once in a while Mozilla will get the file properly, compressed. And it works fine when I request non-static files with Mozilla:

tconl171104.tconl.com - - [14/Nov/2001:02:25:04 -0600] "GET /test/check.php HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.5) Gecko/20011011" mod_gzip: DECHUNK:OK In:607 Out:396 Ratio:35pct.

I love mod_gzip, I just wish things were a little less finicky with regards to it.

[Update]: I'm not so sure this is Mozilla, after all. In fact, I'm not pretty sure it isn't. I just did a little playing with IE, and one thing I noticed is that it seems to correctly get the page, compressed, the first time it gets it. After that, it doesn't. I'm wondering if this is due to cache reasons, and Mozilla/IE are checking to see whether the page has been updated since it was last viewed, and if not, then they're simply redisplaying the same page.

[Update II]: Okay, I'm almost positive my update theory is right. After touching the index.html file and then doing another page refresh, Mozilla/IE get the page correctly.

Chalk this one up to me not thinking things through entirely before posting about it. (Although, I did finally get my thoughts where they needed to be, and only a few minutes after my original post. ;-)


What's happening:
14.25 If-Modified-Since

   The If-Modified-Since request-header field is used with a method to
   make it conditional: if the requested variant has not been modified
   since the time specified in this field, an entity will not be
   returned from the server; instead, a 304 (not modified) response will
   be returned without any message-body.

       If-Modified-Since = "If-Modified-Since" ":" HTTP-date

   An example of the field is:

       If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT

   A GET method with an If-Modified-Since header and no Range header
   requests that the identified entity be transferred only if it has
   been modified since the date given by the If-Modified-Since header.
   The algorithm for determining this includes the following cases:

      a) If the request would normally result in anything other than a
         200 (OK) status, or if the passed If-Modified-Since date is
         invalid, the response is exactly the same as for a normal GET.
         A date which is later than the server's current time is

      b) If the variant has been modified since the If-Modified-Since
         date, the response is exactly the same as for a normal GET.

      c) If the variant has not been modified since a valid If-
         Modified-Since date, the server SHOULD return a 304 (Not
         Modified) response.

   The purpose of this feature is to allow efficient updates of cached
   information with a minimum amount of transaction overhead.

      Note: The Range request-header field modifies the meaning of If-
      Modified-Since; see section 14.35 for full details.

      Note: If-Modified-Since times are interpreted by the server, whose
      clock might not be synchronized with the client.

      Note: When handling an If-Modified-Since header field, some
      servers will use an exact date comparison function, rather than a
      less-than function, for deciding whether to send a 304 (Not
      Modified) response. To get best results when sending an If-
      Modified-Since header field for cache validation, clients are
      advised to use the exact date string received in a previous Last-
      Modified header field whenever possible.

      Note: If a client uses an arbitrary date in the If-Modified-Since
      header instead of a date taken from the Last-Modified header for
      the same request, the client should be aware of the fact that this
      date is interpreted in the server's understanding of time. The
      client should consider unsynchronized clocks and rounding problems
      due to the different encodings of time between the client and
      server. This includes the possibility of race conditions if the
      document has changed between the time it was first requested and
      the If-Modified-Since date of a subsequent request, and the
      possibility of clock-skew-related problems if the If-Modified-
      Since date is derived from the client's clock without correction
      to the server's clock. Corrections for different time bases
      between client and server are at best approximate due to network

   The result of a request having both an If-Modified-Since header field
   and either an If-Match or an If-Unmodified-Since header fields is
   undefined by this specification.

Yeah. . .

I figured it was the browser just checking with the webserver to ensure that the copy it had cached was still the most recent copy.

Thanks for the details, though. It's always good to have a theory verified, and learn the low level details that explain your high level theory. ;-)