Browser caches are private caches(caching for a single user) that reduces latency and network traffic and thus lessen the time needed to display resource representations. HTTP is a client-server protocol, servers can't contact caches and clients when a resource changes; they have to communicate via http response headers.
Primary cache key
Browser's cache the http response to a http get request (depending on caching header's) . The browser's primary cache key consists of request method and target URI.(often, only the URI is used because only GET requests are caching targets). The relevant excerpt from http 1.1 spec
The primary cache key consists of the request method and target URI. However, since HTTP caches in common use today are typically limited to caching responses to GET, many caches simply decline other methods and use only the URI as the primary cache key.
URI consists of scheme,path,query parameters. (not fragment ,as fragment is not part of http request)
Note that a primary cache entry may have multiple stored responses differentiated by secondary key. This can be accomplished using the vary header which is a response header . Only origin servers know what different representations are available, it is again the origin server’s responsibility to indicate to a cache based on which headers it will generate a different representation. To do so, the origin servers must add a Vary header containing the value of the request headers that cause different representations to be generated.When caches see a response coming from an origin server with, for instance, the header Vary: Accept-Language, it will examine the value of the Accept-Language header, such as en-US, and use this value to construct a more specific cache-key, perhaps like https://clarifyforme.com/post_en_US.
When a cache(including browser cache) receives a request , it must not use a cached response by default unless all header fields (vary header can acutally point at more than one header) specified in the Vary header match in both the original (cached) request and the new request. Eg Vary: Accept-Encoding can be used to ensure that seprate version of resource is cached depending on value of Accept-Encoding header.The Vary header can also be useful for serving different content to desktop and mobile users by specifying Vary: User-Agent.
Caches may have to implement normalization to regroup representations.
Controlling caching behaviour by http response headers
Server can control browser caching behaviour via http response headers. Most importantly http response header can be used to control the expiration time of a resource. The expiration can be controlled via headers
- Cache-Control: max-age=31536000
- The Expires HTTP header contains the date/time after which the response is considered expired. Note that If there is a Cache-Control header with the max-age or s-maxage directive in the response, the Expires header is ignored.
Before this expiration time, the resource is fresh, after the expiration time, the resource is stale. When http get request is made to a resource, if the resouce is fresh, the cached copy will be used. If the resource is stale then it can either be
- Validated
- Validation is triggered when
- the user presses the Reload button.
- During normal browsing if the cached response includes the "Cache-Control: must-revalidate" header
- Validation can only occur if the server provided either a strong validator or a weak validator
- Strong validator : Etag is a strong validator. If the ETag header was part of the response for a resource, the client can issue an If-None-Match in the header of future requests to validate the cached resource.
- Weak validator : The Last-Modified response header can be used as a weak validator. It is considered weak because it only has 1-second resolution. If the Last-Modified header is present in a response, then the client can issue an If-Modified-Since request header to validate the cached document. If the resource is not modified the server returns a 304 (Not Modified) header without sending the body of the requested resource.
- Validation is triggered when
- Fetched
- If atleast one of Etag or Last-modified validator header is not present then resource will be refetched.
Revving
If a resource is updated very infrequently then setting there expiration time can be prolematic. For instance css and js files. Hence infrequently updates file path typically have a revision/version number . This technique is also called revving.