Strong cache and negotiation cache in http

_ island2022-06-23 18:08:17

This article has participated in the writing activities of low-key and pragmatic excellent Chinese good youth front-end community

Browser caching mechanism

We all know that when we open a page in a browser , The browser will follow your input URL Go to the corresponding server to request the data resources you want . However, the page may need to wait for some time during this process ( White screen time ) To render to your page .

When you want to improve the user experience , Then we have to mention various cache technologies , for example :DNS cache 、CDN cache . Browser cache 、 Page local cache, etc , A good caching strategy can reduce the number of requests for duplicate resources , Reduce server overhead , Improve the loading speed of user pages .

And this article will talk about HTTP Strong caching and negotiation caching

The basic principle

When the browser loads resources , First, according to the request header expires and cache-control Determine whether to hit the strong cache policy , Determine whether to request resources from the remote server or get cache resources locally .

Strong cache

In the browser , Strong cache is divided into Expires(http1.0 standard )、cache-control(http1.1 standard ) Two kinds of .

Expires

Expires yes http1.0 The specification of , The request header field used to represent the expiration time of the resource , The value is an absolute time , Is returned by the server .

When the browser first requests resources , The response header on the server side will be attached with Expires This response field , The next time the browser requests this resource, it will use the last expires Whether the field uses cached resources ( When the request time is less than the expiration time returned by the server , Use cached data directly )

expires It is judged according to the local time , Suppose the client and server times are different , It will cause cache hit error

Cache-control

We mentioned above Expires There is a drawback , When the local time of the client is modified, the browser will directly request new resources from the server , To solve this problem , stay http1.1 Specification , Put forward cache-control Field , And This field has a higher priority than the one mentioned above Expires, The value is the relative time .

stay cache-control There are several common response attribute values in , They are

Property value value remarks
max-age3600 For example, the value is 3600, Express ( current time +3600 second ) Do not request new data resources from the server
s-maxage and max-age equally , But this is to set the cache time of the proxy server
private Content is cached only in the private cache ( Only clients can cache , The proxy server is not cacheable )
public Everything will be cached ( Both client and proxy servers are cacheable )
no-store Do not cache any data
no-cache Stored in the local cache , Just before the freshness revalidation with the original server , The cache cannot provide it to the client for use

Negotiate the cache

The strong cache mentioned above is determined by the local browser whether to use the cache , When the browser does not hit the strong cache, it will send a request to the browser , Verify that the negotiation cache hits , If the cache hits, it returns 304 Status code , Otherwise, new resource data will be returned .

Negotiate the cache ( Also called contrast cache ) It is up to the server to determine whether resources are available , This will involve pairing two sets of fields , The first time the browser makes a request, it will bring a field (Last-Modified perhaps Etag), Then subsequent requests will be carried with the request field for (if-modified-since perhaps if-none-Match), If the response header does not Last-Modified perhaps Etag, Then the request header will not have a corresponding field

  • Last-modified Indicates the last modification time of the local file , Returned by server
  • if-modified-since It is returned by the browser when requesting data , Value is the last time the browser returned Last-modified
  • ETag Is the unique identifier of a file , When resources change, this ETag It's going to change . Made up for the above last-modified There may be no change in the contents of the file, but last-modified A change has occurred, and a new request for resources has been made to the server . This value is also returned by the server
  • if-none-match It is the field brought by the browser when requesting data , The value is the last time the server returned ETag

That may not be clear , I drew a request flow chart , You can quickly understand what negotiation caching is

Untitled Diagram (1).png

Combined with the specific request process of strong cache

  1. When the browser initiates a resource request , The browser will first determine whether there is a cache record in the local area , If not, it will request new resources from the browser , And record the... Returned by the server last-modified.
  2. If there are cache records , First, determine whether the strong cache exists (cache-control Prior to the expires, I'll say later ), If the strong cache time has not expired, the local cache resource is returned ( Status code for 200)
  3. If the strong cache fails , The client will initiate a request to negotiate the cache policy , First, the server determines Etag identifier , If the identifier passed by the client is consistent with the identifier on the current server , Then return the status code 304 not modified( No resource content will be returned )
  4. without Etag Field , The server will compare the data transmitted from the client if-modified-match, If these two values are consistent , At this time, the response header will not contain last-modified Field ( Because the resources have not changed ,last-modified There will be no change in the value of ). client 304 Read the local cache after the status code . If last-modified.
  5. If Etag Inconsistent with that on the server side , Get new resources again , And negotiate to cache the returned data .

Why ETag

Its emergence is mainly to solve last-modified Several difficult problems

  1. Without modifying the contents of the file, the last modification time of the file may also change , This will cause the client to think that the file has been changed , To re request
  2. Some files may be modified frequently , Modified within seconds ,If-Modified-Since The granularity that can be detected is second , Use Etag The client can guarantee this demand 1 It can be refreshed many times in seconds .
  3. Some servers cannot accurately obtain the last modification time of the file

Status code difference

  • 200 The request is successful , The server returns new data
  • 200 from memory cache / from disk cache The local strong cache is still valid , Use local cache directly
  • 304 The request is successful , Go to the negotiation cache , Server decision (Etag and Last-modified) No expired , Tell the browser to use cache

from memory cache It is fetched from the memory when the page is refreshed from disk cache page tab Taken from disk after shutdown

Cache priority

expires and cache-control If both exist ,cache-control Will be covered expires,expires Invalid , Whether or not it expires ,. namely Cache-control > expires

If strong cache and negotiation cache exist at the same time , We will first compare whether the strong cache is still valid , If the strong cache is effective, compare the negotiation cache , namely Strong cache > Negotiate the cache

Negotiate the cache Etag and last-modified Simultaneous existence , I'll compare first Etag,last-modified Invalid , namely Etag > last-modified

Add up :

stay http1.0 There is another one in the specification Pragma Cache policy , At that time Cache-control(http1.1) Not yet , It is associated with Cache-Control: no-cache The effect is consistent . Force the cache server to submit the request to the source server for verification before returning the cached version

paragma -> Cache-control -> expires -> Etag -> last-modified

Heuristic cache

This cache policy is the browser default , If you send a network request without expirescache-control, But there are last-modified Field , In this case, the browser will have a default caching policy (currentTime - last-modified )*0.1

Only when the server does not return an explicit cache policy will the browser's heuristic cache policy be activated

HTTP Heuristic Caching (Missing Cache-Control and Expires Headers) Explained

Other supplements

  • The negotiation cache should be used in conjunction with the strong cache , If you do not enable strong caching , Negotiation caching makes no sense
  • Most of the web The server turns on the negotiation cache by default , And it starts at the same time last-modified and Etag

Pay attention to the scene

  1. In Distributed Systems last-modified Need to be consistent , So as not to load different machines and cause comparison failure , To return to the new resource
  2. Try to shut down the distributed system Etag, Because each server generates Etag Is different

thank
Similar articles

2022-06-23