Skip to main content

Cloudflare cache warming to improve landing page speed autopinger

Task Goal We want to force cloudflare to cache freshly cleared pages with a nodejs server that will read powr.io sitemap or feeded by a list of URLs and ping them fast and async.

What have we done Write a NodeJS script on an AWS EC2 instance to check every 6 hours whether 2 Plugin Landing & 2 Tutorial Landing Pages have been saved to the CF cache. If they are, the script does nothing and checks again in another 6 hours. If not, the script follows the pinging process outlined below and dumps any URLs which have a revalidated, dynamic or other cache status to the console. See Notes section for more information about what these cache statuses mean.

The script currently pings the background-jobs channel on slack when it is has completed running or it has checked the statuses and it doesn't need to run.

Pinging Process

  1. On the POWR.io sitemap, find all sitemaps for all english pages (ending in -en.xml.gz)
  2. Get all urls from all english sitemaps (about 36k)
  3. Ping these urls
  4. If the CF status is not HIT, MISS, or EXPIRED dump it to the console which will be saved into the nohup.out file in AWS.
  5. If any errors occur, log them into the error_log.txt file

Code Repository The auto-pinger code repository is saved here: https://gitlab.com/powr/auto-pinger The branch used on the EC2 instance is 'lauren_branch'.

Connecting to AWS EC2 instance

  1. The Private RSA will be required. Please ask Ben, Ivan or Lauren
  2. ssh -i “powr_ec2_ca.pem” [email protected]

AWS commands to run on EC2 instance To run the script in the background: nohup node auto_ping.js > nohup.out 2> error_log.txt & To check the job status: ps -aef | grep -v grep | grep auto_ping.js

Notes

CloudFlare Cache status we have seen in the process so far include: HIT - that the page was already saved in cache MISS - the page was not saved in the cache and will be loaded from the cache the next time the page is loaded EXPIRED - the page was loaded into the cache but has since expired. It will be loaded from the cache next time the page loads

DYNAMIC - the page is not saved to the cache and will never be loaded from the cache REVALIDATED - the page in the cache has gone stale and cannot be updated from the source. It has either an if-modified-since or if-none-match header.

For more information on CF status: https://support.cloudflare.com/hc/en-us/articles/200172516-Understanding-Cloudflare-s-CDN

To understand more about nohup and how to run node js scripts in the background: https://stackoverflow.com/questions/4797050/how-to-run-node-js-as-a-background-process-and-never-die

#cache-warming #landing-page-speed