Cloudflare cache warming to improve landing page speed autopinger
Task Goal We want to force cloudflare to cache freshly cleared pages with a nodejs server that will read powr.io sitemap or feeded by a list of URLs and ping them fast and async.
What have we done Write a NodeJS script on an AWS EC2 instance to check every 6 hours whether 2 Plugin Landing & 2 Tutorial Landing Pages have been saved to the CF cache. If they are, the script does nothing and checks again in another 6 hours. If not, the script follows the pinging process outlined below and dumps any URLs which have a revalidated, dynamic or other cache status to the console. See Notes section for more information about what these cache statuses mean.
The script currently pings the background-jobs channel on slack when it is has completed running or it has checked the statuses and it doesn't need to run.
Pinging Process
- On the POWR.io sitemap, find all sitemaps for all english pages (ending in -en.xml.gz)
- Get all urls from all english sitemaps (about 36k)
- Ping these urls
- If the CF status is not HIT, MISS, or EXPIRED dump it to the console which will be saved into the nohup.out file in AWS.
- If any errors occur, log them into the error_log.txt file
Code Repository The auto-pinger code repository is saved here: https://gitlab.com/powr/auto-pinger The branch used on the EC2 instance is 'lauren_branch'.
Connecting to AWS EC2 instance
- The Private RSA will be required. Please ask Ben, Ivan or Lauren
ssh -i “powr_ec2_ca.pem” [email protected]
AWS commands to run on EC2 instance
To run the script in the background: nohup node auto_ping.js > nohup.out 2> error_log.txt &
To check the job status: ps -aef | grep -v grep | grep auto_ping.js
Notes
CloudFlare Cache status we have seen in the process so far include: HIT - that the page was already saved in cache MISS - the page was not saved in the cache and will be loaded from the cache the next time the page is loaded EXPIRED - the page was loaded into the cache but has since expired. It will be loaded from the cache next time the page loads
DYNAMIC - the page is not saved to the cache and will never be loaded from the cache REVALIDATED - the page in the cache has gone stale and cannot be updated from the source. It has either an if-modified-since or if-none-match header.
For more information on CF status: https://support.cloudflare.com/hc/en-us/articles/200172516-Understanding-Cloudflare-s-CDN
To understand more about nohup and how to run node js scripts in the background: https://stackoverflow.com/questions/4797050/how-to-run-node-js-as-a-background-process-and-never-die
#cache-warming #landing-page-speed