Monitor the evolution of a webpage with wkhtmltoimage, bash and cron

After seeing the 37signals.com homepage evolution blog post published last week, I thought it would be interesting to record the evolution of some of my own sites. For this project I used wkhtmltoimage which is a fork of wkhtmltopdf.

wkhtmltoimage runs on the console and renders either a jpg or png of the page. I have limited this to 1000 pixels vertically in order to keep the file sizes down.

Below is the bash script I created to read a list of sites from a file and save them into a directory which is named after the site, with each image been titled with the date and time.


#!/bin/sh

FILENAME="sites.txt"

while read URL; do
    #generate the current date + time
    DATE=`date +%d_%m_%y_%H%M`
    OPTIONS="--height 1000 --crop-h 1000 --quality 80"

    #if the directory does not exist create it
    if [ ! -d "$URL" ]; then
        mkdir "$URL"
    fi

    ./wkhtmltoimage-amd64 $OPTIONS http://$URL ./$URL/$DATE.png

done < "$FILENAME"
Fork me on GitHub