Add fault tolerance to cron noise

Not all cron jobs are created equal, and some of them can afford to fail sporadically before we need to worry about them. Maybe they rely on a third party server, and we don’t want the occasional fail to pollute our inbox.

Here is a little cron job wrapper I created that will suppress stderr but keeps track of the job’s returned exit codes. Above a certain threshold of consecutive abnormal exits it doesn’t suppress stderr anymore.


# if the counter file doesn't already exist we create/initialize it
if [ ! -f /tmp/counter_ri7g3 ] ;
then
    echo 0 > /tmp/counter_ri7g3 ;
fi ;

# we pull the current counter
counter=`cat /tmp/counter_ri7g3` ;

# if the counter is still small, we send stderr to /dev/null
if [ $counter -lt 5 ] ;
then
    $1 > /dev/null 2>&1 ;
# otherwise stderr will follow its normal path and find its way to email
else
    $1 > /dev/null ;
fi ;

# lastly if running the $1 resulted in an abnormal exit, the counter is incremented
if [ ! $? = 0 ] ;
then
    counter=`cat /tmp/counter_ri7g3` ;
    echo "$counter+1" | bc > /tmp/counter_ri7g3 ;
# and if $1 exited normally, we reset the counter
else
    echo 0 > /tmp/counter_ri7g3 ;
fi ;

a cron entry calling it looks as such:


30 * * * *      root      /usr/local/bin/cron_wrapper "/path/to/script arg_1 arg_2"