Automated Linux Backups with cron and tar [2026]

A client called us on a Friday afternoon — the kind of call that ruins weekends. Their sysadmin had been running manual tar backups “every night” for two years. Except “every night” meant “most nights, unless he forgot, was on vacation, or left early.” When ransomware hit their file server, the most recent backup was eleven days old. Eleven days of invoices, contracts, and engineering drawings, gone. Setting up automated Linux backups with cron and tar would have taken thirty minutes. The recovery took three weeks and cost them far more than anyone wants to admit.

Manual Backups Are a Promise Nobody Keeps

Human nature is the weakest link in any backup strategy. I have watched it play out across dozens of client sites. Someone writes a perfectly good tar command, runs it faithfully for a few weeks, then gets busy. Gaps appear. Nobody notices because nobody checks. The backup exists in theory but not in practice.

Automation removes the human variable entirely. A cron job does not get distracted, does not take sick days, and does not decide to “do it tomorrow.” This is not optional. The NIST Cybersecurity Framework identifies automated backup processes as a core recovery function for good reason.

Understanding cron: The Scheduler That Never Sleeps

The cron daemon runs on virtually every Linux system. It reads scheduling instructions from crontab files and executes commands at the specified times without any manual intervention.

The system-wide crontab lives at /etc/crontab. Each line follows a seven-column format:

minutes  hours  day-of-month  month  weekday  user  command

The fields accept these ranges:

Minutes: 0–59
Hours: 0–23
Day of month: 1–31
Month: 1–12
Weekday: 0–7 (both 0 and 7 represent Sunday)

An asterisk means “every possible value.” The forward slash sets step intervals. Commas separate specific values. Hyphens define ranges.

Common Scheduling Patterns That Actually Get Used

Here are patterns I deploy across client environments regularly:

# Every night at 3:00 AM
0 3 * * * root /etc/scripts/backup.sh

# Every 15 minutes
*/15 * * * * root /etc/scripts/health-check.sh

# Every Saturday at midnight
0 0 * * 6 root /etc/scripts/weekly-full.sh

# First and fifteenth of each month at 6:25 AM
25 6 1,15 * * root /etc/scripts/monthly-report.sh

If you want to skip the syntax entirely, drop your script into one of these directories instead: /etc/cron.daily, /etc/cron.weekly, or /etc/cron.monthly. Mark the script executable with chmod +x and cron handles the rest. The tradeoff is you lose control over the exact execution time, which matters when your RPO demands precision.

Ad Space

Building a Backup Script Worth Running

A bare tar command is a starting point, not a solution. Here is a script pattern I deploy across managed client environments that handles file backups with daily rotation:

#!/bin/bash
# backup.sh - Daily rotating backup with tar

BACKUPDIR=/localbackup
WEBDIR=/var/www/html/wordpress
ETCDIR=/etc

# Use weekday number for 7-day rotation (1=Mon, 7=Sun)
weekday=$(date +%u)

# Archive the web application
htmlfile=$BACKUPDIR/web-$weekday.tar.gz
tar czf $htmlfile -C $WEBDIR .

# Archive system configuration
etcfile=$BACKUPDIR/etc-$weekday.tar.gz
tar czf $etcfile -C $ETCDIR .

# Copy to remote host for offsite storage
scp $htmlfile $etcfile user@remotehost:/backup/

The $(date +%u) trick gives you a number from 1 (Monday) to 7 (Sunday). After a week, new backups overwrite the oldest ones. Seven days of recovery points without consuming unlimited storage.

For clients who need database backups alongside file archives, I add mysqldump with the --single-transaction flag to avoid locking tables during the dump:

DB=appdb
DBUSER=backupuser
dbfile=$BACKUPDIR/db-$weekday.sql.gz
mysqldump -u $DBUSER --single-transaction $DB | gzip -c > $dbfile

That --single-transaction flag matters. Without it, mysqldump locks your tables and your application hangs until the dump finishes. I have seen this cause outages on client production databases during business hours because someone scheduled the backup at 9 AM instead of 3 AM.

One caveat: the seven-day rotation means you can only recover data from the past week. If someone deletes a file and nobody notices for ten days, it is already overwritten. For clients with longer retention requirements, switch to $(date +%d) for day-of-month rotation, which gives you 31 recovery points. Adjust the strategy to match your actual RPO.

If you are writing bash scripts for backup automation and want to structure them properly with reusable functions and return values, the guide on bash functions covers the fundamentals you will need.

The PATH Trap That Breaks Cron Jobs Silently

This catches people constantly. Your script runs perfectly from the terminal. You schedule it in cron. It fails silently. No errors in your inbox, no indication anything went wrong.

The culprit is almost always PATH. Cron jobs run with a minimal environment that does not load your .bashrc or .profile. Commands that work interactively — because they rely on custom PATH entries — fail under cron because cron cannot find the binary.

Ad Space

Two fixes. First, use absolute paths for every command in your backup script:

# Will break in cron if tar is not in the default PATH
tar czf $htmlfile -C $WEBDIR .

# Explicit path — works everywhere
/usr/bin/tar czf $htmlfile -C $WEBDIR .

Second, set PATH explicitly at the top of /etc/crontab:

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
[email protected]

One of our managed customers ran a backup script that called a custom-installed binary in /opt/tools. Worked fine for months when the admin ran it manually. The cron job had been failing every single night for five months before anyone noticed. Five months of zero backups. Set MAILTO in your crontab so at least someone gets notified when errors occur. Watch your cron jobs for the first few days after scheduling — do not assume success just because you saw no errors.

Test Your Restores or You Do Not Have Backups

I will say this plainly: an untested backup is not a backup. It is a hope. And hope is not a disaster recovery strategy.

After you schedule your cron job, verify three things:

Did the job actually run? Check /var/log/syslog or /var/log/cron for execution entries.
Did it produce files? List your backup directory and confirm file sizes are reasonable — a 0-byte tar.gz is not a backup.
Can you actually restore from them?

That third point is where most organizations fail. Both the NIST Cybersecurity Framework and ISO 27001 require documented and tested recovery procedures. Here is how I verify tar backups at client sites:

# List archive contents without extracting
tar tzf /localbackup/web-1.tar.gz | head -20

# Test extraction to a temporary directory
mkdir /tmp/restore-test
tar xzf /localbackup/web-1.tar.gz -C /tmp/restore-test

# Compare against the original
diff -r /tmp/restore-test /var/www/html/wordpress

# Clean up
rm -rf /tmp/restore-test

Run a full restore on a separate machine or virtual environment. We make this a quarterly requirement for every client engagement. If the restore fails on a test machine, it will fail when you need it most.

Local Backups Are Not Enough

Here is my position on this, and I will not budge: if your backups and your production data live on the same machine, you do not have a backup strategy. You have a slightly more organized version of the same single point of failure.

Ad Space

The 3-2-1 rule says keep three copies on two different media types with one copy offsite. I push clients toward 3-2-1-1-0: add one immutable or air-gapped copy, and zero errors in your restore verification. The scp command in the script above handles the offsite copy, but for serious protection you want an offsite backup solution that provides immutable storage and versioning beyond what a simple file copy gives you.

We helped a logistics company move from local-only tar backups to a proper offsite strategy after they lost both their production server and their “backup” server to the same power surge. Same rack, same UPS, same failure domain. Their RPO went from total loss to four hours with proper offsite replication. For environments where you need backup infrastructure you fully control, running your own backup repository on a cloud VPS gives you geographic separation without handing your data to a third party.

The same trust-but-verify discipline applies to anything sensitive stored alongside your backups — the article on encryption at rest auditing for Kubernetes secrets walks through a similar verification approach for sensitive data at rest.

Start With Thirty Minutes, Verify for Years

Write the script. Schedule it in /etc/crontab for 3 AM. Watch it run for three days. Then restore from it on a different machine. If the restore works, you have a backup. If it does not, you found the problem before it found you.

The script and the cron entry take thirty minutes to set up. The restore test takes another hour. That ninety-minute investment is the difference between a recoverable incident and a career-defining disaster.

For organizations that need help designing a backup strategy that actually survives contact with reality, reach out to our team. We have built automated backup pipelines for everything from single-server WordPress sites to multi-datacenter deployments. And we test every single one of them.