This tutorial shows you how to setup and test the rclone_jobber backup script. The final product will perform local and remote backups automatically.
The tutorial is written for Linux and most of the content is also useful for macOS and Windows 10 wsl.
Both rclone and rclone_jobber.sh are command line tools. This tutorial assumes that the reader has basic command-line skills.
The example jobs scripts make backups for a home PC. You can adapt rclone_jobber.sh and the example job scripts to suit your own backup needs.
Backup trivia: the verb form to “back up” is two words, whereas the noun is “backup”.
Disclaimer: Some of the scripts used in this tutorial delete or overwrite data. Unless you know exactly what you are doing, please work this tutorial on a spare computer that doesn’t contain data you don’t want to lose. This tutorial and associated scripts are distributed without any warranty. I am not responsible for any lost data.
- About this rclone_jobber backup script tutorial
- Install rclone
- Install rclone_jobber
- Setup environment path variables
- Setup a test_data directory
- Take rclone_jobber.sh for a test drive
- Backup job and rclone_jobber.sh parameters
- Filter rules
- Select a cloud storage provider
- Configure a remote
- Configure a crypt
- Schedule backup jobs to run automatically
- Logging options
- Back up and restore
- Data recovery plan
- License
Install the latest rclone from https://rclone.org/downloads/
Download or clone the rclone_jobber repository.
The “examples” directory contains all the scripts used by this tutorial.
Most tutorials use <value> notation to indicate substitution. This tutorial uses environment path variables to automate the substitutions. Thus, all the examples in this tutorial can run on your system without you having to edit the scripts.
The example scripts use the following environment path variables:
- HOME
- rclone_jobber
- usb
- remote (explained in “Configure a remote” section)
The above “rclone_jobber” path is the location of the rclone_jobber repository you downloaded.
Tutorial examples use the “usb” path variable to demonstrate backup to local mass storage. The “usb” path doesn’t actually have to be a USB drive.
If you’re on Linux, add these lines to your ~/.profile (or ~/.bash_profile) file, but with your system paths:
export rclone_jobber="/home/wolfv/rclone_jobber" export usb="/media/wolfv/USB_device_name"
Reload your .profile:
$ source ~/.profile
Now Linux shell should substitute $rclone_jobber and $usb.
This tutorial’s example scripts backup a small test directory.
The setup_test_data_directory.sh script will setup the small source directory. It recursively deletes the ~/test_rclone_data directory and rebuilds a fresh copy. To setup a test directory from the command line:
$ $rclone_jobber/examples/setup_test_data_directory.sh
Once you have the path variables and test_data directories setup, you can take rclone_jobber for a test drive.
Here is a minimal backup-job script for rclone_jobber:
#!/usr/bin/env sh source="$HOME/test_rclone_data" dest="$usb/test_rclone_backup" $rclone_jobber/rclone_jobber.sh "$source" "$dest"
That last line calls rclone_jobber.sh with arguments source
and dest
.
Open the examples/job_backup_to_USB_minimal.sh in your favorite text editor.
Set options to --dry-run
:
options="--dry-run"
Run the backup job:
$ $rclone_jobber/examples/job_backup_to_USB_minimal.sh
The backup, $dest, was not created because –dry-run.
Important: A bad backup job can cause data loss.
First test with the --dry-run
flag to see exactly what would be copied and deleted.
Here are some more things you can try with rclone_jobber:
- Open rclone_jobber.log (rclone_jobber.log is in the same directory as rclone_jobber.sh).
- Run the backup job again, this time without
--dry-run
. - Inspect changes in the destination files.
- Change some files in source:
- delete a file
- edit a file
- add a file
- move a file
And run the backup job again.
Each backup job contains arguments and a call to rclone_jobber.sh. Here is an example backup job with all the rclone_jobber.sh arguments defined:
#!/usr/bin/env sh source="$HOME/test_rclone_data" dest="$usb/test_rclone_backup" move_old_files_to="dated_files" options="--filter-from=$rclone_jobber/examples/filter_rules --checksum --dry-run" monitoring_URL="https://monitor.io/12345678-1234-1234-1234-1234567890ab" $rclone_jobber/rclone_jobber.sh "$source" "$dest" "$move_old_files_to" "$options" "$(basename $0)" "$monitoring_URL"
The last line calls rclone_jobber.sh.
source
and dest
are required, the remaining arguments can be empty string "" or undefined.
The next sections describe rclone_jobber.sh parameters:
- source
- dest
- move_old_files_to
- options
- job_name
- monitoring_URL
source
is the directory to back up.
Example source
argument:
source="/home/wolfv"
dest
is the directory to back up to.
Data is backed up to destination=$dest/last_snapshot
.
Example dest
argument for local file system data storage:
dest="/media/wolfv/USB/wolfv_backup"
Example dest
for remote data storage:
dest="onedrive_wolfv_backup_crypt:"
If your path contains a space, then you must use extra quotes.
For Linux / OSX source
argument:
source="'/home/wolf v'"
For Windows source
argument:
source='"/home/wolf v"'
Details at https://rclone.org/docs/#quoting-and-the-shell.
When a file is changed or deleted, the old version already in backup is either moved or removed.
The move_old_files_to
parameter specifies what happens to the old files.
Argument to move deleted or changed files to a dated directory:
move_old_files_to="dated_directory"
Old files are moved to the dated directory in their original hierarchy. This makes it easy to restore a deleted sub-directory. Also convenient to manually delete a directory from a previous year.
backup ├── archive <<<<<<<< archive contains dated directories │ └── 2018 │ ├── 2018-02-22_14:00:14 │ │ └── direc1 │ │ └── f1 │ └── 2018-02-22_15:00:14 <<<<<<<< old files were moved here on dated_directory's date │ └── direc1 │ └── f1 <<<<<<<< old version of file f1 └── last_snapshot <<<<<<<< last_snapshot directory contains the most recent backup └── direc1 └── f1
Argument to move old files to old_files directory, and append move date to file names:
move_old_files_to="dated_files"
Old files are moved to the old_files directory in their original hierarchy. This makes it easy to browse a file’s history, and restore a particular version of a file.
backup ├── last_snapshot <<<<<<<< last_snapshot directory contains the most recent backup │ └── direc1 │ └── f1 └── old_files <<<<<<<< old_files directory contains old dated_files └── direc1 ├── f1_2018-02-22_14:00:14 └── f1_2018-02-22_15:00:14 <<<<<<<<< old version of file f1 moved here on appended date
Argument to remove old files from backup:
move_old_files_to=""
Only the most recent version of each file remains in the backup. This can save a little storage space. Useful for making an extra backup before OS upgrade or OS clean install.
backup └── last_snapshot <<<<<<<< last_snapshot directory contains the most recent backup └── direc1 └── f1 <<<<<<<< old versions of file f1 were overwritten or removed
The options
argument can contain any number of rclone options.
You can put any rclone options in the options argument, except for these three:
--backup-dir --suffix --log-file
Those options are set in rclone_jobber.sh.
Example options argument containing four rclone options:
options="--filter-from=filter_rules --checksum --log-level=INFO --dry-run"
Rclone options used in this tutorial are:
- –filter-from (discussed in the “filter rules” section)
- –checksum
- –log-level
- –dry-run
The job_name
argument specifies the job’s file name:
job_name="$(basename $0)"
The shell command “$(basename $0)” will fill in the job’s file name for you.
rclone_jobber.sh guards against job_name
running again before the previous run is finished.
rclone_jobber.sh prints job_name
in warnings and log entries.
The monitoring_URL
argument specifies a ping URL for a cron-monitoring service.
monitoring_URL
is optional.
This is redundant if the remote data-storage provider offers an integrated monitoring service.
Example monitoring_URL
:
monitoring_URL="https://monitor.io/12345678-1234-1234-1234-1234567890ab"
Every time rclone_jobber.sh completes a job without error, it pings the monitoring_URL. If the cron monitoring service hasn’t been pinged within a set amount of time, then it sends you an email alert. Many cron monitoring services offer free plans.
No two jobs should share the same monitoring_URL
.
Rclone has a sophisticated set of filter rules. Filter rules tell rclone which files to include or exclude.
Open the examples/filter_rules_excld file. Each rule starts with a “+ ” or “- “, followed by a pattern.
- a leading “+” means include if the pattern matches
- a leading “-” means exclude if the pattern matches
For each file in source, filter rules are processed in the order that they are defined. If the matcher fails to find a match after testing all the filter rules, then the path is included. Read the examples/filter_rules_excld file to see how this works.
Lines starting with ‘#’ are comments. Comment at the end of a rule is not supported because file names can contain a ‘#’.
The rclone_jobber options
argument specifies the filter_rules_excld file like this:
options="--filter-from filter_rules_excld"
To see the example filter_rules_excld file in action, run:
$ $rclone_jobber/examples/clear_USB_test_backup.sh $ $rclone_jobber/examples/job_backup_to_USB_excld.sh
Rclone uses cloud storage providers to backup data to an off-site storage system. Off-site storage systems are safe from local disaster.
All rclone cloud-storage providers are listed on https://rclone.org/. Some of the cloud-storage-providers’ features are listed in two tables on https://rclone.org/overview/. Most cloud-storage providers offer small storage capacities for free. Pick one. You can always try other cloud-storage providers after you finish this tutorial.
Once you have an account with your chosen cloud-storage provider, the next step is to configure its remote.
There is one page of configuration instructions for each cloud-storage provider. Links to the configuration instructions are at https://rclone.org/docs/#configure and https://rclone.org/. Follow the instructions to configure your remote now.
$ rclone config
Rclone stores all the configuration information you entered. The default location is ~/.config/rclone/rclone.conf. The remote’s password is stored in the rclone.conf file, so be careful about giving people access to it.
To list all your rclone remotes:
$ rclone listremotes
Here is how to run the tutorial’s example remote backup job on Linux (for tutorial scripts only, don’t do this for production). Add this line to your ~/.profile file, but with your remote path:
export remote="onedrive_test_rclone_backup"
and reload .profile:
$ source ~/.profile
To use a tutorial example script as a template for production backups, edit the tutorial scripts: replace occurrences of “${remote}” with your remote path.
To test your remote, run:
$ $rclone_jobber/examples/job_backup_to_remote.sh
“crypt” is a kind of remote that:
- encrypts and decrypts the data stream for an underlying remote
- performs encryption and decryption on the client side
- uses the same command interface as other kinds of remotes
Instructions for configuring a crypt remote are at https://rclone.org/crypt/ and https://rclone.org/docs/#configuration-encryption.
When configuring a crypt remote, rclone will ask you to give it a name. Put some thought into naming your remotes.
name> myremote_myfolder_crypt
And then rclone will ask for the underlying remote. This example will encrypt myfolder in myremote:
remote> myremote:myfolder
You can always rename a remote later via rclone config.
To list all your rclone remotes:
$ rclone listremotes
Most remote cloud-storage providers allow you to view your directory names and file names in a web browser. But that’s not very useful if rclone encrypted the directory and file names. Use rclone to browse encrypted directory and file names.
To list directories in remote:
$ rclone lsd remote: $ rclone lsd remote:path
To list objects and directories of path (requires rclone-v1.40 or later):
$ rclone lsf remote:path
To list top-level files in path:
$ rclone ls remote:path --max-depth 1
To list all files in path recursively:
$ rclone ls remote:path
/examples/job_backup_to_remote.sh uses a remote, which could be of type crypt.
To test your crypt remote, set the path variable as described in the “Configure a remote” section, and then run:
$ $rclone_jobber/examples/job_backup_to_remote.sh
Most cloud storage providers have a 254 character-path-length limit. Crypt limits encrypted paths to 151 characters with some cloud storage providers (this is a known crypt issue). If the path is too long, rclone returns this ERROR:
Failed to copy: invalidRequest: pathIsTooLong: Path exceeds maximum length
There are 3 work-a-rounds:
- turn off “encrypt directory names” in rclone config (file content can still be encrypted)
- shorten your paths
- Long Path Tool (I have not tried this)
rclone crypt file-name and directory-name encryption don’t work with Backblaze b2 lifecycle. This is because:
- b2 lifecycle appends date to end of file names
- b2 doesn’t strip off the appended date before passing the file name back to rclone
So then rclone can’t decrypt the file names.
There are 3 work-a-rounds:
- turn off “encrypt file names” and “encrypt directory names” in rclone config (file content can still be encrypted)
- turn off b2 lifecycle and
- set move_old_files_to="dated_directory" in the backup job
- manually delete old files at end of life
- use a different remote data-storage provider
After you schedule backup jobs, you will have an automated backup system with this workflow:
- a job scheduler calls a backup job script
- the job script calls rclone_jobber.sh
- rclone_jobber.sh calls rclone
- rclone consults your filter rules, connects to a backup storage, and uploads modified files
Schedule your backup jobs in your favorite job scheduler.
The following example schedules jobs on cron (cron is a popular job scheduler installed on Linux). The first line runs a local job every hour on the hour. The second line runs a remote job every hour, 30 minutes past the hour. The third line runs at 3:18 and 15:18 every day
$ crontab -e 00 * * * * /home/wolfv/rclone_jobber/job_backup_to_USB.sh 30 * * * * /home/wolfv/rclone_jobber/job_backup_to_remote.sh 18 3,15 * * * /home/wolfv/rclone_jobber/job_backup_recovery_plan_to_remote.sh
The initial backup will take a long time (subsequent backups are much shorter). If your computer goes to sleep while a backup is in progress, the backup will not finish. Consider disabling sleep on your computer for the initial backup. On Linux Gnome desktop:
right click > Settings > Power > Automatic suspend: Off
rclone_jobber.sh default behavior places rclone_jobber.log in the same directory as rclone_jobber.sh. Read this section if you want the log in a different location.
Logging options are set in rclone_jobber.sh, headed by “# set log” comments. To change logging behavior, search for “# set log” and change the default values.
Logging options are described in the next 5 sections.
To send more information to the log, use the send_to_log function in rclone_jobber.sh:
# set logging to verbose send_to_log "$timestamp $job_name" send_to_log "$cmd"
Additionally, you can set –log-level in the job’s “options” parameter.
In rclone_jobber.sh, variable log_file contains the log file’s path. The default behavior places rclone_jobber.log in the same directory as rclone_jobber.sh:
# set log_file path path="$(realpath "$0")" #path of this script log_file="${path%.*}.log" #replace path extension with "log"
You can change log_file to any path you like.
To set the rclone_jobber log location to var/log, create the log file and give it the user’s ownership and read-write permission. In this example, rclone_jobber.log ownership is given to wolfv:
$ sudo touch /var/log/rclone_jobber.log $ sudo chown wolfv /var/log/rclone_jobber.log $ sudo chmod 0666 /var/log/rclone_jobber.log $ sudo ls -l /var/log/rclone_jobber.log -rw-rw-rw-. 1 wolfv root 19 Mar 21 13:58 /var/log/rclone_jobber.log
In rclone_jobber.sh, set the new log_file path:
# set log_file path log_file="/var/log/rclone_jobber.log"
Over time a log file can grow to unwieldy size. The logrotate utility can automatically archive the current log, start a fresh log, and delete older logs.
To setup logrotate, set log_file path to /var/log/rclone_jobber.log (described in previous section). Then create a logrotate configuration file in /etc/logrotate.d:
$ sudo vi /etc/logrotate.d/rclone_jobber
And paste this text into the logrotate configuration file:
/var/log/rclone_jobber.log { monthly rotate 2 size 1M compress delaycompress }
More options are listed in man:
$ man logrotate
Execute a dry-run to see what logrotate would do:
$ logrotate -d /etc/logrotate.d/rclone_jobber
Linux and macOS can send all log output to systemd journal. To do so, make these two changes to rclone_jobber.sh script:
- change log_option to –syslog
# set log_option for rclone log_option="--syslog"
- send msg to systemd journal (sending msg to log_file is optional, and is commented in this example)
# set log - send msg to log #echo "$msg" >> "$log_file" #send msg to log_file printf "$msg" | systemd-cat -t RCLONE_JOBBER -p info #send msg to systemd journal
The following system uses two backup jobs with complementary attributes (this is how I backup my home PC). The latest snapshot can be easily restored from either backup.
examples/job_backup_to_USB.sh has attributes that make it convenient to browse file history:
- local storage (for fast navigation)
- move_old_files_to="dated_files" (will group old versions of a file together)
- not encrypted (easy to browse files in a file manager) (unencrypted local storage is OK if storage is safe from theft, and useful if the remote storage password is lost)
- schedule hourly, on the hour (this assumes the USB drive is always plugged in and mounted)
/examples/job_backup_to_remote.sh has attributes that make it secure, and easy to restore a deleted sub-directory:
- remote storage (off-site is safe from on-site disaster)
- move_old_files_to="dated_directory" (easy to restore a deleted sub-directory e.g. Documents)
- encrypted (please keep your password in a safe place)
- schedule hourly, 30 min past the hour (for a back up every 30 minutes when combined with job_backup_to_USB.sh)
In addition, job_backup_recovery_plan_to_remote.sh stores recovery-plan files off-site unencrypted. Recovery-plan files are listed in the “Data recovery plan” section.
Backup to both local and remote locations in case disaster destroys one. If the Internet connection fails, local backup is still made.
To restore data, copy files from backup to destination.
You can use cp (shell command) to restore data from local unencrypted backup.
To copy a single file from local backup:
$ cp -p local_backup_path dest_path
To copy a last_snapshot directory from local backup:
$ cp -a local_backup/last_snapshot dest_path
Use rclone to restore data from remote or encrypted backup.
To copy a single file from remote backup:
$ rclone copy remote:source_path dest:dest_path
To copy a single file from remote backup, use one of these scripts:
The following commands test the example backup and restore jobs. They test your entire data recovery system end to end, testing both the data backup and data recovery together. Don’t worry, the tutorial’s environment is setup to make testing painless.
Clear and setup test directories in preparation for a new test run:
$ $rclone_jobber/examples/clear_USB_test_backup.sh $ $rclone_jobber/examples/clear_remote_test_backup.sh $ $rclone_jobber/examples/setup_test_data_directory.sh
Back up data:
$ $rclone_jobber/examples/job_backup_to_USB.sh $ $rclone_jobber/examples/job_backup_to_remote.sh
In job_restore_last_snapshot.sh, uncomment source variable to restore data from. Then restore data:
$ $rclone_jobber/examples/job_restore_last_snapshot.sh
Verify that the files were faithfully restored:
$ diff -r $HOME/test_rclone_data/direc0 $HOME/last_snapshot/direc0
Notice that rclone does not back up empty directories.
Follow a similar test procedure when practicing your recovery plan, but with real data.
Monitor your backups to make that data is actually being backed up. Do not rely solely on warning messages or rclone_jobber.log. They do not prove that data was saved to destination.
Manually run a checklist script once per month, similar to this monitor_backups.sh:
#!/bin/bash echo "" echo ">>>> Check recently changed file time in local backup:" ls -l /run/media/wolfv/big_stick/wolfv_backup/last_snapshot/DATA/Documents/tasks/tasks.org echo "" echo ">>>> Check recently changed file time in remote backup:" rclone lsl onedrive_wolfv_backup_crypt:last_snapshot/DATA/Documents/tasks --max-depth 1 echo "" echo ">>>> Check last log time:" tail -5 /var/log/rclone_jobber.log echo "" echo ">>>> Check my Monthly Report emailed from my monitoring service." echo "" echo ">>>> Check space usage and available space."
A data recovery plan is a documented process to recover and protect data in the event of a disaster. The data recovery plan presented here also includes re-installing the operating system.
Example data recovery plan:
- Retrieve recovery-plan files from an on-site or off-site location
- notes for installing OS
- recovery plan (this file)
- job_restore_last_snapshot.sh
- ~/.config/rclone/rclone.conf
- Install operating system
- Install rclone
- Restore ~/.config/rclone/rclone.conf
- Edit source variable in job_restore_last_snapshot.sh, and then run job_restore_last_snapshot.sh
The rclone.conf configuration file contains the encryption key for backup. Keep it in a secure location. I keep my backup rclone.conf in a password manager (LastPass). The other recovery-plan files (listed in item 1.) are not encrypted so that they can be accessed before rclone is installed. With this setup, all I need to bootstrap the recovery process is a web browser and my LastPass master password.
Schedule the backup of your backup recovery plan. This ensures that your backup recovery-plan files are always up-to-date. Do not encrypt the recovery-plan files so that they can be accessed before installing rclone. For each backup location, place the recovery-plan files in a directory to be backed up.
- If a backup is not encrypted, then the recovery-plan files will be accessible in the backup.
- If a backup is encrypted, create an unencrypted backup job to the same underlying remote.
Like this example:
- job_backup_recovery_plan_to_remote.sh
- filter_rules_recovery_plan
- and schedule job_backup_recovery_plan_to_remote.sh to run automatically
Practice the recovery plan. Start from scratch with a blank environment (or use a different location on the current machine). You’ll run into snags, and that is the point. Workout the snags BEFORE data is lost. If you have enough disk space, restore all your data to a different directory. And then use diff to verify the accuracy of the restored data.
Example annual recovery-plan checklist:
- review your recovery plan
- make sure the recovery-plan files are still accessible and up-to-date (listed in previous section)
- on-site copy
- off-site copy
- practice restore-data on small test directory, from $rclone_jobber/examples:
- setup_test_data_directory.sh
- job_backup_to_USB.sh
- job_backup_to_remote.sh
- delete the ~/test_data_directory
- job_restore_last_snapshot.sh
rclone_jobber_tutorial.org by Wolfram Volpi is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at https://github.com/wolfv6/rclone_jobber.
Permissions beyond the scope of this license may be available at https://github.com/wolfv6/rclone_jobber/issues.
rclone_jobber is not affiliated with rclone.