As part of a disaster recovery plan, you can protect production data on your GitHub Enterprise instance by configuring automated backups.
In this guide:
- About GitHub Enterprise Backup Utilities
- Prerequisites
- Installing GitHub Enterprise Backup Utilities
- Scheduling a backup
- Restoring a backup
About GitHub Enterprise Backup Utilities
GitHub Enterprise Backup Utilities is a backup system you install on a separate host, which takes backup snapshots of your GitHub Enterprise instance at regular intervals over a secure SSH network connection. You can use a snapshot to restore an existing GitHub Enterprise instance to a previous state from the backup host.
Only data added since the last snapshot will transfer over the network and occupy additional physical storage space. To minimize performance impact, backups are performed online under the lowest CPU/IO priority. You do not need to schedule a maintenance window to perform a backup.
For more detailed information on features, requirements, and advanced usage, see the GitHub Enterprise Backup Utilities README.md file.
Prerequisites
To use GitHub Enterprise Backup Utilities, you must have a Linux or Unix host system separate from your GitHub Enterprise instance.
You can also integrate GitHub Enterprise Backup Utilities into an existing environment for long-term permanent storage of critical data.
We recommend that the backup host and your GitHub Enterprise instance be geographically distant from each other. This ensures that backups are available for recovery in the event of a major disaster or network outage at the primary site.
Physical storage requirements will vary based on Git repository disk usage and expected growth patterns:
Hardware | Recommendation |
---|---|
vCPUs | 2 |
Memory | 2 GB |
Storage | Five times the primary instance's allocated storage |
More resources may be required depending on your usage, such as user activity and selected integrations.
Installing GitHub Enterprise Backup Utilities
-
Download the latest GitHub Enterprise Backup Utilities release and extract the file with the
tar
command.tar -xzvf /path/to/github-backup-utils-vMAJOR.MINOR.PATCH.tar.gz
Copy the included
backup.config-example
file to thebackup.config
folder and open in an editor.- Set the
GHE_HOSTNAME
value to your primary GitHub Enterprise instance's hostname or IP address. - Set the
GHE_DATA_DIR
value to the filesystem location where you want to store backup snapshots. - Open your primary instance's settings page at
https://[hostname]/setup/settings
and add the backup host's SSH key to the list of authorized SSH keys. For more information, see Accessing the administrative shell (SSH). -
Verify SSH connectivity with your GitHub Enterprise instance with the
ghe-host-check
command.bin/ghe-host-check
-
To create an initial full backup, run the
ghe-backup
command.bin/ghe-backup
For more detailed information on advanced usage, see the GitHub Enterprise Backup Utilities README.md file.
Scheduling a backup
You can schedule regular backups on the backup host using the cron(8)
command or a similar command scheduling service. The configured backup frequency will dictate the worst case recovery point objective (RPO) in your recovery plan. For example, if you have scheduled the backup to run every day at midnight, you could lose up to 24 hours of data in a disaster scenario. We recommend starting with an hourly backup schedule, guaranteeing a worst case maximum of one hour of data loss if the primary site data is destroyed.
If backup attempts overlap, the ghe-backup
command will abort with an error message, indicating the existence of a simultaneous backup. If this occurs, we recommended decreasing the frequency of your scheduled backups. For more information, see the "Scheduling backups" section of the GitHub Enterprise Backup Utilities README.md file.
Restoring a backup
In the event of prolonged outage or catastrophic event at the primary site, you can restore your GitHub Enterprise instance by provisioning another GitHub Enterprise appliance and performing a restore from the backup host. You must add the backup host's SSH key to the target GitHub Enterprise appliance as an authorized SSH key before restoring an appliance.
If you are restoring to a GitHub Enterprise 2.11 versioned appliance from a 2.9 or 2.10 versioned backup snapshot, you may need to run a migration script on the original appliance first. For more information, see "Migrating audit logs to GitHub Enterprise 2.11."
To restore your GitHub Enterprise instance from the last successful snapshot, use the ghe-restore
command. You should see output similar to this:
ghe-restore -c 169.154.1.1 Checking for leaked keys in the backup snapshot that is being restored ... * No leaked keys found Connect 169.154.1.1:122 OK (v2.9.0) WARNING: All data on GitHub Enterprise appliance 169.154.1.1 (v2.9.0) will be overwritten with data from snapshot 20170329T150710. Please verify that this is the correct restore host before continuing. Type 'yes' to continue: yes Starting restore of 169.154.1.1:122 from snapshot 20170329T150710 # ...output truncated Completed restore of 169.154.1.1:122 from snapshot 20170329T150710 Visit https://169.154.1.1/setup/settings to review appliance configuration.
Note: The network settings are excluded from the backup snapshot. You must manually configure the network on the target GitHub Enterprise appliance as required for your environment.
You can use these additional options with ghe-restore
command:
- The
-c
flag overwrites the settings, certificate, and license data on the target host even if it is already configured. Omit this flag if you are setting up a staging instance for testing purposes and you wish to retain the existing configuration on the target. For more information, see the "Using using backup and restore commands" section of the GitHub Enterprise Backup Utilities README.md file. - The
-s
flag allows you to select a different backup snapshot.