- Home
- QMS
- Procedures
- IT-245-SystemBackupandRecovery
System Backup and Recovery SOP
Purpose
Establish a systematic approach to data backup and recovery to prevent loss, enable rapid restoration and maintain data integrity for all Neosofia IT systems.
Scope
This SOP applies to any IT system that manages Neosofia client or corporate data.
Assets in Scope
Each of the assets below will have an entry in this SOP that outlines the backup and recovery procedures Neosofia employs to protect client and corporate data.
| Data/Support Asset | RPO | RP | RTO | OC |
|---|---|---|---|---|
| Hardware | N/A | N/A | 2 hours | N/A |
| Operating Systems | 1 day | 7 days | 1 hour | 1 full + 6 incr. |
| Virtual Machines | 1 day | 28 days | 1 hour | 1 full + 27 incr. |
| Public DNS Records | N/A | N/A | 1 hour | N/A |
| Source Code | 1 week | 25 years | 1 hour | N/A |
| System Logs | N/A | 30 days | N/A | N/A |
| Credentials | 1 hour | 1 year | 1 hour | 1 full |
Responsibilities
IT System Administrators will be responsible for
- L4 Architecture, design, implementation, and execution of the procedures outlined in this document.
- L3 System monitoring to determine if restoration procedures need to be executed on
- L2 Documentation of the backup and restoration procedure execution as evidence for auditors
- L1 Provide feedback on this document
IT Managers will be responsible for
- L4 Review of this document no less than once per year
- L4 Respond to and integrate feedback into this document
- L3 Review of this document when new IT systems are procured or retired to determine the system backup and restoration procedures that may require an update
- L4 Advise and mentor IT System Administrators in their responsibilities.
Procedures
Hardware Procedures
Neosofia will maintain a 2% hardware inventory reserve to recover from hardware losses or will define procedures below to enable cloud resources to be used as a temporary replacement for system restoration.
Operating System Procedures
Backup
When provisioning a new piece of hardware, the IT System Administrators runs the OS setup script that creates an OS level backup cron job to be run automatically starting at 2AM UTC every day. The automated backup script will:
- Create a full OS level snapshot and on-device (USB stick) rescue media needed to restore the system in the event of a hardware failure
- Reboot the device into the rescue media's automated restora tion program
- Upon system restoration and reboot, the restoration logs are sent to the central log service
- If the daily OS backup procedure completes without errors, a status report is automatically sent to the central log server. If any errors occur, an email is sent to all IT System Administrators with details of the error to be remediated.
Automated backup and system restoration should take no more than 15 minutes 99% of the time
Recovery
Upon notification of a system failure, the IT System Administrators will
- Identify and replace defective hardware
- Boot the machine from the restoration media (USB Stick)
- The restoration procedure should begin automatically. If the restoration procedures requests input due to hardware changes, contact a L3 IT system Administrator or higher for guidance on appropriate inputs.
- If successful, confirm the automated restoration logs were sent to the central log server. if the automated restoration process fails contact a L4+ SA to troubleshoot the error.
- Update the inventory management system and procure replacement hardware if the stock level falls below 2%.
VM Procedures
Networking Procedures
Backup
All network configurations are automatically backed up on a weekly basis by the networking equipment vendor. When a new piece of networking equipment is acquired, follow the procedures below to ensure the device is being backed up.
- Log into the networking management interface and navigate to the system backup setting
- Ensure the system backup checkbox is checked and click the back up now button
- Observe that no errors are reported upon backup and verify that the device is writing to the central log server
Restore
In the event of a networking equipment failure, follow the following steps
- Replace the failed device with the same model
- Log into the networking management interface and navigate to the settings panel of the new device
- from the existing device configuration menu, select the failed device profile and apply it to the new device.
Source Code
Backup
Whenever a pull request is merged into a protected branch an automated script pushes changes to a secondary SCS vendor.
Restoration
Should the primary SCS vendor be compromised in such a way that the source files can not be restored to their original state, the restoration procedures below should be initiated.
- Checkout the git repository from the secondary SCS vendor
- Create a new (blank) repository in the primary SCS vendor
- push the git repository from the secondary vendor into the newly created repo on the primary vendor
- Make a test pull request against the primary repository and merge
- Observe that the changes to the primary repo are synced to the secondary repo
- if the test succeeds, notify all members of the repository that it has been restored and that all changes should be submitted to the primary repository. If the test fails, notify your manager and troubleshoot failures with them until there is resolution.
Log File Procedures
All log files are pushed to an immutable central log server and retained for 30 days.