blob: 01468742e54092d3e195cd23f97f353954b3ace1 [file] [log] [blame]
okozachenkoaf073202021-04-06 16:56:51 +03001# Staffeln
okozachenko093ce9e2021-04-01 22:47:39 +03002
okozachenkoaf073202021-04-06 16:56:51 +03003## Project Description
okozachenko093ce9e2021-04-01 22:47:39 +03004
okozachenkoaf073202021-04-06 16:56:51 +03005This solution is a volume-level scheduled backup to implement a non-intrusive automatic backup for Openstack VMs.
okozachenko093ce9e2021-04-01 22:47:39 +03006
okozachenkoaf073202021-04-06 16:56:51 +03007All volumes attached to the specified VMs are backed up periodically.
okozachenko093ce9e2021-04-01 22:47:39 +03008
okozachenkoaf073202021-04-06 16:56:51 +03009File-level backup will not be provided. The volume can be restored and attached to the target VM to restore any needed files. Users can restore through Horizon or the cli in self-service.
okozachenko093ce9e2021-04-01 22:47:39 +030010
okozachenkoaf073202021-04-06 16:56:51 +030011## Functions
12
13### Function Overview
14
15The solution backs up all volumes attached to VMs which have a pre-defined metadata set, for
16example, `backup=yes`.
17First, it gets the list of VMs which have backup metadata and the list of volumes attached to the
18VMs in the given project by consuming the Openstack API (nova-api and cinder-api). Once the
19volume list is prepared, then it consumes cinder-backup API to perform the backup.
20Once the backup is successful, the backup time is updated in the metadata - `last-backup-time` of
21the VM.
22
23* *Filtering volumes:* It skips specific volumes if the volume metadata includes a specific
24`skip-volume-backup` flag.
25* *Limitation:* The number of volumes which users can backup is limited. Once the backup
26count exceeds the quota which is defined per project, the backup job would fail.
27* *Naming convention:* The backup volume name would be
28{VOLUME_NAME}-{BACKUP_DATE}.
29* Compression: all backup volumes are compressed at the ceph level. The compression
30mode, compression algorithm and required parameters are configured by the user.
31
32### Retention
33
34Based on the configured retention policy, the volumes are removed.
35Openstack API access policies are customized to make only the retention service be able to delete
36the backups and users not.
37
38### Scaling
39
40Cinder backup service is running on the dedicated backup host and it can be scaled across multiple
41backup hosts.
42
43### Notification
44
45Once the backup is finished, the results are notified to the specified users by email regardless of
46whether it was successful or not (the email will be one digest of all backups).
47Backup result HTML Template
48- Backup time
49- Current quota usage(Quota/used number/percentage) with proper colors
50 - 50% <= Quota usage : Green
51 - 80% > Quota > 50% usage : Yellow
52 - Quota usage > 80% : Red
53- Volume list
54- Success/fail: true/false with proper colors
55 - Fail: Red
56 - Success: Green
57- Fail reason
58
59### Settings
60
61Users can configure the settings to control the backup process. The parameters are;
62- Backup period
63- Volume filtering tag
64- Volume skip filter metadata tag
65- Volume limit number
66- Retention time
67- Archival rules
68- Compression mode, algorithm and parameters
69- Notification receiver list
70- Notification email HTML template
71- Openstack Credential
72
73### User Interface
74
75- Users can get the list of backup volumes on the Horizon cinder-backup panel. This panel
76has filtering and pagination functions which are not default ones of Horizon.
77- Users cannot delete the volumes on the UI. Delete Volume Backup button is disabled on
78the cinder-backup panel.
79
80## Dependencies
81
82* openstacksdk (API calls)
83* Flask (HTTP API)
84* oslo.service (long-running daemon)
85* pbr (using setup.cfg for build tooling)
86* oslo.db (database connections)
87* oslo.config (configuration files)
88
89
90## Architecture
91
92### HTTP API (staffeln-api)
93
94This project will need a basic HTTP API. The primary reason for this is because when a user will attempt to delete a backup, we will use [oslo.policy via HTTP](https://docs.openstack.org/oslo.policy/victoria/user/plugins.html) to make sure that the backup they are attempting to delete is not an automated backup.
95
96This API will be unauthenticated and stateless, due to the fact that it is simply going to return the plain-text string True or fail with 401 Unauthorized. Because of the simplicity of this API, [Flask](https://flask.palletsprojects.com/en/1.1.x/) is an excellent tool to be able to build it out.
97
98The flow of the HTTP call will look like the following:
99
1001. HTTP request received through oslo.policy when backup being deleted with ID
1012. Server look up backup ID using OpenStack API
1023. If backup metadata contains `__automated_backup=True` then deny, otherwise allow.
103
104With that flow, well be able to protect automated backups from being deleted automatically. In order to build a proper architecture, this application will be delivered as a WSGI application so it can be hosted via something like uWSGI later.
105
106### Daemon (staffeln-conductor)
107
108The conductor will be an independent daemon that will essentially scan all the virtual machines (grouped by project) which are marked to have automatic backups and then automatically start queueing up backups for them to be executed by Cinder.
109
110Once backups for a project are done, it should be able to start running the rotation policy that is configured on all the existing volumes and then send out a notification email afterwards to the user.
111
112The daemon should be stateful and ensure that it has its own state which is stored inside of a database.