What else is better than a photo of water being backed up - Birchville Dam

AWS Backup

AWS Backup is a service introduced a few years ago to centrally manage backup options for various AWS services that deal with data persistence. These include RDS, DynamoDB, EBS, EFS, EC2, S3, and Amazon FSx. What was earlier at best by processes and architecture reviews and audits in different services with different strategies can be consolidated into a single service with a single Console. For example, what was earlier RDS specific backup design would be collected into an AWS Backup Plan that aligns the specific RDS instance with the rest of the data persistence in a solution without having to manually compare different approaches for backup with different services.

In addition to making the operational overhead for implementing and managing backup for a given solution, AWS Backup also makes it easy to architect on AWS for specific backup strategies. Security and audit reviews are much easier and can be conducted with more confidence because of the centralised nature of the service.

Concepts

To bring disparate services and concepts together, AWS Backup introduces a few new concepts itself.

AWS Backup Concepts

Backup Vaults

A Backup Vault is a primary concept in AWS Backup. It refers to the central data storage of the backup recovery points being taken from the storage instances of interest. Access to Backup Vaults can be managed separately from the resources that store their recovery points in the vaults, or even the principals performing the backups themselves. Both adding recovery points and restoring from them can be controlled at the vault level with vault policies.

Backup Vaults also have a locking feature that can be leveraged for compliance. The two modes, governance and compliance allow locking vaults to be accessed by only select principals with IAM permissions, or locking vaults for a specific period of time without anyone being able to modify the vault or the lock. Additionally, Legal Holds can be applied to AWS Backup Recovery Points at a granular level so that they cannot be deleted until all the applicable Legal Holds have expired. This is useful in situations where backups need to be maintained for a certain period beyond the usual retention period for external reasons.

Vaults is encrypted with a user specified KMS key. While this can be the AWS managed region specific aws/backup key, it is advisable to use KMS Customer Master Keys (CMK) for vault encryption. While this is a more secure approach to encryption in AWS in general, some features as described below, depend on the vault being encrypted with CMKs.

A Backup Vault will be the target for Backup operations performed on various supported resources. Depending on the resource type encryption of the recovery point will be done by either the Backup Vault key or the resource encryption key.

Backup Plans

A Backup Plan is the actual implementation of the backup strategy for the solution. It is a policy that expresses which resources should be backed up how frequently with how long of retention periods. A Backup Plan consists of one or more Backup Rules.

A rule consists of the details for a backup operation. These are details such as frequency of the backup operation, backup window, the target vault for the backup result, and the retention period for the said result. The backup frequency can be as low as hourly. The target vault can be single or multiple, and the secondary vaults could be in the same account, different account, and different regions. In addition to the retention period of the backup recovery point, the lifecycle details such as rotation of the recovery point into low cost cold storage can also be defined here.

Backup Plans are serialised as JSON definitions.

Resource Associations

After the backup operation details are specified, the resources that should be backed up can be included in the Backup Plan. This is done using a Resource Association. A Resource Association is defined as direct resource ARNs or a more general adaptable tag based selection. By decoupling Rules and Resource Associations, AWS Backup detaches the data protection strategy from the data sources while keeping the relationship between the two simple to build.

When defining a Resource Association, a role with permissions to create backups with the target resources should also be specified. For this, a role with AWSBackupServiceRolePolicyForBackup and AWSBackupServiceRolePolicyForRestores policies attached can be created. This role will be assumed by AWS Backup when executing the Backup Plan against the specified resources.

Although it is advisable to keep things simple, the same resource can be associated with multiple Backup Plans. When this is done, recovery points created by overlapping plans are retained in line with the plan with the longest retention period.

Recovery Points

Recovery Points are the result of the Backup Plans getting executed. They are the backups (full or incremental) of the source data. Recovery Points can be copied to multiple Vaults in the same Account or in different Account (and depending on the resource type, in a different Account in a different region as well).

Recovery Points that are in other accounts cannot be used to restore data on the source Account. They have to be copied over to the source account for any restoration to work, so you should be mindful of the RTO goals of your solution when architecting cross-account backups.

Backup Policies

The above mentioned Backup Plans are a great way to automate the backup operations within a single AWS Account. Coupled with good design and periodic architecture reviews, the same strategy can be expanded to other Accounts in the AWS Organization.

However, depending on the use case, it might be easier to define these Backup Plans at a higher level than within individual AWS Account’s Backup service, such as at the AWS Account level, or at the Organizational Unit level. This is where Backup Policies come into play.

For deployments that are more unified and are mostly singular project oriented, the former Backup Plan approach can work best. The implementation of the requirements are clearly defined and are visible in the Backup Console/API. Depending on the requirements, the Backup Plans can be defined separately for separate goals.

For AWS Organizations that are more diverse and have different players working under the same root Node, the latter approach is better. This can enforce best practices on different AWS Accounts on the Organization without having to individually manage Backup Plans. Different Organizational Units can be used to enforce different backup strategies as needed without having to worry about the Account owners being responsible in their architectures.

Backup Policies is a way for Backup Plans to be applied as Account Policies in the same manner Service Control Policies (SCPs) are applied to different Accounts and Organizational Units in an AWS Organization. While SCPs is a guardrail implementation concept, Backup Policies are more of application of Backup Plans to a wider scope.

The effective Backup Plan for a given Account is the collection of Backup Plans defined in AWS Backup inside the particular Account, Backup Policies defined for the Account, and the list of Backup Policies defined in each Organizational Unit up to and including the Root Node.

This provides a way to apply a Backup Plan across an Organizational Unit according to how broad the backup strategy requirements are. For example, for an OU that is defined to contain critical business data in RDS instances, a Backup Policy can be defined so that any RDS instance that matches a given tag is backed up to the Account specific Vault and copied to a different vault in a central backup Account. By doing so, the responsibility of writing Backup Plans in those individual Accounts can be taken out of the Account owning parties while making sure the appropriate data protection strategy is implemented.

Because Backup Policies are inherited similar to the way SCPs are, they can be partially defined at each level of specificity. Policies defined at different levels of a hierarchy with the same name are merged together. For example, a Backup Policy can be partially defined at OU1 where Rules are defined with no backup operation interval defined, while child OUs, OU2 and OU3, can define their own intervals to match their specific needs. Another example is to add or modify Resource Associations in the Backup Plans, where tag selectors can be overridden to match each Account’s usage. However, partially defining Backup Policies could easily break the effective Backup Plan for a specific Account, if the collection of all the Backup Policies do not add up to a functional Backup Plan. This can silently fail with the intended data not being backed up properly. Therefore, a better approach would be to define the Backup Policies fully while overriding the specific fields of the policy in each level. This provides a functional starting point for the Backup Plan.

Backup Policies could also end up creating a fairly complex JSON definition that can be hard to debug. Ideally, Backup Policies could be divided along resource and strategy lines, which would make them much easier to handle. Generally, keeping it simple applies really well to Backup Policies since mistakes could end up costing a huge chunk off the project/business.

Extending this topic in the next article to talk about cross-account backups in AWS.