NYU AWS EC2 Snapshot Overview
There is an Operational requirement to provide AWS-hosted workloads with point-in-time snapshot level backup protection in virtual environments. This standard is designed to provide a basic level of protection in the event of running instance corruption, failure, or inadvertent deletion. Implementation of this standard will enhance our Operational capabilities and build confidence in the delivery of EC2 cloud services to our customers.
This document specifically covers the EBS volume backup. For certain instances, additional backup methods may be required. For database instances in particular, backup mechanisms such as RMAN may still be required to ensure optimum RPO/RTO and DB consistency.
EC2 Recovery Objectives
The default recovery objectives outlined below are dictated by budgetary constraints. If an individual application has additional requirements, deviations from the standard are granted by request and subject to review based upon NYU business criticality.
- Recovery Point Objective (RPO) – maximum of 24hrs
- Recovery Time Objective (RTO) – maximum of 2hrs
The RTO refers to the EC2 instance restoration only and does not necessarily include service restoration times. A successful restore is dependent upon a consistent snapshot.
Lifecycle Policy Details
Note: This Lifecycle policy currently exists as Beta pilot in both Dev and Prod
Description
- Naming convention – [aws account name]-backup-[period]-retention
- Example – dev-backup-2-day-retention
- Syntax
- All characters are strictly lower case
- Remove whitespace from values
- Separate values with dashes
Schedule Name
- Naming convention – [frequency]-[time]-utc
- Example – every24hour-5am-utc (e.g. Daily @ 05:00 UTC)
- Syntax
- All characters are strictly lower case
- Remove whitespace from values
- Separate values with dashes
- All times in UTC
Retention Rule
- Minimum of 2 days
- Maximum of 30 days
Tagging Requirements
ESB volumes must be tagged in order to be backed up.
- Create new tag key “backup”
- Set backup tag value to “yes”
- Apply backup tag to all volumes requiring daily snapshots
Please see NYU Cloud Tagging Strategy for full details on tagging syntax.
Note: Current iteration is binary(i.e. backup=yes). Future iterations may allow for backup tag key to specify a more specific backup policy (i.e backup = 30 day or 15 day, assuming policies may evolve for various needs of applications, services and clients).