From AWS-CloudDesignPattern
Jump to: navigation, search

Data Backups


Problem to Be Solved

More than anything else, it is important that your data be safe. This means that it is imprtant for you to back up your data. For example, you may use a tape to backup data to a separate location. However, with a tape backup, you have cost of changing and storing tapes, for example, and these are operations that are difficult to automate. While you can buy expensive equipment to provide some degree of automation, even when this is done, you are still faced with the fact that tapes is inherently have limited capacity, and complete automation is difficult.

Explanation of the Cloud Solution/Pattern

The AWS Cloud lets you use the limitless capacity of "Internet storage" (also known as "Web Storage"), doing so safely and with relatively little expense.

In the AWS Cloud, we often talk about a "snapshot," which is a backup copy of your data at a given point in time. The AWS Cloud lets you copy the data of a virtual server (including the operating system), along with other data, to Internet storage easily, making it easy for you to take these snapshots at regular intervals. You can take a snapshot with one click of a button in the Control Screen, or you can take a snapshot using an API. That is, you can automate it using a program. Because with Internet Storage you don't have to worry about capacity, you can automate the backup process by having a computer program take snapshots on a regular periodic basis.

When performing program update checks or creating temporary test environments, you need to create the environment using a specific data cross-section. In this case, you need to copy not just the data, but the OS as well. A snapshot is a perfect solution for this as well, because it copies each individual OS.


The Elastic Block Store (EBS), which is the virtual storage in AWS, has a snapshot function. Make a snapshot using this function. When you take a snapshot, it is stored in the Amazon Simple Storage Service (S3) object storage, which is designed to have 99.999999999% durability.

When the EBS snapshot function is used, all of the data, including the EBS, is copied to S3. A snapshot that has been stored in S3 can be recovered as a new EBS. Even if data is lost or corrupted in the EBS, you can recover the data from the time when the snapshot was taken.

When you use the EBS as a data disk, you can backup your data at any time by taking a snapshot. You can take as many new backups as necessary, whenever necessary, without having to worry about storage capacity.

When you use EBS as a boot disk, you can make a copy for each operating system, and store it as an Amazon Machine Image (AMI). You can launch a new EC2 instance from that data.




  • Taking backups can be controlled by a computer program. That is, you can automate the process, rather than having to make backups manually.
  • You can use S3, which has high durability, as the backup destination.
  • You can backup the entirety of the EBS data, enabling immediate use of the backup as a new EBS. This makes recovery easy in the event of a failure.
  • You can make backups not just of user data, but of each individual operating system as well. Because the backup for each OS is stored as an AMI, you can launch new EC2 instances.
  • You can take data under specific circumstances (for example, after replacing an application or after updating data), and can take multiple generations, without having to worry about storage capacity. This lets you rebuild the environment easily after a failure or a problem, enabling you to return to the environment of any given point in time.


  • You must maintain data consistency when taking snapshots. When you take a snapshot with EBS mounted, make sure to take the snapshot in a state where logical consistency has been achieved, for example, after flushing the cache of the file system (EXT or NTFS), and after application transactions have been completed.
  • Typically, the smaller the data size of the boot disk, the more rapidly the virtual server can be started. Note that disk checks that are performed periodically (fsck in Linux) also take time.


  • You may want to split the boot part and the data part into separate EBSs when making a backup, because you will probably want to backup your data parts more often than your boot parts.

Backup of data


One of IT's primary goals is to ensure that data is never lost. In order to protect against data loss, a number of backup approaches are used to "secure" user data. Prior to the cloud organizations would typically use backup software to copy data to multiple tiers of storage including disk arrays in alternate data centers and duplication of backups to expensive robotic tape libraries. While individual tape drives are relatively inexpensive, they frequently fail and provide much slower access than disk-based backups. Organizations typically also send tapes to a third-party vaulting service for long-term archival. These services are expensive and the transportation of physical media increases the odds that tapes will be lost in-transit, exposing the organization to "data spillage" which could result in fines and upset customers.

Description of the pattern in the Cloud

In the cloud, there is no limit of capacity in the "safety" very, storage available and relatively inexpensive to the Internet (also known as Web storage). With a single click from the screen can also manage to save regularly to the Internet its storage, data and other data of the virtual server. This backup method is called with SnapShot, SnapShort acquisition of management rather than to the screen can be retrieved using the API it is possible. Can be used to program, that is to automate. Internet need to worry about storage capacity because it does not, if you like to get on a regular basis SnapShot in the program, it is possible to fully automate the backup. In addition, the needs in making the environment for the temporary check for updates and testing program, wanted to create an environment by using a cross-sectional view of certain data is high. Although be necessary in this case is not only replication of only the data simply do a copy for each OS, because you can duplicate the OS by using the SnapShot, be resolved easily by using the cloud also these problems.


Amazon EBS is a virtual storage in the AWS, which with a snapshot feature to take advantage of this. You can use this feature, when you create a snapshot for the EBS, the data contained in the EBS at the time of writing, the snapshot is created as a whole EBS, its data is stored to S3. S3 is a storage object is very high durability, has been designed with the durability of the 99.999999999%. Snapshots that are stored in S3, EBS can be restored as a new, even if you lost or corrupted data on the EBS if you can restore data from S3 at the time of taking a snapshot. Once you get a snapshot of the disk of the data portion, the data is saved to S3, make a backup very robust. By using the EBS snapshot at any point in time, when needed, you can create a new EBS as needed. EBS snapshot that contains the boot area of ​​the OS, in order to be registered as a (AMI), it is also possible to start a new EC2 instance from which Amazon Machine Image.



Obtain a backup of the area, including the OS can be automated Amazon S3 can be used with high durability to the backup destination. And can easily be backed up whole, even when the data disk is damaged, you can create a quick EBS from snapshot obtained, it is possible to easily recover even when the event of failure of EBS. Can be registered as AMI, can be backup, including the boot area of ​​the OS as well as data. Can (and after replacing the application, for example, after updating the data) is in a particular state that you have a snapshot of, so you can keep a snapshot as many times as necessary, there was a defect or failure I can easily reproduce the environment in case, it can be returned to the timing of any environment.


And file systems that are used to get a snapshot of the EBS, said NTFS and EXT3, in EBS, for applications running on it is to note that it is necessary to consider by itself. Get snapshot is a atomic, if you want to take a snapshot while you mount the EBS, snap or spitting out the file system cache, and end the transaction of such application, the state took the logical consistency you need to create a shot. In general, the better the boot disk is smaller, faster also start the virtual server. Ru disk check performed on a regular basis (in the case of Linux is fsck) of the also not take much time when.


Because there are a lot more of the data portion that performs a backup more frequently than the boot part is done well also be divided into different virtual disk boot and data portions.

Personal tools