AWS EBS, EFS & FSx

Amazon EBS

Amazon EBS provides durable, block-level storage volumes that you can attach to a running instance. EBS volume persists independently from the running life of an instance. You can use Amazon EBS as a primary storage device for data that requires frequent and granular updates.

Amazon EBS volume data is replicated across multiple servers in an Availability Zone to prevent the loss of data from the failure of any single component. It cannot be used across AZ.

With Amazon EBS Elastic Volumes, you can increase the volume size, change the volume type, or adjust the performance of your EBS volumes.

Volume Types

SSD (Solid State Drives) is optimized for transactional workloads involving frequent read/write operations with small I/O size, where the dominant performance attribute is IOPS.

HDD (Hard Disk Drives) is optimized for large streaming workloads where the dominant performance attribute is throughput. Throughput for HDD volumes is always determined by the smaller of the throughput of volumes and the throughput of instances.

Previous Generation(Magnetic volumes) are hard disk drives that can be used for workloads with small datasets where data is accessed infrequently and performance is not of primary importance. It is already outdated.

SSD volumes handle small or random I/O much more efficiently than HDD volumes.

General Purpose SSD

Provides a balance of price and performance. AWS recommend these volumes for most workloads. These volumes are ideal for use cases such as boot volumes, medium-size single instance databases, and development and test environments.

I/O credits represent the available bandwidth that your gp2 volume can use to burst large amounts of I/O when more than the baseline performance is needed. The more credits your volume has for I/O, the more time it can burst beyond its baseline performance level.

Each volume receives an initial I/O credit balance of 5.4 million I/O credits. Volumes earn I/O credits at the baseline performance rate of 3 IOPS per GiB of volume size. When your volume uses fewer I/O credits than it earns in a second, unused I/O credits are added to the I/O credit balance. When your volume requires more than the baseline performance I/O level, it draws on I/O credits in the credit balance to burst to the required performance level, up to a maximum of 3,000 IOPS.

The maximum I/O credit balance for a volume is equal to the initial credit balance (5.4 million I/O credits).

If your gp2 volume uses up all of its I/O credit balance, the maximum IOPS performance of the volume remains at the baseline IOPS performance level and the volume's maximum throughput is reduced to the baseline IOPS multiplied by the maximum I/O size.(250 MiB/s at maximum). When I/O demand drops below the baseline level and unused credits are added to the I/O credit balance, the maximum IOPS performance of the volume again exceeds the baseline.

Assuming V = volume size in GiB, I = I/O size in KiB, R = IOPS per GiB, and T = throughput, we can calculate the throughput by:
T=VIR
We then can calculate the smallest volume size that achieves the maximum throughput by:
V = T/IR

Provisioned IOPS SSD

Provides high performance for mission-critical, low-latency, I/O-intensive, or high-throughput workloads. It is particularly suitable for database workloads, that are sensitive to storage performance and consistency.
Provisioned IOPS SSD volumes use a consistent IOPS rate.
The maximum ratio of provisioned IOPS to requested volume size (in GiB) is 50:1 for io1 volumes, and 500:1 for io2 volumes.

Throughput Optimized HDD

A low-cost HDD designed for frequently accessed, throughput-intensive workloads. It provides low-cost magnetic storage that defines performance in terms of throughput rather than IOPS. This volume type is a good fit for large, sequential workloads such as Amazon EMR, ETL, data warehouses, and log processing.

Like gp2, st1 uses a burst-bucket model for performance. Volume size determines the baseline throughput (40 MiB/s per TB with a maximum of 500 MiB/s) of your volume, which is the rate at which the volume accumulates throughput credits. Volume size also determines the burst throughput of your volume, which is the rate at which you can spend credits when they are available. Larger volumes have higher baseline and burst throughput. The more credits your volume has, the longer it can drive I/O at the burst level.

Burst throughput is limited to 250 or 500 MiB/s, the bucket fills with credits at 40 MiB/s, and it can hold up to 1 TiB-worth of credits.

(Volume size) x (Credit accumulation rate per TiB) = Throughput

Cold HDD

The lowest-cost HDD design for large, sequential cold-data workloads.

sc1 uses a burst-bucket model similar to st1. Burst throughput is limited to 80 MiB/s, the bucket fills with credits at 12 MiB/s, and it can hold up to 1 TiB-worth of credits.

Due to per-instance and per-volume throughput limits, a full volume scan on a throughput Optimized HDD is mush faster than on a cold HDD. Therefore if you have a throughput-oriented workload that needs to complete scans quickly (up to 500 MiB/s), or requires several full volume scans a day, use st1.

Optimize EBS performance

There is a relationship between the maximum performance of your EBS volumes, the size and number of I/O operations, and the time it takes for each action to complete. Each of these factors affects the others, and different applications are more sensitive to one factor or another.

Modify I/O characteristics

IOPS - input/output operations per second. Consequently, when you create an SSD-backed volume supporting 3,000 IOPS, you can transfer up to 3,000 I/Os of data per second, with throughput determined by I/O size.

The operations are measured in KiB. Maximum I/O size for each operation is capped at 256 KiB for SSD volumes and 1,024 KiB for HDD volumes.

Throughput - the rate of size of blocks successfully delivered to the volume per second.

The throughput limit of SSD is between 128 MiB/s and 250 MiB/s, depending on the volume size.

When small I/O operations are physically contiguous, Amazon EBS attempts to merge them into a single I/O operation up to the maximum size. For example, 8 contiguous I/O operations at 32 KiB each will be counted as 1 operation (8×32=256). However, 8 random non-contiguous I/O operations at 32 KiB each will be counted as 8 operations. For smaller I/O operations, you may see a higher-than-provisioned IOPS value as measured from inside your instance. This is because the instance operating system merges small sequential I/O operations into a larger operation before passing them to Amazon EBS. If your workload uses small or random I/Os, you may experience a lower throughput than you expect. This is because we count each random, non-sequential I/O toward the total IOPS count, which can cause you to hit the volume's IOPS limit sooner than expected.

For SSD-backed volumes, if your I/O size is very large, you may experience a smaller number of IOPS than you provisioned because you are hitting the throughput limit of the volume. For example, a gp2 volume under 1,000 GiB with burst credits available has an IOPS limit of 3,000 and a volume throughput limit of 250 MiB/s. If you are using a 256 KiB I/O size, your volume reaches its throughput limit at 1000 IOPS (1000 x 256 KiB = 250 MiB). For smaller I/O sizes (such as 16 KiB), this same volume can sustain 3,000 IOPS because the throughput is (3000 x 16KiB = 48 MiB) well below 250 MiB/s.

Volume queue length - the number of pending I/O requests for a device.

Latency - the true end-to-end client time of an I/O operation, which refers to the time elapsed between sending an I/O to EBS and receiving an acknowledgement from EBS that the I/O read or write is complete. Consistently driving more IOPS to a volume than it has available can cause increased I/O latency.

Optimal queue length varies for each workload, depending on your particular application's sensitivity to IOPS and latency. If your workload is not delivering enough I/O requests to fully use the performance available to your EBS volume, then your volume might not deliver the IOPS or throughput that you have provisioned.

Transaction-intensive applications are sensitive to increased I/O latency and are well-suited for SSD-backed volumes. You can maintain high IOPS while keeping latency down by maintaining a low queue length and a high number of IOPS available to the volume.

Throughput-intensive applications are less sensitive to increased I/O latency, and are well-suited for HDD-backed volumes. You can maintain high throughput to HDD-backed volumes by maintaining a high queue length when performing large, sequential I/O.

Initialization of EBS

Empty EBS volumes receive their maximum performance the moment that they are created and do not require initialization (formerly known as pre-warming). For volumes that were created from snapshots, the storage blocks must be pulled down from Amazon S3 and written to the volume before you can access them. This preliminary action takes time and can cause a significant increase in the latency of I/O operations the first time each block is accessed.

You can avoid this performance hit by:
1. Force the immediate initialization of the entire volume. Use dd or fio to access each block prior to putting the volume into production.
2. Enable fast snapshot restore on a snapshot to ensure that the EBS volumes created from it are fully-initialized

RAID

RAID (Redundant Array of Inexpensive Disks) is a data storage virtualization technology that combines multiple physical disk drive components into one or more logical units for the purposes of data redundancy, performance improvement, or both.

RAID 0 is used when I/O performance is more important than fault tolerance. Some instance types can drive more I/O throughput than what you can provision for a single EBS volume. You can join multiple volumes together in a RAID 0 array to use the available bandwidth for these instances.

RAID 1 is used when fault tolerance is more important than I/O performance. A RAID 1 array offers a "mirror" of your data for extra redundancy.

If you want to back up the data on the EBS volumes in a RAID array using snapshots, you must ensure that the snapshots are consistent. This is because the snapshots of these volumes are created independently. To restore EBS volumes in a RAID array from snapshots that are out of sync would degrade the integrity of the array. You can use EBS multi-volume snapshots to take point-in-time, data coordinated, and crash-consistent snapshots across your RAID array. With EBS multi-volume snapshots, you do not have to stop your instance to coordinate between volumes.

Other Actions to improve performances

Use EBS-optimized instances to separate network traffic and traffic between your instance and your EBS volumes. EBS–optimized instances deliver dedicated bandwidth to Amazon EBS. You can enable optimization for an instance at launch. You can also enable or disable optimization for an existing instance by modifying its Amazon EBS–optimized instance attribute. If the instance is running, you must stop it first.
Increase read-ahead of your block devices for high-throughput, read-heavy sequential workloads. If it consists mostly of small, random I/Os, this setting will actually degrade your performance. This is a per-block-device setting that should only be applied to your HDD volumes.
Use a modern Linux kernel with support for indirect descriptors.

EBS Snapshots

Snapshot contains all of the information that is needed to restore your data (from the moment when the snapshot was taken) to a new EBS volume. Snapshots are incremental backups, which means that only the blocks on the device that have changed after your most recent snapshot are saved. The new snapshots will make a reference to the previous snapshot for blocks that have not been changed. When you delete a snapshot, only the data unique to that snapshot is removed.

After you create a snapshot of an EBS volume, you can use it to create new volumes in the same Region. You can copy snapshots across Regions, making it possible to be used in multiple Regions.

Amazon EBS encryption feature

When you create an encrypted EBS volume and attach it to a supported instance type, the following types of data are encrypted:

data stored at rest inside the volume
all data IO moving between the volume and the instance
all snapshots created from the volume
all volumes created from those snapshots

The encryption occurs on the servers that host EC2 instances, ensuring the security of both data-at-rest and data-in-transit between an instance and its attached EBS storage.

Amazon EBS encryption uses AWS Key Management Service (AWS KMS) symmetric
customer master keys (CMK) when creating encrypted volumes and snapshots. The key must be symmetric. Amazon EBS does not support asymmetric CMKs.

You cannot change the CMK that is associated with an existing snapshot or encrypted volume. EBS encrypts your volume with a data key using the industry-standard AES-256 algorithm and encrypted with your CMK. Your data key is stored on-disk with your encrypted data, but not before EBS encrypts it with your CMK. Your data key never appears on disk in plaintext. The same data key is shared by snapshots of the volume and any subsequent volumes created from those snapshots.

When you attach an encrypted volume to an instance, Amazon EC2 sends a decrypt request to AWS KMS. AWS sends the decrypted data key to Amazon EC2. Then Amazon EC2 uses the plaintext data key in hypervisor memory to encrypt disk I/O to the volume. The plaintext data key persists in memory as long as the volume is attached to the instance.

You can expect the same IOPS performance on encrypted volumes as on unencrypted volumes, with a minimal effect on latency. You can access encrypted volumes the same way that you access unencrypted volumes.

Encryption by default is a Region-specific setting. If you enable it for a Region, you cannot disable it for individual volumes or snapshots in that Region.

When migrating servers using AWS Server Migration Service (SMS), do not turn on encryption by default.

When you have access to both an encrypted and unencrypted volume, you can freely transfer data between them. EC2 carries out the encryption and decryption operations transparently.

Encryption in different situations

If a volume is encrypted, it can only be attached to an instance that supports Amazon EBS encryption.

When you create a new, empty EBS volume, you can encrypt it by enabling encryption for the specific volume creation operation.

The snapshots that you make from the encrypted volume, and the volume that you restore from those snapshots are encrypted by the same CMK you selected for creating the encrypted volume by default.

You cannot remove encryption from an encrypted volume or snapshot, which means that a volume restored from an encrypted snapshot, or a copy of an encrypted snapshot, is always encrypted.

You can restore an encrypted volume from an unencrypted snapshot. When you have enabled encryption by default, encryption is mandatory for volumes restored from unencrypted snapshots.

Without encryption by default enabled, a copy of an unencrypted snapshot is unencrypted by default. However, you can choose to encrypt the resulting snapshot. When you have enabled encryption by default, encryption is mandatory for copies of unencrypted snapshots.

You can encrypt an existing EBS volume by copying an unencrypted snapshot to an encrypted snapshot and then creating a volume from the encrypted snapshot.

During creating volume from an encrypted snapshot, you have the option of restoring a volume from the snapshot, decrypting it, and re-encrypting it with a different CMK.

During making a copy of an encrypted snapshot, you can decrypt source data and re-encrypt it by another symmetric CMK. Volumes restored from the resulting copy are only accessible using the new CMK.

By default, the copy of an encrypted snapshot that has been shared with you is encrypted with a CMK shared by the snapshot's owner. However, AWS recommend users to create a copy of the shared snapshot using a different CMK that they control. This protects your access to the volume if the original CMK is compromised.

EBS Multi-Attach

Amazon EBS Multi-Attach enables you to attach a single Provisioned IOPS SSD (io1 or io2) volume to at most 16 Linux instances that are in the same Availability Zone. You can attach multiple Multi-Attach enabled volumes to an instance or set of instances. Each instance to which the volume is attached has full read and write permission to the shared volume.

Each attached instance is able to drive its maximum IOPS performance up to the volume's maximum provisioned performance. However, the aggregate performance of all of the attached instances can't exceed the volume's maximum provisioned performance. If the attached instances' demand for IOPS is higher than the volume's Provisioned IOPS, the volume will not exceed its provisioned performance.

Multi-Attach enabled volumes that have an issue at the Amazon EBS infrastructure layer are unavailable to all attached instances. Issues at the Amazon EC2 or networking layer might impact only some attached instances.

Multi-Attach enabled volumes can be attached to one block device mapping per instance.

Multi-Attach enabled volumes can't be created as boot volumes.

Multi-Attach enabled volumes do not support I/O fencing.

Standard file systems, such as XFS and EXT4, are not designed to be accessed simultaneously by multiple servers, such as EC2 instances. Using Multi-Attach with a standard file system can result in data corruption or loss, so this not safe for production workloads.

Amazon EFS

Amazon Elastic File System (Amazon EFS) provides a simple, highly scalable, highly available, highly durable, and fully managed elastic NFS file system for use with AWS Cloud services and on-premises resources. It is built to scale on demand to petabytes without disrupting applications. It stores data and metadata across multiple Availability Zones in an AWS Region.

EFS communicates with servers by NFS port (2049).

You can access your Amazon EFS file system concurrently from multiple NFS clients. Multiple Amazon EC2 instances running in multiple Availability Zones within the same AWS Region can access an Amazon EFS file system at the same time, providing a common data source for workloads and applications running on more than one instance or server.

To access your Amazon EFS file system in a VPC, you create one or more mount targets in the VPC. A mount target provides an IP address for an NFSv4 endpoint at which you can mount an Amazon EFS file system. You mount your file system using its Domain Name Service (DNS) name, which resolves to the IP address of the EFS mount target in the same Availability Zone as your EC2 instance.

You can mount your Amazon EFS file systems on your on-premises data center servers when connected to your Amazon VPC with AWS Direct Connect or AWS VPN You can mount your EFS file systems on on-premises servers to migrate datasets to EFS, enable cloud bursting scenarios, or backup your on-premises data to Amazon EFS. AWS recommends mounting an Amazon EFS file system on an on-premises server using a mount target IP address instead of a DNS name.

[图片上传失败...(image-63d9b0-1615387299083)]

Two storage classes,

The Standard storage class is used to store frequently accessed files.
The Infrequent Access (IA) storage class is a lower-cost storage class that's designed for storing long-lived, infrequently accessed ﬁles cost-eﬀectively.

Security

NFS client access to EFS is controlled by both AWS IAM policies and network security policies like security groups.

Access Points

An access point applies an operating system user, group, and file system path to any file system request made using the access point. Applications using the access point can only access data in its own directory and below. This ensures that each application always uses the correct operating system identity and the correct directory when accessing shared file-based datasets.

Encryption

You can enable encryption at rest when creating an Amazon EFS file system. If you do, all your data and metadata is encrypted.
You can enable encryption in transit when you mount the file system.

Perforrnance Modes

The default General Purpose performance mode is ideal for latency-sensitive use cases.
File systems in the Max I/O mode can scale to higher levels of aggregate throughput and operations per second with a tradeoff of slightly higher latencies for file metadata operations.

Throughput Modes

Using the default Bursting Throughput mode, throughput scales as your file system grows.
Using Provisioned Throughput mode, you can specify the throughput of your file system independent of the amount of data stored.

Amazon FSx

Amazon FSx provides fully managed third-party file systems with the native compatibility and feature sets for workloads

FSx for Windows

Amazon FSx for Windows File Server provides fully managed Microsoft Windows file servers, backed by a fully native Windows file system. As a fully managed service, Amazon FSx for Windows File Server eliminates the administrative overhead of setting up and provisioning file servers and storage volumes.

Amazon FSx for Windows File Server oﬀers file systems with two levels of availability and durability. Single-AZ or Multi-AZ.

With Single-AZ file systems, Amazon FSx automatically replicates your data within an Availability Zone (AZ) to protect it from component failure. It continuously monitors for hardware failures and automatically replaces infrastructure components in the event of a failure. Amazon FSx also uses the Windows Volume Shadow Copy Service (VSS) in Microsoft Windows to make highly durable backups of your file system daily and store them in Amazon S3. Multi-AZ file systems support all the availability and durability features of Single-AZ file systems. In addition, Amazon FSx automatically provisions and maintains a standby file server in a different Availability Zone. Any changes written to disk in your file system are synchronously replicated across Availability Zones to the standby.

The primary resources in Amazon FSx are file systems and backups. A file system is where you store and access your files and folders. A file system is made up of one or more Windows file servers and storage volumes.

Automatic daily backups are turned on by default when you create a file system.

Your file system already comes with a default Windows file share called \share.

You can access your file shares from on-premises compute instances using AWS Direct Connect or AWS VPN. You can also access your shares on compute instances that are in a different Amazon VPC, account, or Region by VPC peering or transit gateways.

It automatically encrypts data at rest (for both file systems and backups) using keys that you manage in AWS Key Management Service (AWS KMS). Data in transit is also automatically encrypted using SMB Kerberos session keys.

Amazon FSx for Windows File Server gives you the price and performance flexibility by offering both solid state drive (SSD) and hard disk drive (HDD) storage types. SSD storage is designed for the highest-performance and most latency-sensitive workloads.

FSx for Lustre

The open-source Lustre file system is designed for applications that require fast storage—where you want your storage to keep up with your compute. You use Lustre for workloads where speed matters, such as machine learning, high performance computing (HPC), video processing, and financial modeling.

If you are provisioning a file system with the HDD storage option, you might also want to consider provisioning a read-only SSD cache automatically sized to 20 percent of your HDD storage capacity. This provides sub-millisecond latencies and higher IOPS for frequently accessed files.

Amazon FSx for Lustre offers a choice of scratch and persistent file systems to accommodate different data processing needs.

Scratch file systems

Scratch file systems are ideal for temporary storage and shorter-term processing of data. Data is not replicated and does not persist if a file server fails.

Persistent file systems

Persistent file systems are ideal for longer-term storage and workloads. In persistent file systems, data is replicated, and file servers are replaced if they fail.

Amazon FSx for Lustre file systems can be linked to data repositories on Amazon S3 or to an on-premises data store. When linked to an Amazon S3 bucket, an Amazon FSx for Lustre file system transparently presents S3 objects as files. The file system also enables you to write file system data back to S3.

Amazon FSx for Lustre is accessible from compute workloads running on Amazon Elastic Compute Cloud (Amazon EC2) instances and containers running on Amazon Elastic Kubernetes Service (Amazon EKS). Amazon EC2 instances can access your file system from other Availability Zones within the same Amazon Virtual Private Cloud (Amazon VPC), provided your networking configuration provides for access across subnets within the VPC.

Amazon FSx automatically encrypts file system data at rest using keys managed in AWS Key Management Service (AWS KMS). Data in transit is also automatically encrypted on file systems in certain AWS Regions when accessed from supported EC2 instances.

Intergration with other services

Amazon FSx for Lustre integrates with SageMake, which is a fully managed machine learning service. It acts as an input data source, and accelerates the machine learning training by eliminating the initial download step from Amazon S3. Additionally, your total cost of ownership (TCO) is reduced by avoiding the repetitive download of common objects for iterative jobs on the same dataset as you save on S3 requests costs.

Amazon FSx for Lustre integrates with AWS Batch using EC2 Launch Templates. AWS Batch enables you to run batch computing workloads on the AWS Cloud, including high performance computing (HPC), machine learning (ML), and other asynchronous workloads.

Amazon FSx for Lustre is used as a file system by AWS ParallelCluster, which is an AWS-supported open-source cluster management tool used to deploy and manage HPC clusters.

EBS vs EFS vs FSx vs S3: How These Storage Options Differ