Amazon EC2 defenses against L1TF Reloaded

TutoSartup excerpt from this article:
The guest data of AWS customers running on the AWS Nitro System and Nitro Hypervisor is not at risk from a new attack dubbed “L1TF Reloaded… The AWS Nitro System and Nitro Hypervisor are designed to help protect against this class of attacks… While this attack can successfully leak guest data …

The guest data of AWS customers running on the AWS Nitro System and Nitro Hypervisor is not at risk from a new attack dubbed “L1TF Reloaded.” No additional action is required by AWS customers; however, AWS continues to recommend that customers isolate their workloads using instance, enclave, or function boundaries as described in AWS public documentation. The AWS Nitro System and Nitro Hypervisor are designed to help protect against this class of attacks.

A research paper titled Rain: Transiently Leaking Data from Public Clouds Using Old Vulnerabilities, and its presentation titled Spectre in the real world: Leaking your private data from the cloud with CPU vulnerabilities, demonstrate the attack L1TF Reloaded, which combines half-Spectre gadgets with L1 Terminal Fault (L1TF) to leak guest data. While this attack can successfully leak guest data from upstream Linux/Kernel-based Virtual Machine (KVM) and other cloud providers, it does not impact the guest data of AWS customers running on the AWS Nitro System and Nitro Hypervisor.

The Nitro Hypervisor’s protection against L1TF Reloaded is not the result of a specific patch or reactive mitigation, but rather due to the proactive approach to security at AWS. The fundamental security design principles of the Nitro Hypervisor—particularly the implementation of secret hiding through an extensive use of the eXclusive Page Frame Ownership (XFPO) concept (in some contexts referred to as process-local memory)—provides robust protection against this class of attacks. L1TF Reloaded represents an innovative approach to transient execution attacks, showing how threat actors can combine seemingly mitigated vulnerabilities to create new attacks that are more than the sum of their parts. The research is impressive and constructs a multilayer end-to-end exploit with real-world applicability. AWS sponsored a portion of this work and would like to thank the researchers for their collaboration and coordinated disclosure. The remainder of this post is a deeper dive into the published research.

The Nitro Hypervisor: Purpose-built for security

The Nitro Hypervisor is a foundational component of the AWS Nitro System, designed from the ground up with security as a primary consideration. Unlike traditional hypervisors that evolved from general-purpose operating systems, the Nitro Hypervisor, which is based on Linux/Kernel-based Virtual Machine (KVM), has been intentionally minimized and purpose-built with only the capabilities needed to perform its assigned functions.

The Nitro Hypervisor’s responsibilities are deliberately constrained: it receives virtual machine (VM) management requests from the Nitro Controller, partitions memory and CPU resources using hardware virtualization features, and assigns PCIe devices, including both Physical (PF) and Single Root I/O Virtualization (SR-IOV) Virtual Functions (VF) provided by Nitro hardware (such as NVMe for EBS and instance storage, and Elastic Network Adapter for networking) and third party devices (GPUs), to VMs. Critically, the Nitro Hypervisor excludes entire categories of functionality that exist in conventional hypervisors. There is no networking stack, no general-purpose file system implementations, no peripheral device-driver support, no shell, and no interactive access mode. This meticulous exclusion of non-essential features helps avoid entire classes of issues and attack vectors that can impact other hypervisors, such as remote networking attacks or driver-based privilege escalations.

Understanding transient execution vulnerabilities

To understand why the Nitro Hypervisor’s defenses are effective against L1TF Reloaded, it is important to first understand the fundamentals of transient execution vulnerabilities that emerged in 2018. Modern CPUs implement out-of-order and prediction-based speculative execution to optimize performance by executing operations before they are needed or before the CPU knows whether it should perform them at all. When predictions are wrong, or the CPU encounters execution faults, the CPU will eventually detect these errors and roll back all speculatively computed changes to the architectural state. However, traces of these “transient executions” remain detectable in the microarchitectural state, such as data that was speculatively loaded into CPU caches, creating opportunities for data leakage through side-channel attacks.

Half-Spectre gadgets: Incomplete but dangerous code patterns

While traditional Spectre attacks require complete “gadgets” that both access secret data and transmit it through side channels, researchers have identified a weaker class of gadgets called “half-Spectre gadgets.” These are incomplete Spectre-like code patterns that perform speculative out-of-bounds memory accesses, but lack the transmission component that would make them immediately exploitable.

A classic Spectre v1 gadget contains two key elements: first, a speculative access that loads secret data (such as x = A[index] where index is out of bounds), and second, a transmission mechanism that leaks the data through a side channel (such as y = B[64 * x] that creates cache patterns based on the secret value). Half-Spectre gadgets contain only the first element—the speculative access—without the transmission component.

Because half-Spectre gadgets appear harmless in isolation, they are commonly found throughout software, including hypervisors. These gadgets typically arise from array-indexing operations where bounds checking occurs, but the transient execution window allows out-of-bounds access before the bounds check resolves. The gadgets can be either absolute (directly providing the address to access) or relative (controlling an offset from a base address), with relative gadgets being more common due to typical array indexing patterns. The key insight of L1TF Reloaded is that half-Spectre gadgets, while harmless alone, become dangerous when combined with other vulnerabilities like L1TF. A threat actor can trigger a half-Spectre gadget in the hypervisor to speculatively load arbitrary data into the L1 data cache and then use L1TF to extract that cached data—effectively turning the “harmless” half-Spectre gadget into a complete gadget.

Intel L1TF: Leveraging speculative address translation

L1 Terminal Fault (L1TF), discovered in January 2018 and disclosed in August 2018, represents a significant type of transient execution vulnerability that affects Intel processors up to Coffee Lake. These processors are used in some 5th generation EC2 instance families and all older instance types. L1TF leverages faulty address translations during transient execution when accessing invalid page table entries. Under normal operation, when a CPU encounters a Page Table Entry (PTE) with the present bit cleared or reserved bits set, address translation should halt immediately. However, during transient execution, Intel processors affected by L1TF ignore these invalid page table states and utilize a partially translated address. If the target data exists in the L1 data cache, the CPU will speculatively load it and make it available to subsequent instructions, even though the access should be blocked. This behavior is particularly problematic in virtualized environments. A malicious guest operating system can deliberately clear present bits in its own page tables to trigger terminal faults. When this happens, the CPU skips the normal host address translation process and passes the guest physical address directly to the L1 data cache. This allows the threat actor to potentially read any cached physical memory on the system, regardless of ownership or privilege boundaries. For affected processors, comprehensive software mitigation requires expensive measures, like disabling Simultaneous Multi-Threading (SMT), flushing the L1 data cache on every context switch, or disabling Extended Page Tables (EPT) entirely—performance costs so significant that many systems implement only partial mitigations.

The L1TF Reloaded attack: Exploiting mitigation gaps using Spectre

The research paper demonstrates how threat actors can combine half-Spectre gadgets with L1TF to create a powerful attack vector against hypervisors that lack complete implementation of the previously outlined mitigations. The attack shows that vulnerabilities considered individually mitigated can still be leveraged if combined in novel ways. L1TF Reloaded works by leveraging the fact that while L1TF mitigations like L1 data cache flushing and core scheduling help prevent guest-to-guest attacks, they do not fully mitigate guest-to-host attacks. The attack operates across logical cores that share the L1 data cache in an SMT core. On one logical core, the threat actor triggers a half-Spectre gadget. By mistraining the branch predictor, the threat actor causes the hypervisor to speculatively access out-of-bounds memory, loading sensitive data into the shared L1 data cache. Simultaneously, on the other logical core, the threat actor uses L1TF to extract the cached data. While other research papers have demonstrated L1TF exploitation, this research paper has successfully demonstrated a multilayer end-to-end attack on upstream Linux/KVM and other cloud providers. The authors were able to use an existing half-Spectre gadget, break host Kernel Address Space Layout Randomization (KASLR), gain host address translation capability, find all the processes running on the host, identify the victim VM, break guest KASLR, gain guest address translation capability, identify the init process in the victim VM, enumerate the child processes of the init process, identify the nginx webserver process, locate the private TLS certificate in the guest process heap, and finally leak the private TLS certificate. However, when they attempted the same attack on AWS instances, they encountered a critical limitation: while they could leak some non-sensitive host data, they were unable to access guest data due to what they described as “an undocumented defense in the hypervisor that unmaps victim data from it. This “undocumented defense” is the Nitro Hypervisor’s implementation of secret hiding—a fundamental architectural decision that prevented this type of attack.

Secret hiding: Rethinking hypervisor memory architecture

Traditional hypervisor designs follow a hierarchical privilege model where each higher level of privilege is granted access to all lower level memory. In conventional systems, the hypervisor running at the highest privilege level can access all VM memory, ostensibly for legitimate management purposes. However, this design creates a vulnerability: if a threat actor can trick the hypervisor into speculatively accessing guest data, that data becomes available for extraction through side-channel attacks. The Nitro Hypervisor takes a fundamentally different approach through a technique called secret hiding. Instead of following the traditional model where the hypervisor has access to all VM memory (Figure 1), the Nitro Hypervisor makes sure that guest data is not present in the hypervisor’s virtual address space. By removing VM memory pages from the hypervisor’s virtual address space (Figure 2), we avoid the possibility of transient execution attacks accessing guest data, even if a threat actor successfully triggers gadgets within the hypervisor.

Figure 1: Memory view of the hypervisor without mitigations in the context of VM1

Figure 2: Memory view of the Nitro Hypervisor in the context of VM1. While no guest memory is mapped, only the state of the active guest can be accessed with other guest states remaining inaccessible.

This architectural decision means that when transient execution occurs in the Nitro Hypervisor—whether through L1TF, half-Spectre gadgets, or other transient execution vulnerabilities—there is simply no guest data available to be leaked, creating a barrier against this class of vulnerabilities. The Nitro Hypervisor retains access only to its own data, but guest data remain isolated and inaccessible. While we could not anticipate L1TF Reloaded exactly, we knew transient execution vulnerabilities would continue to be discovered and built defense-in-depth mechanisms which blocked extraction of guest data on AWS instances. This design decision was made proactively during the Nitro Hypervisor development, based on our threat model that explicitly includes guest-to-host attacks that exploit the hypervisor. By assuming that threat actors might find ways to trigger transient execution vulnerabilities within the Nitro Hypervisor—whether through known vulnerabilities like L1TF or future unknown attack vectors—we designed the system to limit the scope of such attacks from the outset.

Beyond memory: Protecting guest CPU context

When VMs are scheduled and context-switched, guest CPU context information such as general-purpose and floating-point register content must be saved and restored. Guest CPU context can contain highly sensitive information. Registers might contain cryptographic keys, memory addresses that could defeat Address Space Layout Randomization (ASLR), or other secrets that applications rely on for security. In traditional hypervisors, guest CPU context is often stored in memory accessible to the hypervisor, creating another potential target for transient execution attacks. The original XPFO (eXclusive Page Frame Ownership) implementation makes sure that either user space or the kernel—but not both—can access a memory page and does not protect guest CPU context since it is exclusively owned by the kernel. The Nitro Hypervisor extends the XPFO concept to guest CPU context by saving it in memory—also known as process-local memory—that is solely mapped by process-specific kernel Page Table Entries (PTEs), as is shown in Figure 2 above. This memory is specifically designed to be only accessible from the Nitro Hypervisor in the context of the process it belongs to. This makes sure that even if a threat actor successfully triggers transient execution vulnerabilities within the Nitro Hypervisor, they cannot access the guest CPU context from other guests. The researchers confirmed this protection, noting that the AWS threat model accounts for guest-to-host attacks and that secret hiding, combined with existing L1 data cache flushing and core scheduling, prevented them from leaking guest data. This comprehensive approach to secret hiding demonstrates the defense-in-depth philosophy of the Nitro System: rather than protecting only known attack vectors, AWS systematically identifies and protects potential sources of guest data leakage, including both VM memory and guest CPU context.

Applying secret hiding principles to Xen

Most AWS Xen instances are now running on the AWS Nitro System and hence enjoy the benefits of the Nitro Hypervisor thanks to Xen-on-Nitro. For our portfolio of instance families running on the AWS Xen Hypervisor, we have implemented similar secret hiding principles to provide protection against transient execution attacks.

Defense in depth: The Nitro Hypervisor’s proven security model

L1TF Reloaded represents an important advancement in our understanding of how seemingly mitigated vulnerabilities can be combined to create new attack vectors. The researchers of the Rain paper demonstrated how L1TF and half-Spectre gadgets can work together to leak guest data from hypervisors. We are pleased to support their work and collaborate with them. The Nitro Hypervisor’s protection against L1TF Reloaded is not the result of a specific patch or reactive mitigation, but rather due to AWS deeply investing in securing multi-tenant cloud environments against sophisticated adversaries. This research reinforces our confidence in the Nitro System’s security model against both known and unknown attack vectors. The proactive security approach of AWS includes designing systems with defense-in-depth principles from the ground up. The threat landscape will continue to evolve, and at the same time, the defense-in-depth mechanisms built into the Nitro Hypervisor and our other products and services will continue to help protect AWS customers from sophisticated attacks, while maintaining the performance and functionality they depend on.

If you have questions or feedback about this post, contact AWS Support.

Amazon EC2 defenses against L1TF Reloaded
Author: Ali Saidi