

# PreFence: A Fine-Grained and Scheduling-Aware Defense Against Prefetching-Based Attacks

Till Schlüter, Nils Ole Tippenhauer

IEEE European Symposium on Security and Privacy (EuroS&P) 2025

## Motivation







### Motivation



# **Hardware Prefetching**























Motivation



Motivation



Motivation







Motivation







Motivation







Motivation



































## Defenses So Far



**Targeted defenses** 

## Defenses So Far



**Targeted defenses** 



Disable prefetching





# Disabling Prefetching Is Expensive





Can we find a defense that...

#### Can we find a defense that...



...prevents prefetch attacks

#### Can we find a defense that...



...prevents prefetch attacks



...has minimal runtime overhead



#### Can we find a defense that...



...prevents prefetch attacks



...has minimal runtime overhead



...is easy to use for developers and end users







#### Can we find a defense that...



...prevents prefetch attacks



...has minimal runtime overhead



...is easy to use for developers and end users



...is compatible with Simultaneous Multithreading (SMT)





## Prefetching-Based Side-Channel Attacks in Prior Work

#### We consider 13 attacks from 7 papers:

| #  | Attack            | Prefetcher      |
|----|-------------------|-----------------|
| 1  | Shin et al.       | Intel IP stride |
| 2  | Augury OOB        | Apple DMP       |
| 3  | Augury SLH        | Apple DMP       |
| 4  | Augury Addr.      | Apple DMP       |
| 5  | Afterlmage Var. 1 | Intel IP stride |
| 6  | Afterlmage Var. 2 | Intel IP stride |
| 7  | AfterImage SGX    | Intel IP stride |
| 8  | AfterImage RSA    | Intel IP stride |
| 9  | AfterImage Sync   | Intel IP stride |
| 10 | Xiao et al.       | Intel IP stride |
| 11 | FetchBench AES    | ARM SMS         |
| 12 | PrefetchX         | Intel XPT       |
| 13 | GoFetch           | Apple DMP       |







# Attack Systematization





# Attack Systematization





# Attack Systematization



**Finding:** Victim process trains the prefetcher





# Design Idea





Disable prefetching temporarily

# PreFence Design: Scheduling-Aware, Temporary Prefetcher Deactivation











Signal: disable prefetching







- Signal: disable prefetching
- Security-critical computation



- Signal: disable prefetching
- Security-critical computation
- Signal: enable prefetching





- Signal: disable prefetching
- Security-critical computation
- Signal: enable prefetching



- Signal: disable prefetching
- Security-critical computation
- Signal: enable prefetching



- Signal: disable prefetching
- Security-critical computation
- Signal: enable prefetching



- Signal: disable prefetching
- Security-critical computation
- Signal: enable prefetching



- Signal: disable prefetching
- Security-critical computation
- Signal: enable prefetching





- Signal: disable prefetching
- Security-critical computation
- Signal: enable prefetching





# **Evaluation Targets**





## Efficacy: Prevents Prior-Work Attacks



Cortex-A72)

Figure 5. Latency of accessing the prefetch location after calling the vulnerable OpenSSL function with the PREFEICE countermeasure not applied tracted, distable flag eleared) and applied (flag set). Short access latency indicates unwanted leakage, which is prevented by activating our countermeasure.

countermeasure against prefetching-based side channels embled. This experiment serves as a baseline and shows that the library function actually leaks information when called with certain inputs. In the second configuration were set the prefetch, disable flag before calling the library function and clear it after returning from the library function and clear it after returning from the library function, If PREFINCE is effective, we expect no more prefetching leakage.

Results. We run both configurations in both evaluation environments and present the results in Figure 5. We repeat each configuration 1,000,000 times on the Intel CPU and 10,000,000 times on the ARM CPU, which the prefetch\_disable flag is cleared on the Intel CPU, we observe a significantly lower latency when loading from the memory line right after the lookup table (median: 90 miles). This indicates that the prefetcher loaded this memory line into the cache (i.e., unwanted leakage). In

speedup of 1.8%), which we attribute to the prefetcher interfering with non-ideal predictions when it is enabled.

However, these measurements only reflect the performance of PREFENCE in an artificial individual case. Thus, we conduct an in-depth efficiency evaluation based on more complex and realistic workloads in Sections 6.5 and 6.6.

#### 6.4. Efficacy: Protecting MbedTLS

Next, we show that PREFENCE successfully prevents an end-to-end attack from prior work, namely the attack on MbedTLS AES from the FetchBench paper [42]. As this attack exploits ARM's Spatial Memory Streaming (SMS) prefetcher, we can only reproduce it on our ARM-based platform.

Vulnerability. The SMS prefetcher divides memory into fixed-size regions of 1 kills each. When a load instruction accesses multiple cache lines within the same region (e.g., in a loop), the prefetcher records this expension accesses pattern in its internal state. As the vulnerable AES-128 unipelmentation is instead state. As the vulnerable AES-128 unipelmentation is unipelmentation of the properties of th

Experiment. We run two experiments: First, as a baseline, we run the end-to-end attack on our patched kernel, but without making any PRETENCE system calls in the victim code. This configuration is expected to show leakage. We record how many secret bits can be recovered successfully. Second, we repeat the attack, but with PRETENCE applied. We set the preferred\_glassife flag in the and clear it afterward. Assists we record the leakage.

Implementation. We build upon the proof-of-concept code published by Schlüter et al. [41]. Due to the complex





with an average success rate of 31.8 correct key bits practice that the transfer attack. The red fine indicates the expected distribution for random guessing, more precisely, a binomial distribution of the properties of the pro

Execution Time Evaluation. Finally, we also measure the temporal overhead on the vulnerable library function caused by the lack of prefetching. To this end, we call the function 10,000,000 times with and without the prefetch\_disable flag set and measure its execution time. We find that the median execution time increases by approx. 2.7% when prefetching is temporarily disabled (from 903 to 927 cycles).

### 6.5. Efficiency: Non-Critical Workloads (Scenarios 1 and 2)

We now investigate the efficiency of PREFENCE for



permanently causes significant performance overhead in benchmarks 502 to 523. The performance overhead introduced by our patched kernel is negligible for non-security-critical workloads.

iterations, while the black error bars indicate the runtime of the other two iterations.

Comparing the two stock kernel configurations (or ange and blue bars), we find that the prefetcher especially speeds up the benchmarks 502-523. At a maximum, the prefetcher improves performance 494% (benchmark 505 on the Intel CPU) and 37% (benchmark 505 on the Angeberry Ph. respectively. In most other workloads, both configurations performed similarly, in one exceptional (555 on the Rapberry Ph.) Nevertheless, we conclude that disabling the prefetcher permanently can lead to a significant performance drop no hot tested systems.

When we compare the stock kernel and the patched kernel, both with prefetching enabled (blue and green bars), we observe only small differences in execution time. For most benchmarks, the absolute difference is around [16]. We conclude that the exclusion parch here prolitible.





## Efficiency: Negligible Overhead on Non-Critical Workloads



SPEC benchmarks perform similarly on stock kernel and patched kernel.

Performance difference around ±1% in most benchmarks.





# Efficiency: Bounded Overhead on Critical Workloads





# Efficiency: Bounded Overhead on Critical Workloads











### **PreFence**



### Evaluation



### Till Schlüter





aithub.com/scy-phy/PreFence



#### PreFence: A Fine-Grained and Scheduling-Aware Defense Against Prefetching-Based Attacks Till Schlüter, Nils Ole Tippenhauer (CISPA) Prefetcher Attacks No Practical Defense So Far Prior work uncovered side-channel vulnerabilities in No effective and efficient defense has been presented hardware data prefetchers that put user data at risk. so far. The most effective defense is to disable pre-However, corresponding defenses have not been studfetching permanently which is impractical due to its ied systematically before. high performance cost for all processes. **Attack Systematization** Systematization Findings **PreFence Design** We identify three mandatory attack stages: PreFence enables processes to disable prefetching temporarily Training in the victim context per core to prevent training (1) transferring secrets into the prefetcher's state Processes send system calls to Triggering in the victim or attacker context, announce when they execute setransferring secrets into the cache state. curitocritical code Cache side channel extraction We extend the exhadular to let it transferring secrets into architectural state manage the prefetcher activation Preventing any of these stages prevents the entire state, preventing attacks across class of prefetcher attacks processes and cores PreFence le Effective PreFence Is Efficient We show that PreEeron is effective by successfully PreFence has negligible impact on non-critical code preventing attacks from prior work, for example the and performs better than permanent disabling for critishared library attack by Shin et al. (CCS 2018). time (offsee units) Efficacy: Droffence mitigates the shared library attack by Shin et al., where the prefetcher is triggered by memory accesses to Efficiency: The performance of PreFence depends on how it is shared data and leaks secret-dependent access patterns into applied to the code of the workload. Permanent disabling (black the carbo state. Preferre prevents this successfully







## References I

- [1] Boru Chen et al. "GoFetch: Breaking Constant-Time Cryptographic Implementations Using Data Memory-Dependent Prefetchers". In: USENIX Security. 2024. URL: https://www.usenix.org/conference/usenixsecurity24/presentation/chen-boru.
- [2] Yun Chen, Lingfeng Pei, and Trevor E. Carlson. "AfterImage: Leaking Control Flow Data and Tracking Load Operations via the Hardware Prefetcher". In: ASPLOS. 2023. DOI: 10.1145/3575693.3575719.
- [3] Yun Chen et al. "PREFETCHX: Cross-Core Cache-Agnostic Prefetcher-Based Side-Channel Attacks". In: HPCA. 2024. DOI: 10.1109/HPCA57654.2024.00037.
- [4] Jose Rodrigo Sanchez Vicarte et al. "Augury: Using Data Memory-Dependent Prefetchers to Leak Data at Rest". In: S&P. 2022. poi: 10.1109/SP46214.2022.9833570.
- [5] Till Schlüter et al. "FetchBench: Systematic Identification and Characterization of Proprietary Prefetchers". In: CCS. 2023. doi: 10.1145/3576915.3623124.



## References II

- [6] Youngioo Shin et al. "Unveiling Hardware-Based Data Prefetcher, a Hidden Source of Information Leakage". In: CCS. 2018. poi: 10.1145/3243734.3243736.
- Chong Xiao, Ming Tang, and Sylvain Guilley. "Exploiting the Microarchitectural Leakage of [7] Prefetching Activities for Side-Channel Attacks". In: Journal of Systems Architecture 139 (June 2023). poi: 10.1016/j.sysarc.2023.102877.