



# Microarchitectural Leakage Templates and Their Application to Cache-Based Side Channels

CCS 2022

Ahmad Ibrahim<sup>C</sup> Hamed Nemati<sup>S,C</sup> <u>Till Schlüter</u><sup>C</sup> Nils Ole Tippenhauer<sup>C</sup> Christian Rossow<sup>C</sup> November 10, 2022

Problem Leakage Templates Plumber Framework Case Studies Matching Binaries Discussion & Conclusion





## Previously...



Security of proprietary CPUs



Discovering new side channels



<sup>&</sup>lt;sup>1</sup> Nemati et al., "Validation of Abstract Side-Channel Models for Computer Architectures".



#### Generic Description of a Side Channel

 $\mathcal{P}(A)$ : A **code** template

B: Distinct behaviors

• e.g. timing:  $\mathcal{B} = \{ \bullet \text{ fast}, \circ \text{ slow} \}$ 

 $\mathcal{R}(A,b)$ : **Relations** between inputs, leading to a certain behavior

 "When inputs X and Y are in relation, then behavior •"



Leakage Template

| Code $\mathcal{P}(A)$ | Behavior and Relations |                                                      |  |
|-----------------------|------------------------|------------------------------------------------------|--|
| ldr x0, [x1]          | ${\mathcal B}$         | $ \mathcal{R}(A,b) $                                 |  |
| ;                     | (●) fast               | $sameTag(x_1, x_2) \land sameSet(x_1, x_2)$          |  |
| ldr x0, [x2]          | (∘)slow                | $\neg sameTag(x_1, x_2) \lor \neg sameSet(x_1, x_2)$ |  |

Figure: Leakage Template: Cache-Timing Side Channel











| Code $\mathcal{P}(A)$ | Behavior a | Behavior and Relations                               |  |
|-----------------------|------------|------------------------------------------------------|--|
| ldr x0, [x1]          | ${\cal B}$ | $\mathcal{R}(A,b)$                                   |  |
| ;                     |            | $sameTag(x_1, x_2) \land sameSet(x_1, x_2)$          |  |
| ldr x0, [x2]          | (∘)slow    | $\neg sameTag(x_1, x_2) \lor \neg sameSet(x_1, x_2)$ |  |

Figure: Leakage Template: Cache-Timing Side Channel

Memory address: ... 11101010010 0010010 010010

Testcases TC0  $t_1, s_1$  $t_1, s_0$ TC1 t1. S1 t1, S1 . . . . . . . . . TC127  $t_1, s_1$  $t_1, s_{127}$ x1: x2: fixed tag. fixed tag and set iterate over all sets Classification (•) fast (o) slow x1 x2 x1 x2  $t_1, s_1$  $t_1, s_1$  $t_1, s_1$  $t_1, s_0$  $t_1, s_1$  $t_1, s_2$ . . .  $t_1, s_1$  $t_1, s_{127}$ 







#### **Case Studies**



Figure 6: Case studies' LTs with selected relations. In  $(a) e^{in}$  means b times inlining n supplies attitudes (a, b, b, n), is inlining n, supplies attitudes (a, b, n), and inlining n, supplies attitudes (a, b, n), and (a, b, n) is uniformly attitude (a, b, n), and (a, b, n) is uniformly attitude (a, b, n), and (a, b, n) is uniformly attitude (a, b, n). The probability of the (a, b, n) is the number of prefetched lines. Relations must be decked in open the first (a, b, n) and (a, b, n) is the number of (a, b, n) and (a, b, n) in the probability of (a, b, n) is the number of (a, b, n) in the number of (a, b, n) in the number of (a, b, n) is the number of (a, b, n) in the number of (a, b, n) in the number of (a, b, n) is the number of (a, b, n) in the number of (a, b, n) in the number of (a, b, n) is the number of (a, b, n) in the number of (a, b, n) in the number of (a, b, n) is the number of (a, b, n) in the number of (a, b, n) in the number of (a, b, n) is the number of (a, b, n) in the number of (a, b, n) in the number of (a, b, n) is the number of (a, b, n) in the number of (a, b, n) in the number of (a, b, n) is the number of (a, b, n) in the number of (a, b, n) in the number of (a, b, n) is the number of (a, b, n) in the number of (a, b, n) in the number of (a, b, n) is the number of (a, b, n) in the number of (a, b, n) in the number of (a, b, n) is the number of (a, b, n) in the number of (a, b, n) in the number of (a, b, n) is the number of (a, b, n) in the number of (

Table 3: Example permutation outcome. Each number represents an instruction from the initial testcase. Underlined numbers are loads from addresses that have the same tag.



(3) N<sub>3,5,8</sub> N<sub>3,6,7</sub> (2 N<sub>3,5</sub> N<sub>3,5,8</sub> (100).
For every CTS, all 10,000 generated testcases show the same behavior. Thus, the exact values of fags and sets do not matter.
Et Word Offset Behavior. In E.2 we observed that the byte offsets of loaded addresses affect previction. To broaden our understanding, in this experiment, we leveraged GTSes as shown in Table 4. They generate testcases for 5-load programs with all possible combinations of these and sets (for loads traceries un to two

(1) [Mr. s. ] 3 Mr. s. Mr. s. | 10000

(2) Mt.s. [Mt.s.] 3 Mt.s. 10000



CCS '22, November 7-11, 2022, Los Angeles, CA, USA.

#### 8.2 Previction w/o Shared Memory (PR\_PP)

Based on experiment E in § 6.2, previction may target prebaded memory address and leak information in the absence of shared memory, e.g., through Prime Probe. The sender code of our prevition-based Prime Probe primitive Ppc. Pis similar to that of PR\_FR. However, in PR\_PP, the receiver first loads two memory lines into the targeted caches set before the execution of the sender code. The receiver then probes the lines to determine the leaked bits.

#### 8.3 Prefetching Control-Flow Leakage (PRF\_CF) PRF CF allows leaking the control flow of a program based on

prefetching, Is is based on the results of \$Z n is \$4.5. Hg. is shows an example code of \$PR C, The sender code has a 4-bad prefetching sequence with a fixed stride (lines 2, 5, 8, and 15). The loads are spentred by a number of arithmetic instruction. The instruction at line 12 is conditionally executed depending on one bit of a secret that is stored in 20 Glines 9 through 12, 2-cooling 10 Tr. It must be a forestead of the string sequence affects the number of prefetched cache lines. By measuring the time required to reload a [spossibly prefetched olatine s.15/12, the time required to reload a [spossibly prefetched olatine s.15/12, the concessional licentine has been considered to the contraction of the con

#### 8.4 Prefetching on an Interrupted Seq. (PRF\_IS)

Inspired by E7, we tested the effect of intermediate memory operations on prefetching. We observed that an intermediate load from a different page leads to prefetching of additional cache lines by a 3-load stream. PRF\_IS is based on this outcome. It also allows

In the paper: 3 Leakage Templates, 4 Covert Channels

All page 16

(e) ppr os



<sup>&</sup>lt;sup>1</sup> Nemati et al., "Validation of Abstract Side-Channel Models for Computer Architectures".

Problem Leakage Templates

LUMBER Framework

Case Studies

Matching Binaries

Discussion & Conclusion





## Searching for Instances of a Leakage Template

## 1. 🗐 Static Analysis

Search for candidate code sections matching  $\mathcal{P}(A)$ 

## 2. Dynamic Analysis

For each candidate section:

Check whether different inputs fulfill relations for different behaviors (= are distinguishable based on behavior)

#### Recall:

 $\mathcal{P}(A)$ : A **code** template

B: Distinct behaviors

 $\mathcal{R}(A,b)$ : **Relations** between inputs, leading to a

certain behavior



Leakage Template



CCS '77 November 7-11 2022 Los Angeles CA USA

Ahmad Ibrahim, Hamed Nemati, Till Schlüter, Nils Ole Tippenhauer, and Christian Rossow

not tigger the prefetcher to load certain cache lines into the cache, depending on the resulting memory access pattern. Therefore, the cache state of potentially prefetched cache lines indicates the crisis trace of relations between the accessed obseque bale elements and by extension, the processed data. Soin et al. exploit these relations to precise the contract of the private key. The attack recovers the key incrementally. The same computation is a pulsel to both the target social read ac candidate scalar. By changing the candidate scalar such that the prefetching behavior assimilate, both scalars assimilate as well. Even though which we have a small contract to a resummed access the properties of the contract to the contract of the Contr

et al. [40]. Data-dependent loads from a lookup table may or may

and service to density of ever when the contract of the contr

Table 5: Confusion matrix, comparing prefetching behavior classification based on relations with the actual behavior.

prefetching behavior based on the relations  $\mathcal{R}(A,b)$  from the LT. Second, we use a Flush-Reload side channel to record a cache trace. This trace contains the cache state of the memory lines around  $SQR_{-}$  to after execution. It is captured for evaluation purposes and indicates the actual prefetching behavior of the CPU. In order to show that the LT accurately represents the prefetching

in older to allow that the L1 accurately represents the preceding behavior, we recorded traces for 100 random input values to the library function. For each input value, we determined the expected prefetching behavior using the access trace\* and compared it with the actual behavior using the corresponding eache trace. Evaluation: Table 8. Jllustraces the classification netformance Evaluation: Table 8. Jllustraces the classification netformance.

Ivaluation. Table 5 illustrates the classification performance. For all 6e cases where the load instructions satisfy the relation for  $P_0$ , the cache traces show that no perfecting occurred. In six cases, the relation of behavior  $P_0$  are satisfied. The three relevant load instructions load data from three consecutive cache lines and load instructions of data from three consecutive cache lines and  $P_0$  is within the perfection bounds. In all it cases, the cache trace shows that perfection got flure additional cache lines occurred, in the remaining 2 cases, the relations for ease; the cache trace shows that  $P_0$  is within the perfection bounds. In all it cases, the cache increase from the  $P_0$  are satisfied. The reason is that the distances  $P_0$  and  $P_0$  between the relevant load instructions are contained to contain the transmitter of contide the transmet.

In the paper: Re-identifying a known vulnerability (Shin et al.<sup>1</sup>, CCS'18): Prefetching-based side channel in Elliptic Curve Diffie-Hellman (ECDH) code in OpenSSL 1.1.0g

<sup>1</sup> Shin et al., "Unveiling Hardware-Based Data Prefetcher, a Hidden Source of Information Leakage".

#### PLUMBER Use Cases and Limitations

#### **Additional Use Cases**

- Facilitate reverse engineering of microarchitectural components
  - Examples in the paper: branch predictor, cache slice mapping

#### Limitations

- Focus on cache-based side channels
- Implemented for ARM architecture

Discussion & Conclusion





#### Leakage Template



- Code
- **Behaviors**
- Relations

#### PLUMBER Framework



#### Case Studies



## Matching Binaries



- 1. 🖹 Static
- 2. Dynamic

#### Code

ngithub.com/scy-phy/plumber

#### Till Schlüter

till.schlueter@cispa.de

(List of all resoruces tschlueter.com related to this paper)











## Generative Testcase Specification (GTS)

| Directives |                              |
|------------|------------------------------|
| Directive  | Description                  |
| M          | Memory Access                |
| Α          | Arithmetic/Logic Instruction |
| N          | NOP                          |
| В          | Branch                       |
|            |                              |

| Operators                  |                           |
|----------------------------|---------------------------|
| Operator                   | Description               |
| [·] <i>n</i>               | Power                     |
| #n                         | Wildcard                  |
| $\langle \cdot \rangle$ \$ | Cache line (set) mutation |
| $P(\cdot)$                 | Precondition              |
| •••                        | •••                       |

#### Example: Cache-Timing Side Channel

$$P(M_{t_1,s_1})$$

Precondition: Prime cache with a cache line in set  $s_1$ 



Generate one test case for each possible set index, keep the tag index constant





#### ☐ Classifier

- Classifies test cases based on the observed behavior
- For each behavior: produce a bit table
  - Bit table: List of all test cases that trigger a certain behavior

| Bit Table   |    |            |  |  |
|-------------|----|------------|--|--|
| Behavior ∘  |    |            |  |  |
| Test Case # | x1 | <b>x</b> 2 |  |  |
| 1           | 00 | 01         |  |  |
| 2           | 00 | 10         |  |  |
|             |    |            |  |  |

#### Analyzer

- For each bit table (= behavior): Identify common features
- ⇒ Extracts relations that trigger a certain behavior





#### **Prefetching on ARM Cortex-A53**

 Loads cache lines in advance that are likely to be needed soon

#### Steps to Create the Leakage Template

- 1. Number of sequential loads
- 2. Intermediate instructions
- 3. Respecting page boundary
- 4. Multiple prefetching sequences
- 5. Cache hits

| Code $\mathcal{P}(A)$                                                                      | Behavior and Relations                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|--------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| ldr x8,[x0,x1]<br>#n1 instructions<br>ldr x8,[x0,x2]<br>#n2 instructions<br>ldr x8,[x0,x3] | $\begin{split} \operatorname{Let} d_1 &= \operatorname{set}(x0 + x2) - \operatorname{set}(x0 + x1), \\ d_2 &= \operatorname{set}(x0 + x3) - \operatorname{set}(x0 + x2), \\ a_{\min} &= x0 + x1,  a_{\max} &= x0 + x3, \\ N_3 &:= (n_1 < 3.n_2 < 3) \lor (n_1 = 0.n_2 = 3), \\ N_4 &:= (n_1 = 3 \land n_2 = 0) \lor (n_1 = 0 \land n_2 = 3), \\ N_7 &:= (n_1 = 4 \land n_2 = 0) \lor (n_1 = 0 \land n_2 = 4) \operatorname{in} \\ & \mathcal{B} & \mathcal{R}(A, b) \\ \hline P_0 & d_1 \neq d_2 \lor d_1 = 0 \lor  d_1  > \delta_{\max} \lor \\ & -\operatorname{samePage}(a_{\min}, a_{\max}) \lor \\ & -\operatorname{samePage}(a_{\min}, a_{\max}) + d_1) \\ \hline P_1 & (N_3 \lor N_1 \lor N_2) \land -\operatorname{samePage}(a_{\max}, a_{\max} + 2d_1) \\ \hline P_2 & (N_3 \lor N_4 \lor N_7) \land -\operatorname{samePage}(a_{\max}, a_{\max} + 2d_1) \\ \hline P_3 & N_3 \lor ((N_4 \lor N_7) \land -\operatorname{samePage}(a_{\max}, a_{\max} + 4d_1)) \\ \hline P_4 & N_4 \lor (N_7 \land -\operatorname{samePage}(a_{\max}, a_{\max} + 5d_1)) \\ \hline P_5 & N_7 \land -\operatorname{samePage}(a_{\max}, a_{\max}, a_{\max} + 5d_1) \\ \hline P_6 & N_7 \land -\operatorname{samePage}(a_{\max}, a_{\max}, a_{\max} + 7d_1) \\ \hline P_7 & N_7 & -\operatorname{samePage}(a_{\max}, a_{\max}, a_{\min} + 7d_1) \\ \hline P_7 & N_7 & -\operatorname{samePage}(a_{\max}, a_{\max}, a_{\min}) \\ \hline \end{pmatrix}$ |

Figure: Leakage Template: Prefetching.  $P_l$  means prefetching l lines.





## Re-Identifying a Prefetching-Based Vulnerability in OpenSSL

## Vulnerability (Shin et al.<sup>2</sup> CCS'18):

Prefetching-based attack on Elliptic Curve Diffie-Hellman (ECDH) in OpenSSL 1.1.0g

## 1. 🗐 Static Analysis

- Search for the code template from the prefetching Leakage Template
- ⇒ Identified 429 matching sequences across 18 OpenSSL modules (including the target code section)

## 2. Dynamic Analysis

- Run code with different inputs
- Evaluate register contents against relations
- ⇒ Different inputs satisfy relations for different behaviors

**Conclusion:** Different classes of inputs are distinguishable based on prefetching behavior.

<sup>&</sup>lt;sup>2</sup>Shin et al., "Unveiling Hardware-Based Data Prefetcher, a Hidden Source of Information Leakage".





## References

- [1] Hamed Nemati et al. "Validation of Abstract Side-Channel Models for Computer Architectures". In: *International Conference on Computer-Aided Verification (CAV)*. 2020. DOI: 10.1007/978-3-030-53288-8\_12.
- [2] Youngjoo Shin et al. "Unveiling Hardware-Based Data Prefetcher, a Hidden Source of Information Leakage". In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS '18). 2018. DOI: 10.1145/3243734.3243736.