

# Side-Channel Attacks

ISSISP 2017 – Gif-sur-Yvette 2017-07-21

Damien Couroussé, CEA – LIST / LIALP; Grenoble Université Alpes damien.courousse@cea.fr

#### COPYRIGHT NOTICE



- Unless explicit mention at the bottom of the page, these slides are distributed under the Creative Common Attribution 3.0 License
  - You are free:
    - to share—to copy, distribute and transmit the work
    - to remix—to adapt the work
  - under the following conditions:
    - Attribution: You must attribute the work (but not in any way that suggests that the author endorses you or your use of the work) as follows:

"Courtesy of Damien Couroussé, CEA France"

The complete license text can be found at http://creativecommons.org/licenses/by/3.0/legalcode



# **AES, TIME AFTER TIME (BUT SO USEFUL...)**



secrets of smart cards, vol. 31. Springer, 2007.



#### **BESTIARY OF EMBEDDED SYSTEMS**

#### ... IN NEED FOR SECURITY CAPABILITES



**Smart Card** 









... And many other things



Secure Element inside...











#### PHYSICAL ATTACKS: WHY ALL THE FUSS?

#### **Cryptography** is used to secure communications

- **Encrypted data** can be safely sent over an untrusted communication channel
- Cannot recover the encrypted information without the **key**

**Cryptanalysis** studies the mathematical properties of cryptographic algorithms, and provides a "practical" confidence in security bounds.

Security bounds are expressed in terms of attack complexity

**Physical attacks** are the only (effective) way to break cryptography nowadays.

- Sometimes considered as part of cryptanalysis
- But quite different research communities







Courtesy of Sylvain Guilley 2015, Télécom ParisTech - Secure-IC

#### PHYSICAL ATTACKS 101



## An attacker proceeds in two steps:

- 1. Global analysis of the target, looking for potential weaknesses or known vulnerabilities – this step is not considered in the littérature.
- 2. Focused attack on a target
- **Cryptanalysis**

Out of the scope of this talk

**Reverse engineering** 

Hardware inspection: decapsulation, physical abrasion, chemical etching, visual inspection, etc.

Software inspection: debug, memory dumps, code analysis, etc. [see lectures past in the weekl

Passive attacks: side-channel attacks

Observations: electromagnetic, electrical / power, acoustic, execution time, etc. [you are here]

**Active attacks: fault attacks** 

Laser or other lights illumination, under/over-voltage, clock glitches, electromagnetic perturbations, etc. [next lecture]

**Logical attacks** 

[see past lectures this weeks]

Sometimes considered as a « solved » issue in High Security products.



#### « PHYSICAL ATTACKS IS SCI-FI »

#### Physical attacks are considered (by software hackers) as not practical

- Require dedicated HW attack benches, can be quite expensive, especially for fault injection (laser benches)
- We also find low cost ones
  - E.g. *The ChipWhisperer*, starting at ~ 300€
- Require human expertise, but more than other attacks



https://newae.com/tools/chipwhisperer



#### « PHYSICAL ATTACKS IS SCI-FI » #2

#### **IoT Goes Nuclear: Creating a ZigBee Chain Reaction**

#### **IoT Goes Nuclear:** Creating a ZigBee Chain Reaction

"Adjacent IoT devices will infect each other with a worm that will rapidly spread over large areas"

Eyal Ronen(⋈)\*, Colin OFlynn<sup>†</sup>, Adi Shamir\* and Achi-Or Weingarten\* PRELIMINARY DRAFT, VERSION 0.91

> \*Weizmann Institute of Science, Rehovot, Israel {eyal.ronen,adi.shamir}@weizmann.ac.il †Dalhousie University, Halifax, Canada coflynn@dal.ca

#### Philips Hue Smart lamp

ZigBee protocol

#### Uploading malicious firmware with OTA update

- Discovered the hex command code for OTA update
- Firmware is protected with a single global key! Using symmetric crypto (AES-CCM).

#### Attack path

- Get access to the key  $\rightarrow$  side-channel attack with power analysis
- Sign a malicious firmware
- Take over bulbs by: plugging a bulb, war-driving around in a car, war-flying with a drone
- Request OTA update
- The malicious firmware can request OTA update to its neighbours to spread.

Other interesting read: N. Timmers and A. Spruyt, "Bypassing Secure Boot using Fault Injection," presented at the Black Hat Europe 2016, 04-Nov-2016.



# SIDE-CHANNEL ATTACKS FOR REVERSE ENGINEERING

- Reverse-engineering from side-channel analysis
  - Even simpler on interpreters
- SCARE attacks: recovering looking tables with side-channel analysis
- FIRE attacks: using fault attacks

T. Eisenbarth, C. Paar, and B. Weghenkel, "Building a Side Channel Based Disassembler," in Transactions on Computational Science X, Springer, Berlin, Heidelberg, 2010, pp. 78–99.

- M. S. Pedro, M. Soos, and S. Guilley, "FIRE: Fault Injection for Reverse Engineering," in Information Security Theory and Practice. Security and Privacy of Mobile Devices in Wireless Communication, 2011, pp. 280–293.
- C. Clavier, "An Improved SCARE Cryptanalysis Against a Secret A3/A8 GSM Algorithm," in Information Systems Security, 2007, pp. 143–155.



#### THE "DPA" BOOK

# The most comprehensive book about side-channel attacks

- Excellent introduction to side-channel attacks
- Published in 2007: does not cover recent attacks and countermeasures

S. Mangard, E. Oswald, and T. Popp, Power analysis attacks: Revealing the secrets of smart cards, vol. 31. Springer, 2007.





# SIMPLE POWER ANALYSIS (SPA)

Direct interpretation of power consumption measurements Extraction of information by inspection of single side-channel traces



- Nature of the algorithm
- Structure of the algorithm
  - Number of executions
  - Number of iterations
  - Number of sub-functions
  - nature of instructions executed (memory accesses...)
  - Etc.

Illustration of SPA in the wild: C. O'Flynn, "A Lightbulb Worm? A teardown of the Philips Hue.," presented at the Black Hat, 2016. cf. slides ~60 to 70

P. Kocher, J. Jaffe, and B. Jun, "Differential Power Analysis," in Advances in Cryptology — CRYPTO' 99, vol. 1666, M. Wiener, Ed. Springer Berlin Heidelberg, 1999, pp. 388-397.

P. Kocher, J. Jaffe, B. Jun, and P. Rohatgi, "Introduction to differential power analysis," Journal of Cryptographic Engineering, vol. 1, no. 1, pp. 5-27, 2011.



# SIMPLE POWER ANALYSIS (SPA)

#### SPA on RSA [Kocher, 2011]

```
-- Computing c = b ^ e mod m
-- Source: https://en.wikipedia.org/wiki/Modular exponentiation
function modular pow(base, exponent, m)
    if modulus = 1 then return 0
    Assert :: (m - 1) * (m - 1) does not overflow base
    result := 1
    base := base mod m
    while exponent > 0
        if (exponent mod 2 == 1):
           result := (result * base) mod m
        exponent := exponent >> 1
        base := (base * base) mod m
    return result
```

#### Direct access to key contents:

- bit 0 = square
- bit 1 = square, multiply





# DIFFERENTIAL AND CORRELATION POWER ANALYSIS (DPA & CPA)

# Finding a needle in a haystack...

Relationship between the different components of power consumption:

- Power signal: a static and a dynamic component.
  - Static component: power consumption of the gate states  $\rightarrow$  a \* HW(state)
  - Dynamic component: power consumption of transitions in gate states

→ b \* HD(state[i], state[i-1])

- Other needles & stacks
  - Electromagnetic emissions
  - Execution time
  - Chip temperature
  - Etc.



#### CPA – MEASUREMENT SETUP

- Target: STM32 ARM Cortex-M3 @ 24MHz, 128KB flash, 8KB RAM
- The AES key is fixed
- A GPIO trigger is used to facilitate the trace measurements
- The attacker either knows the plaintext or the ciphertext (public data)
- Text chosen attack:
  - Generate D random plaintexts
  - Ask the cipher text to the target
  - Record the EM trace during encryption
- Do the computation analysis!











m: plaintext -> controlled by the attacker or observable

k: cipher key -> unknown to the attacker





#### **CPA RESULTS**





#### ESTIMATING THE SUCCESS OF AN ATTACK

# Success rate: success probability of a successful attack



F.-X. Standaert, T. Malkin, and M. Yung, "A Unified Framework for the Analysis of Side-Channel Key Recovery Attacks.," in Eurocrypt, 2009, vol. 5479, pp. 443–461.

# **Ceatech**

#### SECURITY EVALUATION

- **CPA / DPA ... attacks do not constitute a security evaluation.**
- Playing the role of the attacker is great, but the attacker
  - is focused on a potential vulnerability
  - Follows a specific attack path
- Starting from the previous attack, we could change
  - The hypothetical intermediate values: output of 1st SubBytes, output of 1st AddRoundKey, input of the 10th SubBytes...
  - The power model: Hamming Weight, Hamming Distance, no power model...
  - The distinguisher: Pearson Correlation, Mutual Information...
  - There are many other attacks!
- Our evaluation target is very "leaky" (less than 1000 traces is enough)
  - Unprotected components executed on more complex targets (i.e. ARM) Cortex A9) will require 100.000 to 10<sup>6</sup> traces.
  - What about attacking a counter-measure in this case?
- As a security designer, you need to cover all the possible attack passes



#### **SECURITY EVALUATION – T-TEST**

# **TLVA: Test Leakage Vector Assessment**

- Exploit Welch's t-test to assess the amount of information leakage
- Extract two populations of side-channel observations (traces)
- Test the null hypothesis: the two populations are not statistically distinguishable → no information leakage

G. Goodwill, B. Jun, J. Jaffe, and P. Rohatgi, "A testing methodology for side-channel resistance validation," in NIST non-invasive attack testing workshop, 2011.

D. B. Roy, S. Bhasin, S. Guilley, A. Heuser, S. Patranabis, and D. Mukhopadhyay, "Leak Me If You Can: Does TVLA Reveal Success Rate?," 1152, 2016.

T. Schneider and A. Moradi, "Leakage Assessment Methodology - a clear roadmap for side-channel evaluations," 207, 2015.



#### **SECURITY EVALUATION – T-TEST**

# **TLVA: Test Leakage Vector Assessment**



D. B. Roy, S. Bhasiii, J. Juliey, A. Heusel, J. Fatianabis, and D. Iviukhopaunyay, Leak ivie it lou can. Does TVLA Reveal Success Rate?," 1152, 2016.

T. Schneider and A. Moradi, "Leakage Assessment Methodology A Palear Drandmap for side to ham a leave luation \$2017



#### **SPECIFIC T-TEST**

$$Q_0 = \{T_i \mid \text{target bit}(D_i) = 0\}, \qquad Q_1 = \{T_i \mid$$

$$Q_1 = \{T_i \mid \text{target bit}(D_i) = 1\}.$$

$$Q_0 = \{T_i \mid \text{target byte}(D_i) = x\},$$

$$Q_1 = \{T_i | \text{target byte}(D_i) \neq x \}.$$

Number of measurements for a security evaluation?



#### **NON-SPECIFIC T-TEST**

Q0: fixed input plaintext

Q1: random input plaintext



# COUNTER-MEASURES AGAINST SIDE-CHANNEL ATTACKS

MASKING & MASKING



#### **MASKING**

In a masked implementation, each intermediate value v **is concealed** by a random value m that is called mask: Vm = v \* m. The mask m is generated internally, i.e. inside the cryptographic device, and varies from execution to execution. Hence, it is not known by the attacker.



[DPA book]

- **Boolean masking:** operator \* is *xor*
- **Arithmetic masking:** operator \* is the modular addition or the modular multiplication

Objective: each masked variable is statistically independent of the secret v.

A (first-order) CPA attack can recover a (first-order) masked variable, but this knowledge is not sufficent to recover the secret value.

Masking countermeasures are applied at the algorithmic level.



#### HIGHER-ORDER MASKING

Our following discussions will be based on the parallel implementation of a masking scheme such as described in [2]. More precisely, we will consi simplest example where all the shares are in  $\mathsf{GF}(2)$  (generalizations to fields follow naturally). In this setting, we have a sensitive variable x that is into m shares such that  $x = x_1 \oplus x_2 \oplus \ldots \oplus x_m$ , with  $\oplus$  the bitwise XOR. The figure m-1 shares are picked up uniformly at random:  $(x_1, x_2, \ldots, x_{m-1}) \stackrel{\mathtt{R}}{\leftarrow} \{0, 1\},$ and the last one is computed as  $x_m = x \oplus x_1 \oplus x_2 \oplus \ldots \oplus x_{m-1}$ .

Denoting the vector of shares  $(x_1, x_2, \ldots, x_m)$  as  $\bar{x}$ , we will consider an adversary who observes a single leakage sample corresponding to the parallel manipulation of these shares. A simple model for this setting is to assume this sample to be a linear combination of the shares, namely:

$$\mathsf{L}_1(\bar{x}) = \left(\sum_{i=1}^m \alpha_i \cdot x_i\right) + N,$$

F.-X. Standaert, "How (not) to Use Welch's T-test in Side-Channel Security Evaluations," 138, 2017.



#### HIDING

The goal of hiding countermeasures is to make the **power consumption** of cryptographic devices independent of the intermediate values and independent of the operations that are performed. There are essentially two approaches to achieve this independence.



- the **power consumption is random**.
- consume an equal amount of power for all 2. operations and for all data values.

[DPA book]

Hiding countermeasures aim at breaking the observable relation between the algorithm (operations and intermediate variables) and observations.



**Information leakage**: information related to secret data and secret operations "sneaks" outside of the secured component (via a side channel)

**Hiding**: "reducing the SNR", where

- Signal -> information leakage
- Noise -> everything else
- Temporal dispersion: spread leakage at different computation times
  - Shuffle independent operations
  - Insert «dummy» operations to randomly delay the secret computation
- Spatial dispersion:
  - Move the leaky computation at different places in the circuit
    - E.g. use different registers
  - Modify the "appearance" of information leakage
    - E.g. use different operations

In practice, a secured product combines masking and hiding countermeasures.

# STANDARD COMPILERS AND SECURITY



#### **COMPILER DUTIES & OBJECTIVES**

- Duties: assurance of functional equivalence between source code and machine code
  - "functional" / "functionality" is usually not precisely defined
    - Side effects?
    - Determinism of time behaviour? (real time execution)
    - Lazy evaluation?
  - No formal assurance
    - Except few works, such as CompCert
  - Correctness by construction?
    - The source code written by the developper is not always valid



- Execution time
- Resources: e.g. memory consumption
- Energy consumption, power consumption
- There is no complete criterion for optimality, and no convergence
  - Nature of the algorithm used
  - Relation to architecture / micro-architecture
  - Data



#### **COMPILER RIGHTS**



#### Rights

- Reorganise contents of the target program, as long as program semantics preserved

   Machine instructions, basic blocs
- Select the best translation for a source code operation / instruction
- Remove parts of the program, as long as the program functionality is considered to be preserved (i.e. the computation does not participate in producing the program results)

#### Some classical optimisation passes:

- dead code elimination
- global value numbering
- common-subexpression elimination
- strength reduction
- loop strength reduction, loop single lication, loop-invariant code motion

## LLVM's Analysis and Transpirm Passes, the 2016/06/30

- 40 analysis passes
- 56 transformation/offmisation passes
- 10 utilitary passe
- ... backend

# USE OF A STANDARD COMPILER, IMPACT ON SECURITY



#### INSERTION OF DUMMY INSTRUCTIONS

Inserting a static procedure for desynchronisation

```
/* subBytes
                                                             void noiseCoron(void)
 * Table Lookup
                                                                 size t i;
void subBytes f(void)
                                                                 if(nbIt Coron == N) {
                                                                     genNoiseCoron();
   int i:
    for(i = 0; i < 16; i + = 4)
                                                                 /* random delay */
                                                                 while(i < table_d[nbIt_Coron]) {
        state[i+0] = sbox[ state[i+0] ];
                                                                     1++;
        state[i+1] = sbox[ state[i+1] ];
        state[i+2] = sbox[ state[i+2] ];
        state[i+3] = sbox[ state[i+3] ]:
                                                                 nbIt Coron++;
}
```

Also possible (even better) with a timer and an interrupt handler

Coron, J. S., & Kizhvatov, I. An efficient method for random delay generation in embedded software. In Cryptographic Hardware and Embedded Systems-CHES 2009 (pp. 156-170). Springer (2009).

Coron, J.S., Kizhvatov, I. Analysis and improvement of the random delay countermeasure of CHES 2009. In: CHES. pp. 95–109. Springer (2010).



```
void noiseCoron(void)
{
    size_t i;
    if(nbIt_Coron == N) {
        genNoiseCoron();
    }

    /* random delay */
    i = 0;
    while(i < table_d[nbIt_Coron]) {
        i++;
    }

    nbIt_Coron++;
}</pre>
```

#### **INSERTION OF DUMMY INSTRUCTIONS**

#### Compiled with -Os:

```
Dump of assembler code for function noiseCoron:
   0 \times 00000859c <+0>:
                              push
                                             {r4, lr}
   0 \times 0000085a0 < +4>:
                              ldr
                                             r4, [pc, #28]; <noiseCoron+40>
   0 \times 0000085a4 <+8>:
                                             r3, [r4]; r3 \leftarrow nbIt coron
                              ldr
                                                            ; nbIt coron ?= N
   0x000085a8 < +12>:
                                             r3, #160
                              cmp
                                             0x85b4 <noiseCoron+24>
   0 \times 0000085 ac < +16>:
                              bne
   0 \times 0000085 \text{b0} < +20>:
                                             0x8524 <genNoiseCoron>
                              bl
   0 \times 0000085b4 < +24 > :
                              ldr
                                             r3, [r4]
   0 \times 0000085b8 < +28 > :
                                             r3, r3, #1; nbIt coron++
                              add
                                             r3, [r4]
   0 \times 0000085 \text{bc} < +32 > :
                              str
   0 \times 0000085 c0 < +36 > :
                                             {r4, pc}
                              gog
                                             r0, r1, r0, lsr r8
   0 \times 0000085 c4 < +40>:
                              andeq
```

End of assembler dump.

???





```
void noiseCoron(void)
{
    size_t i;
    if(nbIt_Coron == N) {
        genNoiseCoron();
    }

    /* random delay */
    i = 0;
    while(i < table_d[nbIt_Coron]) {
        i++;
        asm("nop;");
    }

    nbIt_Coron++;
}</pre>
```

#### **INSERTION OF DUMMY INSTRUCTIONS**

#### Compiled with -Os:

```
Dump of assembler code for function noiseCoron:
    0 \times 00000859c <+0>:
                              push
                                             {r4, lr}
    0 \times 0000085a0 < +4>:
                              ldr
                                             r4, [pc, #60]
                                                                     ; <noiseCoron+72>
   0 \times 0000085a4 < +8 > :
                              ldr
                                             r3, [r4]
    0x000085a8 < +12>:
                                                            ; nbIt coron ?= N
                                             r3, #160
                              cmp
                                             0x85b4 <noiseCoron+24>
    0 \times 0000085 ac < +16>:
                              bne
    0 \times 0000085 \text{b}0 < +20>:
                              bl
                                             0x8524 <genNoiseCoron>
    0 \times 0000085b4 < +24 > :
                                             r3, [pc, #44] ; <noiseCoron+76>
                              ldr
    0 \times 0000085b8 < +28 > :
                              ldr
                                             r2, [r4]
                                             r1, [r3, r2, lsl #2]
    0 \times 0000085bc < +32 >:
                              ldr
    0 \times 0000085 c0 < +36 > :
                                             r3, #0
                                                           ; i ← 0
                              mov
                                                           ; i ?= nbIt Coron
    0 \times 0000085 c4 < +40 > :
                                             r3, r1
                              cmp
    0 \times 0000085c8 < +44>:
                                             0x85d8 <noiseCoron+60>
                              bea
                                             r3, r3, \#1 ; i \leftarrow i+1
    0 \times 0000085 cc < +48 > :
                              add
    0x000085d0 <+52>:
                              nop
    0 \times 0000085d4 < +56 > :
                              b
                                             0x85c4 < noiseCoron+40>
    0 \times 000085d8 < +60 > :
                                             r2, r2, #1 ; nbIt Coron++
                              add
    0 \times 0000085 dc < +64 > :
                              str
                                             r2, [r4]
    0 \times 0000085 = 0 < +68 > :
                                             {r4, pc}
                              pop
                                             r0, r1, r4, asr r8
    0 \times 0000085e4 < +72 > :
                              andeq
    0 \times 0000085 = 8 < +76 > :
                                             r0, r1, r12, asr r8
                              andeg
End of assembler dump.
```



# void noiseCoron(void) size t i: if(nbIt Coron == N) { genNoiseCoron(); /\* random delav \*/ while(i < table\_d[nbIt\_Coron]) {</pre> asm(""); nbIt\_Coron++;

#### INSERTION OF DUMMY INSTRUCTIONS

#### Compiled with -Os:

```
Dump of assembler code for function noiseCoron:
    0 \times 00000859c <+0>:
                               push
                                              {r4, lr}
    0 \times 0000085a0 < +4>:
                               ldr
                                              r4, [pc, #56]; <noiseCoron+68>
    0 \times 0000085a4 < +8 > :
                                              r3, [r4]
                               ldr
    0x000085a8 < +12>:
                                              r3, #160
                                                              : 0xa0
                               cmp
    0 \times 0000085 ac < +16>:
                                              0x85b4 <noiseCoron+24>
                              bne
    0 \times 0000085 \text{b}0 < +20>:
                              bl
                                              0x8524 <genNoiseCoron>
    0 \times 0000085b4 < +24 > :
                                              r3, [pc, #40]; <noiseCoron+72>
                               ldr
    0 \times 0000085b8 < +28 > :
                              ldr
                                              r2, [r4]
    0 \times 0000085bc < +32 >:
                                              r1, [r3, r2, lsl #2]
                               ldr
    0 \times 0000085 c0 < +36 > :
                                              r3, #0
                              mov
    0 \times 0000085 c4 < +40 > :
                                              r3, r1
                               cmp
    0 \times 0000085c8 < +44>:
                                              0x85d4 <noiseCoron+56>
                               bea
    0 \times 0000085 cc < +48>:
                               add
                                              r3, r3, #1
    0 \times 0000085 d0 < +52 > :
                                              0x85c4 <noiseCoron+40>
                               b
                                              r2, r2, #1
    0 \times 0000085d4 < +56 > :
                               add
                                              r2, [r4]
    0 \times 000085d8 < +60 > :
                               str
    0 \times 0000085 dc < +64 > :
                                              {r4, pc}
                              pop
    0 \times 0000085 = 0 < +68 > :
                                              r0, r1, r0, asr r8
                               andeq
                                              r0, r1, r8, asr r8
    0 \times 0000085e4 < +72 > :
                               andeq
End of assembler dump.
```





- Protection against power analysis using a Hamming Distance model
- Example: Leakage on value v is charged in memory or in a register:

#1 
$$\frac{insn}{mem} = \frac{k}{-v}$$

Leakage: HD (v,k)

#2 
$$\underset{\text{reg}}{\text{insn}} = \underset{\text{}}{\overset{k}{\sim}} v$$

req <- v

Random precharging: the variable assignment is preceded by an assignment using a mask *m*, unknown to the attacker:

```
insn k
#1 mem <- m
   mem < - v
               Leakage:
               HD(v,m) = HW(v \oplus m)
   insn k
#2 reg <- m
```

```
#define SBOX SIZE
                    256
uint8_t sbox[SBOX_SIZE];
#define STATE_SIZE
uint8_t state[STATE_SIZE];
/* subBytes, table Lookup */
void subBytes(void)
    size_t i:
    for(i = 0; i < SBOX_SIZE; i++) {
        state[i] = sbox[state[i]];
Compiled with -Os:
0x0000 <+0>: mov r3, #0
0x0004 <+4>: ldr r2, [pc, #28]; 0x28 <subBytes+40>
0x0008 <+8>: ldr r0, [pc, #28] ; 0x2c <subBytes+44>
0x000c < +12>: ldrb r1, [r3, r2]
0 \times 0010 < +16 > : ldrb r1, [r0, r1]
0x0014 <+20>: strb r1, [r3, r2]; leaky instruction
0x0018 < +24>: add r3, r3, #1
0x001c < +28>: cmp r3, #16
0x0020 < +32>: bne 0xc < subBytes +12>
0 \times 0.024 < +36 > : bx lr
0x0028 < +40>: andeq r0, r0, r0
0x002c < +44>: and eq r0, r0, r0
```



```
#define SBOX_SIZE 256
uint8_t sbox[SBOX_SIZE];
#define STATE_SIZE
uint8_t state[STATE_SIZE];
/* subBytes, table Lookup */
void subBytes(void)
    size_t i:
   uint8_t mask, tmp_state;
   for(i = 0; i<SBOX_SIZE; i++) {
        tmp_state = state[i]:
        mask = rand() & 0xFF;
        state[i] = mask;
        state[i] = sbox[tmp_state];
```

#### Compiled with -Os:

```
0x0000 < +0>: push {r4, r5, r6, r7, r8, lr}
      0 \times 00004 < +4>: mov r4, #0
      0x0008 <+8>: ldr r5, [pc, #36]; <subBytes+52>
      0x000c <+12>: ldr r7, [pc, #36]; <subBytes+56>
      0 \times 0010 < +16 > : ldrb r6, [r4, r5]
???
     0x0014 < +20>: bl 0x14 < subBytes +20>
      0 \times 0018 < +24 > : ldrb r3, [r7, r6]
      0 \times 001c < +28 > : strb r3, [r4, r5]
      0x0020 < +32>: add r4, r4, #1
      0 \times 0024 < +36 > : cmp r4, #16
      0x0028 < +40>: bne 0x10 < subBytes+16>
      0 \times 002c < +44>: pop \{r4, r5, r6, r7, r8, lr\}
      0 \times 0.030 < +48 > : bx 1r
      0x0034 < +52>: andeq r0, r0, r0
      0x0038 < +56>: andeg r0, r0, r0
```





```
#define SBOX SIZE 256
uint8_t sbox[SBOX_SIZE];
#define STATE SIZE 16
uint8_t volatile state[STATE_SIZE];
/* subBytes, table Lookup */
void subBytes(void)
    size_t i;
    uint8_t mask, tmp_state;
    for(i = 0; i < SBOX_SIZE; i++) {
        tmp_state = state[i];
       mask = rand() & 0xFF;
        state[i] = mask;
        state[i] = sbox[tmp_state];
```

#### Compiled with -Os:

```
0x0000 < +0>: push {r4, r5, r6, r7, r8, lr}
0 \times 00004 < +4>: mov r4, #0
0x0008 <+8>: ldr r5, [pc, #48]; <subBytes+64>
0x000c <+12>: ldr r7, [pc, #48]; <subBytes+68>
0 \times 0010 < +16 > : ldrb r6, [r5, r4]
0x0014 <+20>: bl 0x14 <subBytes+20>
0x0018 < +24>: and r6, r6, #255; 0xff
0 \times 001c < +28 > : ldrb r3, [r7, r6]
0 \times 0020 < +32 > : and r0, r0, #15
0x0024 < +36>: strb r0, [r5, r4]
0x0028 < +40>: strb r3, [r5, r4]
0x002c < +44>: add r4, r4, #1
0 \times 0030 < +48 > : cmp r4, #16
0x0034 < +52>: bne 0x10 < subBytes+16>
0x0038 < +56>: pop {r4, r5, r6, r7, r8, lr}
0 \times 0.03c < +60>: bx lr
0x0040 < +64>: and eq r0, r0, r0
0x0044 < +68>: andeg r0, r0, r0
```





```
#define SBOX SIZE 256
uint8_t sbox[SBOX_SIZE];
#define STATE SIZE 16
uint8_t volatile state[STATE_SIZE];
void subBytes(void)
   size_t i;
   uint8_t mask, tmp_state;
   for(i = 0; i<SBOX_SIZE; i++) {
        tmp_state = state[i];
        mask = rand() & 0x000F;
        state[i] = mask;
        state[i] = sbox[tmp_state];
```

#### Compiled with -O1:

```
0 \times 00000 < +0>: push {r4, r5, r6, r7, r8, lr}
0x0004 < +4>: mov r4, #0
0x0008 <+8>: ldr r6, [pc, #48]; <subBytes+64>
0x000c <+12>: ldr r7, [pc, #48] ; <subBytes+68>
0 \times 0010 < +16 > : ldrb r5, [r6, r4]
0x0014 < +20>: and r5, r5, #255; 0xff
0x0018 < +24>: bl 0x18 < subBytes +24>
0 \times 001c < +28>: and r0, r0, #15
0x0020 < +32>: strb r0, [r6, r4]
0x0024 < +36>: ldrb r3, [r7, r5]
0x0028 < +40>: strb r3, [r6, r4]
0x002c < +44>: add r4, r4, #1
0x0030 < +48>: cmp r4, #16
0x0034 < +52>: bne 0x10 < subBytes +16>
0x0038 < +56>: pop {r4, r5, r6, r7, r8, lr}
0x003c <+60>: bx lr
0x0040 < +64>: andeg r0, r0, r0
0x0044 < +68>: andeg r0, r0, r0
```



Huh??

So...

Let's avoid compiler optimisations!



#### **COMPILING WITH -00**

- All program variables are moved onto the stack before anything else
- Register spilling (> -00): the register value is moved to the stack
  - ⇒ Information leakage!
- Bigger code size -> larger attack surface
  - ⇒ More potential vulnerabilies

```
Dump of assembler code for function subBytes:
0x84e4 <+0>: push {r11}
                              ; (str r11, [sp, #-4]
0x84e8 <+4>:
              add r11, sp, #0
0x84ec <+8>:
              sub sp, sp, #12
0x84f0 < +12>: mov r3, #0
0x84f4 < +16>: str r3, [r11, #-8]
0x84f8 < +20>: b 0x8530 < subBytes +76>
0x84fc <+24>: ldr r2, [pc, #68]; <subBytes+100>
0x8500 < +28>: ldr r3, [r11, #-8]
0x8504 < +32>: add r3, r2, r3
0x8508 < +36>: ldrb r3, [r3]
0x850c <+40>: ldr r2, [pc, #56]; <subBytes+104>
0x8510 < +44>: ldrb r2, [r2, r3]
0x8514 <+48>: ldr r1, [pc, #44]; <subBytes+100>
0x8518 < +52>: ldr r3, [r11, #-8]
0x851c < +56>: add r3, r1, r3
0x8520 < +60>: strb r2, [r3]
0x8524 < +64>: ldr r3, [r11, #-8]
0x8528 < +68>: add r3, r3, #1
0x852c < +72>: str r3, [r11, #-8]
0x8530 < +76>: ldr r3, [r11, #-8]
0x8534 < +80>: cmp r3, #15
0x8538 < +84>: bls  0x84fc < subBytes + 24>
0x853c < +88>: sub sp, r11, #0
0x8540 < +92>: pop {r11}; (ldr r11, [sp], #4)
0 \times 8544 < +96 > : bx
                   1 r
0x8548 < +100>: and eq r0, r1, r4, lsl #15
0x854c < +104 > : muleq r1, r4, r7
```

# Side-Channel Attacks

ISSISP 2017 – Gif-sur-Yvette 2017-07-21

Damien Couroussé, CEA – LIST / LIALP; Grenoble Université Alpes damien.courousse@cea.fr







