avelardi/branchscope-en.md

## branchscope-en.md

      
    Raw
  

              branchscope-en.md
            
          
    BranchScope Article memo by herumi

Original here, just run through Google Translate and fixed formatting.

Whitepaper: BranchScope: A New Side-ChannelAttack on Directional Branch Predictor
Caution: I still do not understand the essential part


I might update it (maybe I can not dig into anymore because the condition that can be attacked is too severe to think that it is usually impossible)

Overview


BranchScope: Side channel attack that guesses the direction of arbitrary conditional branch by manipulating the direction prediction (Directional branch predictor) of the shared branch
Here the direction of branch is likely to branch (Taken) · prediction that it will not branch (Not taken)
It is different from existing attack using BTB (branch target buffer)
BTB is a buffer that records the result of the conditional branch executed last time
Executable in user space
Error rate is 1% or less
Can attack against SGX be possible?

Branch prediction unit BPU (branch prediction units)


BPU: Predicts a conditional branch
It consists largely of two kinds of elements

L1 (level-1) prediction

PHT (pattern history table) 2 bit saturation counter

Two mode prediction associated with program counter (pc)


gshare format L2 prediction

Prediction using past information other than branch address


Branch prediction is done taking both predictions into account

This attack uses only L1 prediction


Attack model

Character


victim (victim) has confidential information
spy (attacker) steals information without directly accessing the secret information

condition


victim and spy are running on the same physical core

To share BPU


Victim runs slowly

Victim's program must move slowly for high-precision branchscope attack

Do it with another attack (performance degradation attack etc.) to do it
Fiddling with the scheduler


It is possible to control timing of code execution of victim

Attack Summary


Stage 1: Initialize the PHT entry

Attacker brings PHT to a certain state
spy executes the code to be described later


Stage 2: Running victim

Execute only one conditional branch (?)


Stage 3: Exploring PHT entries

spy executes the same conditional branch as the PHT entry used by victim and observes it


Attack possibility


First of all let me do 1-level prediction only
how?
In the experiment of branch prediction, the mistake rate of about 50% drops to 10% at 3, 4 times at the first time and it becomes almost 0% at 5 to 7 times
Since L1 predictable information does not exist in the initial state, L1 prediction is used

randomize_pht:
  cmp% rcx,% rcx
  je. L 0
  nop
.L0:
  jne. L1
  nop
.L1:
  je. L2
  ...
. L99999:
  nop
`` `
* Choose je and jne randomly
* It repeats it 100000 times
* Experiment result L2 prediction is no longer used

# L1 predicted 2-bit FSM (finite state machine)

* T; taken (branching)
* N; not taken (not branching)
* H; Branch prediction success (hit)
* M; branch prediction failure (miss)
* SN: strongly not taken; it does not branch strongly
* WN: weakly not taken; Weakly not to branch
* ST: strongly taken; Strongly branching
* WT: weakly taken; Weakly branching


   T T T
SN → WN → WT → ST
   ← ← ←
   N N N


State after Prime
Prime
State after Target
Target
Probe
Observation


TTT
ST
T
ST
TT
HH


TTT
ST
T
ST
NN
MM


TTT
ST
N
WT
TT
HH


TTT
ST
N
WT
NN
MH


NNN
SN
T
WN
TT
MH


NNN
SN
T
WN
NN
HH


NNN
SN
N
SN
TT
MM


NNN
SN
N
SN
NN
HH


If Probe is TT HH, NN then MM you can predict that the state after Target was ST
State of branch prediction


There is quite a lot of noise when examining the distribution
Looking at the entry of PHT in detail, it seems that the address is 14 bits apart using the same (lower 14 bits as key?)

Implementation


victim code

int sec_data [] = {1, 0, 1, ...};
void victim () {
  if (sec_data [i])
    asm ("nop; nop");
  i ++;
}


spy code

int probe_array [] = {1, 1};
int main ()
{
  randomize_pht ();
  sleep (); // wait for victim
  spy (probe_arr);
}

void spy (int array [2]) {
  for (int i = 0; i <2; i ++) {
    int a = read_branch_mispred_counter ();
    if (array [i])
      asm ("nop; nop; nop");
    int b = read_branch_mispred_counter ();
    store_branch_mispred_data (b - a);
  }
}


spy can force victim's process to execute a conditional branch instruction only once between context switches

Since the OS can be controlled in the SGX scenario, there is an assumption


Set the virtual address of the conditional branch instruction of victim and the conditional branch instruction of spy to the same

In order to conflict PHT of victim and spy


spy executes randomize_pht () and sleep ()
victim () executes a conditional branch only once based on confidential information (stop)
spy reads conditional branch counter and branches based on array []
Check the condition branch counter again to see if the read value does not change or is incremented by 1

Experimental result


Skylake and Haswell have less than 1% error rate
3 to 5% error rate with Sandy Bridge

Time stamp counter


Privilege elevation is required to access conditional branch counter
substitute with rdtsc (p)

Conditional branch mistake takes extra 20 cycles extra, so it can be determined by that
State after Prime	Prime	State after Target	Target	Probe	Observation
TTT	ST	T	ST	TT	HH
TTT	ST	T	ST	NN	MM
TTT	ST	N	WT	TT	HH
TTT	ST	N	WT	NN	MH
NNN	SN	T	WN	TT	MH
NNN	SN	T	WN	NN	HH
NNN	SN	N	SN	TT	MM
NNN	SN	N	SN	NN	HH