I've been doing "educational" live-streams doing security research and
demonstrating the capabilities of one of my tools. As part of this I
ended up looking at ctags
because it was a lightweight parser that
was easy to cross-compile without dependencies. We ended up compiling
ctags
for 6502 because that's what I happened to have around. But, I
have verified that all of these issues indeed show up in x86_64 as
well.
Largely this was to show the process that goes into fuzzing,
harnessing, triaging, minimizing, and reporting bugs. I recognize that
ctags
is not a very critical surface so I found it to be a reasonable
one to teach with.
Ultimately we ended up putting about 5 billion fuzz cases into ctags
over the course of a few streams. Hopefully this irons out the rest of
the kinks. It's always possible more bugs show up as fixes make them
accessible.
In the case of a single tick ''
string existing in a typedef, the
sp
local gets reset back to equal to tok
. This then causes the
logic of ;
in the finalizing of a typedef
to skip NULL-termination
of sp
. The tok
is then passed to pfnote
potentially without
NULL-termination (if the string following typedef is >= 8 bytes
(overwriting "typedef" and it's null-terminator in tok
).
Since this bug is relatively hard to follow, I instrumented the state
machine of c_entries()
to help show what is going on as the file is
parsed.
char t sp 0xb390 tok 0xb390 token 0 t_def 0 t_level -1 level 0 lineno 1
char y sp 0xb391 tok 0xb390 token 1 t_def 0 t_level -1 level 0 lineno 1
char p sp 0xb392 tok 0xb390 token 1 t_def 0 t_level -1 level 0 lineno 1
char e sp 0xb393 tok 0xb390 token 1 t_def 0 t_level -1 level 0 lineno 1
char d sp 0xb394 tok 0xb390 token 1 t_def 0 t_level -1 level 0 lineno 1
char e sp 0xb395 tok 0xb390 token 1 t_def 0 t_level -1 level 0 lineno 1
char f sp 0xb396 tok 0xb390 token 1 t_def 0 t_level -1 level 0 lineno 1
char sp 0xb397 tok 0xb390 token 1 t_def 0 t_level -1 level 0 lineno 1
char A sp 0xb390 tok 0xb390 token 0 t_def 1 t_level 0 level 0 lineno 1
char A sp 0xb391 tok 0xb390 token 1 t_def 1 t_level 0 level 0 lineno 1
char A sp 0xb392 tok 0xb390 token 1 t_def 1 t_level 0 level 0 lineno 1
char A sp 0xb393 tok 0xb390 token 1 t_def 1 t_level 0 level 0 lineno 1
char A sp 0xb394 tok 0xb390 token 1 t_def 1 t_level 0 level 0 lineno 1
char A sp 0xb395 tok 0xb390 token 1 t_def 1 t_level 0 level 0 lineno 1
char A sp 0xb396 tok 0xb390 token 1 t_def 1 t_level 0 level 0 lineno 1
char A sp 0xb397 tok 0xb390 token 1 t_def 1 t_level 0 level 0 lineno 1
char ' sp 0xb398 tok 0xb390 token 1 t_def 1 t_level 0 level 0 lineno 1
char ; sp 0xb390 tok 0xb390 token 0 t_def 1 t_level 0 level 0 lineno 1
Create a C file with contents typedef AAAAAAAA'';
. Run ctags <filename>
. This may not actually cause a crash, but it will indeed
cause a note to get created without a null-terminator. Depending on the
stack state this can result in uninitialized stack data being written
to the tags
file.
The following lines in get_line()
will read into a fixed-sized global
buffer (if the "-x" command line option is specified) until the file
ends or a newline is encountered. This will end up with a controlled
global buffer overflow.
for (cp = lbuf; GETC(!=, EOF) && c != '\n'; *cp++ = c)
continue;
(gdb) bt
#0 get_line () at print.c:60
#1 0x000055555555798b in c_entries () at C.c:159
#2 0x000055555555a3f0 in find_entries (
file=0x7fffffffe51b "0x1426_0xc040_1_WRITE_Access") at ctags.c:240
#3 0x0000555555559bee in main (argc=1, argv=0x7fffffffe258)
at ctags.c:139
(gdb) print cp == (lbuf + sizeof(lbuf))
$7 = 1
Create a file containing A(<insert >2048 A's here>
. Invoke ctags -x <filename>
.
In pfnote()
it is possible that curfile
is set from an input tags
file which was loaded in preload_entries()
. In this case the filename
can be controlled by an input file. If the entry matches the main
tag, is 254 bytes followed by a ".", then the access will go of out
bounds on the strrchr()
. This is due to the snprintf()
writing to a
(1+255+1) [257] byte buffer with the 255 bytes from the filename and
trailing ".", and the null terminator. At this point the buffer is
completely filled. In this case the strrchr()
will find the '.' on
the last character in the buffer, prior to a '\0', however it will
access at byte fp[2]
, which is one byte past the null terminator.
Thus an out-of-bounds read of one byte past the nbuf
buffer.
This bug can also be hit by parsing a file with the same filename, 254 non-'.' characters followed by a '.' (empty extension).
(void)snprintf(nbuf, sizeof nbuf, "M%s", fp);
fp = strrchr(nbuf, '.');
if (fp && !fp[2])
*fp = EOS;
pleb@gamey:~/openbsd_src/usr.bin/ctags$ ./a.out -u asdf
=================================================================
==25068==ERROR: AddressSanitizer: stack-buffer-overflow on address
0x7ffe92b03d61 at pc 0x5606a636e0c0
bp 0x7ffe92b03c10 sp 0x7ffe92b03c08
READ of size 1 at 0x7ffe92b03d61 thread T0
#0 0x5606a636e0bf in pfnote
/home/pleb/openbsd_src/usr.bin/ctags/tree.c:72
#1 0x5606a636b9b9 in preload_entries
/home/pleb/openbsd_src/usr.bin/ctags/ctags.c:309
#2 0x5606a636aaae in main
/home/pleb/openbsd_src/usr.bin/ctags/ctags.c:129
#3 0x7f0b078ac09a in __libc_start_main ../csu/libc-start.c:308
#4 0x5606a6368329 in _start
(/home/pleb/openbsd_src/usr.bin/ctags/a.out+0x3329)
Create a tags file with the contents
"main\t<254 non-'.' characters>.\t/^/" and save it to "tags". Then
invoke ctags
in -u
mode: ctags -u tags
Alternatively, create a file with the contents "main(){" and save it
to a file named "<254 'A' characters>.", and invoke ctags <filename>
The put_entries()
function does not check for NULL
on a node
value, and thus when a malloc()
fails the first time pfnote()
(tree.c) is called it is possible to put_entries(head);
. At this
point head
is still NULL
(initial state) and thus a NULL deref
occurs.
Relevant code in tree.c
void
pfnote(char *name, int ln)
{
NODE *np;
char *fp;
char nbuf[1+MAXNAMLEN+1];
if (!(np = malloc(sizeof(NODE)))) {
warnx("too many entries to sort");
put_entries(head); <--- NULL deref here
Relevant code in print.c
void
put_entries(NODE *node)
{
if (node->left) <--- crash here on dereference
put_entries(node->left);
Stack trace:
(gdb) bt
#0 0xdffb208f0bd in put_entries (node=0x0)
at /usr/src/usr.bin/ctags/print.c:95
#1 0xdffb208f275 in pfnote (name=0xd0c60 "main", ln=3)
at /usr/src/usr.bin/ctags/tree.c:60
#2 0xdffb208c663 in c_entries () at /usr/src/usr.bin/ctags/C.c:163
#3 0xdffb208d951 in main (argc=1, argv=0xd0d80)
at /usr/src/usr.bin/ctags/ctags.c:139
If an allocation failure occurs at pfnote()
it is possible for
put_entries()
to get invoked causing a write to outf
. Since outf
has not been opened yet, there might be a fprintf()
that ends up
writing to outf
while it is NULL
. Depending on the implementation
of fprintf()
it is possible that this would cause a NULL
dereference.
if (!(np = malloc(sizeof(NODE)))) {
warnx("too many entries to sort");
put_entries(head); <-- Allocation failure leading to early
put_entries
free_tree(head);
if (!(head = np = malloc(sizeof(NODE))))
err(1, NULL);
}
void
put_entries(NODE *node)
{
if (node->left)
put_entries(node->left);
if (vflag)
printf("%s %s %d\n",
node->entry, node->file, (node->lno + 63) / 64);
else if (xflag)
printf("%-16s %4d %-16s %s\n",
node->entry, node->lno, node->file, node->pat);
else
fprintf(outf, "%s\t%s\t%c^%s%c\n", <--- Write to outf
prior to `outf`
being opened
node->entry, node->file, searchar, node->pat,
searchar);
if (node->right)
put_entries(node->right);
}
Consumes non-spaces forever. This will go past the NUL-terminator from
the fgets()
.
for (; !isspace((unsigned char)*lbp); ++lbp)
continue;
Program received signal SIGSEGV, Segmentation fault.
0x000055555555beba in l_entries () at lisp.c:70
70 for (; !isspace((unsigned char)*lbp); ++lbp)
(gdb) bt
#0 0x000055555555beba in l_entries () at lisp.c:70
#1 0x000055555555a1fb in find_entries (file=0x7fffffffe53b "test.l")
at ctags.c:210
#2 0x0000555555559bee in main (argc=1, argv=0x7fffffffe280)
at ctags.c:139
(gdb) print lbp
$1 = 0x555555564000
<error: Cannot access memory at address 0x555555564000>
Create a file with a name ending in ".l" to create a LISP file. Then,
put the contents (def<more than 2048 'A's here>
and this will end up
reading out of bounds of the lbuf
global buffer.