DJVuLibre is an open source library for DjVu, a web-centric format and software platform for distributing documents and images. According to the official site, it is used by many academic, commercial, government, and non-commercial websites around the world.
A vulnerability was found by researcher Hongxu Chen. An out-of-bound read is possible when parsing a DJVU file, resulting a denial-of-service condition.
In DjVmDir::decode
of file DjVmDir.cpp, we have this block of code:
void
DjVmDir::decode(const GP<ByteStream> &gstr)
{
// ... code ...
// Line 292
GTArray<char> strings;
char buffer[1024];
int length;
while((length=bs_str.read(buffer, 1024)))
{
int strings_size=strings.size();
strings.resize(strings_size+length-1);
memcpy((char*) strings+strings_size, buffer, length);
}
DEBUG_MSG("size of decompressed names block=" << strings.size() << "\n");
if (strings[strings.size()-1] != 0)
{
int strings_size=strings.size();
strings.resize(strings_size+1);
strings[strings_size] = 0;
}
// Copy names into the files
const char * ptr=strings;
for(pos=files_list;pos;++pos)
{
GP<File> file=files_list[pos];
file->id=ptr;
// ... code ...
}
We start with a custom GTArray named strings
. It is used to store the user-provided byte stream, which we read up to 1024 bytes. While storing, the GTArray buffer gets resized before the data is copied:
GTArray<char> strings;
char buffer[1024];
int length;
while((length=bs_str.read(buffer, 1024)))
{
int strings_size=strings.size();
strings.resize(strings_size+length-1);
memcpy((char*) strings+strings_size, buffer, length);
}
If the char array does not end with a null byte, a null byte is inserted (and size readjusted):
if (strings[strings.size()-1] != 0)
{
int strings_size=strings.size();
strings.resize(strings_size+1);
strings[strings_size] = 0;
}
Next, a reference of the GTArray is copied, and then this is used as a file ID according to this line:
file->id=ptr;
The id
member is actually a custom GUTF8String. It overrides the =
operator, which the implementation can be found here:
// Line 2625 in GString.cpp
GUTF8String& GUTF8String::operator= (const char *str)
{ return init(GStringRep::UTF8::create(str)); }
The implementation for create()
can be found here:
// Line 156 in GString.cpp
GP<GStringRep>
GStringRep::UTF8::create(const char *s)
{
GStringRep::UTF8 dummy;
return dummy.strdup(s);
}
The strdup
function isn't exactly the same as the original strdup
in C/C++, in fact it is custom for UTF8. This is where the problem finally blows up. Although DjVmDir::decode
is aware that a null byte is necessary at the end of the string, it is just a ASCII type null byte terminator, which is only one byte, but that's not enough for UTF8. In other words, the null byte terminating routine in DjVmDir::decode
does not really work. As a result, an off-by-one out-of-bound read condition could occur, which is proven in the AddressSanitizer bug report by Hongxu Chen:
==14708==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6040000000f1 at pc 0x7fd31456a66e bp 0x7ffc59407e10 sp 0x7ffc594075b8
READ of size 1 at 0x6040000000f1 thread T0
#0 0x7fd31456a66d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d)
#1 0x7fd3141a5d5b in GStringRep::strdup(char const*) const /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/GString.cpp:1017
#2 0x7fd31419f474 in GStringRep::UTF8::create(char const*) /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/GString.cpp:160
#3 0x7fd3141b64fd in GUTF8String::operator=(char const*) /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/GString.cpp:2626
#4 0x7fd314054dbb in DjVmDir::decode(GP<ByteStream> const&) /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/DjVmDir.cpp:315
#5 0x7fd3140c0b54 in display_djvm_dirm /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/DjVuDumpHelper.cpp:172
#6 0x7fd3140c2a64 in display_chunks /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/DjVuDumpHelper.cpp:335
#7 0x7fd3140c2b1f in display_chunks /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/DjVuDumpHelper.cpp:342
#8 0x7fd3140c31f0 in DjVuDumpHelper::dump(GP<ByteStream>) /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/DjVuDumpHelper.cpp:361
#9 0x562f0317dba7 in display(GURL const&) /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/tools/djvudump.cpp:128
#10 0x562f0317e35d in main /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/tools/djvudump.cpp:178
#11 0x7fd3135fbb96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
#12 0x562f0317d909 in _start (/home/hongxu/FOT/djvulibre/djvu-djvulibre-git/install/bin/djvudump+0x3909)
0x6040000000f1 is located 0 bytes to the right of 33-byte region [0x6040000000d0,0x6040000000f1)
allocated by thread T0 here:
#0 0x7fd3145f9458 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xe0458)
#1 0x7fd31415c17c in GArrayBase::resize(int, int) /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/GContainer.cpp:220
#2 0x7fd31405ede4 in GArrayTemplate<char>::resize(int) /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/GContainer.h:496
#3 0x7fd314054aff in DjVmDir::decode(GP<ByteStream> const&) /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/DjVmDir.cpp:298
#4 0x7fd3140c0b54 in display_djvm_dirm /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/DjVuDumpHelper.cpp:172
#5 0x7fd3140c2a64 in display_chunks /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/DjVuDumpHelper.cpp:335
#6 0x7fd3140c2b1f in display_chunks /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/DjVuDumpHelper.cpp:342
#7 0x7fd3140c31f0 in DjVuDumpHelper::dump(GP<ByteStream>) /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/libdjvu/DjVuDumpHelper.cpp:361
#8 0x562f0317dba7 in display(GURL const&) /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/tools/djvudump.cpp:128
#9 0x562f0317e35d in main /home/hongxu/FOT/djvulibre/djvu-djvulibre-git/tools/djvudump.cpp:178
#10 0x7fd3135fbb96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
It seems the vulnerability falls under the local attack category, therefore an out-of-bound read type vulnerability would not be directly threatening to the system. In our case specifically, it looks like the extra read would actually cause a crash somewhere in the decode()
function.