Skip to content

Instantly share code, notes, and snippets.

@Bit00009
Last active October 31, 2022 06:48
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Bit00009/3241bb66301f8aaba16074537d094e61 to your computer and use it in GitHub Desktop.
Save Bit00009/3241bb66301f8aaba16074537d094e61 to your computer and use it in GitHub Desktop.
LZMA2 In-Memory Compression/Decompression
#include <Windows.h>
#include <iostream>
#include <fstream>
#include <vector>
#include "swapi.h" // Stopwatch
#include <fast-lzma2.h>
using namespace std;
using namespace timer;
const int LZMA2_COMPRESSION_LEVEL = 9;
#define FL2_CLEVEL_DEFAULT 6
#define FL2_MAX_CLEVEL 9
#define print(fmt,...) printf(fmt "\n",__VA_ARGS__);
void DumpBufferToDisk(const char* bufferName, void* bufferPtr, size_t bufferSize)
{
ofstream dumper;
string bufferFileName(bufferName);
bufferFileName += "-dumped.bin";
dumper.open(bufferFileName.c_str(), ios::binary | ios::out);
dumper.write((char*)bufferPtr, bufferSize);
dumper.close();
}
Stopwatch sw_simple;
Stopwatch sw_advanced;
// simple
int simple_mt()
{
print("Initializing...");
// App Code
std::ifstream file_reader("DemoFile.bin", std::ios::binary);
std::vector<unsigned char> file_buffer(std::istreambuf_iterator<char>(file_reader), {});
// Compress
std::vector<unsigned char> compressed;
compressed.resize(file_buffer.size());
print("Compressing...");
sw_simple.Start();
size_t compressed_size =
FL2_compressMt(compressed.data(), compressed.size(), file_buffer.data(), file_buffer.size(), LZMA2_COMPRESSION_LEVEL, 8);
print("Compressed");
print("Simple :: Compressed at %f ms with no error.", sw_simple.ElapsedMilliseconds());
DumpBufferToDisk("DemoFile-Simple.lzma2", compressed.data(), compressed_size);
// Decompress
std::vector<unsigned char> decompressed;
decompressed.resize(compressed_size * 100);
print("Decompressing...");
size_t decompressed_size =
FL2_decompressMt(decompressed.data(), decompressed.size(), compressed.data(), compressed.size(), 8);
print("Decompressed");
print("Simple :: Decompressed at %f ms with no error.", sw_simple.ElapsedMilliseconds());
DumpBufferToDisk("DemoFile-Simple.decompressed", decompressed.data(), decompressed_size);
print("Clean up...");
std::vector<unsigned char>().swap(compressed);
std::vector<unsigned char>().swap(decompressed);
std::vector<unsigned char>().swap(file_buffer);
print("Simple :: Executed in %f ms with no error.", sw_simple.ElapsedMilliseconds());
}
// advanced
int advanced_mt()
{
print("Initializing...");
// App Code
std::ifstream file_reader("DemoFile.bin", std::ios::binary);
std::vector<unsigned char> file_buffer(std::istreambuf_iterator<char>(file_reader), {});
// Create Compressor Engine
FL2_CCtx* cctx = FL2_createCCtxMt(8);
FL2_CCtx_setParameter(cctx, FL2_p_compressionLevel, FL2_MAX_CLEVEL); // Maximum is 10
FL2_CCtx_setParameter(cctx, FL2_p_highCompression, 9);
FL2_CCtx_setParameter(cctx, FL2_p_dictionarySize, FL2_DICTSIZE_MAX);
// Compress
std::vector<unsigned char> compressed;
compressed.resize(file_buffer.size());
print("Compressing...");
sw_advanced.Start();
size_t compressed_size =
FL2_compressCCtx(cctx, compressed.data(), compressed.size(), file_buffer.data(), file_buffer.size(), LZMA2_COMPRESSION_LEVEL);
print("Compressed");
print("Advanced :: Compressed at %f ms with no error.", sw_advanced.ElapsedMilliseconds());
DumpBufferToDisk("DemoFile-Advanced.lzma2", compressed.data(), compressed_size);
// Decompress
std::vector<unsigned char> decompressed;
decompressed.resize(compressed_size * 100);
print("Decompressing...");
size_t decompressed_size =
FL2_decompressMt(decompressed.data(), decompressed.size(), compressed.data(), compressed.size(), 8);
print("Decompressed");
print("Advanced :: Decompressed at %f ms with no error.", sw_advanced.ElapsedMilliseconds());
DumpBufferToDisk("DemoFile-Advanced.decompressed", decompressed.data(), decompressed_size);
print("Clean up...");
FL2_freeCCtx(cctx);
std::vector<unsigned char>().swap(compressed);
std::vector<unsigned char>().swap(decompressed);
std::vector<unsigned char>().swap(file_buffer);
print("Advanced :: Executed in %f ms with no error.", sw_advanced.ElapsedMilliseconds());
}
// main
int main()
{
simple_mt();
advanced_mt();
getchar();
}
@jordanvrtanoski
Copy link

So, are you executing the code multiple times, or the measurement is from only one execution?

@jordanvrtanoski
Copy link

This is the implementation of the FL2_compressMt()

FL2LIB_API size_t FL2LIB_CALL FL2_compressMt(void* dst, size_t dstCapacity,
    const void* src, size_t srcSize,
    int compressionLevel,
    unsigned nbThreads)
{
    FL2_CCtx* const cctx = FL2_createCCtxMt(nbThreads);
    if (cctx == NULL)
        return FL2_ERROR(memory_allocation);

    size_t const cSize = FL2_compressCCtx(cctx, dst, dstCapacity, src, srcSize, compressionLevel);

    FL2_freeCCtx(cctx);

    return cSize;
}

As you can see, there is nothing extra that is going on, basically it's :

  1. Create context
  2. Compress,
  3. Release context

@jordanvrtanoski
Copy link

Ok, after some playing, the impact is from the FL2_CCtx_setParameter(cctx, FL2_p_highCompression, 9);. Remove the line #85 one and the advanced is even faster. I have tested on Linux on ARM, but I assume will be same on Wondows on X86

@Bit00009
Copy link
Author

Bit00009 commented Mar 9, 2021

Ok, after some playing, the impact is from the FL2_CCtx_setParameter(cctx, FL2_p_highCompression, 9);. Remove the line #85 one and the advanced is even faster. I have tested on Linux on ARM, but I assume will be same on Wondows on X86

Oh, they became same for me on windows, so how to set dic size to 1024 or can I even set it to higher values? I don't mind speed I want highest ratio, still with 7zip LZMA2 slow mode it becomes 1.95MB

@jordanvrtanoski
Copy link

If you are not chasing the speed, set dictionary to FL2_DICTSIZE_MIN, try also to play with FL2_p_resetInterval, set FL2_p_searchDepth to max, FL2_p_strategy=2

@jordanvrtanoski
Copy link

It's more of a trial and error, since the size of output also depends on the nature of the input. More randomized input, larger the compressed file. More uniform input, less compressed size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment