Skip to content

Instantly share code, notes, and snippets.

@kellabyte
Last active December 14, 2015 20:49
Show Gist options
  • Save kellabyte/5146568 to your computer and use it in GitHub Desktop.
Save kellabyte/5146568 to your computer and use it in GitHub Desktop.
void TestReadingMMap()
{
double bytesRead = 0;
double elapsed = 0;
mapped_file_source file;
__int64 checksum = 0;
{
auto_cpu_timer timer;
file.open("C:\SomeBigFile.dat", 2147483647);
stream<mapped_file_source> input(file);
if(file.is_open())
{
const int segmentSize = 4096;
const int size = file.size();
const int remainder = file.size() % segmentSize;
const int segmentLoops = file.size() / segmentSize;
char bytes[segmentSize];
for (int x=0; x<segmentLoops; x++)
{
input.read(bytes, segmentSize);
#pragma unroll 4096
for (int i=0; i<segmentSize; i++)
{
// This block reduces IO rate from 1.6GB/s
// to 1GB/s loosing 600MB/s.
checksum += bytes[i];
}
bytesRead += segmentSize;
}
// Moving this out here rather than a condition in the loop above
// reduced branch mispredictions.
if (remainder > 0)
{
input.read(bytes, remainder);
for (int i=0; i<remainder; i++)
{
checksum += bytes[i];
}
bytesRead += remainder;
}
input.close();
file.close(); // Unmap the file.
if (checksum == 43089565243)
{
cout << "Checksum passed" << endl;
}
}
else
{
cout << "could not map the file" << std::endl;
}
elapsed = timer.elapsed().wall;
}
cout << bytesRead / 1048576 << "MB at " << bytesRead / 1048576 / (elapsed/ 1000000000) << " MB/s" << endl;
cout << "Checksum: " << checksum << endl;
}
@talisein
Copy link

@MorganPersson Presumably that's what the #pragma unroll does.

@jrwren
Copy link

jrwren commented Mar 13, 2013

I achieved great speedup by using an array of 64bit int and doing 64bit addition instead of 8 times as many 8bit addition
.

    if(file.is_open()) 
    {
        const int segmentSize = 4096;
        const int size = file.size();
        const int remainder = file.size() % segmentSize;
        const int segmentLoops = file.size() / segmentSize;
        __int64 value = 0;

        __int64 bytes[segmentSize/sizeof(__int64)];
        for (int x=0; x<segmentLoops; x++)
        {
            input.read((char*)bytes, segmentSize);
            #pragma unroll 4096
            for (int i=0; i<segmentSize/sizeof(__int64); i++)
            {
                // This block reduces IO rate from 1.6GB/s 
                // to 1GB/s loosing 600MB/s.
                // JRW: original was 220MB/s to 80MB/s on my old system

                checksum += bytes[i];
                // instead of adding lots of 8bit numbers we add 64bit numbers and I keep my 220MB/s
            }
            bytesRead += segmentSize;
        }

        // Moving this out here rather than a condition in the loop above 
        // reduced branch mispredictions.
        if (remainder > 0)
        {
            input.read((char*)bytes, remainder);
            for (int i=0; i<remainder/sizeof(__int64); i++)
            {
                checksum += bytes[i];
            }
            bytesRead += remainder;
        }

        input.close();
        file.close(); // Unmap the file.

        if (checksum == 43089565243)
        {
            cout << "Checksum passed" << endl;
        }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment