Skip to content

Instantly share code, notes, and snippets.

@vendruscolo
Created October 28, 2012 15:13
Show Gist options
  • Save vendruscolo/3968865 to your computer and use it in GitHub Desktop.
Save vendruscolo/3968865 to your computer and use it in GitHub Desktop.
100MB read test

Using this file http://code.google.com/p/jquery-speedtest/downloads/detail?name=100MB.txt

This is one line full of 0s, so there are 104857600 0s.

wget http://jquery-speedtest.googlecode.com/files/100MB.txt

main.m reads the whole file at once. Doing so it occupies more than 200MB of memory (100MB for the NSString, another 100MB for the char array), as you can see here http://d.pr/i/jopC

main_memory_mapped.m maps the file in-memory, incrementally reading it; therefore its far less memory aggressive, as you can see here http://d.pr/i/YYzi

//
// main.m
// MBtest
//
// Created by Alessandro Vendruscolo on 28/10/12.
// Copyright (c) 2012 Alessandro Vendruscolo. All rights reserved.
//
#import <Foundation/Foundation.h>
int main(int argc, const char * argv[])
{
@autoreleasepool {
// a string representing the file to read
NSString *filePath = @"/Users/MJ/Desktop/100MB.txt";
// an error object
NSError *error = nil;
// our file
NSString *file = [NSString stringWithContentsOfFile:filePath encoding:NSUTF8StringEncoding error:&error];
if (error) {
NSLog(@"WTF!?: %@", [error userInfo]);
return 1;
}
// the char we have to find
const char charToFind = '0';
// keep track of how many chars we found
NSUInteger charsFound = 0;
// get a C array of chars
const char *characters = [file UTF8String];
// cache the length of the file
NSUInteger count = [file length];
// how many of them are == to the char we're looking for?
for (NSUInteger i = 0; i < count; ++i) {
if (characters[i] == charToFind) {
++charsFound;
}
}
// DONE
NSLog(@"Found %li characters", charsFound);
return 0;
}
return 0;
}
//
// main.m
// MBtest
//
// Created by Alessandro Vendruscolo on 28/10/12.
// Copyright (c) 2012 Alessandro Vendruscolo. All rights reserved.
//
#import <Foundation/Foundation.h>
int main(int argc, const char * argv[])
{
@autoreleasepool {
// a string representing the file to read
NSString *filePath = @"/Users/MJ/Desktop/100MB.txt";
// an error object
NSError *error = nil;
// our file
NSData *data = [NSData dataWithContentsOfFile:filePath
options:NSDataReadingMappedAlways
error:&error];
if (error) {
NSLog(@"WTF!?: %@", [error userInfo]);
return 1;
}
// the char we have to find
const char charToFind = '0';
// keep track of how many chars we found
NSUInteger charsFound = 0;
// we'll incrementally read the data, starting obviously from 0
NSUInteger readPointer = 0;
// cache some results
NSUInteger dataLenght = [data length];
const void *dataBytes = [data bytes];
// read as long as there's data to read
while(readPointer < dataLenght) {
// calculate how far we're from EOF
NSUInteger distanceToEndOfData = dataLenght - readPointer;
// pointer arithmetic is magic (-> start reading from this)
const void *bytes = (uint8_t *)dataBytes + readPointer;
// in every loop we want to read up to 128 kb of data (-> up to here)
NSUInteger _128kb = distanceToEndOfData > 131072 ? 131072 : distanceToEndOfData;
// get a string from said data (start reading from from bytes and read up to _128kb chars)
NSString *shortString = [[NSString alloc] initWithBytes:bytes length:_128kb encoding:NSUTF8StringEncoding];
// get the a C array of chars
const char *characters = [shortString UTF8String];
// and cache its length (it may not be _128kb)
NSUInteger count = [shortString length];
// how many of them are == to the char we're looking for?
for (NSUInteger i = 0; i < count; ++i) {
if (characters[i] == charToFind) {
++charsFound;
}
}
// advance our read pointer by the number of bytes actually read
readPointer += count;
}
// DONE!
NSLog(@"Found %li characters", charsFound);
return 0;
}
return 0;
}

Using ack:

Desktop $ purge && time ack -c '0' 100MB.txt 
1

real        0m2.146s
user        0m0.327s
sys         0m0.615s

Using Objective-C (read the whole string at once):

Release $ purge && time ./MBtest
2012-10-28 14:57:56.755 MBtest[3730:707] Found 104857600 characters

real        0m2.425s
user        0m0.898s
sys         0m0.441s

Using Objective-C (memory-mapped file):

Release $ purge && time ./MBtest
2012-10-28 16:00:25.430 MBtest[4187:707] Found 104857600 characters

real        0m2.285s
user        0m0.583s
sys         0m0.437s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment