Skip to content

Instantly share code, notes, and snippets.

@0xced
Created March 23, 2012 15:02
Show Gist options
  • Star 10 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save 0xced/2171491 to your computer and use it in GitHub Desktop.
Save 0xced/2171491 to your computer and use it in GitHub Desktop.
Experiment ->isa vs object_getClass()
OPTIONS = metaclass.m -o metaclass -std=c99 -framework Foundation -arch x86_64 -Os
isa:
clang -DUSE_ISA=1 $(OPTIONS)
./metaclass
object_getClass:
clang -DUSE_ISA=0 $(OPTIONS)
./metaclass
#import <stdio.h>
#import <objc/runtime.h>
#import <mach/mach_time.h>
int main(int argc, char *argv[])
{
uint64_t totalTime = 0;
uint64_t start;
unsigned int classCount;
Class *classes = objc_copyClassList(&classCount);
printf("number of classes: %u\n", classCount);
const unsigned int experimentCount = 100000;
for (int i = 0; i < experimentCount; i++)
{
for (unsigned int j = 0; j < classCount; j++)
{
Class clazz = classes[j];
Class metaclass;
start = mach_absolute_time();
#if USE_ISA
metaclass = clazz->isa;
#else
metaclass = object_getClass(clazz);
#endif
totalTime += mach_absolute_time() - start;
}
}
#if USE_ISA
const char *method = "isa";
#else
const char *method = "object_getClass";
#endif
printf("time (%s) = %llu\n", method, totalTime / (experimentCount * classCount));
}
$ make object_getClass
clang -DUSE_ISA=0 metaclass.m -o metaclass -std=c99 -framework Foundation -arch x86_64 -Os
./metaclass
number of classes: 671
time (object_getClass) = 22
$ make isa
clang -DUSE_ISA=1 metaclass.m -o metaclass -std=c99 -framework Foundation -arch x86_64 -Os
./metaclass
number of classes: 671
time (isa) = 19
@cfsi
Copy link

cfsi commented Aug 22, 2014

This is a very nice experiment, but the difference between the two sample runs does not indicate what one might think; in short, I don't think object_getClass is just 15% slower than isa.

For reference, running the exact code above on my machine yields (showing as time : number of classes):

  • 22 : 834 for isa
  • 24 : 834 for object_getClass

If I remove the the timed code that assigns metaclass (lines 21-25), leaving:

start = mach_absolute_time();
totalTime += mach_absolute_time() - start;

and run the timing, I get:

  • 22: 834

That is, it appears to take 22 "absolute time units" to do absolutely nothing (the same as the previous timing I got for isa). What is really taking so long?

mach_absolute_time() is actually a system call (call into the kernel) which has nonzero overhead - those calls alone take substantially more time than the code we are actually trying to measure. This code is actually including the exit + entry times for that system call. That implies a single call to mach_absolute_time() itself takes 22 units. We can check by modifying the code again:

start = mach_absolute_time();
mach_absolute_time();
totalTime += mach_absolute_time() - start;
  • 44 : 834

That is, 44 minus our baseline of 22 units leaves 22 units, as expected.

Naively subtracting out that 22 units we don't want from the original measurements, we're left with 2 units for object_getClass and 0 units for isa. It's not really quite so simple, but this indicates the difference in performance between the two ways of getting the object class is much larger than 15%... Which is expected since ->isa is just a memory access, while object_getclass() is doing a fair amount more.

I leave determining how to measure the performance ratio more accurately as an exercise for the reader. :-)

However, it is worth noting:

  • When using timing calls to measure intervals, it's imporant that the overhead of those timing calls be significantly less than the thing being measured.
  • The significance of whatever the difference in performance is needs to be measured in context. In my case, I was led here by concerns over changes to JSONKit to remove the deprecated isa. So really, the way to test this is by encoding/decoding a large number of objects into/out of JSON. I've so far been unable to find any measureable difference between the two in that context, indicating (assuming I don't have errors in my experimental setup) the difference between the two methods is insignificant on the scale of all the work being done - regardless of whether one call is taking 10 or 100 times longer than the other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment