Skip to content

Instantly share code, notes, and snippets.

@CodingFabian
Last active August 29, 2015 14:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save CodingFabian/11261644 to your computer and use it in GitHub Desktop.
Save CodingFabian/11261644 to your computer and use it in GitHub Desktop.
Charset benchmark
package charset;
import java.io.UnsupportedEncodingException;
import java.nio.charset.StandardCharsets;
import java.util.Random;
import org.openjdk.jmh.annotations.GenerateMicroBenchmark;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
@State(Scope.Benchmark)
public class CharsetBenchmark {
String ref = "Hello Charset Benchmark";
// seeding the same randoms for both benchmarks
Random r1 = new Random(123456789l);
Random r2 = new Random(123456789l);
@GenerateMicroBenchmark
public byte[] _a_preJava7CharsetLookup() {
try {
switch (r1.nextInt(4)) {
case 0:
return ref.getBytes("UTF-8");
case 1:
return ref.getBytes("ISO-8859-1");
case 2:
return ref.getBytes("US-ASCII");
case 3:
return ref.getBytes("UTF-16");
}
} catch (UnsupportedEncodingException e) {
// errm this cannot happen. the JLS requires these charsets to exist!
}
return null;
}
@GenerateMicroBenchmark
public byte[] _b_postJava7CharsetLookup() {
switch (r2.nextInt(4)) {
case 0:
return ref.getBytes(StandardCharsets.UTF_8);
case 1:
return ref.getBytes(StandardCharsets.ISO_8859_1);
case 2:
return ref.getBytes(StandardCharsets.US_ASCII);
case 3:
return ref.getBytes(StandardCharsets.UTF_16);
}
return null;
// No Exception! YAY!
}
}
/**
* <pre>
* Benchmark Mode Samples Mean Mean error Units
* c.CharsetBenchmark._a_preJava7CharsetLookup thrpt 5 3956.537 144.562 ops/ms
* c.CharsetBenchmark._b_postJava7CharsetLookup thrpt 5 7138.064 179.101 ops/ms
* </pre>
*/
@CodingFabian
Copy link
Author

@shipilev correctly points out, that this benchmark does not scale with concurrent threads. Its a toy benchmark, also because in reality you are not randomly choosing charsets. I chose the randomness to cheat: it maximizes the cost of looking up charsets, because the last used charset will be cached in a way it is faster to access.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment