Skip to content

Instantly share code, notes, and snippets.

@rahulmutt
Last active January 23, 2017 03:37
Show Gist options
  • Save rahulmutt/1edddd53330082dbb5c69564c51c8b2b to your computer and use it in GitHub Desktop.
Save rahulmutt/1edddd53330082dbb5c69564c51c8b2b to your computer and use it in GitHub Desktop.

Replacing imaginary/queens with the implemenation below in rahulmutt/nofib:

{-# LANGUAGE BangPatterns #-}

-- solution by Oystein Kolsrud
-- https://www.youtube.com/watch?v=I2tMmsZC1ZU
okToAdd :: Int -> [Int] -> Bool
okToAdd q qs = all (okToAddDirection q qs) [succ, pred, id]
    where
        okToAddDirection q qs f = and $ zipWith (/=) (tail (iterate f q)) qs

extendSolution n qs = map (\q -> q:qs) $ filter (\q -> okToAdd q qs) [1..n]

allSolutions !n 0 = [[]]
allSolutions !n k = concatMap (extendSolution n) (allSolutions n (k-1))


main = do
     putStr "12 Queens, "
     putStr (show $ length $ allSolutions 12 12)
     putStr " Solutions.\n"

and running this command:

nofib-runner imaginary/queens --run --jmh="-wi 10 -i 10" --way="-O2"

We get this result:

# JMH 1.15 (released 115 days ago)
# VM version: JDK 1.8.0_102, VM 25.102-b14
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/jre/bin/java
# VM options: <none>
# Warmup: 10 iterations, single-shot each
# Measurement: 10 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark: com.typelead.TestBenchmark.benchmark
# Parameters: (args = 10 +RTS)

# Run progress: 0.00% complete, ETA 00:00:00
# Fork: 1 of 1
# Warmup Iteration   1: 280582.188 ms/op
# Warmup Iteration   2: 1.463 ms/op
# Warmup Iteration   3: 1.613 ms/op
# Warmup Iteration   4: 1.647 ms/op
# Warmup Iteration   5: 1.448 ms/op
# Warmup Iteration   6: 1.539 ms/op
# Warmup Iteration   7: 1.291 ms/op
# Warmup Iteration   8: 1.251 ms/op
# Warmup Iteration   9: 1.435 ms/op
# Warmup Iteration  10: 1.014 ms/op
Iteration   1: 1.344 ms/op
Iteration   2: 1.537 ms/op
Iteration   3: 1.419 ms/op
Iteration   4: 1.478 ms/op
Iteration   5: 1.233 ms/op
Iteration   6: 1.456 ms/op
Iteration   7: 1.124 ms/op
Iteration   8: 1.273 ms/op
Iteration   9: 0.994 ms/op
Iteration  10: 1.116 ms/op


Result "benchmark":
  N = 10
  mean =      1.297 ±(99.9%) 0.272 ms/op

  Histogram, ms/op:
    [0.900, 0.950) = 0 
    [0.950, 1.000) = 1 
    [1.000, 1.050) = 0 
    [1.050, 1.100) = 0 
    [1.100, 1.150) = 2 
    [1.150, 1.200) = 0 
    [1.200, 1.250) = 1 
    [1.250, 1.300) = 1 
    [1.300, 1.350) = 1 
    [1.350, 1.400) = 0 
    [1.400, 1.450) = 1 
    [1.450, 1.500) = 2 
    [1.500, 1.550) = 1 

  Percentiles, ms/op:
      p(0.0000) =      0.994 ms/op
     p(50.0000) =      1.309 ms/op
     p(90.0000) =      1.531 ms/op
     p(95.0000) =      1.537 ms/op
     p(99.0000) =      1.537 ms/op
     p(99.9000) =      1.537 ms/op
     p(99.9900) =      1.537 ms/op
     p(99.9990) =      1.537 ms/op
     p(99.9999) =      1.537 ms/op
    p(100.0000) =      1.537 ms/op


# Run complete. Total time: 00:04:41

Benchmark                 (args)  Mode  Cnt  Score   Error  Units
TestBenchmark.benchmark  10 +RTS    ss   10  1.297 ± 0.272  ms/op

To investigate why it takes so long on the first iteration, we try:

nofib-runner imaginary/queens --run --jmh="-prof cl -wi 0 -i 10" --way="-O2"

which gives:

# JMH 1.15 (released 115 days ago)
# VM version: JDK 1.8.0_102, VM 25.102-b14
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/jre/bin/java
# VM options: <none>
# Warmup: <none>
# Measurement: 10 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark: com.typelead.TestBenchmark.benchmark
# Parameters: (args = 10 +RTS)

# Run progress: 0.00% complete, ETA 00:00:00
# Fork: 1 of 1
Iteration   1: 277928.321 ms/op
                 ·class.load:        1311033.096 classes/sec
                 ·class.load.norm:   4717.000 classes/op
                 ·class.unload:      ≈ 0 classes/sec
                 ·class.unload.norm: ≈ 0 classes/op

Iteration   2: 1.674 ms/op
                 ·class.load:        ≈ 0 classes/sec
                 ·class.load.norm:   ≈ 0 classes/op
                 ·class.unload:      ≈ 0 classes/sec
                 ·class.unload.norm: ≈ 0 classes/op

Iteration   3: 1.611 ms/op
                 ·class.load:        ≈ 0 classes/sec
                 ·class.load.norm:   ≈ 0 classes/op
                 ·class.unload:      ≈ 0 classes/sec
                 ·class.unload.norm: ≈ 0 classes/op

Iteration   4: 1.685 ms/op
                 ·class.load:        ≈ 0 classes/sec
                 ·class.load.norm:   ≈ 0 classes/op
                 ·class.unload:      ≈ 0 classes/sec
                 ·class.unload.norm: ≈ 0 classes/op

Iteration   5: 1.406 ms/op
                 ·class.load:        ≈ 0 classes/sec
                 ·class.load.norm:   ≈ 0 classes/op
                 ·class.unload:      ≈ 0 classes/sec
                 ·class.unload.norm: ≈ 0 classes/op

Iteration   6: 1.679 ms/op
                 ·class.load:        ≈ 0 classes/sec
                 ·class.load.norm:   ≈ 0 classes/op
                 ·class.unload:      ≈ 0 classes/sec
                 ·class.unload.norm: ≈ 0 classes/op

Iteration   7: 1.341 ms/op
                 ·class.load:        ≈ 0 classes/sec
                 ·class.load.norm:   ≈ 0 classes/op
                 ·class.unload:      ≈ 0 classes/sec
                 ·class.unload.norm: ≈ 0 classes/op

Iteration   8: 1.250 ms/op
                 ·class.load:        ≈ 0 classes/sec
                 ·class.load.norm:   ≈ 0 classes/op
                 ·class.unload:      ≈ 0 classes/sec
                 ·class.unload.norm: ≈ 0 classes/op

Iteration   9: 1.436 ms/op
                 ·class.load:        ≈ 0 classes/sec
                 ·class.load.norm:   ≈ 0 classes/op
                 ·class.unload:      ≈ 0 classes/sec
                 ·class.unload.norm: ≈ 0 classes/op

Iteration  10: 1.538 ms/op
                 ·class.load:        ≈ 0 classes/sec
                 ·class.load.norm:   ≈ 0 classes/op
                 ·class.unload:      ≈ 0 classes/sec
                 ·class.unload.norm: ≈ 0 classes/op



Result "benchmark":
  N = 10
  mean =  27794.194 ±(99.9%) 132874.377 ms/op

  Histogram, ms/op:
    [     0.000,  25000.000) = 9 
    [ 25000.000,  50000.000) = 0 
    [ 50000.000,  75000.000) = 0 
    [ 75000.000, 100000.000) = 0 
    [100000.000, 125000.000) = 0 
    [125000.000, 150000.000) = 0 
    [150000.000, 175000.000) = 0 
    [175000.000, 200000.000) = 0 
    [200000.000, 225000.000) = 0 
    [225000.000, 250000.000) = 0 
    [250000.000, 275000.000) = 0 

  Percentiles, ms/op:
      p(0.0000) =      1.250 ms/op
     p(50.0000) =      1.574 ms/op
     p(90.0000) = 250135.657 ms/op
     p(95.0000) = 277928.321 ms/op
     p(99.0000) = 277928.321 ms/op
     p(99.9000) = 277928.321 ms/op
     p(99.9900) = 277928.321 ms/op
     p(99.9990) = 277928.321 ms/op
     p(99.9999) = 277928.321 ms/op
    p(100.0000) = 277928.321 ms/op

Secondary result "·class.load":
  131103.310 ±(99.9%) 626793.463 classes/sec [Average]
  (min, avg, max) = (≈ 0, 131103.310, 1311033.096), stdev = 414585.067
  CI (99.9%): [≈ 0, 757896.772] (assumes normal distribution)

Secondary result "·class.load.norm":
  471.700 ±(99.9%) 2255.156 classes/op [Average]
  (min, avg, max) = (≈ 0, 471.700, 4717.000), stdev = 1491.646
  CI (99.9%): [≈ 0, 2726.856] (assumes normal distribution)

Secondary result "·class.unload":
  ≈ 0 classes/sec

Secondary result "·class.unload.norm":
  ≈ 0 classes/op


# Run complete. Total time: 00:04:38

Benchmark                                    (args)  Mode  Cnt       Score        Error        Units
TestBenchmark.benchmark                     10 +RTS    ss   10   27794.194 ± 132874.377        ms/op
TestBenchmark.benchmark:·class.load         10 +RTS    ss   10  131103.310 ± 626793.463  classes/sec
TestBenchmark.benchmark:·class.load.norm    10 +RTS    ss   10     471.700 ±   2255.156   classes/op
TestBenchmark.benchmark:·class.unload       10 +RTS    ss   10         ≈ 0               classes/sec
TestBenchmark.benchmark:·class.unload.norm  10 +RTS    ss   10         ≈ 0                classes/op

It spends 131 seconds in class loading time. And 27 seconds in the actual benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment