Skip to content

Instantly share code, notes, and snippets.

@tanmaykm
Last active August 29, 2015 14:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tanmaykm/856bc6007c702c985a12 to your computer and use it in GitHub Desktop.
Save tanmaykm/856bc6007c702c985a12 to your computer and use it in GitHub Desktop.
sparse get methods
S = nnz rows per column of the sparse matrix
I = num rows being indexed
J = num cols being indexed
sorted = indexing time with sorted I and J
unsorted = index time with unsorted I and J
S | I | J | sorted | unsorted
1559 | 16384 | 16384 | 3.21e-01 | 2.16e+00
1559 | 16384 | 1638 | 3.29e-02 | 2.27e-01
1559 | 16384 | 164 | 2.96e-03 | 2.26e-02
1559 | 16384 | 16 | 7.42e-04 | 2.52e-03
1559 | 16384 | 2 | 2.29e-04 | 5.69e-04
1559 | 16384 | 0 | 2.26e-04 | 3.25e-04
1559 | 1638 | 16384 | 2.91e-01 | 4.54e-01
1559 | 1638 | 1638 | 2.84e-02 | 4.56e-02
1559 | 1638 | 164 | 1.87e-03 | 4.11e-03
1559 | 1638 | 16 | 2.15e-04 | 3.57e-04
1559 | 1638 | 2 | 7.00e-05 | 1.16e-04
1559 | 1638 | 0 | 4.40e-05 | 6.00e-05
1559 | 164 | 16384 | 2.41e-01 | 2.50e-01
1559 | 164 | 1638 | 1.77e-02 | 2.27e-02
1559 | 164 | 164 | 1.27e-03 | 1.54e-03
1559 | 164 | 16 | 1.56e-04 | 1.97e-04
1559 | 164 | 2 | 4.40e-05 | 5.40e-05
1559 | 164 | 0 | 2.50e-05 | 3.30e-05
1559 | 16 | 16384 | 2.18e-01 | 2.28e-01
1559 | 16 | 1638 | 1.38e-02 | 1.92e-02
1559 | 16 | 164 | 1.28e-03 | 1.48e-03
1559 | 16 | 16 | 2.01e-04 | 1.56e-04
1559 | 16 | 2 | 4.00e-05 | 4.80e-05
1559 | 16 | 0 | 2.40e-05 | 3.00e-05
1559 | 2 | 16384 | 2.09e-01 | 2.27e-01
1559 | 2 | 1638 | 1.41e-02 | 1.98e-02
1559 | 2 | 164 | 1.27e-03 | 1.42e-03
1559 | 2 | 16 | 1.47e-04 | 1.53e-04
1559 | 2 | 2 | 3.90e-05 | 4.60e-05
1559 | 2 | 0 | 2.30e-05 | 2.90e-05
1559 | 0 | 16384 | 1.82e-01 | 1.92e-01
1559 | 0 | 1638 | 1.28e-02 | 1.67e-02
1559 | 0 | 164 | 1.27e-03 | 1.37e-03
1559 | 0 | 16 | 1.47e-04 | 1.55e-04
1559 | 0 | 2 | 3.90e-05 | 4.50e-05
1559 | 0 | 0 | 2.30e-05 | 2.80e-05
163 | 16384 | 16384 | 4.23e-02 | 2.73e-01
163 | 16384 | 1638 | 2.68e-03 | 1.89e-02
163 | 16384 | 164 | 4.55e-04 | 1.94e-03
163 | 16384 | 16 | 2.24e-04 | 4.69e-04
163 | 16384 | 2 | 2.07e-04 | 3.41e-04
163 | 16384 | 0 | 2.07e-04 | 3.22e-04
163 | 1638 | 16384 | 2.52e-02 | 4.72e-02
163 | 1638 | 1638 | 2.07e-03 | 4.02e-03
163 | 1638 | 164 | 2.78e-04 | 3.50e-04
163 | 1638 | 16 | 6.80e-05 | 9.10e-05
163 | 1638 | 2 | 4.80e-05 | 6.40e-05
163 | 1638 | 0 | 4.30e-05 | 5.90e-05
163 | 164 | 16384 | 1.51e-02 | 2.45e-02
163 | 164 | 1638 | 1.46e-03 | 1.93e-03
163 | 164 | 164 | 1.76e-04 | 1.89e-04
163 | 164 | 16 | 4.10e-05 | 4.70e-05
163 | 164 | 2 | 2.70e-05 | 3.50e-05
163 | 164 | 0 | 2.50e-05 | 3.30e-05
163 | 16 | 16384 | 1.26e-02 | 1.75e-02
163 | 16 | 1638 | 1.38e-03 | 1.52e-03
163 | 16 | 164 | 1.67e-04 | 1.77e-04
163 | 16 | 16 | 3.70e-05 | 4.20e-05
163 | 16 | 2 | 2.60e-05 | 5.70e-05
163 | 16 | 0 | 2.40e-05 | 2.90e-05
163 | 2 | 16384 | 1.25e-02 | 1.47e-02
163 | 2 | 1638 | 1.39e-03 | 1.47e-03
163 | 2 | 164 | 1.67e-04 | 1.74e-04
163 | 2 | 16 | 3.70e-05 | 4.30e-05
163 | 2 | 2 | 2.50e-05 | 2.90e-05
163 | 2 | 0 | 2.30e-05 | 2.80e-05
163 | 0 | 16384 | 1.25e-02 | 1.44e-02
163 | 0 | 1638 | 1.47e-03 | 1.47e-03
163 | 0 | 164 | 1.67e-04 | 1.73e-04
163 | 0 | 16 | 3.70e-05 | 4.20e-05
163 | 0 | 2 | 2.50e-05 | 3.00e-05
163 | 0 | 0 | 2.20e-05 | 2.80e-05
16 | 16384 | 16384 | 2.41e-03 | 1.13e-02
16 | 16384 | 1638 | 4.69e-04 | 1.39e-03
16 | 16384 | 164 | 2.58e-04 | 4.37e-04
16 | 16384 | 16 | 2.41e-04 | 3.37e-04
16 | 16384 | 2 | 2.37e-04 | 3.26e-04
16 | 16384 | 0 | 2.34e-04 | 3.21e-04
16 | 1638 | 16384 | 2.65e-03 | 4.22e-03
16 | 1638 | 1638 | 3.76e-04 | 4.75e-04
16 | 1638 | 164 | 7.80e-05 | 9.80e-05
16 | 1638 | 16 | 4.80e-05 | 6.50e-05
16 | 1638 | 2 | 4.40e-05 | 5.90e-05
16 | 1638 | 0 | 4.40e-05 | 5.90e-05
16 | 164 | 16384 | 1.71e-03 | 2.96e-03
16 | 164 | 1638 | 2.70e-04 | 3.56e-04
16 | 164 | 164 | 5.10e-05 | 6.00e-05
16 | 164 | 16 | 2.90e-05 | 3.50e-05
16 | 164 | 2 | 2.60e-05 | 3.30e-05
16 | 164 | 0 | 2.60e-05 | 3.10e-05
16 | 16 | 16384 | 1.63e-03 | 3.62e-03
16 | 16 | 1638 | 2.61e-04 | 3.09e-04
16 | 16 | 164 | 4.70e-05 | 5.50e-05
16 | 16 | 16 | 2.70e-05 | 3.20e-05
16 | 16 | 2 | 2.40e-05 | 2.90e-05
16 | 16 | 0 | 2.30e-05 | 2.90e-05
16 | 2 | 16384 | 1.62e-03 | 2.77e-03
16 | 2 | 1638 | 2.57e-04 | 3.04e-04
16 | 2 | 164 | 4.60e-05 | 5.40e-05
16 | 2 | 16 | 2.60e-05 | 3.10e-05
16 | 2 | 2 | 2.30e-05 | 2.80e-05
16 | 2 | 0 | 2.30e-05 | 2.80e-05
16 | 0 | 16384 | 1.75e-03 | 2.76e-03
16 | 0 | 1638 | 2.61e-04 | 3.03e-04
16 | 0 | 164 | 4.70e-05 | 5.50e-05
16 | 0 | 16 | 2.60e-05 | 3.10e-05
16 | 0 | 2 | 2.30e-05 | 2.80e-05
16 | 0 | 0 | 2.20e-05 | 2.80e-05
2 | 16384 | 16384 | 7.25e-04 | 1.74e-03
2 | 16384 | 1638 | 2.64e-04 | 4.56e-04
2 | 16384 | 164 | 2.10e-04 | 3.37e-04
2 | 16384 | 16 | 2.04e-04 | 3.26e-04
2 | 16384 | 2 | 2.07e-04 | 3.22e-04
2 | 16384 | 0 | 2.01e-04 | 3.22e-04
2 | 1638 | 16384 | 1.56e-03 | 2.32e-03
2 | 1638 | 1638 | 1.43e-04 | 1.74e-04
2 | 1638 | 164 | 5.60e-05 | 7.30e-05
2 | 1638 | 16 | 4.70e-05 | 6.00e-05
2 | 1638 | 2 | 4.50e-05 | 5.90e-05
2 | 1638 | 0 | 4.40e-05 | 5.90e-05
2 | 164 | 16384 | 7.15e-04 | 1.06e-03
2 | 164 | 1638 | 1.08e-04 | 1.35e-04
2 | 164 | 164 | 3.50e-05 | 4.40e-05
2 | 164 | 16 | 2.70e-05 | 3.40e-05
2 | 164 | 2 | 2.60e-05 | 3.30e-05
2 | 164 | 0 | 2.60e-05 | 3.20e-05
2 | 16 | 16384 | 7.02e-04 | 1.03e-03
2 | 16 | 1638 | 1.05e-04 | 1.30e-04
2 | 16 | 164 | 3.30e-05 | 4.00e-05
2 | 16 | 16 | 2.50e-05 | 3.00e-05
2 | 16 | 2 | 2.40e-05 | 3.00e-05
2 | 16 | 0 | 2.40e-05 | 2.90e-05
2 | 2 | 16384 | 7.02e-04 | 9.68e-04
2 | 2 | 1638 | 1.04e-04 | 1.23e-04
2 | 2 | 164 | 3.20e-05 | 3.90e-05
2 | 2 | 16 | 2.40e-05 | 3.00e-05
2 | 2 | 2 | 2.40e-05 | 2.80e-05
2 | 2 | 0 | 2.30e-05 | 2.80e-05
2 | 0 | 16384 | 7.00e-04 | 9.77e-04
2 | 0 | 1638 | 1.04e-04 | 1.23e-04
2 | 0 | 164 | 3.30e-05 | 3.80e-05
2 | 0 | 16 | 2.40e-05 | 3.00e-05
2 | 0 | 2 | 2.30e-05 | 2.80e-05
2 | 0 | 0 | 2.20e-05 | 2.80e-05
0 | 16384 | 16384 | 4.66e-04 | 7.39e-04
0 | 16384 | 1638 | 2.67e-04 | 3.59e-04
0 | 16384 | 164 | 2.37e-04 | 3.26e-04
0 | 16384 | 16 | 2.34e-04 | 3.24e-04
0 | 16384 | 2 | 2.34e-04 | 3.22e-04
0 | 16384 | 0 | 3.41e-04 | 3.22e-04
0 | 1638 | 16384 | 1.62e-01 | 1.62e-01
0 | 1638 | 1638 | 1.62e-02 | 1.63e-02
0 | 1638 | 164 | 1.67e-03 | 1.77e-03
0 | 1638 | 16 | 2.07e-04 | 3.02e-04
0 | 1638 | 2 | 6.90e-05 | 1.63e-04
0 | 1638 | 0 | 4.90e-05 | 1.63e-04
0 | 164 | 16384 | 1.65e-02 | 1.67e-02
0 | 164 | 1638 | 1.66e-03 | 1.70e-03
0 | 164 | 164 | 1.76e-04 | 1.86e-04
0 | 164 | 16 | 2.60e-05 | 3.70e-05
0 | 164 | 2 | 1.20e-05 | 2.20e-05
0 | 164 | 0 | 9.00e-06 | 2.00e-05
0 | 16 | 16384 | 1.80e-03 | 1.93e-03
0 | 16 | 1638 | 1.89e-04 | 2.02e-04
0 | 16 | 164 | 2.40e-05 | 3.00e-05
0 | 16 | 16 | 7.00e-06 | 1.10e-05
0 | 16 | 2 | 6.00e-06 | 1.00e-05
0 | 16 | 0 | 5.00e-06 | 9.00e-06
0 | 2 | 16384 | 4.57e-04 | 6.15e-04
0 | 2 | 1638 | 5.70e-05 | 7.20e-05
0 | 2 | 164 | 1.00e-05 | 1.60e-05
0 | 2 | 16 | 6.00e-06 | 1.00e-05
0 | 2 | 2 | 4.00e-06 | 8.00e-06
0 | 2 | 0 | 4.00e-06 | 9.00e-06
0 | 0 | 16384 | 3.25e-04 | 4.58e-04
0 | 0 | 1638 | 4.20e-05 | 5.50e-05
0 | 0 | 164 | 8.00e-06 | 1.40e-05
0 | 0 | 16 | 5.00e-06 | 8.00e-06
0 | 0 | 2 | 4.00e-06 | 7.00e-06
0 | 0 | 0 | 4.00e-06 | 7.00e-06
S = nnz rows per column of the sparse matrix
I = num rows being indexed
J = num cols being indexed
sorted = indexing time with sorted I and J
unsorted = indexing time with unsorted I and J in different algos
unsorted, lin/bin = linear and binary search with a transpose
unsorted, sorted = sorted getindex followed by permute_rows! (proposed)
unsorted, linear = linear search with a transpose (existing)
S | I | J | sorted | unsorted | diff
| | | | lin/bin | sorted | linear | fastest | existing/proposed
16384 | 16384 | 16384 | 5.51e+00 | 4.26e+01 | 3.63e+01 | 3.78e+01 | sorted | 1.04e+00
16384 | 16384 | 1638 | 6.06e-01 | 4.44e+00 | 4.32e+00 | 4.09e+00 | linear | 9.48e-01
16384 | 16384 | 164 | 3.87e-02 | 4.60e-01 | 4.29e-01 | 4.96e-01 | sorted | 1.16e+00
16384 | 16384 | 16 | 3.70e-03 | 3.95e-02 | 4.18e-02 | 4.46e-02 | lin/bin | 1.07e+00
16384 | 16384 | 2 | 5.54e-04 | 6.11e-03 | 7.27e-03 | 5.54e-03 | linear | 7.61e-01
16384 | 16384 | 0 | 7.32e-05 | 2.46e-03 | 2.21e-03 | 2.43e-03 | sorted | 1.10e+00
16384 | 1638 | 16384 | 4.41e+00 | 6.65e+00 | 6.95e+00 | 5.41e+00 | linear | 7.78e-01
16384 | 1638 | 1638 | 4.20e-01 | 6.72e-01 | 6.56e-01 | 5.81e-01 | linear | 8.87e-01
16384 | 1638 | 164 | 4.16e-02 | 5.59e-02 | 6.50e-02 | 4.91e-02 | linear | 7.56e-01
16384 | 1638 | 16 | 3.80e-03 | 5.65e-03 | 6.22e-03 | 5.03e-03 | linear | 8.09e-01
16384 | 1638 | 2 | 5.08e-04 | 6.93e-04 | 1.06e-03 | 5.91e-04 | linear | 5.56e-01
16384 | 1638 | 0 | 1.71e-05 | 2.24e-04 | 1.63e-04 | 1.89e-04 | sorted | 1.16e+00
16384 | 164 | 16384 | 1.37e+00 | 2.68e+00 | 1.59e+00 | 2.39e+00 | sorted | 1.50e+00
16384 | 164 | 1638 | 1.38e-01 | 2.52e-01 | 1.59e-01 | 2.25e-01 | sorted | 1.41e+00
16384 | 164 | 164 | 1.22e-02 | 2.32e-02 | 1.44e-02 | 1.91e-02 | sorted | 1.33e+00
16384 | 164 | 16 | 1.11e-03 | 2.12e-03 | 1.35e-03 | 1.62e-03 | sorted | 1.20e+00
16384 | 164 | 2 | 1.74e-04 | 3.40e-04 | 2.10e-04 | 2.67e-04 | sorted | 1.28e+00
16384 | 164 | 0 | 1.23e-05 | 7.90e-05 | 2.80e-05 | 3.76e-05 | sorted | 1.34e+00
16384 | 16 | 16384 | 3.38e-01 | 1.95e+00 | 3.84e-01 | 1.59e+00 | sorted | 4.14e+00
16384 | 16 | 1638 | 3.40e-02 | 1.88e-01 | 4.13e-02 | 1.62e-01 | sorted | 3.93e+00
16384 | 16 | 164 | 2.38e-03 | 1.67e-02 | 2.84e-03 | 1.14e-02 | sorted | 4.02e+00
16384 | 16 | 16 | 2.40e-04 | 1.62e-03 | 3.12e-04 | 1.39e-03 | sorted | 4.44e+00
16384 | 16 | 2 | 4.06e-05 | 2.54e-04 | 5.68e-05 | 1.65e-04 | sorted | 2.91e+00
16384 | 16 | 0 | 1.01e-05 | 6.64e-05 | 1.89e-05 | 2.52e-05 | sorted | 1.33e+00
16384 | 2 | 16384 | 1.86e-02 | 1.60e+00 | 5.00e-02 | 8.98e-01 | sorted | 1.80e+01
16384 | 2 | 1638 | 1.90e-03 | 1.57e-01 | 5.00e-03 | 9.32e-02 | sorted | 1.86e+01
16384 | 2 | 164 | 2.57e-04 | 1.39e-02 | 5.93e-04 | 7.87e-03 | sorted | 1.33e+01
16384 | 2 | 16 | 3.34e-05 | 1.43e-03 | 7.68e-05 | 6.21e-04 | sorted | 8.08e+00
16384 | 2 | 2 | 1.54e-05 | 2.16e-04 | 3.02e-05 | 1.21e-04 | sorted | 4.01e+00
16384 | 2 | 0 | 1.29e-05 | 6.71e-05 | 1.98e-05 | 2.95e-05 | sorted | 1.49e+00
16384 | 0 | 16384 | 3.23e-04 | 9.85e-01 | 5.17e-04 | 2.00e-04 | linear | 3.86e-01
16384 | 0 | 1638 | 8.80e-05 | 1.01e-01 | 1.20e-04 | 4.72e-05 | linear | 3.92e-01
16384 | 0 | 164 | 2.02e-05 | 9.29e-03 | 2.96e-05 | 2.76e-05 | linear | 9.33e-01
16384 | 0 | 16 | 1.37e-05 | 9.68e-04 | 1.90e-05 | 2.57e-05 | sorted | 1.35e+00
16384 | 0 | 2 | 1.03e-05 | 1.74e-04 | 1.70e-05 | 2.33e-05 | sorted | 1.37e+00
16384 | 0 | 0 | 1.05e-05 | 6.31e-05 | 1.72e-05 | 2.42e-05 | sorted | 1.40e+00
1639 | 16384 | 16384 | 6.31e-01 | 4.32e+00 | 3.57e+00 | 6.49e+00 | sorted | 1.82e+00
1639 | 16384 | 1638 | 4.37e-02 | 3.87e-01 | 2.85e-01 | 6.05e-01 | sorted | 2.12e+00
1639 | 16384 | 164 | 4.83e-03 | 3.34e-02 | 3.01e-02 | 5.22e-02 | sorted | 1.73e+00
1639 | 16384 | 16 | 1.05e-03 | 6.53e-03 | 5.10e-03 | 7.68e-03 | sorted | 1.51e+00
1639 | 16384 | 2 | 4.22e-04 | 2.73e-03 | 2.49e-03 | 2.63e-03 | sorted | 1.06e+00
1639 | 16384 | 0 | 1.57e-04 | 2.12e-03 | 1.90e-03 | 2.02e-03 | sorted | 1.06e+00
1639 | 1638 | 16384 | 7.37e-01 | 1.02e+00 | 9.69e-01 | 1.13e+00 | sorted | 1.17e+00
1639 | 1638 | 1638 | 7.49e-02 | 9.45e-02 | 9.57e-02 | 1.11e-01 | lin/bin | 1.16e+00
1639 | 1638 | 164 | 7.39e-03 | 8.88e-03 | 9.43e-03 | 1.14e-02 | lin/bin | 1.21e+00
1639 | 1638 | 16 | 7.50e-04 | 1.04e-03 | 1.13e-03 | 1.15e-03 | lin/bin | 1.02e+00
1639 | 1638 | 2 | 1.21e-04 | 3.33e-04 | 3.06e-04 | 3.23e-04 | sorted | 1.06e+00
1639 | 1638 | 0 | 1.74e-05 | 2.26e-04 | 1.66e-04 | 1.88e-04 | sorted | 1.14e+00
1639 | 164 | 16384 | 3.88e-01 | 3.20e-01 | 4.49e-01 | 2.84e-01 | linear | 6.33e-01
1639 | 164 | 1638 | 3.86e-02 | 2.93e-02 | 4.30e-02 | 2.74e-02 | linear | 6.37e-01
1639 | 164 | 164 | 3.66e-03 | 2.63e-03 | 4.10e-03 | 2.36e-03 | linear | 5.75e-01
1639 | 164 | 16 | 3.73e-04 | 3.61e-04 | 4.44e-04 | 2.76e-04 | linear | 6.23e-01
1639 | 164 | 2 | 5.81e-05 | 1.06e-04 | 8.21e-05 | 6.71e-05 | linear | 8.17e-01
1639 | 164 | 0 | 1.20e-05 | 7.79e-05 | 2.91e-05 | 3.74e-05 | sorted | 1.29e+00
1639 | 16 | 16384 | 1.22e-01 | 2.59e-01 | 1.50e-01 | 1.97e-01 | sorted | 1.31e+00
1639 | 16 | 1638 | 9.57e-03 | 2.24e-02 | 1.23e-02 | 1.60e-02 | sorted | 1.31e+00
1639 | 16 | 164 | 9.06e-04 | 2.12e-03 | 1.22e-03 | 1.40e-03 | sorted | 1.15e+00
1639 | 16 | 16 | 1.01e-04 | 2.47e-04 | 1.39e-04 | 1.75e-04 | sorted | 1.26e+00
1639 | 16 | 2 | 2.21e-05 | 8.78e-05 | 3.51e-05 | 3.94e-05 | sorted | 1.12e+00
1639 | 16 | 0 | 1.01e-05 | 6.60e-05 | 1.76e-05 | 2.54e-05 | sorted | 1.44e+00
1639 | 2 | 16384 | 2.35e-02 | 1.87e-01 | 2.98e-02 | 1.04e-01 | sorted | 3.48e+00
1639 | 2 | 1638 | 1.76e-03 | 1.67e-02 | 2.56e-03 | 7.73e-03 | sorted | 3.02e+00
1639 | 2 | 164 | 2.00e-04 | 1.58e-03 | 2.61e-04 | 7.17e-04 | sorted | 2.74e+00
1639 | 2 | 16 | 3.19e-05 | 2.48e-04 | 4.68e-05 | 1.22e-04 | sorted | 2.61e+00
1639 | 2 | 2 | 1.43e-05 | 7.90e-05 | 2.01e-05 | 3.25e-05 | sorted | 1.61e+00
1639 | 2 | 0 | 1.05e-05 | 6.10e-05 | 1.76e-05 | 2.46e-05 | sorted | 1.39e+00
1639 | 0 | 16384 | 2.75e-04 | 1.31e-01 | 5.05e-04 | 1.96e-04 | linear | 3.88e-01
1639 | 0 | 1638 | 8.13e-05 | 1.35e-02 | 1.17e-04 | 4.73e-05 | linear | 4.04e-01
1639 | 0 | 164 | 1.99e-05 | 1.31e-03 | 2.69e-05 | 2.66e-05 | linear | 9.90e-01
1639 | 0 | 16 | 1.06e-05 | 1.92e-04 | 1.75e-05 | 2.40e-05 | sorted | 1.37e+00
1639 | 0 | 2 | 9.54e-06 | 7.86e-05 | 1.75e-05 | 2.51e-05 | sorted | 1.43e+00
1639 | 0 | 0 | 9.95e-06 | 6.21e-05 | 1.66e-05 | 2.30e-05 | sorted | 1.39e+00
164 | 16384 | 16384 | 6.35e-02 | 4.81e-01 | 3.07e-01 | 2.03e+00 | sorted | 6.61e+00
164 | 16384 | 1638 | 8.11e-03 | 4.02e-02 | 3.04e-02 | 1.93e-01 | sorted | 6.34e+00
164 | 16384 | 164 | 1.47e-03 | 7.18e-03 | 5.14e-03 | 2.10e-02 | sorted | 4.09e+00
164 | 16384 | 16 | 3.95e-04 | 2.66e-03 | 2.38e-03 | 3.84e-03 | sorted | 1.61e+00
164 | 16384 | 2 | 1.96e-04 | 2.20e-03 | 2.02e-03 | 2.26e-03 | sorted | 1.12e+00
164 | 16384 | 0 | 1.50e-04 | 2.11e-03 | 1.96e-03 | 2.03e-03 | sorted | 1.03e+00
164 | 1638 | 16384 | 3.76e-02 | 8.91e-02 | 9.88e-02 | 2.91e-01 | lin/bin | 2.94e+00
164 | 1638 | 1638 | 5.48e-03 | 9.26e-03 | 1.03e-02 | 2.95e-02 | lin/bin | 2.87e+00
164 | 1638 | 164 | 1.03e-03 | 1.44e-03 | 1.60e-03 | 3.15e-03 | lin/bin | 1.97e+00
164 | 1638 | 16 | 2.71e-04 | 4.51e-04 | 4.53e-04 | 4.56e-04 | lin/bin | 1.01e+00
164 | 1638 | 2 | 1.27e-04 | 2.95e-04 | 2.76e-04 | 2.21e-04 | linear | 8.02e-01
164 | 1638 | 0 | 1.00e-04 | 2.68e-04 | 2.43e-04 | 1.88e-04 | linear | 7.75e-01
164 | 164 | 16384 | 7.15e-02 | 6.26e-02 | 1.04e-01 | 8.97e-02 | lin/bin | 8.63e-01
164 | 164 | 1638 | 7.44e-03 | 6.37e-03 | 1.07e-02 | 9.27e-03 | lin/bin | 8.62e-01
164 | 164 | 164 | 8.01e-04 | 7.31e-04 | 1.13e-03 | 9.78e-04 | lin/bin | 8.69e-01
164 | 164 | 16 | 9.04e-05 | 1.38e-04 | 1.32e-04 | 1.28e-04 | linear | 9.68e-01
164 | 164 | 2 | 6.00e-05 | 8.18e-05 | 4.39e-05 | 4.80e-05 | sorted | 1.09e+00
164 | 164 | 0 | 1.10e-05 | 7.70e-05 | 3.14e-05 | 3.83e-05 | sorted | 1.22e+00
164 | 16 | 16384 | 1.77e-02 | 3.21e-02 | 3.19e-02 | 3.01e-02 | linear | 9.44e-01
164 | 16 | 1638 | 2.11e-03 | 3.37e-03 | 3.34e-03 | 2.92e-03 | linear | 8.75e-01
164 | 16 | 164 | 2.61e-04 | 4.11e-04 | 3.53e-04 | 3.19e-04 | linear | 9.03e-01
164 | 16 | 16 | 3.71e-05 | 9.39e-05 | 4.54e-05 | 5.40e-05 | sorted | 1.19e+00
164 | 16 | 2 | 1.37e-05 | 6.71e-05 | 2.13e-05 | 2.87e-05 | sorted | 1.35e+00
164 | 16 | 0 | 1.00e-05 | 6.39e-05 | 1.70e-05 | 2.58e-05 | sorted | 1.52e+00
164 | 2 | 16384 | 9.81e-03 | 2.40e-02 | 1.39e-02 | 1.36e-02 | linear | 9.81e-01
164 | 2 | 1638 | 1.12e-03 | 2.47e-03 | 1.51e-03 | 1.48e-03 | linear | 9.77e-01
164 | 2 | 164 | 1.54e-04 | 3.14e-04 | 1.86e-04 | 1.77e-04 | linear | 9.50e-01
164 | 2 | 16 | 2.44e-05 | 8.77e-05 | 3.28e-05 | 4.01e-05 | sorted | 1.22e+00
164 | 2 | 2 | 1.20e-05 | 6.49e-05 | 1.66e-05 | 3.06e-05 | sorted | 1.84e+00
164 | 2 | 0 | 1.01e-05 | 6.33e-05 | 1.65e-05 | 2.37e-05 | sorted | 1.44e+00
164 | 0 | 16384 | 1.31e-04 | 2.12e-02 | 2.44e-04 | 1.96e-04 | linear | 8.06e-01
164 | 0 | 1638 | 2.72e-05 | 2.92e-03 | 4.15e-05 | 4.82e-05 | sorted | 1.16e+00
164 | 0 | 164 | 1.35e-05 | 3.41e-04 | 1.90e-05 | 2.90e-05 | sorted | 1.53e+00
164 | 0 | 16 | 9.88e-06 | 9.10e-05 | 1.67e-05 | 2.82e-05 | sorted | 1.69e+00
164 | 0 | 2 | 1.07e-05 | 6.42e-05 | 1.77e-05 | 2.49e-05 | sorted | 1.41e+00
164 | 0 | 0 | 9.68e-06 | 6.16e-05 | 1.80e-05 | 2.40e-05 | sorted | 1.33e+00
16 | 16384 | 16384 | 7.30e-03 | 4.71e-02 | 5.30e-02 | 1.35e+00 | lin/bin | 2.55e+01
16 | 16384 | 1638 | 1.96e-03 | 7.24e-03 | 7.89e-03 | 1.37e-01 | lin/bin | 1.73e+01
16 | 16384 | 164 | 4.59e-04 | 2.72e-03 | 2.65e-03 | 1.55e-02 | sorted | 5.83e+00
16 | 16384 | 16 | 2.31e-04 | 2.20e-03 | 2.00e-03 | 3.38e-03 | sorted | 1.69e+00
16 | 16384 | 2 | 1.60e-04 | 2.14e-03 | 1.94e-03 | 2.18e-03 | sorted | 1.12e+00
16 | 16384 | 0 | 1.55e-04 | 2.12e-03 | 1.90e-03 | 2.07e-03 | sorted | 1.09e+00
16 | 1638 | 16384 | 4.19e-03 | 1.15e-02 | 3.27e-02 | 1.49e-01 | lin/bin | 4.56e+00
16 | 1638 | 1638 | 1.29e-03 | 1.81e-03 | 4.01e-03 | 1.54e-02 | lin/bin | 3.84e+00
16 | 1638 | 164 | 3.25e-04 | 5.13e-04 | 7.05e-04 | 1.74e-03 | lin/bin | 2.47e+00
16 | 1638 | 16 | 1.32e-04 | 2.99e-04 | 2.96e-04 | 3.35e-04 | sorted | 1.13e+00
16 | 1638 | 2 | 1.04e-04 | 2.73e-04 | 2.51e-04 | 2.05e-04 | linear | 8.18e-01
16 | 1638 | 0 | 1.01e-04 | 2.65e-04 | 2.44e-04 | 1.88e-04 | linear | 7.71e-01
16 | 164 | 16384 | 2.02e-02 | 1.53e-02 | 2.74e-02 | 2.61e-02 | lin/bin | 9.53e-01
16 | 164 | 1638 | 2.35e-03 | 1.84e-03 | 3.07e-03 | 2.88e-03 | lin/bin | 9.40e-01
16 | 164 | 164 | 2.67e-04 | 2.58e-04 | 3.26e-04 | 3.24e-04 | lin/bin | 9.94e-01
16 | 164 | 16 | 3.60e-05 | 1.27e-04 | 6.00e-05 | 6.67e-05 | sorted | 1.11e+00
16 | 164 | 2 | 1.43e-05 | 8.00e-05 | 3.68e-05 | 4.18e-05 | sorted | 1.14e+00
16 | 164 | 0 | 1.12e-05 | 7.75e-05 | 2.75e-05 | 3.71e-05 | sorted | 1.35e+00
16 | 16 | 16384 | 6.43e-03 | 7.22e-03 | 9.56e-03 | 1.01e-02 | lin/bin | 1.05e+00
16 | 16 | 1638 | 9.42e-04 | 9.86e-04 | 1.18e-03 | 1.37e-03 | lin/bin | 1.16e+00
16 | 16 | 164 | 1.24e-04 | 1.63e-04 | 1.40e-04 | 1.49e-04 | sorted | 1.06e+00
16 | 16 | 16 | 2.23e-05 | 7.88e-05 | 3.35e-05 | 3.73e-05 | sorted | 1.11e+00
16 | 16 | 2 | 1.11e-05 | 6.43e-05 | 2.28e-05 | 2.94e-05 | sorted | 1.29e+00
16 | 16 | 0 | 9.57e-06 | 6.38e-05 | 1.72e-05 | 2.57e-05 | sorted | 1.50e+00
16 | 2 | 16384 | 1.80e-03 | 3.94e-03 | 3.42e-03 | 3.54e-03 | sorted | 1.04e+00
16 | 2 | 1638 | 3.48e-04 | 5.51e-04 | 4.53e-04 | 4.68e-04 | sorted | 1.03e+00
16 | 2 | 164 | 5.44e-05 | 1.27e-04 | 6.80e-05 | 7.35e-05 | sorted | 1.08e+00
16 | 2 | 16 | 1.44e-05 | 7.02e-05 | 2.78e-05 | 2.92e-05 | sorted | 1.05e+00
16 | 2 | 2 | 1.27e-05 | 6.43e-05 | 2.04e-05 | 2.71e-05 | sorted | 1.33e+00
16 | 2 | 0 | 9.99e-06 | 6.15e-05 | 1.69e-05 | 2.35e-05 | sorted | 1.39e+00
16 | 0 | 16384 | 1.31e-04 | 3.63e-03 | 1.94e-04 | 2.05e-04 | sorted | 1.05e+00
16 | 0 | 1638 | 3.03e-05 | 6.35e-04 | 4.17e-05 | 4.69e-05 | sorted | 1.12e+00
16 | 0 | 164 | 1.34e-05 | 1.31e-04 | 1.89e-05 | 2.61e-05 | sorted | 1.38e+00
16 | 0 | 16 | 9.84e-06 | 6.97e-05 | 1.68e-05 | 2.46e-05 | sorted | 1.47e+00
16 | 0 | 2 | 9.68e-06 | 5.97e-05 | 1.58e-05 | 2.27e-05 | sorted | 1.44e+00
16 | 0 | 0 | 1.20e-05 | 6.62e-05 | 1.87e-05 | 2.52e-05 | sorted | 1.34e+00
2 | 16384 | 16384 | 2.36e-03 | 7.74e-03 | 2.88e-02 | 7.14e-01 | lin/bin | 2.48e+01
2 | 16384 | 1638 | 5.92e-04 | 3.00e-03 | 4.95e-03 | 7.35e-02 | lin/bin | 1.49e+01
2 | 16384 | 164 | 2.23e-04 | 2.21e-03 | 2.35e-03 | 8.95e-03 | lin/bin | 3.81e+00
2 | 16384 | 16 | 1.59e-04 | 2.15e-03 | 1.94e-03 | 2.66e-03 | sorted | 1.37e+00
2 | 16384 | 2 | 1.55e-04 | 2.13e-03 | 1.94e-03 | 2.13e-03 | sorted | 1.10e+00
2 | 16384 | 0 | 1.49e-04 | 2.11e-03 | 1.95e-03 | 2.04e-03 | sorted | 1.04e+00
2 | 1638 | 16384 | 1.31e-03 | 2.42e-03 | 6.50e-03 | 7.19e-02 | lin/bin | 1.11e+01
2 | 1638 | 1638 | 4.23e-04 | 7.01e-04 | 1.11e-03 | 7.60e-03 | lin/bin | 6.86e+00
2 | 1638 | 164 | 1.54e-04 | 3.23e-04 | 3.46e-04 | 9.16e-04 | lin/bin | 2.64e+00
2 | 1638 | 16 | 1.09e-04 | 2.90e-04 | 2.54e-04 | 2.58e-04 | sorted | 1.02e+00
2 | 1638 | 2 | 1.01e-04 | 2.67e-04 | 2.42e-04 | 2.02e-04 | linear | 8.37e-01
2 | 1638 | 0 | 1.02e-04 | 2.67e-04 | 2.44e-04 | 1.94e-04 | linear | 7.94e-01
2 | 164 | 16384 | 6.87e-03 | 5.24e-03 | 8.15e-03 | 9.12e-03 | lin/bin | 1.12e+00
2 | 164 | 1638 | 7.54e-04 | 7.93e-04 | 1.06e-03 | 1.14e-03 | lin/bin | 1.08e+00
2 | 164 | 164 | 1.13e-04 | 1.61e-04 | 1.39e-04 | 1.54e-04 | sorted | 1.11e+00
2 | 164 | 16 | 2.19e-05 | 8.51e-05 | 3.97e-05 | 4.89e-05 | sorted | 1.23e+00
2 | 164 | 2 | 1.18e-05 | 7.78e-05 | 3.10e-05 | 3.86e-05 | sorted | 1.25e+00
2 | 164 | 0 | 1.11e-05 | 7.55e-05 | 2.75e-05 | 3.71e-05 | sorted | 1.35e+00
2 | 16 | 16384 | 1.84e-03 | 2.12e-03 | 2.73e-03 | 2.91e-03 | lin/bin | 1.07e+00
2 | 16 | 1638 | 3.21e-04 | 4.52e-04 | 4.76e-04 | 4.98e-04 | lin/bin | 1.05e+00
2 | 16 | 164 | 5.70e-05 | 1.12e-04 | 7.39e-05 | 7.84e-05 | sorted | 1.06e+00
2 | 16 | 16 | 1.57e-05 | 7.10e-05 | 2.39e-05 | 3.01e-05 | sorted | 1.26e+00
2 | 16 | 2 | 1.11e-05 | 6.47e-05 | 1.81e-05 | 3.08e-05 | sorted | 1.70e+00
2 | 16 | 0 | 9.17e-06 | 6.54e-05 | 1.73e-05 | 2.55e-05 | sorted | 1.48e+00
2 | 2 | 16384 | 8.28e-04 | 1.34e-03 | 1.35e-03 | 1.50e-03 | lin/bin | 1.11e+00
2 | 2 | 1638 | 1.69e-04 | 2.69e-04 | 2.26e-04 | 2.53e-04 | sorted | 1.12e+00
2 | 2 | 164 | 3.18e-05 | 8.68e-05 | 4.00e-05 | 4.86e-05 | sorted | 1.22e+00
2 | 2 | 16 | 1.29e-05 | 6.81e-05 | 1.98e-05 | 2.66e-05 | sorted | 1.34e+00
2 | 2 | 2 | 1.04e-05 | 6.29e-05 | 1.65e-05 | 2.39e-05 | sorted | 1.44e+00
2 | 2 | 0 | 1.01e-05 | 6.22e-05 | 1.56e-05 | 2.43e-05 | sorted | 1.56e+00
2 | 0 | 16384 | 1.27e-04 | 8.61e-04 | 1.94e-04 | 1.98e-04 | sorted | 1.02e+00
2 | 0 | 1638 | 3.41e-05 | 2.27e-04 | 4.39e-05 | 4.84e-05 | sorted | 1.10e+00
2 | 0 | 164 | 1.67e-05 | 8.60e-05 | 2.06e-05 | 2.76e-05 | sorted | 1.34e+00
2 | 0 | 16 | 9.96e-06 | 6.67e-05 | 1.76e-05 | 2.52e-05 | sorted | 1.43e+00
2 | 0 | 2 | 1.02e-05 | 6.31e-05 | 1.65e-05 | 2.33e-05 | sorted | 1.41e+00
2 | 0 | 0 | 1.06e-05 | 6.28e-05 | 1.70e-05 | 2.39e-05 | sorted | 1.41e+00
0 | 16384 | 16384 | 3.35e-04 | 2.48e-03 | 2.30e-03 | 2.42e-03 | sorted | 1.05e+00
0 | 16384 | 1638 | 2.09e-04 | 2.17e-03 | 1.99e-03 | 2.09e-03 | sorted | 1.05e+00
0 | 16384 | 164 | 1.60e-04 | 2.13e-03 | 1.92e-03 | 2.03e-03 | sorted | 1.06e+00
0 | 16384 | 16 | 1.49e-04 | 2.11e-03 | 1.90e-03 | 2.04e-03 | sorted | 1.07e+00
0 | 16384 | 2 | 1.50e-04 | 2.13e-03 | 1.91e-03 | 2.03e-03 | sorted | 1.07e+00
0 | 16384 | 0 | 1.52e-04 | 2.10e-03 | 1.92e-03 | 2.02e-03 | sorted | 1.05e+00
0 | 1638 | 16384 | 2.90e-04 | 5.74e-04 | 6.00e-04 | 5.26e-04 | linear | 8.76e-01
0 | 1638 | 1638 | 1.59e-04 | 3.25e-04 | 3.14e-04 | 2.66e-04 | linear | 8.48e-01
0 | 1638 | 164 | 1.12e-04 | 2.86e-04 | 2.51e-04 | 1.95e-04 | linear | 7.76e-01
0 | 1638 | 16 | 1.02e-04 | 2.69e-04 | 2.40e-04 | 1.88e-04 | linear | 7.84e-01
0 | 1638 | 2 | 1.02e-04 | 2.66e-04 | 2.42e-04 | 1.89e-04 | linear | 7.83e-01
0 | 1638 | 0 | 1.00e-04 | 2.66e-04 | 2.43e-04 | 1.89e-04 | linear | 7.75e-01
0 | 164 | 16384 | 2.01e-04 | 4.10e-04 | 3.43e-04 | 3.76e-04 | sorted | 1.10e+00
0 | 164 | 1638 | 5.91e-05 | 1.37e-04 | 8.70e-05 | 1.09e-04 | sorted | 1.25e+00
0 | 164 | 164 | 1.77e-05 | 8.51e-05 | 3.52e-05 | 4.74e-05 | sorted | 1.35e+00
0 | 164 | 16 | 1.19e-05 | 7.78e-05 | 2.84e-05 | 3.82e-05 | sorted | 1.35e+00
0 | 164 | 2 | 1.15e-05 | 1.12e-04 | 2.97e-05 | 3.75e-05 | sorted | 1.26e+00
0 | 164 | 0 | 1.09e-05 | 7.61e-05 | 2.68e-05 | 3.71e-05 | sorted | 1.38e+00
0 | 16 | 16384 | 1.98e-04 | 3.88e-04 | 3.30e-04 | 3.61e-04 | sorted | 1.10e+00
0 | 16 | 1638 | 5.91e-05 | 1.42e-04 | 7.47e-05 | 9.32e-05 | sorted | 1.25e+00
0 | 16 | 164 | 1.54e-05 | 7.35e-05 | 2.46e-05 | 3.25e-05 | sorted | 1.32e+00
0 | 16 | 16 | 1.09e-05 | 6.66e-05 | 1.85e-05 | 2.59e-05 | sorted | 1.40e+00
0 | 16 | 2 | 1.10e-05 | 6.51e-05 | 1.70e-05 | 2.36e-05 | sorted | 1.39e+00
0 | 16 | 0 | 9.47e-06 | 6.60e-05 | 1.82e-05 | 2.61e-05 | sorted | 1.44e+00
0 | 2 | 16384 | 2.02e-04 | 3.94e-04 | 3.31e-04 | 3.63e-04 | sorted | 1.10e+00
0 | 2 | 1638 | 5.83e-05 | 1.27e-04 | 7.32e-05 | 9.48e-05 | sorted | 1.29e+00
0 | 2 | 164 | 1.56e-05 | 7.13e-05 | 2.33e-05 | 3.19e-05 | sorted | 1.37e+00
0 | 2 | 16 | 1.13e-05 | 6.58e-05 | 2.15e-05 | 2.81e-05 | sorted | 1.31e+00
0 | 2 | 2 | 1.06e-05 | 6.23e-05 | 1.61e-05 | 2.47e-05 | sorted | 1.53e+00
0 | 2 | 0 | 9.71e-06 | 6.30e-05 | 1.71e-05 | 2.35e-05 | sorted | 1.38e+00
0 | 0 | 16384 | 1.26e-04 | 3.52e-04 | 1.92e-04 | 2.14e-04 | sorted | 1.11e+00
0 | 0 | 1638 | 3.63e-05 | 1.27e-04 | 4.43e-05 | 5.05e-05 | sorted | 1.14e+00
0 | 0 | 164 | 1.18e-05 | 7.21e-05 | 1.88e-05 | 2.68e-05 | sorted | 1.42e+00
0 | 0 | 16 | 1.04e-05 | 7.90e-05 | 1.58e-05 | 2.42e-05 | sorted | 1.53e+00
0 | 0 | 2 | 9.88e-06 | 6.13e-05 | 1.61e-05 | 2.35e-05 | sorted | 1.46e+00
0 | 0 | 0 | 8.68e-06 | 6.28e-05 | 1.65e-05 | 2.36e-05 | sorted | 1.43e+00
function spget_unsorted1{Tv,Ti}(A::SparseMatrixCSC{Tv,Ti}, I::AbstractVector, J::AbstractVector)
# Anything for indexing rows.
# This sorts I first then does some trick with constructing the transpose.
(m, n) = size(A)
nI = length(I)
nJ = length(J)
colptrA = A.colptr; rowvalA = A.rowval; nzvalA = A.nzval
nnzS = 0
pI = sortperm(I)
@inbounds I = I[pI]
fI = find(I)
W = zeros(Ti, nI + 1) # Keep row counts
W[1] = 1 # For cumsum later
cacheI = zeros(Int, m)
avgM = div(nnz(A),n)
# heuristics based on experiments
alg = ((nI - avgM) > 2^8) ? 1 :
((avgM - nI) > 2^10) ? 0 : 2
ptrI::Int = 1
if alg == 1 # binary search on I
# count rows
@inbounds for j = 1:nJ
col = J[j]
for ptrA in colptrA[col]:(colptrA[col+1]-1)
cacheI[rowvalA[ptrA]] += 1
end
end
ptrI = 1
# fill cache and count nnz
@inbounds for j = 1:m
cval = cacheI[j]
(cval == 0) && continue
ptrI = searchsortedfirst(I, j, ptrI, nI, Base.Order.Forward)
cacheI[j] = ptrI
while ptrI <= nI && I[ptrI] == j
W[fI[pI[ptrI]]+1] += cval
nnzS += cval
ptrI += 1
end
if ptrI > nI
@simd for i=(j+1):m; @inbounds cacheI[i]=ptrI; end
break
end
end
else # linear search on both rowvalA and I
# Form the structure of the result and compute space
@inbounds for j = 1:nJ
col = J[j]
ptrI = 1
ptrA::Int = colptrA[col]
stopA::Int = colptrA[col+1]
while ptrI <= nI && ptrA < stopA
rowA = rowvalA[ptrA]
rowI = I[ptrI]
if rowI > rowA
ptrA += 1
elseif rowI < rowA
ptrI += 1
else
(cacheI[rowA] == 0) && (cacheI[rowA] = ptrI)
W[fI[pI[ptrI]]+1] += 1
nnzS += 1
ptrI += 1
end
end
end
end
colptrS_T = cumsum(W)
# Populate the values in the result, but transposed
rowvalS_T = Array(Ti, nnzS)
nzvalS_T = Array(Tv, nnzS)
@simd for i=1:nI; @inbounds W[i] = 0; end # Zero out W to store row positions
@inbounds for j = 1:nJ
col = J[j]
ptrA::Int = colptrA[col]
stopA::Int = colptrA[col+1]
while ptrA < stopA
rowA = rowvalA[ptrA]
ptrI = cacheI[rowA]
if ptrI > 0
while ptrI <= nI && I[ptrI] == rowA
rowS = fI[pI[ptrI]]
k = colptrS_T[rowS] + W[rowS]
rowvalS_T[k] = j
nzvalS_T[k] = nzvalA[ptrA]
W[rowS] += 1
ptrI += 1
end
end
ptrA += 1
end
end
# Transpose so that rows are in sorted order and return
S_T = SparseMatrixCSC(nJ, nI, colptrS_T, rowvalS_T, nzvalS_T)
return S_T.'
end
# for every value in B, find the first occurrence in A,
# store it indexed with the value in R.
# R must be initialized to 0 by caller
function match_linear{Ti}(A::Vector{Ti}, startA::Int, endA::Int, B::Vector{Ti}, startB::Int, endB::Int, R::Vector{Int})
alg = 0
nmatch = 0
if alg == 0 #linear
while startA <= endA && startB <= endB
vA = A[startA]
vB = B[startB]
if vA > vB
startB += 1
elseif vA < vB
startA += 1
else
(R[vA] == 0) && (R[vA] = startA)
startB += 1
nmatch += 1
end
end
end
nmatch
end
function permute_rows!{Tv,Ti}(S::SparseMatrixCSC{Tv,Ti}, pI::Vector{Int})
(m, n) = size(S)
colptrS = S.colptr; rowvalS = S.rowval; nzvalS = S.nzval
# preallocate temporary sort space
nr = min(nnz(S), m)
rowperm = Array(Int, nr)
rowvalTemp = Array(Ti, nr)
nzvalTemp = Array(Tv, nr)
@inbounds for j in 1:n
rowrange = colptrS[j]:(colptrS[j+1]-1)
nr = length(rowrange)
(nr > 0) || continue
k = 1
for i in rowrange
rowA = rowvalS[i]
rowvalTemp[k] = pI[rowA]
nzvalTemp[k] = nzvalS[i]
k += 1
end
sortperm!(pointer_to_array(pointer(rowperm), nr), pointer_to_array(pointer(rowvalTemp), nr))
k = 1
for i in rowrange
kperm = rowperm[k]
rowvalS[i] = rowvalTemp[kperm]
nzvalS[i] = nzvalTemp[kperm]
k += 1
end
end
S
end
function spget_unsorted2{Tv,Ti}(A::SparseMatrixCSC{Tv,Ti}, I::AbstractVector, J::AbstractVector)
pI = sortperm(I)
@inbounds I = I[pI]
permute_rows!(Base.SparseMatrix.getindex_I_sorted(A, I, J), pI)
end
function spget_unsorted3{Tv,Ti}(A::SparseMatrixCSC{Tv,Ti}, I::AbstractVector, J::AbstractVector)
# Anything for indexing rows.
# This sorts I first then does some trick with constructing the transpose.
(m, n) = size(A)
nI = length(I)
nJ = length(J)
colptrA = A.colptr; rowvalA = A.rowval; nzvalA = A.nzval
nnzS = 0
pI = sortperm(I)
@inbounds I = I[pI]
fI = find(I)
W = zeros(Ti, nI + 1) # Keep row counts
W[1] = 1 # For cumsum later
# Form the structure of the result and compute space
for j = 1:nJ
@inbounds col = J[j]
ptrI::Int = 1
ptrA::Int = colptrA[col]
stopA::Int = colptrA[col+1]
@inbounds while ptrI <= nI && ptrA < stopA
rowA = rowvalA[ptrA]
rowI = I[ptrI]
if rowI > rowA
ptrA += 1
elseif rowI < rowA
ptrI += 1
else
W[fI[pI[ptrI]]+1] += 1
nnzS += 1
ptrI += 1
end
end
end
colptrS_T = cumsum(W)
# Populate the values in the result, but transposed
rowvalS_T = Array(Ti, nnzS)
nzvalS_T = Array(Tv, nnzS)
@simd for i=1:nI; @inbounds W[i] = 0; end # Zero out W to store row positions
@inbounds for j = 1:nJ
col = J[j]
ptrI::Int = 1
ptrA::Int = colptrA[col]
stopA::Int = colptrA[col+1]
while ptrI <= nI && ptrA < stopA
rowA = rowvalA[ptrA]
rowI = I[ptrI]
if rowI > rowA
ptrA += 1
elseif rowI < rowA
ptrI += 1
else
rowS = fI[pI[ptrI]]
k = colptrS_T[rowS] + W[rowS]
rowvalS_T[k] = j
nzvalS_T[k] = nzvalA[ptrA]
W[rowS] += 1
ptrI += 1
end
end
end
# Transpose so that rows are in sorted order and return
S_T = SparseMatrixCSC(nJ, nI, colptrS_T, rowvalS_T, nzvalS_T)
return S_T.'
end
function compare_methods()
let M = 2^14, N=2^14
Irand = randperm(M)
Jrand = randperm(N)
SA = [1., 0.1, 0.01, 0.001, 0.0001, 0.];
IA = [M, M*0.1, M*0.01, M*0.001, M*0.0001, 0.];
JA = [N, N*0.1, N*0.01, N*0.001, N*0.0001, 0.];
debug = true
if debug
@printf(" S | I | J | sorted | unsorted | diff\n")
@printf(" | | | | lin/bin | sorted | linear | fastest | linear/sorted\n")
end
for d in SA
S = sprand(M, N, d)
for In in IA
I = Irand[1:int(In)]
Isorted = sort(I)
for Jn in JA
J = Jrand[1:int(Jn)]
Jsorted = sort(J)
gc()
ru1 = @timed spget_unsorted1(S, I, J)
gc()
ru2 = @timed spget_unsorted2(S, I, J)
gc()
ru3 = @timed spget_unsorted3(S, I, J)
gc()
rs = @timed S[Isorted, Jsorted]
fastest = ""
diff = 0
if ru1[2] < ru2[2] && ru1[2] < ru3[2]
fastest = "lin/bin"
elseif ru2[2] < ru1[2] && ru2[2] < ru3[2]
fastest = " sorted"
else
fastest = " linear"
end
diff = ru3[2] / ru2[2]
if debug
@printf(" %7d | %7d | %7d | %4.2e | %4.2e | %4.2e | %4.2e | %s | %4.2e\n", int(nnz(S)/S.n), length(I), length(J), rs[2], ru1[2], ru2[2], ru3[2], fastest, diff)
end
end
end
end
end
end
compare_methods()
function unsorted()
M = 2^14;
N = 2^14;
%SA = [1.0, 0.1, 0.01, 0.001, 0.0001, 0.0];
SA = [0.1, 0.01, 0.001, 0.0001, 0.0];
IA = [M, M*0.1, M*0.01, M*0.001, M*0.0001, 0.0];
JA = [N, N*0.1, N*0.01, N*0.001, N*0.0001, 0.0];
%fprintf(1, 'making Irand\n');
Irand = randperm(M);
%fprintf(1, 'making Jrand\n');
Jrand = randperm(N);
fprintf(1, ' S | I | J | sorted | unsorted\n')
for d = SA
%fprintf(1, 'making S\n');
S = sprand(M, N, d);
for In = IA
%fprintf(1, 'making I\n');
I = Irand(1:int64(In));
%fprintf(1, 'sorting I\n');
Isorted = sort(I);
for Jn = JA
%fprintf(1, 'making J\n');
J = Jrand(1:int64(Jn));
%fprintf(1, 'sorting J\n');
Jsorted = sort(J);
%fprintf(1, 'calling with unsorted\n');
tic; S(I, J); ru = toc;
%fprintf(1, 'calling with sorted\n');
tic; S(Isorted, Jsorted); rs = toc;
fprintf(1, ' %7d | %7d | %7d | %4.2e | %4.2e\n', int64(nnz(S)/N), length(I), length(J), rs, ru);
end
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment