Skip to content

Instantly share code, notes, and snippets.

@carnaval
Created September 11, 2015 11:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save carnaval/a45dc14b0791c633ce9b to your computer and use it in GitHub Desktop.
Save carnaval/a45dc14b0791c633ce9b to your computer and use it in GitHub Desktop.
julia> function f(x::Float64)
@iaca for i = 1:1000
x = x*x - 2*x
end
x
end
f (generic function with 2 methods)
julia> println(analyze(f, Tuple{Float64}))
Intel(R) Architecture Code Analyzer Version - 2.1
Analyzed File - /tmp/tmpoiDgjt
Binary Format - 64Bit
Architecture - HSW
Analysis Type - Throughput
Throughput Analysis Report
--------------------------
Block Throughput: 8.50 Cycles Throughput Bottleneck: InterIteration
Port Binding In Cycles Per Iteration:
---------------------------------------------------------------------------------------
| Port | 0 - DV | 1 | 2 - D | 3 - D | 4 | 5 | 6 | 7 |
---------------------------------------------------------------------------------------
| Cycles | 1.5 0.0 | 1.5 | 0.5 0.5 | 0.5 0.5 | 0.0 | 1.5 | 1.5 | 0.0 |
---------------------------------------------------------------------------------------
N - port number or number of cycles resource conflict caused delay, DV - Divider pipe (on port 0)
D - Data fetch pipe (on ports 2 and 3), CP - on a critical path
F - Macro Fusion with the previous instruction occurred
* - instruction micro-ops not bound to a port
^ - Micro Fusion happened
# - ESP Tracking sync uop was issued
@ - SSE instruction followed an AVX256 instruction, dozens of cycles penalty is expected
! - instruction not supported, was not accounted in Analysis
| Num Of | Ports pressure in cycles | |
| Uops | 0 - DV | 1 | 2 - D | 3 - D | 4 | 5 | 6 | 7 | |
---------------------------------------------------------------------------------
| 1 | | | | | | 1.0 | | | | mov eax, 0x3e8
| 1 | | | | | | 0.5 | 0.5 | | | mov rcx, 0x7ffde31fe000
| 1 | | | 0.5 0.5 | 0.5 0.5 | | | | | | vmovsd xmm1, qword ptr [rcx]
| 0* | | | | | | | | | | nop dword ptr [rax], eax
| 1 | 0.9 | 0.1 | | | | | | | CP | vmulsd xmm2, xmm0, xmm0
| 1 | 0.6 | 0.4 | | | | | | | CP | vmulsd xmm0, xmm0, xmm1
| 1 | | 1.0 | | | | | | | CP | vaddsd xmm0, xmm2, xmm0
| 1 | | | | | | | 1.0 | | | add rax, 0xffffffffffffffff
| 0F | | | | | | | | | | jnz 0xfffffffffffffff0
Total Num Of Uops: 7
julia>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment