Skip to content
View cachemiss.txt
output from
mcode dump: p1
7f939af17000 48C7C100E1F505 mov rcx, 0x05f5e100
7f939af17007 48BE2020F8410000. mov rsi, 0x0000000041f82020
7f939af17011 49C7C000000000 mov r8, 0x0
7f939af17018 4E8B04C6 mov r8, [rsi+r8*8]
7f939af1701c 4130C8 xor r8b, cl
7f939af1701f 4883E901 sub rcx, +0x01
View log.html
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<script type="text/javascript" src="jquery.min.js"></script><script type="text/javascript" src="jquery-ui.min.js"></script><script type="text/javascript" src="treebits.js"></script><link rel="stylesheet" href="logfile.css" type="text/css">
<title>Log File</title>
<h1>VM build log</h1>
<p><a href="javascript:" class="logTreeExpandAll">Expand all</a> |
View readme.txt
$ sudo ./snabb snsh -jdump=+rs,x -jp=vi1 counter2.lua
100% TRACE 3 ->loop counter2.lua:5
$ sudo ./snabb snsh -jdump=+rs,x -jtprof counter2.lua
traceprof report (recorded 10567/10567 samples):
62% TRACE 3 counter2.lua:5
26% TRACE 4 (3/5) counter2.lua:9
10% TRACE 3:LOOP counter2.lua:5
View luajit-profiler-separate-loops.patch
diff --git a/lib/luajit/src/jit/p.lua b/lib/luajit/src/jit/p.lua
index d894bb7..b619517 100644
--- a/lib/luajit/src/jit/p.lua
+++ b/lib/luajit/src/jit/p.lua
@@ -71,7 +71,8 @@ local map_vmmode = {
-- Profiler callback.
-local function prof_cb(th, samples, vmmode)
+local function prof_cb(th, samples, vmmode, ip)
if [ "$1" == "1" ]; then
elif [ "$1" == "2" ]; then
View 0checksum.lua
local ffi = require("ffi")
local C = ffi.C
local pmu = require("lib.pmu")
-- Use /etc/passwd contents as a dummy packet
local data = core.lib.readfile("/etc/passwd", "*all")
local ptr = ffi.cast("unsigned char *", data), #data
local len = #data
local loops = 100000
View readme.txt
CPU PMU (Performance Monitoring Unit) support
I have geeked out on a new piece of hardware :-)
This time it is the Performance Monitoring Unit built into the CPU. This is a hardware capability to track fine-grained events inside the processor and give visibility into things like cache misses, branch mispredictions, utilization of internal CPU resources,
Turns out that you only need two special CPU instructions to drive this - WMSR to setup a counter, RDPMC to read it - and a simple but interesting benchmarking tool is only 500 lines of code. This was also a good opportunity to use our new ability to write Lua code that generates machine code at runtime.
View pmu.lua
-- This module counts and reports on CPU events such as cache misses,
-- branch mispredictions, utilization of internal CPU resources such
-- as execution units, and so on.
-- Hundreds of low-level counters are available. The exact list
-- depends on CPU model. See pmu_cpu.lua for our definitions.
-- API:
View 0readme
This is a quick prototype for generating a Lua module that describes Intel's Performance Monitoring Counters.
First download the raw data:
wget -r --no-parent
Then run the script (attached) to get the output (attached).
Looks like this compiles with "luajit -b" to 211KB of object code. Shrinks to 16KB with gzip.
Something went wrong with that request. Please try again.