Skip to content

Instantly share code, notes, and snippets.

@davidm
Last active December 14, 2015 17:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save davidm/5126086 to your computer and use it in GitHub Desktop.
Save davidm/5126086 to your computer and use it in GitHub Desktop.
checkglobals patch

Detailed Description: checkglobals module + patch

The checkglobals module + patch (from LuaPowerPatches -- Download Patch for Lua 5.1.3 is a hybrid of a compile-time and run-time approach for detecting undefined variables. Consider the following trivial Lua module:

-- multiplybyx.lua
local function multiplybyx(y)
  return y * X                -- is X defined???
end
return multiplybyx

Is this code valid? Did we mistype x as X? Well, we can detect at compile time that X is a global variable, but whether X is a ''defined'' global variable can in general not be known until run-time:

-- main.lua
local multiplybyx = dofile 'multiplybyx.lua'
X = 2
print(multiplybyx(5))      -- multiplybyx is valid
X = nil
print(multiplybyx(5))      -- multiplybyx is now not valid

So, we'll define a function checkglobals that determines whether all the globals "directly" referenced lexically inside the code of a given function (e.g. multiplybyx) are defined at the time checkglobals is called:

-- main.lua
local checkglobals = require 'checkglobals'
local multiplybyx = dofile 'multiplybyx.lua'
X = 2
checkglobals(multiplybyx)  -- ok: multiplybyx is valid
print(multiplybyx(5))
X = nil
checkglobals(multiplybyx)  -- fails: multiplybyx is not valid
print(multiplybyx(5))


$ lua main.lua
10
lua: main.lua:8: accessed undefined variable "X" at line 3
stack traceback:
        [C]: in function 'error'
        etc/checkglobals.lua:77: in function 'checkglobals'
        main.lua:8: in main chunk
        [C]: ?

The function checkglobals(f) operates by retrieving the environment table (env) (known at run-time) of function {{f}} and retrieving the list of all global get and set bytecodes (GETGLOBAL and SETGLOBAL) lexically inside f (known at compile-time). checkglobals verifies that for each get or set global with name varname that env[varname] ~= nil. If this check fails, checkglobals raises an error. Unless the code was stripped, i.e. luac -s, the error also contains the line number in which the global variable was accessed.

The checkglobals function accepts some additional parameters that make it more flexible. Let's look at the comments in the source on it. The implementation of this module (on the Lua side) is basically this:

local function checkglobals(f, env)
  local fp = f or 1
  if type(fp) == 'number' then fp = fp + 1 end
  env = env or getfenv(2)
  local gref = getinfo(fp, 'g').globals
  for i=1,#gref,gref.ncols do
    local op,name,linenum = unpack(gref, i,i+2)
    if env[name] == nil then
      error('accessed undefined variable "' .. name .. '"' ..
            (linenum and ' at line ' .. linenum or ''), 2)
    end
  end
  return f
end

This code makes use of a patched debug.getinfo that supports a new "g" ("globals") option that returns the list of all globals accessed lexically inside the given function (including functions lexically nested inside that function). gref = getinfo(fp, 'g').globals is an array. For each global accessed, the following values are appended to the array: the access type ("GETGLOBAL" or "SETGLOBAL"), the variable name (as a string), and the line number (if source was not stripped). There is also a field gref.ncols equal to the number of columns (2 or 3) represented in the flat array.

Below are some examples of possible ways to use the module:

Example:

-- factorial.lua
function factorial(k)
  if k == 1 then
    return K            -- opps!
  else
    return k * factorial(k-1)
  end
end

function main()
  print(factorial(10))
end

require 'checkglobals' ()  -- fails since K is undefined

main()

Example:

-- factorial.lua
require 'checkglobals' ()  -- fails since K is undefined

-- note: no new globals can be "directly" defined beyond this point
-- (though via _G and getfenv() is ok).

local function factorial(k)
  if k == 1 then
    return K            -- opps!
  else
    return k * factorial(k-1)
  end
end

local function main()
  print(factorial(10))
end

main()

Example:

-- factorial.lua
local M = {}

local function factorial(k)
  if k == 1 then
    return K            -- opps!
  else
    return k * factorial(k-1)
  end
end
M.factorial = factorial

require 'checkglobals' ()

return M

Note the patch made to the Lua's debugging module (ldblib.c). The patch is rather simple and quite isolated. It only makes additions (no deletions) to lua_getinfo and debug.getinfo to support the new "g" ("globals") option.

The new "g" option may have uses elsewhere, so this might be a useful addition to Lua's debug module. The list of globals that a function accesses can be considered part of the function's interface, which is a very fundamental aspect of what the function is. reflection/introspection is much about accessing information on interfaces.

This "g" option may alternately be defined in terms of lhf's bytecode inspector library (lbci)[http://www.tecgraf.puc-rio.br/~lhf/ftp/lua/#lbci]. See checkglobals-lbci.lua.

What are the advantages/disadvantages/caveats to checkglobals? Here are some qualities of it:

  • This code is intended to be simple, robust and suitable for general use, with the semantics fairly easy to understand without corner cases
  • It detects global accesses in code that is never executed (similar to static analysis approaches).
  • It does not mess with environment metatables (like strict.lua) that can potentially cause obscure conflicts.
  • It makes weaker assumptions about global variable defined-ness than the static analysis approaches trick, though it makes stronger assumptions than the "strict" approach. Mainly, it assumes that globals aren't created or destroyed during and between the time that checkglobals is called and the function that was validated is called. Note that you may call checkglobals more than once (e.g. after creating new globals).
  • The checks are applied normally just after code loading (not off-line as with luac -p -l or on each variable access as with strict.lua), though may be done later or more frequently.
  • The checkglobals approach may be combined with the strict approach for the strongest validation.
  • It requires a patch to lua_getinfo and debug.getinfo to support the new "g" ("globals") option used by checkglobals.lua. This patch is entirely backwards compatible and rather isolated and it might be useful for other purposes as well.
  • checkglobals is written entirely in Lua and can be customized.
  • checkglobals (like the static analysis approaches) assumes that a function has a single, non-changing environment. It also assumes that lexically nested functions have the same environment as the parent function, although this restriction might be relaxed with an additional parameter that causes checkglobals to ignore lexically nested functions: checkglobals(f,env,'norecurse'); that will also require an extension to the debug.getinfo patch. See LuaList:2008-03/msg00598 for details.
  • As this suggestion as written is new (I believe), the design qualities of it may still need to be verified in practice.

See also mail list discussion: LuaList:2008-03/msg00440.html .

--DavidManura

(P.S. I no longer use this but rather prefer semantically aware Lua text editors.)

diff -urN lua-5.1.3/etc/checkglobals.lua lua-5.1.3-checkglobals/etc/checkglobals.lua
--- lua-5.1.3/etc/checkglobals.lua 1969-12-31 19:00:00.000000000 -0500
+++ lua-5.1.3-checkglobals/etc/checkglobals.lua 2008-03-22 22:46:19.227500000 -0400
@@ -0,0 +1,87 @@
+-- checkglobals.lua
+-- Undeclared global variable detection for Lua.
+--
+-- This module consists of and returns a single function:
+--
+-- f = checkglobals(f, env)
+--
+-- In short, checkglobals validates that the function f uses only
+-- global variables defined in the table env.
+--
+-- Often, checkglobals() is called without arguments. If f is
+-- unspecified (nil), the calling function is used. If f is a number,
+-- the function at stack level f is used (1 is the calling function).
+-- If env is unspecified (nil), the environment of the calling
+-- function is used.
+--
+-- The test passes only if all global variables "directly" read from
+-- or written to lexically inside the function f (including functions
+-- lexically nested in f) exist in the table env. That is,
+-- env[varname] ~= nil for variable with name varname. Access to
+-- globals "indirectly" via _G or getfenv() don't count.
+--
+-- On success, returns f. On failure, raises error. The error
+-- contains a line number unless the source was stripped (luac -s).
+--
+-- This module requires a patched version of Lua that makes minor
+-- additions to ldebug.c (lua_getinfo 'g' option) and ldblib.c
+-- (debug.getinfo 'globals' field). Internally, it retrieves
+-- GETGLOBAL and SETGLOBAL bytecodes.
+--
+-- This module can be used in various ways including...
+--
+-- Usage mode #1: Define globals, then check.
+-- -- foo.lua
+-- x = 1
+-- function foo() x = x + 1; print(x) end
+-- function bar() X = X + 1; print(X) end -- opps!
+-- foo()
+-- require 'checkglobals' ()
+--
+-- Usage mode #2: Check, then define only locals.
+-- -- foo.lua
+-- require 'checkglobals' ()
+-- local x = 1
+-- local function foo() x = x + 1; print(x) end
+-- local function bar() X = X + 1; print(X) end -- opps!
+-- foo()
+--
+-- Usage mode #3: Check specified function.
+-- -- foo.lua
+-- local checkglobals = require 'checkglobals'
+-- function foo()
+-- print(mAtH.pi) -- opps!
+-- end
+-- checkglobals(foo)
+-- foo()
+--
+-- David Manura, 2008. Licensed under the same terms as Lua itself
+-- (MIT License).
+
+
+-- copy in case a sandbox removes these
+local getinfo = debug.getinfo
+local unpack = unpack
+local type = type
+local getfenv = getfenv
+local error = error
+
+local function checkglobals(f, env)
+ local fp = f or 1
+ if type(fp) == 'number' then fp = fp + 1 end
+ env = env or getfenv(2)
+ local gref = getinfo(fp, 'g').globals
+ for i=1,#gref,gref.ncols do
+ local op,name,linenum = unpack(gref, i,i+2)
+ if env[name] == nil then
+ error('accessed undefined variable "' .. name .. '"' ..
+ (linenum and ' at line ' .. linenum or ''),
+ 2)
+ end
+ end
+ return f
+end
+
+checkglobals() -- check oneself :)
+
+return checkglobals
diff -urN lua-5.1.3/src/ldblib.c lua-5.1.3-checkglobals/src/ldblib.c
--- lua-5.1.3/src/ldblib.c 2008-01-21 08:11:21.000000000 -0500
+++ lua-5.1.3-checkglobals/src/ldblib.c 2008-03-22 21:56:56.227500000 -0400
@@ -136,6 +136,11 @@
treatstackoption(L, L1, "activelines");
if (strchr(options, 'f'))
treatstackoption(L, L1, "func");
+
+ /* PATCH - checkglobals */
+ if (strchr(options, 'g'))
+ treatstackoption(L, L1, "globals");
+
return 1; /* return table */
}
diff -urN lua-5.1.3/src/ldebug.c lua-5.1.3-checkglobals/src/ldebug.c
--- lua-5.1.3/src/ldebug.c 2007-12-28 10:32:23.000000000 -0500
+++ lua-5.1.3-checkglobals/src/ldebug.c 2008-03-22 21:59:35.399375000 -0400
@@ -219,6 +219,10 @@
}
break;
}
+
+ /* PATCH - checkglobals */
+ case 'g':
+
case 'L':
case 'f': /* handled by lua_getinfo */
break;
@@ -229,6 +233,31 @@
}
+/* PATCH - checkglobals */
+static void auxgetinfoglobals(lua_State *L, Proto *p, Table *t, int *c) {
+ TValue *k = p->k;
+ int j;
+ for (j = 0; j < p->sizecode; j++) {
+ const Instruction i = p->code[j];
+ OpCode op = GET_OPCODE(i);
+ const TValue *ts;
+ if (op != OP_GETGLOBAL && op != OP_SETGLOBAL)
+ continue;
+ ts = k+GETARG_Bx(i);
+ lua_assert(ttisstring(ts));
+ setobj2t(L, luaH_setnum(L, t, (*c)++),
+ (L->top - (OP_GETGLOBAL ? 2 : 1)))
+ setobj2t(L, luaH_setnum(L, t, (*c)++), ts);
+ if(p->lineinfo) {
+ setnvalue(luaH_setnum(L, t, (*c)++), p->lineinfo[j]);
+ }
+ }
+ for (j = 0; j < p->sizep; j++) { /* lexically nested functions */
+ auxgetinfoglobals(L, p->p[j], t, c);
+ }
+}
+
+
LUA_API int lua_getinfo (lua_State *L, const char *what, lua_Debug *ar) {
int status;
Closure *f = NULL;
@@ -254,6 +283,22 @@
}
if (strchr(what, 'L'))
collectvalidlines(L, f);
+
+ /* PATCH - checkglobals */
+ if (strchr(what, 'g')) {
+ lua_newtable(L);
+ if (f != NULL || !f->c.isC) {
+ Table *t = hvalue(L->top-1);
+ int c = 1;
+ lua_pushnumber(L, f->l.p->lineinfo ? 3 : 2);
+ lua_setfield(L, -2, "ncols");
+ lua_pushliteral(L, "GETGLOBAL");
+ lua_pushliteral(L, "SETGLOBAL");
+ auxgetinfoglobals(L, f->l.p, t, &c);
+ lua_pop(L, 2);
+ }
+ }
+
lua_unlock(L);
return status;
}
diff -urN lua-5.1.3/test/checkglobalstest.lua lua-5.1.3-checkglobals/test/checkglobalstest.lua
--- lua-5.1.3/test/checkglobalstest.lua 1969-12-31 19:00:00.000000000 -0500
+++ lua-5.1.3-checkglobals/test/checkglobalstest.lua 2008-03-22 22:46:56.821250000 -0400
@@ -0,0 +1,55 @@
+_=[[--filtered
+
+-- Test suite for checkglobals.lua.
+-- D.Manura, 2008.
+
+package.path = package.path .. ';etc/?.lua'
+
+local checkglobals = require 'checkglobals'
+
+-- some utility functions for the test suite.
+local function fail(f)
+ local ok,msg = pcall(f)
+ if ok then error('fail expected', 2) end
+end
+local function pass(f) f() end
+local function env(f)
+ local newenv = setmetatable({}, {__index=_G})
+ return setfenv(f, newenv)
+end
+
+-- empty function
+checkglobals(function() end)
+
+-- basic tests, check outside
+checkglobals(function() print() end)
+checkglobals(function() foo() end, {foo=print})
+fail<< checkglobals(function() print() end, {}) >>
+pass<<
+ local G = _G
+ G.setfenv(1, {print = print})
+ checkglobals(function() print() end)
+ fail<< checkglobals(function() print(math.pi) end) >>
+>>
+pass<< checkglobals(function() print = 1 end) >>
+fail<< checkglobals(function() prinT = 1 end) >>
+
+-- basic tests, check inside
+pass(env<< checkglobals() >>)
+pass(env<< checkglobals(nil, 1) >>)
+pass(env<< x = 1; y=math.sqrt(x); checkglobals() >>)
+fail(env<< checkglobals(); x = 1 >>)
+fail(env<< checkglobals(nil, 1); x = 1 >>)
+fail(env<< if false then x = 1 end; checkglobals() >>)
+
+-- nested functions
+pass<< checkglobals(function() return function() print() end end) >>
+fail<< checkglobals(function() return function() print() end end, {}) >>
+
+print 'DONE'
+
+]]
+_=_:gsub('<<', ' (function() '):gsub('>>', ' end) ')
+assert(loadstring(_))()
+-- The above source filters itself to allow << ... >> as shorthand
+-- for (function() ... end).
-- Alternative implementation of checkglobals using
-- lhf's bytecode inspector library (lbci)[http://www.tecgraf.puc-rio.br/~lhf/ftp/lua/#lbci
local getinstruction = inspector.getinstruction
local getfunction = inspector.getfunction
local getconstant = inspector.getconstant
local inf = math.huge
local type = type
local function auxgetglobals(f, gref)
for i=1,inf do
local linenum,op,_,idx = getinstruction(f, i)
if not op then break end
if op == 'GETGLOBAL' or op == 'SETGLOBAL' then
local name = getconstant(f, -idx)
gref[#gref+1] = op
gref[#gref+1] = name
gref[#gref+1] = linenum -- may be nil
end
end
for i=1,inf do
local f2 = getfunction(f,i)
if not f2 then break end
auxgetglobals(f2, gref)
end
end
local function getglobals(f)
local gref = {}
auxgetglobals(f, gref)
local haslines = type(gref[3]) == 'number'
gref.ncols = haslines and 3 or 2
return gref
end
local orig_getinfo = debug.getinfo
function debug.getinfo(a,b,c)
local thread,f,what
if type(a) == 'thread' then
thread,f,what = a,b,c
else
f,what = a,b
end
if type(f) == 'number' then f = f + 1 end
local globals
if what and what:find 'g' then
what = what:gsub('g', '')
local fp = type(f) == 'number' and orig_getinfo(f, 'f').func or f
globals = getglobals(fp)
end
local t = thread and orig_getinfo(thread, f, what) or
orig_getinfo( f, what)
t.globals = globals
return t
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment