Skip to content

Instantly share code, notes, and snippets.

@whilei
Forked from tbrunz/Lua-Closures.lua
Created December 25, 2022 21:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save whilei/c00b4f8f9b7b0841cd62a248f92b337e to your computer and use it in GitHub Desktop.
Save whilei/c00b4f8f9b7b0841cd62a248f92b337e to your computer and use it in GitHub Desktop.
A simple Lua app to demonstrate the principles of lexical closures
#! /usr/bin/env lua
--
-- A simple Lua app to demonstrate the principles of lexical closures.
--
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
--
-- An iterator that sorts a table by its keys, in alphabetical order.
--
function pairsByKeys ( alphaTable, sortFunction )
local alphaArray = { }
--[[
The provided table, 'alphaTable', is a Lua table, which means it's an
associative array. AA's store their key-value pairs using hash codes
to improve look-up efficiency. One trade-off in using this method is
that key-value retrieval (using the 'pairs' iterator) will appear in
more or less a "random" order, not alphabetical order.
For this application, we want a loop iterator that *will* successively
return the key-value pairs in alphabetic order by key. To do this, we
will recast 'alphaTable' to make a new Lua 'sequence' (i.e., an ordered
array), where the values of the array, 'alphaArray', are the keys of
the 'alphaTable' table we were given.
Why do this? Because the lua function 'table.sort' sorts arrays, but
only by their values; it can't sort keys. Ergo, we need an array whose
values are the (unsorted) keys of 'alphaTable'.
]]--
for key, _ in pairs( alphaTable ) do
alphaArray[ #alphaArray + 1 ] = key
end
--[[
Now we can sort the table by its keys, using this auxiliary array.
lua's 'table.sort' function sorts arrays using a provided comparison
function. But note that the default sorting function is '<', which
works for alphabetical sorting. So if our interator was called with
no second argument, we'll get an alpha sort, as expected. And if
it's called with a function as a second argument, the iterator will
return key-value pairs using that alternative sorting scheme instead.
]]--
table.sort( alphaArray, sortFunction )
--[[
Now that the keys are sorted, we can provide a function that will
iterate over the *sorted* order of the keys, returning them & their
associated values. The iterator is a closure, a function that
"closes over" three variables that are in the lexical scope of the
returned function defined here (but are not actually defined in the
function itself):
1) The table argument that this function was called with,
2) The sorted array used to control the order of retrieval,
3) An index to keep track of what to return on successive calls.
On each call of the iterator we will pre-increment the index to
make the end compare easy; therefore we must start it at zero.
(Recall that arrays in Lua are 1-based, not 0-based.)
]]--
local index = 0
return function ( )
-- When we've exhausted the table/array, a pre-incremented index
-- will cause the table/array access to return 'nil', which also
-- signals the client using this iterator that its loop is complete.
index = index + 1
-- The keys are in alphabetic order in 'alphaArray', while the
-- values are in a hashed order in 'alphaTable'. Note that Lua
-- is register-based, so the following construct is byte-compiled
-- into an efficient executable.
return alphaArray[index], alphaTable[ alphaArray[index] ]
end
end
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
--
-- An iterator that traverses an input file, returning its words, one by one.
--
function allWords ( filename )
-- Open a file of the given name for reading, so we can count its words.
-- Note that 'io.input' will raise an error for us, should one occur.
io.input( filename )
--[[
Prime the while loop in the returned function below by reading the first
line from the file, and have it start searching for words at position 1
(since strings/arrays are 1-based). Note that the default argument for
'io.read' is "*l", or "read line". The default file that 'io.read' reads
from is set with an earlier call to 'io.input'.
]]--
local line = io.read()
local pos = 1
--[[
What Lua (and other languages, such as Java) call an "iterator" are
really "generators". That is, a "generator" function, such as this,
creates and returns an "iterator" function. Clients will call the
iterator function, typically in a 'for' loop.
Iterators are typically closures; in this case, our iterator closes
over the variables 'line' & 'pos', defined above in *this* function
(which is called only once, to obtain the iterator); these variables
are not defined in the closure being returned, although they can be,
are are, used in the iterator function. This is possible because
these variables exist in the closure's lexical scope at define-time
(even though they will have already gone out of scope at run-time).
These two variables are not global, but are not local to the closure;
they are "non-local variables". They are captured by (and hidden
within) the closure returned. (All functions in Lua are closures.)
]]--
return function ( )
-- Enable looping as long as a line is read from the file. Once
-- the end of file is reached, 'io.read' will return 'nil', which
-- will prevent the 'while' statement from executing.
while line do
-- Locals 'first' & 'last' delimit the position of each word
-- found in the current line of the file. By 'word', we mean
-- a contiguous string of alphanumeric characters, a pattern
-- matched by "%w+". ("%W+" would match strings of non-word
-- characters.) The closure variable 'pos' tells Lua where
-- in the string to start matching word characters.
-- Note that Lua supports multiple return values (very nice)!
-- Also note that we treat 'line' as an object, applying the
-- 'find' method to it by using the ':' operator.
local first, last = line:find( "%w+", pos )
-- Was a word found in this 'line' at or after position 'pos'?
-- If so, 'first' and 'last' will be returned as integers.
-- If not, they will be 'nil' (which equates to boolean false).
if first then
-- Advance the position pointer to the next character after
-- the word, then extract the word string and return it.
pos = last + 1
-- The string function 'sub' returns a sub-string.
return line:sub( first, last )
else
-- A word was not found in (the remainder of) the line; so
-- read another line & reset the position to the beginning
-- of this new line. Then loop to try again. If there are
-- no more lines in the file, 'line' will be 'nil', and
-- this 'while' loop will terminate. By using a 'while'
-- loop, we will skip over multiple blank lines correctly.
line = io.read()
pos = 1
end
end
-- On reaching this point, there is nothing left in the file; so
-- return 'nil' to signal that the loop has completed. Note that
-- this line isn't strictly needed, as a 'return' in Lua without
-- an argument will return 'nil' anyway.
return nil
end
end
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
--
-- Test the 'allWords' iterator (generator) with a simple routine to
-- count all occurrences of each word in a text file.
--
--[[
We start by getting a file to scan -- let's use ourselves! To do that,
we need the Lua equivalent of '${0}' in bash or '__file__' in Python.
Lua doesn't have a global variable for this, however, so we need to call a
function, 'debug.getinfo' and extract a field that holds the filename.
Our call to 'debug.getinfo' below returns the name of the source file ('S')
of the function at the first level ('1') of the call stack. Note that the
filename will begin with an '@' symbol, indicating it is a filename. By
passing '1', we are asking for information about the current function. If
we were to pass in '0', it would return "=[C]", since it would be returning
information about the 'getinfo' function itself (which is defined in an
external C library). More information on this function can be found in the
Debug Library chapter in the "Programming in Lua" book...
]]--
local thisModule = debug.getinfo( 1, 'S' )
-- Don't forget to snip off the '@' sigil from the obtained filename!
local targetFile = thisModule.source:sub( 2 )
-- Collect all the words in the target file as a 'bag of words'.
-- Creating a 'bag' requires us to count the number of occurrences of each
-- key, where the keys are simply the words we find in the file.
WordBag = { }
-- Use the Lua iterator we defined above to retrieve the words from the
-- target file, one by one, dropping each into our bag of words as we go.
for word in allWords( targetFile ) do
-- We don't care about capitalization, so make the word all lowercase.
word = word:lower()
-- If the word hasn't been seen before, create a key with a value of 1;
-- otherwise, increment the value (count) for keys we've seen before:
if not WordBag[ word ] then
WordBag[ word ] = 1
else
WordBag[ word ] = WordBag[ word ] + 1
end
end
-- Now that we've read through the entire file, our bag of words is full.
-- Create a report, sent to 'stdout', formatted as "<word> <count>". The
-- report format is controlled by 'string.format', where the format codes
-- are exactly the same as the 'printf' format codes in the C language.
outputFormat = "%20s %s"
-- And here is the report generation code:
print()
print( string.format( outputFormat, "Word", "Frequency" ) )
print( string.format( outputFormat,
" ----------------", "----------------" ) )
--[[
Now we want to extract each word (key) and its count (value) from our bag of
words. However, the order of the words in the bag (i.e., the order we would
pull them back out in, if we use the 'pairs' iterator) is essentially random.
We need to sort the bag by its keys, in alphabetical order, and print the
results. To do this, we use the custom iterator we defined at the top of this
module to return the bag contents in the desired order.
]]--
for word, count in pairsByKeys( WordBag ) do
print( string.format( outputFormat, word, count ) )
end
print()
-- Note that this report can easily be modified to display the words in the
-- order of the frequency with which they appear in the target file.
-- (This is left as an exercise for the student... ;^)
-- Finally, cLose the file. (Not strictly needed, as the OS will do this.)
-- Note that 'io.input' when called with no arguments, will return the handle
-- of the current input file.
io.input():close()
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment