Last active
February 11, 2024 20:33
-
-
Save burtonsamograd/f08f561264ff94391300 to your computer and use it in GitHub Desktop.
A quick and dirty tree shaker for SBCL, giving about a 40% reduction in dump size.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
;; -*- mode: lisp -*- | |
;; | |
;; A quick and dirty tree shaker for SBCL. Basically, it destroys the | |
;; package system and does a gc before saving the lisp image. Gives | |
;; about a 40% reduction in image size on a basic hello world test. | |
;; Would like to hear how it works on larger projects. | |
;; | |
;; Original idea from: https://groups.google.com/d/msg/comp.lang.lisp/6zpZsWFFW18/WMy4PyA9B4kJ | |
;; | |
;; Burton Samograd | |
;; burton.samograd AT gmail.com | |
;; License: BSD | |
;; 2015 | |
(defun save-lisp-tree-shake-and-die (file toplevel-fn) | |
"A dumb tree shaker for SBCL written with lots of advice from nyef." | |
(let (packages) | |
(maphash (lambda (package-name package) | |
(setf (sb-impl::package-%use-list package) nil) | |
(setf (sb-impl::package-%used-by-list package) nil) | |
(setf (sb-impl::package-%shadowing-symbols package) nil) | |
(setf (sb-impl::package-internal-symbols package) | |
(sb-impl::make-package-hashtable 0)) | |
(setf (sb-impl::package-external-symbols package) | |
(sb-impl::make-package-hashtable 0)) | |
(setf (sb-impl::package-tables package) #()) | |
(setf (sb-impl::package-%implementation-packages package) nil) | |
(setf (sb-impl::package-%local-nicknames package) nil) | |
(setf (sb-impl::package-%locally-nicknamed-by package) nil) | |
(push package-name packages) | |
(do-symbols (symbol package-name) | |
(unless (keywordp symbol) | |
(sb-impl::%set-symbol-package symbol nil) | |
(unintern symbol) | |
))) | |
sb-impl::*package-names*) | |
(dolist (package packages) | |
(unless (string= "KEYWORD" package) | |
(remhash package sb-impl::*package-names*)))) | |
(gc :full t) | |
(save-lisp-and-die file :executable t :toplevel toplevel-fn)) | |
(defun main () (print "hello world")) | |
(save-lisp-tree-shake-and-die "x" #'main) | |
#| | |
21:07 *** burtons JOIN | |
21:07 *** TOPIC SBCL developer hangout, more dev than help -- but | |
if #lisp can't solve it and it's SBCL specific, | |
we'll try | |
(nikodemus!~nikodemus@cs181056239.pp.htv.fi on Fri | |
Oct 21 08:26:56 2011) | |
21:07 *** NAMES @Krystof _8hzp Adlai akkad ams antoszka Bike Blkt | |
brucem burtons carvite ccl-logbot christop1 chu | |
cojy csziacobus Cymew DeadTrickster dekhoda dougk_ | |
drmeister echo-area fe[nl]ix ferada fikusz fitzsim | |
flip214 foom Hydan Intensity irsol ivan4th ivan`` | |
jdz joshe jsnell kanru karswell les loke luis | |
milosn minion mood_ nicdev nyef oleo__ p_l pchrist | |
Perlboy pkhuong Posterdati psilord PuercoPop | |
Quadrescence reb redline6561 scymtym snafuchs sobel | |
soggybre1d specbot tmh_ wasabiz__ weissschloss | |
White_Flame yauz |3b| | |
21:08 <burtons> after reading this little bit on tree shaking i | |
decided to try it: | |
21:08 <burtons> What our "tree-shaker" did was simply destroy the | |
package system, and then do a | |
21:08 <burtons> GC. Any symbols remaining were put back in the | |
package they were originally | |
21:08 <burtons> in, allowing debugging. | |
21:09 <burtons> after looking around a bit, I found that theres | |
sb-impl::*package-names* that holds the list of | |
packages | |
21:09 *** echo-are` JOIN | |
21:10 <burtons> so, to 'destroy to package system' I tried (clrhash | |
sb-impl::*package-names*) and then (gc t) follwed | |
by a save-lisp-and-die | |
21:10 *** echo-area QUIT Read error: Connection reset by peer | |
21:10 <burtons> i don't think anything was gc'd and the size of the | |
image didn't decrease | |
21:10 <burtons> any other ideas on how one could 'destroy the | |
package system' to start off with a dumb tree | |
shaker? | |
21:10 <nyef> And nothing disappeared because all of the packages | |
were retained by the SYMBOL-PACKAGE slots and the | |
inter-package references? | |
21:11 <burtons> you tell me | |
21:11 <burtons> :) | |
21:11 <burtons> this is pretty much my first time looking into sbcl | |
so it was just a shot in the dark | |
21:12 *** psy_ JOIN | |
21:12 <nyef> If you're going to be messing with the package system, | |
I'd start by looking at SYS:SRC;CODE;PACKAGE.LISP and | |
SYS:SRC;CODE;TARGET-PACKAGE.LISP. | |
21:12 <burtons> already there | |
21:12 <burtons> that's where I found *package-names* | |
21:13 <burtons> i'll look into the symbol-package slots | |
21:13 <nyef> Yes, but note the relations between packages. You'll | |
need to destroy those. And then there's | |
SYMBOL-PACKAGE. And you'll need to NOT destroy the | |
KEYWORD package... | |
21:14 <nyef> Umm... And I don't know if anything would be retained | |
in the infodb. | |
21:14 <nyef> That might only track symbols, not packages. | |
21:14 <burtons> but does the idea seem sound? | |
21:14 <nyef> It's an interesting approach, at least. | |
21:15 <burtons> https://groups.google.com/d/msg/comp.lang.lisp/6zpZsWFFW18/WMy4PyA9B4kJ | |
21:15 <nyef> I'm never quite sure what I think about tree shakers | |
generally, though. | |
21:15 <burtons> sounds like it's what they used to do in CMUCL | |
21:15 <burtons> i just thought it would be a interesting little | |
project to learn a bit about sbcl internals | |
21:15 <nyef> Oh! I have my bandwidth back. | |
21:16 <nyef> Wow, from Pi day twenty years ago? | |
21:16 <burtons> yeah, it's an old post | |
21:17 <burtons> ok, now i'm confused. i'm looking into the | |
symbol-package stuff and the definition of | |
%set-symbol-package is calling | |
%symbol-package... same thing for symbol-package | |
21:18 <burtons> where are symbol-packages stored? in the plist? | |
21:18 <nyef> That gets into compiler magic. | |
21:18 <burtons> nope, not in the plist | |
21:18 <nyef> The compiler knows how to "open code" things like | |
SYMBOL-PACKAGE to produce an (optional) type-check | |
followed by a memory reference. | |
21:19 <burtons> ok | |
21:19 <burtons> not quite sure what that means yet :) | |
21:19 <nyef> If you have a look at | |
SYS:SRC;COMPILER;GENERIC;OBJDEF.LISP you'll find a | |
!DEFINE-PRIMITIVE-OBJECT for SYMBOL. | |
21:20 <nyef> It describes the layout of a symbol in memory, | |
associates it to type tags, adds some magic for | |
allocating symbols, and so on. | |
21:21 <burtons> so that low level | |
21:21 <burtons> would there be a way to destroy the references to | |
symbol packages, or would the compiler have to be | |
modified? | |
21:21 <nyef> The :REF-TRANS and :SET-TRANS slot options tell the | |
compiler that a slot should have read and write | |
accessors generated. | |
21:21 <nyef> Given a symbol, use %SET-SYMBOL-PACKAGE directly if | |
necessary. | |
21:21 <burtons> yup, just found those | |
21:22 <Bike> could you just loop through all symbols uninterning | |
them and then deleting the packages? | |
21:22 <burtons> so if you were to iterate over all the symbols and | |
%set-symbol-package to nil, that would be a start? | |
21:23 <nyef> Bike: Unintern has quite a bit of overhead... | |
21:23 <nyef> Bike: Plus it can prompt name conflicts. | |
21:23 <nyef> (If it's a shadowing symbol, for example.) | |
21:25 <nyef> The last two paragraphs in Rob's post rather sum up my | |
current feeling about smaller lisp executables. | |
21:25 <nyef> Don't try to shake parts of the system loose. Instead, | |
arrange not to link them in in the first place. | |
21:26 <nyef> But that takes quite a bit of prep work to make | |
happen. | |
21:26 <burtons> i think a dumb tree shaker would be better than | |
none for some deployments | |
21:26 <nyef> That's plausible as well. | |
21:26 <burtons> how would one get a list of symbols in a package? | |
21:27 <nyef> Portably or unportably? | |
21:27 <burtons> i don't see anything obvious in the package- | |
21:27 <burtons> unportably | |
21:28 <nyef> Ah. You don't, easily. You basically have to grub them | |
out of the package hashtable. At which point you may | |
as well use one of the standard iterators. | |
21:28 <burtons> i think do-symbols would work | |
21:28 <nyef> Yeah, that'd be the one. | |
21:30 <burtons> slime sure doesn't like my attempt to clear the | |
symbol packages :) | |
21:30 <burtons> doesn't like clearing *package-names* either, but | |
whatever | |
21:32 <nyef> Well, you ARE messing with a critical chunk of the | |
programmer (user) interface. | |
21:32 <burtons> play hard or die trying | |
21:33 <burtons> hmm, clearing the symbol package kills the system | |
21:33 <burtons> ends me up in the debugger right after | |
21:33 <burtons> (maphash (lambda (package-name package) | |
21:33 <burtons> (do-symbols (symbol package-name) | |
21:33 <burtons> (sb-impl::%set-symbol-package symbol | |
nil))) sb-impl::*package-names*) | |
21:33 <nyef> Yeah, not unexpected. You also need to not clear the | |
symbol-package if the symbol is a keyword. | |
21:34 <burtons> ok | |
21:35 <nyef> And you're going to run into a number of issues if you | |
try to do this interactively, since the interactive | |
system relies on packages basically working. | |
21:35 <burtons> so run the code with --eval | |
21:35 <burtons> ? | |
21:35 <nyef> Or --load, or cat | sbcl, or... | |
21:36 <nyef> Or just be prepared to have to re-load your code | |
often. | |
21:36 <nyef> Essentially once per test run, really. | |
21:36 <burtons> ok, not killing keywords doesn't dump me in the | |
debugger, so that's a bit better | |
21:36 <burtons> all symbols are printed as #: now | |
21:37 <burtons> still trying interactively for now | |
21:37 <burtons> i'm sure calling remhash during a maphash is a no | |
no | |
21:38 <nyef> Yeah, I think that might be against the rules. | |
21:38 <nyef> What are you trying to remhash? | |
21:38 *** dougk_ QUIT Ping timeout: 244 seconds | |
21:38 <burtons> the packages from *package-names*, except for the | |
keywords package | |
21:38 *** echo-are` NICK echo-area | |
21:39 <burtons> is it called "keywords"? | |
21:39 <nyef> Don't bother. Create a new hashtable with just keyword | |
and maybe common-lisp... | |
21:39 <nyef> Have a poke around, there should be a symbol | |
*keyword-package* or similar with a direct reference. | |
21:39 <nyef> It's required to be called "KEYWORD", but may have | |
aliases. | |
21:41 *** pacon JOIN | |
21:42 *** pacon QUIT Read error: Connection reset by peer | |
21:42 <burtons> all i see is a "keyword" package | |
21:42 <nyef> There may be some case-mangling going on. | |
21:43 <burtons> i'm just printing the package name strings | |
21:43 *** dougk_ JOIN | |
21:43 <burtons> ok, so i've set all symbol-packages to nil | |
21:43 <burtons> cleared *package-names* | |
21:43 <nyef> Ah, sb-int:*keyword-package*. | |
21:43 <burtons> did a gc, save still nothing | |
21:43 <burtons> no reduction in size i mean | |
21:43 <burtons> there must be other references | |
21:44 <burtons> gc doesn't seem to do anything | |
21:44 <nyef> Yeah, that's the other bit. The entire system sort of | |
hangs together even without the package system. | |
21:45 <burtons> too bad, i was hoping for a good result to a friday | |
night quick hack :) | |
21:45 <burtons> what if all the symbols were uninterened as well | |
21:45 <burtons> wouldn't the compiler already have numeric | |
references to everything? | |
21:46 <nyef> Well, what you are doing IS basically uninterning all | |
of the symbols. | |
21:46 <burtons> by setting the symbol-package to nil? | |
21:46 <nyef> Umm... Oh, right, you're not clearing the package | |
internal-symbol and external-symbol hashtables yet, | |
are you? | |
21:47 <burtons> no | |
21:47 <burtons> i don't think so, just using %set-symbol-package to | |
nil | |
21:47 <nyef> Or breaking the %use-list or %used-by-list in the | |
packages... | |
21:47 <nyef> ... And it looks like the TABLES slot would need | |
resetting. | |
21:48 <burtons> i would think that ripping the packages out of | |
*package-names* would leave them to be collected | |
unless there's other references | |
21:48 <nyef> And even with all of this damage to the package | |
structure, I'm not expecting much. | |
21:48 <burtons> well, the original idea was to destroy the package | |
system :) | |
21:48 *** pacon JOIN | |
21:49 <nyef> IIRC, there's a direct reference to at least the | |
COMMON-LISP package somewhere around, and almost all | |
packages :USE that. | |
21:49 <nyef> And there are forward and backward links between the | |
package objects themselves. | |
21:49 <burtons> i don't think links between packages would matter | |
if the top level references to them is in | |
*package-names* | |
21:50 <nyef> Yes, but they matter as soon as the top-level | |
references are gone. | |
21:50 <burtons> why is that? i would think they would all be | |
dangling after that and the gc could pick them up | |
21:51 <nyef> There are other references to specific packages, such | |
as to the keyword and common-lisp packages, and almost | |
every package will have a reference to and from the | |
common-lisp package. | |
21:51 <burtons> ah, yeah | |
21:52 <burtons> ok, so clear the use-list and used-by-lists | |
21:55 <burtons> hmm, can't setf package-use-list | |
21:55 <nyef> Because it's package-%use-list. | |
21:56 <burtons> thanks. gc still does nothing after doing that as | |
well... | |
21:56 <burtons> time to reset the tables slot | |
21:56 <nyef> Are you calling GC directly? | |
21:57 <burtons> yesa | |
21:57 <nyef> With :FULL T ? | |
21:57 <burtons> no, just (gc t) | |
21:58 <burtons> didn't realize that was a keyword | |
21:59 <nyef> I'm going to have to wish you luck, because I have | |
about a minute left before my DSL modem turns into a | |
pumpkin. | |
21:59 <burtons> ok, hopefully you gave me all the advice i need | |
21:59 <burtons> thanks | |
21:59 <burtons> if it works i'll report back | |
22:00 <nyef> And if it doesn't, you'll be back with more questions, | |
I'm sure. (-: | |
22:00 <nyef> Good night, and good luck. | |
22:00 *** nyef QUIT Quit: G'night all | |
|# |
Version updated to work with SBCL 2.2.2:
(defun save-lisp-tree-shake-and-die (file toplevel-fn)
"A dumb tree shaker for SBCL written with lots of advice from nyef."
(let (packages)
(mapc (lambda (package)
(let ((package-name (package-name package)))
(setf (sb-impl::package-%use-list package) nil)
(setf (sb-impl::package-%used-by-list package) nil)
(setf (sb-impl::package-%shadowing-symbols package) nil)
(setf (sb-impl::package-internal-symbols package)
(sb-impl::make-package-hashtable 0))
(setf (sb-impl::package-external-symbols package)
(sb-impl::make-package-hashtable 0))
(setf (sb-impl::package-tables package) #())
(setf (sb-impl::package-%implementation-packages package) nil)
(setf (sb-impl::package-%local-nicknames package) nil)
(setf (sb-impl::package-%nicknames package) nil)
(push package-name packages)
(do-symbols (symbol package-name)
(unless (keywordp symbol)
(sb-impl::%set-symbol-package symbol nil)
(unintern symbol)))))
(list-all-packages))
(dolist (package-name packages)
(unless (string= "KEYWORD" package-name)
(setf (sb-int:info-gethash package-name
sb-impl::*package-names*)
nil)))
(sb-ext:gc :full t))
(save-lisp-and-die file :executable t :toplevel toplevel-fn :purify t :compression t))
With the last version on an 2.3.1 I get
fatal error encountered in SBCL pid 2060368 tid 2060368:
Unhandled SIGILL at 0x52ac3940.
Welcome to LDB, a low-level debugger for the Lisp runtime environment.
ldb>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
First of all, if I tried to use it, it's because I'm very much interested :-)
I tried this on https://github.com/Ralt/lxc-wrapper
2 things:
This is using sbcl 1.2.8. I guess I've done something wrong?
Note that I use compression, because it isn't very useful otherwise :-)
I've named a file
save-image.lisp
, and called it usingsbcl --load save-image.lisp
. I slightly modified thesave-lisp-and-die
call and removed thegc
call, since:purify t
does it. Here's what in this file: