Skip to content

Instantly share code, notes, and snippets.

@burtonsamograd
Last active February 11, 2024 20:33
Show Gist options
  • Save burtonsamograd/f08f561264ff94391300 to your computer and use it in GitHub Desktop.
Save burtonsamograd/f08f561264ff94391300 to your computer and use it in GitHub Desktop.
A quick and dirty tree shaker for SBCL, giving about a 40% reduction in dump size.
;; -*- mode: lisp -*-
;;
;; A quick and dirty tree shaker for SBCL. Basically, it destroys the
;; package system and does a gc before saving the lisp image. Gives
;; about a 40% reduction in image size on a basic hello world test.
;; Would like to hear how it works on larger projects.
;;
;; Original idea from: https://groups.google.com/d/msg/comp.lang.lisp/6zpZsWFFW18/WMy4PyA9B4kJ
;;
;; Burton Samograd
;; burton.samograd AT gmail.com
;; License: BSD
;; 2015
(defun save-lisp-tree-shake-and-die (file toplevel-fn)
"A dumb tree shaker for SBCL written with lots of advice from nyef."
(let (packages)
(maphash (lambda (package-name package)
(setf (sb-impl::package-%use-list package) nil)
(setf (sb-impl::package-%used-by-list package) nil)
(setf (sb-impl::package-%shadowing-symbols package) nil)
(setf (sb-impl::package-internal-symbols package)
(sb-impl::make-package-hashtable 0))
(setf (sb-impl::package-external-symbols package)
(sb-impl::make-package-hashtable 0))
(setf (sb-impl::package-tables package) #())
(setf (sb-impl::package-%implementation-packages package) nil)
(setf (sb-impl::package-%local-nicknames package) nil)
(setf (sb-impl::package-%locally-nicknamed-by package) nil)
(push package-name packages)
(do-symbols (symbol package-name)
(unless (keywordp symbol)
(sb-impl::%set-symbol-package symbol nil)
(unintern symbol)
)))
sb-impl::*package-names*)
(dolist (package packages)
(unless (string= "KEYWORD" package)
(remhash package sb-impl::*package-names*))))
(gc :full t)
(save-lisp-and-die file :executable t :toplevel toplevel-fn))
(defun main () (print "hello world"))
(save-lisp-tree-shake-and-die "x" #'main)
#|
21:07 *** burtons JOIN
21:07 *** TOPIC SBCL developer hangout, more dev than help -- but
if #lisp can't solve it and it's SBCL specific,
we'll try
(nikodemus!~nikodemus@cs181056239.pp.htv.fi on Fri
Oct 21 08:26:56 2011)
21:07 *** NAMES @Krystof _8hzp Adlai akkad ams antoszka Bike Blkt
brucem burtons carvite ccl-logbot christop1 chu
cojy csziacobus Cymew DeadTrickster dekhoda dougk_
drmeister echo-area fe[nl]ix ferada fikusz fitzsim
flip214 foom Hydan Intensity irsol ivan4th ivan``
jdz joshe jsnell kanru karswell les loke luis
milosn minion mood_ nicdev nyef oleo__ p_l pchrist
Perlboy pkhuong Posterdati psilord PuercoPop
Quadrescence reb redline6561 scymtym snafuchs sobel
soggybre1d specbot tmh_ wasabiz__ weissschloss
White_Flame yauz |3b|
21:08 <burtons> after reading this little bit on tree shaking i
decided to try it:
21:08 <burtons> What our "tree-shaker" did was simply destroy the
package system, and then do a
21:08 <burtons> GC. Any symbols remaining were put back in the
package they were originally
21:08 <burtons> in, allowing debugging.
21:09 <burtons> after looking around a bit, I found that theres
sb-impl::*package-names* that holds the list of
packages
21:09 *** echo-are` JOIN
21:10 <burtons> so, to 'destroy to package system' I tried (clrhash
sb-impl::*package-names*) and then (gc t) follwed
by a save-lisp-and-die
21:10 *** echo-area QUIT Read error: Connection reset by peer
21:10 <burtons> i don't think anything was gc'd and the size of the
image didn't decrease
21:10 <burtons> any other ideas on how one could 'destroy the
package system' to start off with a dumb tree
shaker?
21:10 <nyef> And nothing disappeared because all of the packages
were retained by the SYMBOL-PACKAGE slots and the
inter-package references?
21:11 <burtons> you tell me
21:11 <burtons> :)
21:11 <burtons> this is pretty much my first time looking into sbcl
so it was just a shot in the dark
21:12 *** psy_ JOIN
21:12 <nyef> If you're going to be messing with the package system,
I'd start by looking at SYS:SRC;CODE;PACKAGE.LISP and
SYS:SRC;CODE;TARGET-PACKAGE.LISP.
21:12 <burtons> already there
21:12 <burtons> that's where I found *package-names*
21:13 <burtons> i'll look into the symbol-package slots
21:13 <nyef> Yes, but note the relations between packages. You'll
need to destroy those. And then there's
SYMBOL-PACKAGE. And you'll need to NOT destroy the
KEYWORD package...
21:14 <nyef> Umm... And I don't know if anything would be retained
in the infodb.
21:14 <nyef> That might only track symbols, not packages.
21:14 <burtons> but does the idea seem sound?
21:14 <nyef> It's an interesting approach, at least.
21:15 <burtons> https://groups.google.com/d/msg/comp.lang.lisp/6zpZsWFFW18/WMy4PyA9B4kJ
21:15 <nyef> I'm never quite sure what I think about tree shakers
generally, though.
21:15 <burtons> sounds like it's what they used to do in CMUCL
21:15 <burtons> i just thought it would be a interesting little
project to learn a bit about sbcl internals
21:15 <nyef> Oh! I have my bandwidth back.
21:16 <nyef> Wow, from Pi day twenty years ago?
21:16 <burtons> yeah, it's an old post
21:17 <burtons> ok, now i'm confused. i'm looking into the
symbol-package stuff and the definition of
%set-symbol-package is calling
%symbol-package... same thing for symbol-package
21:18 <burtons> where are symbol-packages stored? in the plist?
21:18 <nyef> That gets into compiler magic.
21:18 <burtons> nope, not in the plist
21:18 <nyef> The compiler knows how to "open code" things like
SYMBOL-PACKAGE to produce an (optional) type-check
followed by a memory reference.
21:19 <burtons> ok
21:19 <burtons> not quite sure what that means yet :)
21:19 <nyef> If you have a look at
SYS:SRC;COMPILER;GENERIC;OBJDEF.LISP you'll find a
!DEFINE-PRIMITIVE-OBJECT for SYMBOL.
21:20 <nyef> It describes the layout of a symbol in memory,
associates it to type tags, adds some magic for
allocating symbols, and so on.
21:21 <burtons> so that low level
21:21 <burtons> would there be a way to destroy the references to
symbol packages, or would the compiler have to be
modified?
21:21 <nyef> The :REF-TRANS and :SET-TRANS slot options tell the
compiler that a slot should have read and write
accessors generated.
21:21 <nyef> Given a symbol, use %SET-SYMBOL-PACKAGE directly if
necessary.
21:21 <burtons> yup, just found those
21:22 <Bike> could you just loop through all symbols uninterning
them and then deleting the packages?
21:22 <burtons> so if you were to iterate over all the symbols and
%set-symbol-package to nil, that would be a start?
21:23 <nyef> Bike: Unintern has quite a bit of overhead...
21:23 <nyef> Bike: Plus it can prompt name conflicts.
21:23 <nyef> (If it's a shadowing symbol, for example.)
21:25 <nyef> The last two paragraphs in Rob's post rather sum up my
current feeling about smaller lisp executables.
21:25 <nyef> Don't try to shake parts of the system loose. Instead,
arrange not to link them in in the first place.
21:26 <nyef> But that takes quite a bit of prep work to make
happen.
21:26 <burtons> i think a dumb tree shaker would be better than
none for some deployments
21:26 <nyef> That's plausible as well.
21:26 <burtons> how would one get a list of symbols in a package?
21:27 <nyef> Portably or unportably?
21:27 <burtons> i don't see anything obvious in the package-
21:27 <burtons> unportably
21:28 <nyef> Ah. You don't, easily. You basically have to grub them
out of the package hashtable. At which point you may
as well use one of the standard iterators.
21:28 <burtons> i think do-symbols would work
21:28 <nyef> Yeah, that'd be the one.
21:30 <burtons> slime sure doesn't like my attempt to clear the
symbol packages :)
21:30 <burtons> doesn't like clearing *package-names* either, but
whatever
21:32 <nyef> Well, you ARE messing with a critical chunk of the
programmer (user) interface.
21:32 <burtons> play hard or die trying
21:33 <burtons> hmm, clearing the symbol package kills the system
21:33 <burtons> ends me up in the debugger right after
21:33 <burtons> (maphash (lambda (package-name package)
21:33 <burtons> (do-symbols (symbol package-name)
21:33 <burtons> (sb-impl::%set-symbol-package symbol
nil))) sb-impl::*package-names*)
21:33 <nyef> Yeah, not unexpected. You also need to not clear the
symbol-package if the symbol is a keyword.
21:34 <burtons> ok
21:35 <nyef> And you're going to run into a number of issues if you
try to do this interactively, since the interactive
system relies on packages basically working.
21:35 <burtons> so run the code with --eval
21:35 <burtons> ?
21:35 <nyef> Or --load, or cat | sbcl, or...
21:36 <nyef> Or just be prepared to have to re-load your code
often.
21:36 <nyef> Essentially once per test run, really.
21:36 <burtons> ok, not killing keywords doesn't dump me in the
debugger, so that's a bit better
21:36 <burtons> all symbols are printed as #: now
21:37 <burtons> still trying interactively for now
21:37 <burtons> i'm sure calling remhash during a maphash is a no
no
21:38 <nyef> Yeah, I think that might be against the rules.
21:38 <nyef> What are you trying to remhash?
21:38 *** dougk_ QUIT Ping timeout: 244 seconds
21:38 <burtons> the packages from *package-names*, except for the
keywords package
21:38 *** echo-are` NICK echo-area
21:39 <burtons> is it called "keywords"?
21:39 <nyef> Don't bother. Create a new hashtable with just keyword
and maybe common-lisp...
21:39 <nyef> Have a poke around, there should be a symbol
*keyword-package* or similar with a direct reference.
21:39 <nyef> It's required to be called "KEYWORD", but may have
aliases.
21:41 *** pacon JOIN
21:42 *** pacon QUIT Read error: Connection reset by peer
21:42 <burtons> all i see is a "keyword" package
21:42 <nyef> There may be some case-mangling going on.
21:43 <burtons> i'm just printing the package name strings
21:43 *** dougk_ JOIN
21:43 <burtons> ok, so i've set all symbol-packages to nil
21:43 <burtons> cleared *package-names*
21:43 <nyef> Ah, sb-int:*keyword-package*.
21:43 <burtons> did a gc, save still nothing
21:43 <burtons> no reduction in size i mean
21:43 <burtons> there must be other references
21:44 <burtons> gc doesn't seem to do anything
21:44 <nyef> Yeah, that's the other bit. The entire system sort of
hangs together even without the package system.
21:45 <burtons> too bad, i was hoping for a good result to a friday
night quick hack :)
21:45 <burtons> what if all the symbols were uninterened as well
21:45 <burtons> wouldn't the compiler already have numeric
references to everything?
21:46 <nyef> Well, what you are doing IS basically uninterning all
of the symbols.
21:46 <burtons> by setting the symbol-package to nil?
21:46 <nyef> Umm... Oh, right, you're not clearing the package
internal-symbol and external-symbol hashtables yet,
are you?
21:47 <burtons> no
21:47 <burtons> i don't think so, just using %set-symbol-package to
nil
21:47 <nyef> Or breaking the %use-list or %used-by-list in the
packages...
21:47 <nyef> ... And it looks like the TABLES slot would need
resetting.
21:48 <burtons> i would think that ripping the packages out of
*package-names* would leave them to be collected
unless there's other references
21:48 <nyef> And even with all of this damage to the package
structure, I'm not expecting much.
21:48 <burtons> well, the original idea was to destroy the package
system :)
21:48 *** pacon JOIN
21:49 <nyef> IIRC, there's a direct reference to at least the
COMMON-LISP package somewhere around, and almost all
packages :USE that.
21:49 <nyef> And there are forward and backward links between the
package objects themselves.
21:49 <burtons> i don't think links between packages would matter
if the top level references to them is in
*package-names*
21:50 <nyef> Yes, but they matter as soon as the top-level
references are gone.
21:50 <burtons> why is that? i would think they would all be
dangling after that and the gc could pick them up
21:51 <nyef> There are other references to specific packages, such
as to the keyword and common-lisp packages, and almost
every package will have a reference to and from the
common-lisp package.
21:51 <burtons> ah, yeah
21:52 <burtons> ok, so clear the use-list and used-by-lists
21:55 <burtons> hmm, can't setf package-use-list
21:55 <nyef> Because it's package-%use-list.
21:56 <burtons> thanks. gc still does nothing after doing that as
well...
21:56 <burtons> time to reset the tables slot
21:56 <nyef> Are you calling GC directly?
21:57 <burtons> yesa
21:57 <nyef> With :FULL T ?
21:57 <burtons> no, just (gc t)
21:58 <burtons> didn't realize that was a keyword
21:59 <nyef> I'm going to have to wish you luck, because I have
about a minute left before my DSL modem turns into a
pumpkin.
21:59 <burtons> ok, hopefully you gave me all the advice i need
21:59 <burtons> thanks
21:59 <burtons> if it works i'll report back
22:00 <nyef> And if it doesn't, you'll be back with more questions,
I'm sure. (-:
22:00 <nyef> Good night, and good luck.
22:00 *** nyef QUIT Quit: G'night all
|#
@ralt
Copy link

ralt commented Feb 7, 2015

Hi,

First of all, if I tried to use it, it's because I'm very much interested :-)

I tried this on https://github.com/Ralt/lxc-wrapper

2 things:

  1. The generated image isn't smaller.
  2. The image drops in LDB right away when I execute it.

This is using sbcl 1.2.8. I guess I've done something wrong?

Note that I use compression, because it isn't very useful otherwise :-)

I've named a file save-image.lisp, and called it using sbcl --load save-image.lisp. I slightly modified the save-lisp-and-die call and removed the gc call, since :purify t does it. Here's what in this file:

(in-package :sb-impl)

(defun save-lisp-tree-shake-and-die (file toplevel-fn)
  "A dumb tree shaker for SBCL written with lots of advice from nyef."
  (let (packages)
    (maphash (lambda (package-name package)
           (setf (sb-impl::package-%use-list package) nil)
           (setf (sb-impl::package-%used-by-list package) nil)
           (setf (sb-impl::package-%shadowing-symbols package) nil)
           (setf (sb-impl::package-internal-symbols package)
             (sb-impl::make-package-hashtable 0))
           (setf (sb-impl::package-external-symbols package)
             (sb-impl::make-package-hashtable 0))
           (setf (sb-impl::package-tables package) #())
           (setf (sb-impl::package-%implementation-packages package) nil)
           (setf (sb-impl::package-%local-nicknames package) nil)
           (setf (sb-impl::package-%locally-nicknamed-by package) nil)
           (push package-name packages)
           (do-symbols (symbol package-name)
         (unless (keywordp symbol)
           (sb-impl::%set-symbol-package symbol nil)
           (unintern symbol)
           )))
         sb-impl::*package-names*)

    (dolist (package packages)
      (unless (string= "KEYWORD" package)
    (remhash package sb-impl::*package-names*))))
  (save-lisp-and-die file :executable t :toplevel toplevel-fn :purify t :compression t))

(asdf:operate 'asdf:load-op 'lxc-wrapper)

(save-lisp-tree-shake-and-die #P"dist/lxc-wrapper" #'lxc-wrapper:main)

@svetlyak40wt
Copy link

Version updated to work with SBCL 2.2.2:

(defun save-lisp-tree-shake-and-die (file toplevel-fn)
  "A dumb tree shaker for SBCL written with lots of advice from nyef."
  (let (packages)
    (mapc (lambda (package)
            (let ((package-name (package-name package)))
              (setf (sb-impl::package-%use-list package) nil)
              (setf (sb-impl::package-%used-by-list package) nil)
              (setf (sb-impl::package-%shadowing-symbols package) nil)
              (setf (sb-impl::package-internal-symbols package)
                    (sb-impl::make-package-hashtable 0))
              (setf (sb-impl::package-external-symbols package)
                    (sb-impl::make-package-hashtable 0))
              (setf (sb-impl::package-tables package) #())
              (setf (sb-impl::package-%implementation-packages package) nil)
              (setf (sb-impl::package-%local-nicknames package) nil)
              (setf (sb-impl::package-%nicknames package) nil)
              (push package-name packages)
              (do-symbols (symbol package-name)
                (unless (keywordp symbol)
                  (sb-impl::%set-symbol-package symbol nil)
                  (unintern symbol)))))
          (list-all-packages))

    (dolist (package-name packages)
      (unless (string= "KEYWORD" package-name)
        (setf (sb-int:info-gethash package-name
                                   sb-impl::*package-names*)
              nil)))

    (sb-ext:gc :full t))

  (save-lisp-and-die file :executable t :toplevel toplevel-fn :purify t :compression t))

@phmarek
Copy link

phmarek commented Mar 31, 2023

With the last version on an 2.3.1 I get

fatal error encountered in SBCL pid 2060368 tid 2060368:
Unhandled SIGILL at 0x52ac3940.

Welcome to LDB, a low-level debugger for the Lisp runtime environment.
ldb> 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment