| ;; -*- mode: lisp -*- | |
| ;; | |
| ;; A quick and dirty tree shaker for SBCL. Basically, it destroys the | |
| ;; package system and does a gc before saving the lisp image. Gives | |
| ;; about a 40% reduction in image size on a basic hello world test. | |
| ;; Would like to hear how it works on larger projects. | |
| ;; | |
| ;; Original idea from: https://groups.google.com/d/msg/comp.lang.lisp/6zpZsWFFW18/WMy4PyA9B4kJ | |
| ;; | |
| ;; Burton Samograd | |
| ;; burton.samograd AT gmail.com | |
| ;; License: BSD | |
| ;; 2015 | |
| (defun save-lisp-tree-shake-and-die (file toplevel-fn) | |
| "A dumb tree shaker for SBCL written with lots of advice from nyef." | |
| (let (packages) | |
| (maphash (lambda (package-name package) | |
| (setf (sb-impl::package-%use-list package) nil) | |
| (setf (sb-impl::package-%used-by-list package) nil) | |
| (setf (sb-impl::package-%shadowing-symbols package) nil) | |
| (setf (sb-impl::package-internal-symbols package) | |
| (sb-impl::make-package-hashtable 0)) | |
| (setf (sb-impl::package-external-symbols package) | |
| (sb-impl::make-package-hashtable 0)) | |
| (setf (sb-impl::package-tables package) #()) | |
| (setf (sb-impl::package-%implementation-packages package) nil) | |
| (setf (sb-impl::package-%local-nicknames package) nil) | |
| (setf (sb-impl::package-%locally-nicknamed-by package) nil) | |
| (push package-name packages) | |
| (do-symbols (symbol package-name) | |
| (unless (keywordp symbol) | |
| (sb-impl::%set-symbol-package symbol nil) | |
| (unintern symbol) | |
| ))) | |
| sb-impl::*package-names*) | |
| (dolist (package packages) | |
| (unless (string= "KEYWORD" package) | |
| (remhash package sb-impl::*package-names*)))) | |
| (gc :full t) | |
| (save-lisp-and-die file :executable t :toplevel toplevel-fn)) | |
| (defun main () (print "hello world")) | |
| (save-lisp-tree-shake-and-die "x" #'main) | |
| #| | |
| 21:07 *** burtons JOIN | |
| 21:07 *** TOPIC SBCL developer hangout, more dev than help -- but | |
| if #lisp can't solve it and it's SBCL specific, | |
| we'll try | |
| (nikodemus!~nikodemus@cs181056239.pp.htv.fi on Fri | |
| Oct 21 08:26:56 2011) | |
| 21:07 *** NAMES @Krystof _8hzp Adlai akkad ams antoszka Bike Blkt | |
| brucem burtons carvite ccl-logbot christop1 chu | |
| cojy csziacobus Cymew DeadTrickster dekhoda dougk_ | |
| drmeister echo-area fe[nl]ix ferada fikusz fitzsim | |
| flip214 foom Hydan Intensity irsol ivan4th ivan`` | |
| jdz joshe jsnell kanru karswell les loke luis | |
| milosn minion mood_ nicdev nyef oleo__ p_l pchrist | |
| Perlboy pkhuong Posterdati psilord PuercoPop | |
| Quadrescence reb redline6561 scymtym snafuchs sobel | |
| soggybre1d specbot tmh_ wasabiz__ weissschloss | |
| White_Flame yauz |3b| | |
| 21:08 <burtons> after reading this little bit on tree shaking i | |
| decided to try it: | |
| 21:08 <burtons> What our "tree-shaker" did was simply destroy the | |
| package system, and then do a | |
| 21:08 <burtons> GC. Any symbols remaining were put back in the | |
| package they were originally | |
| 21:08 <burtons> in, allowing debugging. | |
| 21:09 <burtons> after looking around a bit, I found that theres | |
| sb-impl::*package-names* that holds the list of | |
| packages | |
| 21:09 *** echo-are` JOIN | |
| 21:10 <burtons> so, to 'destroy to package system' I tried (clrhash | |
| sb-impl::*package-names*) and then (gc t) follwed | |
| by a save-lisp-and-die | |
| 21:10 *** echo-area QUIT Read error: Connection reset by peer | |
| 21:10 <burtons> i don't think anything was gc'd and the size of the | |
| image didn't decrease | |
| 21:10 <burtons> any other ideas on how one could 'destroy the | |
| package system' to start off with a dumb tree | |
| shaker? | |
| 21:10 <nyef> And nothing disappeared because all of the packages | |
| were retained by the SYMBOL-PACKAGE slots and the | |
| inter-package references? | |
| 21:11 <burtons> you tell me | |
| 21:11 <burtons> :) | |
| 21:11 <burtons> this is pretty much my first time looking into sbcl | |
| so it was just a shot in the dark | |
| 21:12 *** psy_ JOIN | |
| 21:12 <nyef> If you're going to be messing with the package system, | |
| I'd start by looking at SYS:SRC;CODE;PACKAGE.LISP and | |
| SYS:SRC;CODE;TARGET-PACKAGE.LISP. | |
| 21:12 <burtons> already there | |
| 21:12 <burtons> that's where I found *package-names* | |
| 21:13 <burtons> i'll look into the symbol-package slots | |
| 21:13 <nyef> Yes, but note the relations between packages. You'll | |
| need to destroy those. And then there's | |
| SYMBOL-PACKAGE. And you'll need to NOT destroy the | |
| KEYWORD package... | |
| 21:14 <nyef> Umm... And I don't know if anything would be retained | |
| in the infodb. | |
| 21:14 <nyef> That might only track symbols, not packages. | |
| 21:14 <burtons> but does the idea seem sound? | |
| 21:14 <nyef> It's an interesting approach, at least. | |
| 21:15 <burtons> https://groups.google.com/d/msg/comp.lang.lisp/6zpZsWFFW18/WMy4PyA9B4kJ | |
| 21:15 <nyef> I'm never quite sure what I think about tree shakers | |
| generally, though. | |
| 21:15 <burtons> sounds like it's what they used to do in CMUCL | |
| 21:15 <burtons> i just thought it would be a interesting little | |
| project to learn a bit about sbcl internals | |
| 21:15 <nyef> Oh! I have my bandwidth back. | |
| 21:16 <nyef> Wow, from Pi day twenty years ago? | |
| 21:16 <burtons> yeah, it's an old post | |
| 21:17 <burtons> ok, now i'm confused. i'm looking into the | |
| symbol-package stuff and the definition of | |
| %set-symbol-package is calling | |
| %symbol-package... same thing for symbol-package | |
| 21:18 <burtons> where are symbol-packages stored? in the plist? | |
| 21:18 <nyef> That gets into compiler magic. | |
| 21:18 <burtons> nope, not in the plist | |
| 21:18 <nyef> The compiler knows how to "open code" things like | |
| SYMBOL-PACKAGE to produce an (optional) type-check | |
| followed by a memory reference. | |
| 21:19 <burtons> ok | |
| 21:19 <burtons> not quite sure what that means yet :) | |
| 21:19 <nyef> If you have a look at | |
| SYS:SRC;COMPILER;GENERIC;OBJDEF.LISP you'll find a | |
| !DEFINE-PRIMITIVE-OBJECT for SYMBOL. | |
| 21:20 <nyef> It describes the layout of a symbol in memory, | |
| associates it to type tags, adds some magic for | |
| allocating symbols, and so on. | |
| 21:21 <burtons> so that low level | |
| 21:21 <burtons> would there be a way to destroy the references to | |
| symbol packages, or would the compiler have to be | |
| modified? | |
| 21:21 <nyef> The :REF-TRANS and :SET-TRANS slot options tell the | |
| compiler that a slot should have read and write | |
| accessors generated. | |
| 21:21 <nyef> Given a symbol, use %SET-SYMBOL-PACKAGE directly if | |
| necessary. | |
| 21:21 <burtons> yup, just found those | |
| 21:22 <Bike> could you just loop through all symbols uninterning | |
| them and then deleting the packages? | |
| 21:22 <burtons> so if you were to iterate over all the symbols and | |
| %set-symbol-package to nil, that would be a start? | |
| 21:23 <nyef> Bike: Unintern has quite a bit of overhead... | |
| 21:23 <nyef> Bike: Plus it can prompt name conflicts. | |
| 21:23 <nyef> (If it's a shadowing symbol, for example.) | |
| 21:25 <nyef> The last two paragraphs in Rob's post rather sum up my | |
| current feeling about smaller lisp executables. | |
| 21:25 <nyef> Don't try to shake parts of the system loose. Instead, | |
| arrange not to link them in in the first place. | |
| 21:26 <nyef> But that takes quite a bit of prep work to make | |
| happen. | |
| 21:26 <burtons> i think a dumb tree shaker would be better than | |
| none for some deployments | |
| 21:26 <nyef> That's plausible as well. | |
| 21:26 <burtons> how would one get a list of symbols in a package? | |
| 21:27 <nyef> Portably or unportably? | |
| 21:27 <burtons> i don't see anything obvious in the package- | |
| 21:27 <burtons> unportably | |
| 21:28 <nyef> Ah. You don't, easily. You basically have to grub them | |
| out of the package hashtable. At which point you may | |
| as well use one of the standard iterators. | |
| 21:28 <burtons> i think do-symbols would work | |
| 21:28 <nyef> Yeah, that'd be the one. | |
| 21:30 <burtons> slime sure doesn't like my attempt to clear the | |
| symbol packages :) | |
| 21:30 <burtons> doesn't like clearing *package-names* either, but | |
| whatever | |
| 21:32 <nyef> Well, you ARE messing with a critical chunk of the | |
| programmer (user) interface. | |
| 21:32 <burtons> play hard or die trying | |
| 21:33 <burtons> hmm, clearing the symbol package kills the system | |
| 21:33 <burtons> ends me up in the debugger right after | |
| 21:33 <burtons> (maphash (lambda (package-name package) | |
| 21:33 <burtons> (do-symbols (symbol package-name) | |
| 21:33 <burtons> (sb-impl::%set-symbol-package symbol | |
| nil))) sb-impl::*package-names*) | |
| 21:33 <nyef> Yeah, not unexpected. You also need to not clear the | |
| symbol-package if the symbol is a keyword. | |
| 21:34 <burtons> ok | |
| 21:35 <nyef> And you're going to run into a number of issues if you | |
| try to do this interactively, since the interactive | |
| system relies on packages basically working. | |
| 21:35 <burtons> so run the code with --eval | |
| 21:35 <burtons> ? | |
| 21:35 <nyef> Or --load, or cat | sbcl, or... | |
| 21:36 <nyef> Or just be prepared to have to re-load your code | |
| often. | |
| 21:36 <nyef> Essentially once per test run, really. | |
| 21:36 <burtons> ok, not killing keywords doesn't dump me in the | |
| debugger, so that's a bit better | |
| 21:36 <burtons> all symbols are printed as #: now | |
| 21:37 <burtons> still trying interactively for now | |
| 21:37 <burtons> i'm sure calling remhash during a maphash is a no | |
| no | |
| 21:38 <nyef> Yeah, I think that might be against the rules. | |
| 21:38 <nyef> What are you trying to remhash? | |
| 21:38 *** dougk_ QUIT Ping timeout: 244 seconds | |
| 21:38 <burtons> the packages from *package-names*, except for the | |
| keywords package | |
| 21:38 *** echo-are` NICK echo-area | |
| 21:39 <burtons> is it called "keywords"? | |
| 21:39 <nyef> Don't bother. Create a new hashtable with just keyword | |
| and maybe common-lisp... | |
| 21:39 <nyef> Have a poke around, there should be a symbol | |
| *keyword-package* or similar with a direct reference. | |
| 21:39 <nyef> It's required to be called "KEYWORD", but may have | |
| aliases. | |
| 21:41 *** pacon JOIN | |
| 21:42 *** pacon QUIT Read error: Connection reset by peer | |
| 21:42 <burtons> all i see is a "keyword" package | |
| 21:42 <nyef> There may be some case-mangling going on. | |
| 21:43 <burtons> i'm just printing the package name strings | |
| 21:43 *** dougk_ JOIN | |
| 21:43 <burtons> ok, so i've set all symbol-packages to nil | |
| 21:43 <burtons> cleared *package-names* | |
| 21:43 <nyef> Ah, sb-int:*keyword-package*. | |
| 21:43 <burtons> did a gc, save still nothing | |
| 21:43 <burtons> no reduction in size i mean | |
| 21:43 <burtons> there must be other references | |
| 21:44 <burtons> gc doesn't seem to do anything | |
| 21:44 <nyef> Yeah, that's the other bit. The entire system sort of | |
| hangs together even without the package system. | |
| 21:45 <burtons> too bad, i was hoping for a good result to a friday | |
| night quick hack :) | |
| 21:45 <burtons> what if all the symbols were uninterened as well | |
| 21:45 <burtons> wouldn't the compiler already have numeric | |
| references to everything? | |
| 21:46 <nyef> Well, what you are doing IS basically uninterning all | |
| of the symbols. | |
| 21:46 <burtons> by setting the symbol-package to nil? | |
| 21:46 <nyef> Umm... Oh, right, you're not clearing the package | |
| internal-symbol and external-symbol hashtables yet, | |
| are you? | |
| 21:47 <burtons> no | |
| 21:47 <burtons> i don't think so, just using %set-symbol-package to | |
| nil | |
| 21:47 <nyef> Or breaking the %use-list or %used-by-list in the | |
| packages... | |
| 21:47 <nyef> ... And it looks like the TABLES slot would need | |
| resetting. | |
| 21:48 <burtons> i would think that ripping the packages out of | |
| *package-names* would leave them to be collected | |
| unless there's other references | |
| 21:48 <nyef> And even with all of this damage to the package | |
| structure, I'm not expecting much. | |
| 21:48 <burtons> well, the original idea was to destroy the package | |
| system :) | |
| 21:48 *** pacon JOIN | |
| 21:49 <nyef> IIRC, there's a direct reference to at least the | |
| COMMON-LISP package somewhere around, and almost all | |
| packages :USE that. | |
| 21:49 <nyef> And there are forward and backward links between the | |
| package objects themselves. | |
| 21:49 <burtons> i don't think links between packages would matter | |
| if the top level references to them is in | |
| *package-names* | |
| 21:50 <nyef> Yes, but they matter as soon as the top-level | |
| references are gone. | |
| 21:50 <burtons> why is that? i would think they would all be | |
| dangling after that and the gc could pick them up | |
| 21:51 <nyef> There are other references to specific packages, such | |
| as to the keyword and common-lisp packages, and almost | |
| every package will have a reference to and from the | |
| common-lisp package. | |
| 21:51 <burtons> ah, yeah | |
| 21:52 <burtons> ok, so clear the use-list and used-by-lists | |
| 21:55 <burtons> hmm, can't setf package-use-list | |
| 21:55 <nyef> Because it's package-%use-list. | |
| 21:56 <burtons> thanks. gc still does nothing after doing that as | |
| well... | |
| 21:56 <burtons> time to reset the tables slot | |
| 21:56 <nyef> Are you calling GC directly? | |
| 21:57 <burtons> yesa | |
| 21:57 <nyef> With :FULL T ? | |
| 21:57 <burtons> no, just (gc t) | |
| 21:58 <burtons> didn't realize that was a keyword | |
| 21:59 <nyef> I'm going to have to wish you luck, because I have | |
| about a minute left before my DSL modem turns into a | |
| pumpkin. | |
| 21:59 <burtons> ok, hopefully you gave me all the advice i need | |
| 21:59 <burtons> thanks | |
| 21:59 <burtons> if it works i'll report back | |
| 22:00 <nyef> And if it doesn't, you'll be back with more questions, | |
| I'm sure. (-: | |
| 22:00 <nyef> Good night, and good luck. | |
| 22:00 *** nyef QUIT Quit: G'night all | |
| |# |
ralt
commented
Feb 7, 2015
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
First of all, if I tried to use it, it's because I'm very much interested :-)
I tried this on https://github.com/Ralt/lxc-wrapper
2 things:
This is using sbcl 1.2.8. I guess I've done something wrong?
Note that I use compression, because it isn't very useful otherwise :-)
I've named a file
save-image.lisp, and called it usingsbcl --load save-image.lisp. I slightly modified thesave-lisp-and-diecall and removed thegccall, since:purify tdoes it. Here's what in this file: