Last active
September 9, 2017 23:00
-
-
Save nascheme/69ad9ef533e0c654bf7566f437405ab2 to your computer and use it in GitHub Desktop.
CPython dev sprint 2017: Startup speed: idea, lazy creation of module definitions, global values
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
See: | |
https://public.etherpad-mozilla.org/p/cpython-dev-sprint-2017 | |
https://github.com/warsaw/lazyimport/blob/master/lazy_compile.py | |
https://github.com/warsaw/lazyimport/blob/master/lazy_helper.py | |
This idea is based on a comment from Larry Hastings. PHP got a good | |
speedup by not creating all functions defined in the source. Python could | |
doso mething similar for classes and functions. Perhaps without not too | |
much backwards compatibility problems. This would be a huge win for | |
startup and memory usage of command-line tools that use large libraries | |
but each invocation only uses a small subset of those libraries. Lazy | |
loading per module helps but doing it per function and per class would be | |
much more powerful. | |
Could be prototyped using an AST transformer. Make function global | |
variable a property, when accessed actually create function (from marshal | |
byte string stored in memory). Should be safe to do as AST analysis finds | |
side-effect potential code and does not make it lazy. | |
Analyzer prototype: https://github.com/warsaw/lazyimport (lazy_analyze.py) | |
Same issues as lazy module load safety check: | |
- from .. import ... could raise error | |
- class A(B) metaclass side-effects | |
Safe with command-line flag to turn it on per app? Otherwise, mark each | |
module safe using compiler directive, e.g. | |
from __future__ import __lazy__ | |
If it will be future behavior (I think it should be), provide mechanisms | |
to do non-lazy side effects. If I understand Guido's position, should be | |
done by calling module function to get the size effects you want. I.e. | |
the person using a library should determine when side-effects happen, not | |
happen just as side-effect of import. Idea from Barry Warsaw allow | |
__init__ function, marks whole module as lazy safe, gets called on import. | |
Prototype of alternative compiler that produces modules that are loaded | |
this way: | |
https://github.com/warsaw/lazyimport/blob/master/lazy_compile.py | |
Doesn't quite work for a few reasons: | |
- inpecting the module.__dict__ directly will not show the lazy global | |
defs. Could break inspection tools, IDEs. Maybe not fatal problem. | |
dir() still works if done as property on module. | |
- LOAD_NAME is serious problem. A load of the global from within the | |
module is done with the LOAD_NAME opcode. That does a direct | |
PyDict_GetItem. No place to put property hook and wake-up object. The | |
fact that properties don't work for LOAD_NAME means that PEP 549 has a | |
similar issue (I think). | |
How to fix the LOAD_NAME problem? In retrospect, LOAD_NAME should not | |
exist and should have just LOAD_ATTR. LOAD_NAME is there for historical | |
reasons. It is faster than LOAD_ATTR but the bigger problem is how | |
'globals' gets passed around. E.g. exec(code, module) does not work, you | |
have to pass exec() the module __dict__. Then, inside ceval, you don't | |
have access to the module (could look it up or keep circular ref, ugly). | |
So, no way to look for property and no way to override __getattr__. | |
This problem exists for assigning __class__ on modules. You can do | |
import a | |
a.b # b is a property | |
but this does not work: | |
# implementation of a | |
import sys | |
class MyClass(type(sys)): | |
@property | |
def x(self): | |
... | |
sys.modules[__name__] = MyClass | |
# try to use x | |
print(x) # does not exist in the __dict__, LOAD_NAME fails | |
Some ideas on how to fix. Allow exec() to take a module, make the module | |
__dict__ a subtype that has weakref to the module. Then LOAD_NAME can get | |
the module and do LOAD_ATTR. Optimize idea, don't set dict ref to module | |
unless properties are defined (i.e use LOAD_NAME unless properties are | |
defined). |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment