-
-
Save efischer19/2c4c64992e8e2f8212e87e046528ad56 to your computer and use it in GitHub Desktop.
dict, 521597608 to 71847848, diff: -449749760, 86.2254260951% | |
unicode, 204969204 to 8713716, diff: -196255488, 95.7487681906% | |
ObjectId, 22841784 to 217000, diff: -22624784, 99.0499866385% | |
str, 47456524 to 27408451, diff: -20048073, 42.2451357794% | |
BlockKey, 19437480 to 184320, diff: -19253160, 99.0517289278% | |
list, 25739104 to 9034048, diff: -16705056, 64.9014666556% | |
BlockData, 8778560 to 82688, diff: -8695872, 99.0580687493% | |
EditInfo, 8778560 to 82688, diff: -8695872, 99.0580687493% | |
datetime.datetime, 7403424 to 425184, diff: -6978240, 94.2569276054% | |
int, 8518392 to 5261424, diff: -3256968, 38.23454004% | |
DatastoreNode, 1192960 to 6560, diff: -1186400, 99.4501072961% | |
float, 1221912 to 96624, diff: -1125288, 92.0923929055% | |
Markup, 936488 to 0, diff: -936488, 100.0% | |
long, 498824 to 30720, diff: -468104, 93.8415152439% | |
function, 7503120 to 7059360, diff: -443760, 5.91433963471% | |
LibraryLocator, 353808 to 0, diff: -353808, 100.0% | |
instancemethod, 561600 to 327440, diff: -234160, 41.6951566952% | |
CourseEnvelope, 210744 to 144, diff: -210600, 99.9316706525% | |
DatabaseNode, 213800 to 6000, diff: -207800, 97.1936389149% | |
Mixologist, 187520 to 320, diff: -187200, 99.8293515358% | |
SplitMongoIdManager, 187328 to 128, diff: -187200, 99.9316706525% | |
CachingDescriptorSystem, 187328 to 128, diff: -187200, 99.9316706525% | |
OSFS, 187328 to 128, diff: -187200, 99.9316706525% | |
_RLock, 192192 to 4992, diff: -187200, 97.4025974026% | |
CourseLocator, 125520 to 360, diff: -125160, 99.7131931166% | |
collections.deque, 102832 to 8736, diff: -94096, 91.5045900109% | |
thread.lock, 100064 to 6464, diff: -93600, 93.540134314% | |
tuple, 4934808 to 4843000, diff: -91808, 1.86041685918% | |
cell, 761992 to 693000, diff: -68992, 9.0541632983% | |
CourseAccessRole, 46528 to 0, diff: -46528, 100.0% | |
ModelState, 46848 to 704, diff: -46144, 98.4972677596% | |
set, 1329744 to 1298840, diff: -30904, 2.32405635972% | |
code, 6295680 to 6265600, diff: -30080, 0.477787943479% | |
type, 7381160 to 7358560, diff: -22600, 0.306184935701% | |
MultiValueDict, 22752 to 592, diff: -22160, 97.3980309423% | |
FixedOffset, 22016 to 192, diff: -21824, 99.1279069767% | |
_sre.SRE_Pattern, 369232 to 347416, diff: -21816, 5.90848030507% | |
datetime.timedelta, 29000 to 15360, diff: -13640, 47.0344827586% | |
_HashedSeq, 6712 to 136, diff: -6576, 97.9737783075% | |
Random, 20256 to 15192, diff: -5064, 25.0% | |
OrderedDict, 237744 to 233192, diff: -4552, 1.91466451309% | |
StaticTab, 4288 to 0, diff: -4288, 100.0% | |
KeyedRef, 7488 to 5088, diff: -2400, 32.0512820513% | |
getset_descriptor, 404136 to 401832, diff: -2304, 0.57010511313% | |
frozenset, 139184 to 136904, diff: -2280, 1.63811932406% | |
weakref, 1052216 to 1050016, diff: -2200, 0.209082545789% | |
module, 207065 to 205161, diff: -1904, 0.919518025741% | |
ChildrenModelMetaclass, 114808 to 113000, diff: -1808, 1.57480314961% | |
property, 264176 to 262768, diff: -1408, 0.532978014657% | |
functools.partial, 11704 to 10296, diff: -1408, 12.030075188% | |
ImmutableList, 43192 to 42104, diff: -1088, 2.51898499722% |
Interesting non-default classes: BlockKey, BlockData, EditInfo, DatastoreNode, LibraryLocator (ugh), CourseEnvelope, DatabaseNode, SplitMongoIdManager, CachingDescriptorSystem, CourseLocator.
collections.deque and thread.lock are also both types I'd think would be fairly unique, that could prove useful.
https://github.com/edx/edx-platform/blob/master/openedx/core/lib/graph_traversals.py is the only place we use collections.deque apparently
DatabaseNode and DatastoreNode are from newrelic monitoring
https://github.com/edmorley/newrelic-python-agent/blob/5f4aaf1834f01df0ecc604de621a255f087262df/newrelic/newrelic/core/datastore_node.py#L15
https://github.com/edmorley/newrelic-python-agent/blob/5f4aaf1834f01df0ecc604de621a255f087262df/newrelic/newrelic/core/database_node.py#L42
Current thread I'm tugging:
CachingDescriptorSystem
, CourseEnvelope
, and SplitMongoIdManager
all have exactly 2927 instances, which is reasonably close to the "every library on prod" number from my previous investigations (some ~2750ish by my memory). There are also 3159 LibraryLocators. If the problem we're seeing is anything similar to the previous one, understanding the relationships between these classes will be key.
Another fun fact: there are only 13 dequeue instances, despite collections.deque
being reported as a very large diff between processes. I'd bet one of them grew out of control.
Also, there are 3127 thread.lock instances, which seems interesting
Current working theory: We are constructing too many CachingDescriptorSystem
objects. As seen here, that would explain why that class, CourseEnvelope, and SplitMongoIdManager have so many instances. So who is creating 3k CachingDescriptorSystems...
PR to fix here: https://github.com/edx/edx-platform/pull/15673
Basically, https://github.com/edx/edx-platform/pull/15501 had an implementation bug, so global staff users were still loading every library that exists into memory when opening the studio homepage.
Tools used are now up on github instead of hiding on my hard drive: https://github.com/efischer19/meliae_scripts
for reference, here's the same process applied to the dumps feanil provided before, when 1 cms process was actively leaking and the other wasn't.
Note that in both list, I truncated all types where
[(large size) - (small size)] < 1000