Skip to content

Instantly share code, notes, and snippets.

@simonw
Last active January 16, 2024 08:13
Show Gist options
  • Save simonw/8aa492e59265c1a021f5c5618f9e6b12 to your computer and use it in GitHub Desktop.
Save simonw/8aa492e59265c1a021f5c5618f9e6b12 to your computer and use it in GitHub Desktop.
How to recover lost Python source code if it's still resident in-memory

How to recover lost Python source code if it's still resident in-memory

I screwed up using git ("git checkout --" on the wrong file) and managed to delete the code I had just written... but it was still running in a process in a docker container. Here's how I got it back, using https://pypi.python.org/pypi/pyrasite/ and https://pypi.python.org/pypi/uncompyle6

Attach a shell to the docker container

Install GDB (needed by pyrasite)

apt-get update && apt-get install gdb

Install pyrasite - this will let you attach a Python shell to the still-running process

pip install pyrasite

Install uncompyle6, which will let you get Python source code back from in-memory code objects

pip install uncompyle6

Find the PID of the process that is still running

ps aux | grep python

Attach an interactive prompt using pyrasite

pyrasite-shell <PID>

Now you're in an interactive prompt! Import the code you need to recover

>>> from my_package import my_module

Figure out which functions and classes you need to recover

>>> dir(my_module)
['MyClass', 'my_function']

Decompile the function into source code

>>> import uncompyle6
>>> import sys
>>> uncompyle6.main.uncompyle(
    2.7, my_module.my_function.func_code, sys.stdout
)
# uncompyle6 version 2.9.10
# Python bytecode 2.7
# Decompiled from: Python 2.7.12 (default, Nov 19 2016, 06:48:10) 
# [GCC 5.4.0 20160609]
# Embedded file name: /srv/my_package/my_module.py
function_body = "appears here"

For the class, you'll need to decompile each method in turn

>>> uncompyle6.main.uncompyle(
    2.7, my_module.MyClass.my_method.im_func.func_code, sys.stdout
)
# uncompyle6 version 2.9.10
# Python bytecode 2.7
# Decompiled from: Python 2.7.12 (default, Nov 19 2016, 06:48:10) 
# [GCC 5.4.0 20160609]
# Embedded file name: /srv/my_package/my_module.py
class_method_body = "appears here"
@ancat
Copy link

ancat commented Mar 12, 2017

@i336 what you're likely seeing is the buffer of the interactive shell history. Testing python <file> on a file that gets deleted yields "random" code fragments (or the entirety, for very tiny programs) here and there but not the entire source. I used gdb to search across the entirety of memory space and couldn't recover the source code for any programs larger than a few lines.

@tleeuwenburg
Copy link

For what it's worth, I did something similar recently with git and went a different path to recovery based on 'git fsck' and retrieving the files from hashed objects stored in git. Kudos to your fantastic recovery strategy though!

@odino
Copy link

odino commented Mar 24, 2017

What about docker cp? :)

@seralf
Copy link

seralf commented Mar 25, 2017

well done! :-)

@prem-narain
Copy link

Awesome !! Thanks !!

@davidtgq
Copy link

davidtgq commented May 2, 2017

Stuck on this step: pyrasite-shell <PID> I just get a blank line. If I type any command and press enter, nothing happens.

@smiddela
Copy link

smiddela commented Mar 8, 2018

Me too same problem

Stuck on this step: pyrasite-shell I just get a blank line.

@HolyShitMan
Copy link

@davidtgq @smiddela:
Did you install gdb? It's needet to run pyrasite.

@richard-scott
Copy link

I saw this in this issue, it said to try running this before pyrasite-shell:

echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope

It helped me.

@govcert-ch
Copy link

I also have the freezing problem, but ptrace did not help (it's on Ubuntu 18.04). Debug (verbose==True added to inject call) says

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fc582b13ff7 in __GI___select (nfds=0, readfds=0x0, writefds=0x0, exceptfds=0x0, timeout=0x7ffc162924a0) at ../sysdeps/unix/sysv/linux/select.c:41

41      ../sysdeps/unix/sysv/linux/select.c: No such file or directory.
'PyGILState_Ensure' has unknown return type; cast the call to its declared return type
'PyRun_SimpleString' has unknown return type; cast the call to its declared return type
History has not yet reached $1.

Any ideas what that means?

@user202729
Copy link

I also have the freezing problem, but ptrace did not help (it's on Ubuntu 18.04). Debug (verbose==True added to inject call) says

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fc582b13ff7 in __GI___select (nfds=0, readfds=0x0, writefds=0x0, exceptfds=0x0, timeout=0x7ffc162924a0) at ../sysdeps/unix/sysv/linux/select.c:41

41      ../sysdeps/unix/sysv/linux/select.c: No such file or directory.
'PyGILState_Ensure' has unknown return type; cast the call to its declared return type
'PyRun_SimpleString' has unknown return type; cast the call to its declared return type
History has not yet reached $1.

Any ideas what that means?

Known bug. See lmacken/pyrasite#75 (comment) .

@iPurya
Copy link

iPurya commented May 5, 2021

i tried for cpython i can get shell access but i cant read codes. do yo have any idea for this situation ?

@rodmur
Copy link

rodmur commented Aug 4, 2022

Hi, this all appears to have changed for Python 3, it appears the uncompyle6.main.uncompyle() function is gone in favor of uncompyle6.main.decompile().

Also, what would the "my_package" be named if you're just trying to recover a simple python script with no package or module? It doesn't appear __main__ works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment