Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
How to recover lost Python source code if it's still resident in-memory

How to recover lost Python source code if it's still resident in-memory

I screwed up using git ("git checkout --" on the wrong file) and managed to delete the code I had just written... but it was still running in a process in a docker container. Here's how I got it back, using https://pypi.python.org/pypi/pyrasite/ and https://pypi.python.org/pypi/uncompyle6

Attach a shell to the docker container

Install GDB (needed by pyrasite)

apt-get update && apt-get install gdb

Install pyrasite - this will let you attach a Python shell to the still-running process

pip install pyrasite

Install uncompyle6, which will let you get Python source code back from in-memory code objects

pip install uncompyle6

Find the PID of the process that is still running

ps aux | grep python

Attach an interactive prompt using pyrasite

pyrasite-shell <PID>

Now you're in an interactive prompt! Import the code you need to recover

>>> from my_package import my_module

Figure out which functions and classes you need to recover

>>> dir(my_module)
['MyClass', 'my_function']

Decompile the function into source code

>>> import uncompyle6
>>> import sys
>>> uncompyle6.main.uncompyle(
    2.7, my_module.my_function.func_code, sys.stdout
)
# uncompyle6 version 2.9.10
# Python bytecode 2.7
# Decompiled from: Python 2.7.12 (default, Nov 19 2016, 06:48:10) 
# [GCC 5.4.0 20160609]
# Embedded file name: /srv/my_package/my_module.py
function_body = "appears here"

For the class, you'll need to decompile each method in turn

>>> uncompyle6.main.uncompyle(
    2.7, my_module.MyClass.my_method.im_func.func_code, sys.stdout
)
# uncompyle6 version 2.9.10
# Python bytecode 2.7
# Decompiled from: Python 2.7.12 (default, Nov 19 2016, 06:48:10) 
# [GCC 5.4.0 20160609]
# Embedded file name: /srv/my_package/my_module.py
class_method_body = "appears here"
@steven-cutting

This comment has been minimized.

Show comment Hide comment
@steven-cutting

steven-cutting Mar 11, 2017

I'll have to try this out just for the heck of it.

Pretty. Darn. Cool.

I'll have to try this out just for the heck of it.

Pretty. Darn. Cool.

@sksq96

This comment has been minimized.

Show comment Hide comment
@sksq96

sksq96 Mar 11, 2017

Woah ! Awesome.

sksq96 commented Mar 11, 2017

Woah ! Awesome.

@raplin

This comment has been minimized.

Show comment Hide comment
@raplin

raplin Mar 11, 2017

Wow sweet had no idea you could attach a py shell to a running python process. Super handy! thx

raplin commented Mar 11, 2017

Wow sweet had no idea you could attach a py shell to a running python process. Super handy! thx

@JohannesBertens

This comment has been minimized.

Show comment Hide comment
@JohannesBertens

JohannesBertens Mar 11, 2017

Nice tinkering!

Was executing a bash script inside the running docker container and just accessing the python script there not possible? Or was this overwritten?

Nice tinkering!

Was executing a bash script inside the running docker container and just accessing the python script there not possible? Or was this overwritten?

@NickSB2000

This comment has been minimized.

Show comment Hide comment
@NickSB2000

NickSB2000 Mar 11, 2017

Excellent, this has the potential to eliminate a swear and/or impress a colleague.. :-)

Excellent, this has the potential to eliminate a swear and/or impress a colleague.. :-)

@Neko-Design

This comment has been minimized.

Show comment Hide comment
@Neko-Design

Neko-Design Mar 12, 2017

Awesome! Given the stupid number of times I've done exactly this im sure I'll get a chance to try it in anger soon enough

Awesome! Given the stupid number of times I've done exactly this im sure I'll get a chance to try it in anger soon enough

@i336

This comment has been minimized.

Show comment Hide comment
@i336

i336 Mar 12, 2017

FYI, this feels incredibly complicated. Here's a much simpler method that universally applies to any process and will probably recover the original source, or very close to it - for example I used this approach to recover some text from a textbox in Chrome when an undo operation went awry recently. Using Python as an example:

$ python
>>> x = "QqWwEeRrTtYy"

(Leave that running, then...)

$ gdb -p $(pidof python)
...
0xb7414b08 in ___newselect_nocancel () from /lib/libc.so.6
(gdb) generate-core-file pythontest.dump
Saved corefile pythontest.dump
(gdb) quit
A debugging session is active.

        Inferior 1 [process 14970] will be detached.

Quit anyway? (y or n) y
Detaching from program: /usr/bin/python2.7, process 14970
$ grep -o QqWw pythontest.dump 
Binary file pythontest.dump matches
$ grep -ao QqWw pythontest.dump 
QqWw
QqWw
QqWw
QqWw
QqWw
QqWw
bash-4.3$ grep -a QqWw pythontest.dump 
...libxml2.ph....   >>>  = "QqWwEeRrTtYy   >> x = "QqWwEeRrTtYy"ntel        st-0x = "QqWwEeRrTtYy"
(...)
 = "QqWwEeRrTtYy" ··¸$.·xtermi336ÀÛr·åÿÿÿÿ Return a wrapped version of file which provides transparent
ÀÛr·ÿÿÿÿencodings.latin_1ÀÛr·3AÄencodings.latin_1É*·þÿÿÿ`är·\·\·L·þÿÿÿ`är·¬Ì(··H·è·ýÿÿÿ`är·dÙ·,з ÷r·1· ·ýÿÿÿ parse_and_bindacheÀÛr·WIoDread_history_fileÀÛr·ÓcVûwrite_history_fileÀÛrÉ3Ïget_completerÀÛr·73>get_completion_typeÀÛr·vÁÄremove_history_itemÀÛr·0Q¦set_startup_hookÀÛr·
.Öclear_historyÀÛr·Åù_READLINE_VERSION@·ÀÛr·ÿÿÿÿeRrTtYy"ÀÛr·
                                                            @Q£QqWwEeRrTtYyTtYy"
òlS·àSw·x = "QqWwEeRrTtYy"
$ 64;1;2;6;9;15;18;21;22c^C
$ ^C

Left in some of the binary asplosion for fun; this is a Unicode world now after all, it shouldn't cause any issues. As you can see, some of the data (a ridiculously small amount here) is mangled, but I see at least three intact copies of my original text. YMMV depending on what malloc implementation your app is using and how much fragmentation happened.

Here's one to file away if you frequently use Linux:

configure enough swapspace on your system, then in an absolute emergency open a terminal and run sync then echo disk > /sys/power/state or pm-hibernate to trigger system hibernation. Of course, this process requires a full copy of memory to be written to the disk... :) reboot your system off a flash drive for best results analysing the disk. WARNING: It feels horribly unintuitive but you must sync your disk before hibernating unless you know you'll be able to successfully resume off of the hibernated memory image, because of course hibernating means that whatever the filesystem was doing is immediately abandoned in-flight, with the idea that it will be finished when the system wakes back up! If you never resume, that in-memory filesystem data never makes it to disk. Ideally you'd copy the memory image somewhere then resume from the hibernated image; it might be worth figuring out how to do that on your system.

And of course this is all because Linux doesn't provide arbitrary access to memory. Kinda crazy that it's not generally possible, but it's understandable.

i336 commented Mar 12, 2017

FYI, this feels incredibly complicated. Here's a much simpler method that universally applies to any process and will probably recover the original source, or very close to it - for example I used this approach to recover some text from a textbox in Chrome when an undo operation went awry recently. Using Python as an example:

$ python
>>> x = "QqWwEeRrTtYy"

(Leave that running, then...)

$ gdb -p $(pidof python)
...
0xb7414b08 in ___newselect_nocancel () from /lib/libc.so.6
(gdb) generate-core-file pythontest.dump
Saved corefile pythontest.dump
(gdb) quit
A debugging session is active.

        Inferior 1 [process 14970] will be detached.

Quit anyway? (y or n) y
Detaching from program: /usr/bin/python2.7, process 14970
$ grep -o QqWw pythontest.dump 
Binary file pythontest.dump matches
$ grep -ao QqWw pythontest.dump 
QqWw
QqWw
QqWw
QqWw
QqWw
QqWw
bash-4.3$ grep -a QqWw pythontest.dump 
...libxml2.ph....   >>>  = "QqWwEeRrTtYy   >> x = "QqWwEeRrTtYy"ntel        st-0x = "QqWwEeRrTtYy"
(...)
 = "QqWwEeRrTtYy" ··¸$.·xtermi336ÀÛr·åÿÿÿÿ Return a wrapped version of file which provides transparent
ÀÛr·ÿÿÿÿencodings.latin_1ÀÛr·3AÄencodings.latin_1É*·þÿÿÿ`är·\·\·L·þÿÿÿ`är·¬Ì(··H·è·ýÿÿÿ`är·dÙ·,з ÷r·1· ·ýÿÿÿ parse_and_bindacheÀÛr·WIoDread_history_fileÀÛr·ÓcVûwrite_history_fileÀÛrÉ3Ïget_completerÀÛr·73>get_completion_typeÀÛr·vÁÄremove_history_itemÀÛr·0Q¦set_startup_hookÀÛr·
.Öclear_historyÀÛr·Åù_READLINE_VERSION@·ÀÛr·ÿÿÿÿeRrTtYy"ÀÛr·
                                                            @Q£QqWwEeRrTtYyTtYy"
òlS·àSw·x = "QqWwEeRrTtYy"
$ 64;1;2;6;9;15;18;21;22c^C
$ ^C

Left in some of the binary asplosion for fun; this is a Unicode world now after all, it shouldn't cause any issues. As you can see, some of the data (a ridiculously small amount here) is mangled, but I see at least three intact copies of my original text. YMMV depending on what malloc implementation your app is using and how much fragmentation happened.

Here's one to file away if you frequently use Linux:

configure enough swapspace on your system, then in an absolute emergency open a terminal and run sync then echo disk > /sys/power/state or pm-hibernate to trigger system hibernation. Of course, this process requires a full copy of memory to be written to the disk... :) reboot your system off a flash drive for best results analysing the disk. WARNING: It feels horribly unintuitive but you must sync your disk before hibernating unless you know you'll be able to successfully resume off of the hibernated memory image, because of course hibernating means that whatever the filesystem was doing is immediately abandoned in-flight, with the idea that it will be finished when the system wakes back up! If you never resume, that in-memory filesystem data never makes it to disk. Ideally you'd copy the memory image somewhere then resume from the hibernated image; it might be worth figuring out how to do that on your system.

And of course this is all because Linux doesn't provide arbitrary access to memory. Kinda crazy that it's not generally possible, but it's understandable.

@ancat

This comment has been minimized.

Show comment Hide comment
@ancat

ancat Mar 12, 2017

@i336 what you're likely seeing is the buffer of the interactive shell history. Testing python <file> on a file that gets deleted yields "random" code fragments (or the entirety, for very tiny programs) here and there but not the entire source. I used gdb to search across the entirety of memory space and couldn't recover the source code for any programs larger than a few lines.

ancat commented Mar 12, 2017

@i336 what you're likely seeing is the buffer of the interactive shell history. Testing python <file> on a file that gets deleted yields "random" code fragments (or the entirety, for very tiny programs) here and there but not the entire source. I used gdb to search across the entirety of memory space and couldn't recover the source code for any programs larger than a few lines.

@tleeuwenburg

This comment has been minimized.

Show comment Hide comment
@tleeuwenburg

tleeuwenburg Mar 15, 2017

For what it's worth, I did something similar recently with git and went a different path to recovery based on 'git fsck' and retrieving the files from hashed objects stored in git. Kudos to your fantastic recovery strategy though!

For what it's worth, I did something similar recently with git and went a different path to recovery based on 'git fsck' and retrieving the files from hashed objects stored in git. Kudos to your fantastic recovery strategy though!

@odino

This comment has been minimized.

Show comment Hide comment
@odino

odino Mar 24, 2017

What about docker cp? :)

odino commented Mar 24, 2017

What about docker cp? :)

@seralf

This comment has been minimized.

Show comment Hide comment
@seralf

seralf Mar 25, 2017

well done! :-)

seralf commented Mar 25, 2017

well done! :-)

@prem-narain

This comment has been minimized.

Show comment Hide comment
@prem-narain

prem-narain May 1, 2017

Awesome !! Thanks !!

Awesome !! Thanks !!

@davidtgq

This comment has been minimized.

Show comment Hide comment
@davidtgq

davidtgq May 2, 2017

Stuck on this step: pyrasite-shell <PID> I just get a blank line. If I type any command and press enter, nothing happens.

davidtgq commented May 2, 2017

Stuck on this step: pyrasite-shell <PID> I just get a blank line. If I type any command and press enter, nothing happens.

@smiddela

This comment has been minimized.

Show comment Hide comment
@smiddela

smiddela Mar 8, 2018

Me too same problem

Stuck on this step: pyrasite-shell I just get a blank line.

smiddela commented Mar 8, 2018

Me too same problem

Stuck on this step: pyrasite-shell I just get a blank line.

@HolyShitMan

This comment has been minimized.

Show comment Hide comment
@HolyShitMan

HolyShitMan May 11, 2018

@davidtgq @smiddela:
Did you install gdb? It's needet to run pyrasite.

@davidtgq @smiddela:
Did you install gdb? It's needet to run pyrasite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment