csabahenk/00-debugging-detached-python.md

## 00-debugging-detached-python.md

      
    Raw
  

              00-debugging-detached-python.md
            
          
    Debugging detached Python

Contents


Introduction
Remote Pdb
Stack printing
Debugging Python with Gdb


## Introduction

What we'd like to do: debug Python software as it's customary --
ie. break into pdb, the Python Debugger and investigate the live
code.
Why we can't do that: because the way we run the code, the standard
input is not a tty. Pdb assumes interaction via a terminal.
(Note: thus the other way for us would be to force the code run in
terminal. It's worth to explore but now we go in another way.)
What we will do: find alternative ways of debugging and introspection
that do not rely on stdin.
(Note: in our case, while stdin is problematic, we can easy see the
stdout. If that does not hold in your case, however you'd like to apply
these techniques, replace the print statements with the kind of logging
mechanism that's available for you.)

We use Python 2.7.

## Remote Pdb
A hack to get at a networked Pdb session, useful in the case when stdin is not a tty.
Place attached rdb.py (stolen from
here,
with some adjusments) file somewhere to your $PYTHONPATH. You can do then
import rdb; rdb.set_trace() just like with stock pdb. It will print the port
on which the debug session is spawned like PDB listening on 6902 (if you
don't see stdout, you can try to find out the port by lsof(8) & co.).  Then
you can just telnet localhost 6902.
Issues:

no readline support (you can add it externally with rlwrap)
no permanent session. If you set a breakpoint and press c, the connection drops and the followup
break will spawn on stdin, not on the network

However, allegedly it supports multiple sessions (ie., if the program hits
set_trace multiple times, a new rdb server will spawn for each)(I haven't
tried).
Another take on remoting Pdb is Rpdb
(thanks Prasanth Pai for the hint). I found it's neither perfect, has
similar but slightly different issues. You can give it a try.

## Stack printing
Put the following snippet into your code:
import threading,sys,traceback
def dumpstacks(signal=None, frame=None):
    id2name = dict([(th.ident, th.name) for th in threading.enumerate()])
    code = []
    for threadId, stack in sys._current_frames().items():
        code.append("\n# Thread: %s(%d)" % (id2name.get(threadId,""), threadId))
        for filename, lineno, name, line in traceback.extract_stack(stack):
            code.append('File: "%s", line %d, in %s' % (filename, lineno, name))
            if line:
                code.append("  %s" % (line.strip()))
    print "\n".join(code)
then you can just call dumpstacks() to get a stack trace printed to stdout.
Additionally, if you set
import signal
signal.signal(signal.SIGUSR1, dumpstacks)
somewhere in the main code path (ie. what's get called on program startup)
you can get a stack trace at any point by sending SIGUSR1 to your program.
The most convenient way to accomplish these is to use the
sitecustomize/usercustomize
feature of Python that allows you to specify code which is  loaded in each
Python program (unless you explicitly ask not to via the -S option of the
interpreter), ie. it's always in the main code path.
Just create the sitecustomize.py file with the above content in your Python
site dir (installation and version dependent, something like
/usr/lib/python2.7/site-packages/). Then the SIGUSR1 stack printing will
be always enabled, while in code you can get a stackdump by
from sitecustomize import dumpstacks; dumpstacks().
(Courtesy of.)

## Debugging Python with Gdb
I'll provide the instructions in two flavors:

Fedora (tested with 19)
general instructions

On Fedora support for this feature is nicely built in. In general, you
have to compile a suitable Python by yourself and make some additional
adjustments.
Getting at a debug-enabled Python

on Fedora:

# yum install yum-utils
# debuginfo-install python


in general:
Python follows the standard autotools build procedure of ./configure && make && make install.
Perform the build with one change: replace the plain make invocation with make OPT="-ggdb -O0".
If you are performing the build through a package/build manager, make sure the build manager does
not strip the binaries (eg. on Arch Linux, if you build using the
python2 PKGBUILD,
add '!strip' to the options array).

Note: on RHEL/CentOS, similarly to Fedora, a debuginfo package is available.
Debugging Python: the basics


on Fedora: it just works as is. You run the Python script under
Gdb (either gdb python <script> or gdb -p <pid-of-running-script>
and you'll have access to the py-* commands like py-bt to show
a Python backtrace.
in general:


Make a note of the location of the Python source tree.


Add the following to your ~/.gdbinit:
define py-load
python import sys; sys.path.insert(0, "/Tools/gdb/"); import libpython
end


Run your Python script under Gdb, as discussed above.
When you drop to the Gdb prompt for the first time, type py-load
which will load the Python support routines. (Note: I tried to
have them loaded automatically from ~/.gdbinit but then they
did not work properly. Most likely they presuppose that the Python
debug symbols are already available. If you load them only from the
prompt, by that time this condition is fulfilled.)


Note: on RHEL/CentOS it seems that the py-* commands are not integrated to
the build, so you have to follow the general instructions. You can get the
Python source if you fetch the SRPM (cf. yumdownloader(1)) of you can get
libpython.py right from the
source repository
(direct download url).
The Gdb Python routines

This is an older mechanism that predates Python scripting support in Gdb -- a
collection of routines written directly in Gdb's command language to extract
information from the Python VM's internal data structures. They provide the
py* commands (ie., prefixed with "py" but no hyphen, like pystack). They are
considered deprecated, but are of interest for us for two purposes:

Their output contains less information. That can be advantageous if we
want terse output, easy to parse for the eye.
If we want to add some convenience commands of our own, they serve as good
reference.

The routines are included in Python source repo as
Misc/gdbinit (direct
download url). To use
them, download the file and either add their contents to ~/.gdbinit or keep
it separately and pull them in with
source <path-to-downloaded-file>

Debugging Python: beef it up

At this point we have basic introspection capabilities for the Python runtime,
but still we can't do things that's considered basic for a debugger, most
eminently, breaking and stepping. That's what we want to achieve.
breaking

Playing around, one can see that the C function that facilitates the invocation
of Python functions is called PyEval_EvalFrameEx. Looking into the Gdb Python
routines, we can see how to extract the function name and file from the
parameters of PyEval_EvalFrameEx. Thus we can put together the following
command:
define pybr
  if $argc == 1
    break PyEval_EvalFrameEx if strcmp((char *)(*(PyStringObject*)f.f_code.co_name).ob_sval, $arg0) == 0
  end
  if $argc == 2
    break PyEval_EvalFrameEx if strcmp((char *)(*(PyStringObject*)f.f_code.co_name).ob_sval, $arg0) == 0 && \
                                strcmp((char *)(*(PyStringObject*)f.f_code.co_filename).ob_sval, $arg1) == 0
  end
end
document pybr
  Python break
end

(This is, needless to say, suggested for inclusion in ~/.gdbinit or some
other Gdb command file you would source.)
So the first argument of pybr is the function to break at, the second,
optional is the name of the file that includes the function. Note that its
arguments should be passed as strings and not as identifiers, for example pybr "GET", or pybr "GET" "monkeyserver.py". Another caveat is whether to use
absolute or relative filenames -- that might depend on the way of having the
program invoked. You can discover the actual file naming convention by checking
py-bt or pystack's output.
stepping

Given that hitting a Python function means hitting PyEval_EvalFrameEx in the
C runtime, I suggest you the following practice for stepping in Python code:

when you want to start stepping, do break PyEval_EvalFrameEx
(make a note of the index of this breakpoint)
just hit c (continue) to step forward
if you want to continue in Python, disable this breakpoint by dis <index-of-breakpoint>
and then c.
if you want to step in Python again, enable the breakpoint by en <index-of-breakpoint>.

Practically (if no other automatic breakpoint setting interferes) you can add
break PyEval_EvalFrameEx
disable 1

to your ~/.gdbinit so that the PyEval_EvalFrameEx breakpoint will be of index 1 and
disabled on start; and then you can enable Python-stepping by en 1, and disable
it by dis 1.

  
## rdb.py
# -*- coding: utf-8 -*-
"""
celery.contrib.rdb
==================

Remote debugger for Celery tasks running in multiprocessing pool workers.
Inspired by http://snippets.dzone.com/posts/show/7248

**Usage**

.. code-block:: python

    from celery.contrib import rdb
    from celery import task

    @task()
    def add(x, y):
        result = x + y
        rdb.set_trace()
        return result


**Environment Variables**

.. envvar:: CELERY_RDB_HOST

    Hostname to bind to.  Default is '127.0.01', which means the socket
    will only be accessible from the local host.

.. envvar:: CELERY_RDB_PORT

    Base port to bind to.  Default is 6899.
    The debugger will try to find an available port starting from the
    base port.  The selected port will be logged by the worker.

"""
from __future__ import absolute_import, print_function

import errno
import os
import socket
import sys

from pdb import Pdb

####
from contextlib import contextmanager

def get_errno_name(n):
    """Get errno for string, e.g. ``ENOENT``."""
    if isinstance(n, basestring):
        return getattr(errno, n)
    return n


@contextmanager
def ignore_errno(*errnos, **kwargs):
    """Context manager to ignore specific POSIX error codes.

    Takes a list of error codes to ignore, which can be either
    the name of the code, or the code integer itself::

        >>> with ignore_errno('ENOENT'):
        ...     with open('foo', 'r'):
        ...         return r.read()

        >>> with ignore_errno(errno.ENOENT, errno.EPERM):
        ...    pass

    :keyword types: A tuple of exceptions to ignore (when the errno matches),
                    defaults to :exc:`Exception`.
    """
    types = kwargs.get('types') or (Exception, )
    errnos = [get_errno_name(errno) for errno in errnos]
    try:
        yield
    except types as exc:
        if not hasattr(exc, 'errno'):
            raise
        if exc.errno not in errnos:
            raise
####


default_port = 6899

CELERY_RDB_HOST = os.environ.get('CELERY_RDB_HOST') or '127.0.0.1'
CELERY_RDB_PORT = int(os.environ.get('CELERY_RDB_PORT') or default_port)

#: Holds the currently active debugger.
_current = [None]

_frame = getattr(sys, '_getframe')

NO_AVAILABLE_PORT = """\
{self.ident}: Couldn't find an available port.

Please specify one using the CELERY_RDB_PORT environment variable.
"""

BANNER = """\
{self.ident}: Please telnet into {self.host} {self.port}.

Type `exit` in session to continue.

{self.ident}: Waiting for client...
"""

SESSION_STARTED = '{self.ident}: Now in session with {self.remote_addr}.'
SESSION_ENDED = '{self.ident}: Session with {self.remote_addr} ended.'


class Rdb(Pdb):
    me = 'Remote Debugger'
    _prev_outs = None
    _sock = None

    def __init__(self, host=CELERY_RDB_HOST, port=CELERY_RDB_PORT,
                 port_search_limit=100, port_skew=+0, out=sys.stdout):
        self.active = True
        self.out = out

        self._prev_handles = sys.stdin, sys.stdout

        self._sock, this_port = self.get_avail_port(
            host, port, port_search_limit, port_skew,
        )
        self._sock.setblocking(1)
        self._sock.listen(1)
        self.ident = '{0}:{1}'.format(self.me, this_port)
        self.host = host
        self.port = this_port
        self.say(BANNER.format(self=self))

        self._client, address = self._sock.accept()
        self._client.setblocking(1)
        self.remote_addr = ':'.join(str(v) for v in address)
        self.say(SESSION_STARTED.format(self=self))
        self._handle = sys.stdin = sys.stdout = self._client.makefile('rw')
        Pdb.__init__(self, completekey='tab',
                     stdin=self._handle, stdout=self._handle)

    def get_avail_port(self, host, port, search_limit=100, skew=+0):
        this_port = None
        for i in range(search_limit):
            _sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            this_port = port + skew + i
            try:
                _sock.bind((host, this_port))
            except socket.error as exc:
                if exc.errno in [errno.EADDRINUSE, errno.EINVAL]:
                    continue
                raise
            else:
                print('PDB listening on %d' % this_port)
                return _sock, this_port
        else:
            raise Exception(NO_AVAILABLE_PORT.format(self=self))

    def say(self, m):
        print(m, file=self.out)

    def _close_session(self):
        self.stdin, self.stdout = sys.stdin, sys.stdout = self._prev_handles
        self._handle.close()
        self._client.close()
        self._sock.close()
        self.active = False
        self.say(SESSION_ENDED.format(self=self))

    def do_continue(self, arg):
        self._close_session()
        self.set_continue()
        return 1
    do_c = do_cont = do_continue

    def do_quit(self, arg):
        self._close_session()
        self.set_quit()
        return 1
    do_q = do_exit = do_quit

    def set_trace(self, frame=None):
        if frame is None:
            frame = _frame().f_back
        with ignore_errno(errno.ECONNRESET):
            Pdb.set_trace(self, frame)

    def set_quit(self):
        # this raises a BdbQuit exception that we are unable to catch.
        sys.settrace(None)


def debugger():
    """Returns the current debugger instance (if any),
    or creates a new one."""
    rdb = _current[0]
    if rdb is None or not rdb.active:
        rdb = _current[0] = Rdb()
    return rdb


def set_trace(frame=None):
    """Set breakpoint at current location, or a specified frame"""
    if frame is None:
        frame = _frame().f_back
    return debugger().set_trace(frame)
	# -- coding: utf-8 --
	"""
	celery.contrib.rdb
	==================

	Remote debugger for Celery tasks running in multiprocessing pool workers.
	Inspired by http://snippets.dzone.com/posts/show/7248

	Usage

	.. code-block:: python

	from celery.contrib import rdb
	from celery import task

	@task()
	def add(x, y):
	result = x + y
	rdb.set_trace()
	return result


	Environment Variables

	.. envvar:: CELERY_RDB_HOST

	Hostname to bind to. Default is '127.0.01', which means the socket
	will only be accessible from the local host.

	.. envvar:: CELERY_RDB_PORT

	Base port to bind to. Default is 6899.
	The debugger will try to find an available port starting from the
	base port. The selected port will be logged by the worker.

	"""
	from __future__ import absolute_import, print_function

	import errno
	import os
	import socket
	import sys

	from pdb import Pdb

	####
	from contextlib import contextmanager

	def get_errno_name(n):
	"""Get errno for string, e.g. ``ENOENT``."""
	if isinstance(n, basestring):
	return getattr(errno, n)
	return n


	@contextmanager
	def ignore_errno(errnos, *kwargs):
	"""Context manager to ignore specific POSIX error codes.

	Takes a list of error codes to ignore, which can be either
	the name of the code, or the code integer itself::

	>>> with ignore_errno('ENOENT'):
	... with open('foo', 'r'):
	... return r.read()

	>>> with ignore_errno(errno.ENOENT, errno.EPERM):
	... pass

	:keyword types: A tuple of exceptions to ignore (when the errno matches),
	defaults to :exc:`Exception`.
	"""
	types = kwargs.get('types') or (Exception, )
	errnos = [get_errno_name(errno) for errno in errnos]
	try:
	yield
	except types as exc:
	if not hasattr(exc, 'errno'):
	raise
	if exc.errno not in errnos:
	raise
	####


	default_port = 6899

	CELERY_RDB_HOST = os.environ.get('CELERY_RDB_HOST') or '127.0.0.1'
	CELERY_RDB_PORT = int(os.environ.get('CELERY_RDB_PORT') or default_port)

	#: Holds the currently active debugger.
	_current = [None]

	_frame = getattr(sys, '_getframe')

	NO_AVAILABLE_PORT = """\
	{self.ident}: Couldn't find an available port.

	Please specify one using the CELERY_RDB_PORT environment variable.
	"""

	BANNER = """\
	{self.ident}: Please telnet into {self.host} {self.port}.

	Type `exit` in session to continue.

	{self.ident}: Waiting for client...
	"""

	SESSION_STARTED = '{self.ident}: Now in session with {self.remote_addr}.'
	SESSION_ENDED = '{self.ident}: Session with {self.remote_addr} ended.'


	class Rdb(Pdb):
	me = 'Remote Debugger'
	_prev_outs = None
	_sock = None

	def __init__(self, host=CELERY_RDB_HOST, port=CELERY_RDB_PORT,
	port_search_limit=100, port_skew=+0, out=sys.stdout):
	self.active = True
	self.out = out

	self._prev_handles = sys.stdin, sys.stdout

	self._sock, this_port = self.get_avail_port(
	host, port, port_search_limit, port_skew,
	)
	self._sock.setblocking(1)
	self._sock.listen(1)
	self.ident = '{0}:{1}'.format(self.me, this_port)
	self.host = host
	self.port = this_port
	self.say(BANNER.format(self=self))

	self._client, address = self._sock.accept()
	self._client.setblocking(1)
	self.remote_addr = ':'.join(str(v) for v in address)
	self.say(SESSION_STARTED.format(self=self))
	self._handle = sys.stdin = sys.stdout = self._client.makefile('rw')
	Pdb.__init__(self, completekey='tab',
	stdin=self._handle, stdout=self._handle)

	def get_avail_port(self, host, port, search_limit=100, skew=+0):
	this_port = None
	for i in range(search_limit):
	_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
	this_port = port + skew + i
	try:
	_sock.bind((host, this_port))
	except socket.error as exc:
	if exc.errno in [errno.EADDRINUSE, errno.EINVAL]:
	continue
	raise
	else:
	print('PDB listening on %d' % this_port)
	return _sock, this_port
	else:
	raise Exception(NO_AVAILABLE_PORT.format(self=self))

	def say(self, m):
	print(m, file=self.out)

	def _close_session(self):
	self.stdin, self.stdout = sys.stdin, sys.stdout = self._prev_handles
	self._handle.close()
	self._client.close()
	self._sock.close()
	self.active = False
	self.say(SESSION_ENDED.format(self=self))

	def do_continue(self, arg):
	self._close_session()
	self.set_continue()
	return 1
	do_c = do_cont = do_continue

	def do_quit(self, arg):
	self._close_session()
	self.set_quit()
	return 1
	do_q = do_exit = do_quit

	def set_trace(self, frame=None):
	if frame is None:
	frame = _frame().f_back
	with ignore_errno(errno.ECONNRESET):
	Pdb.set_trace(self, frame)

	def set_quit(self):
	# this raises a BdbQuit exception that we are unable to catch.
	sys.settrace(None)


	def debugger():
	"""Returns the current debugger instance (if any),
	or creates a new one."""
	rdb = _current[0]
	if rdb is None or not rdb.active:
	rdb = _current[0] = Rdb()
	return rdb


	def set_trace(frame=None):
	"""Set breakpoint at current location, or a specified frame"""
	if frame is None:
	frame = _frame().f_back
	return debugger().set_trace(frame)