public
Last active

IPEP 1: Cleanup and extension of the Magic system in IPython

  • Download Gist
magics.md
Markdown

IPEP 1: Cleanup and extension of the Magic system in IPython

This document reviews the status of the magic command system in IPython and proposes an extension of magics to work in multiline contexts, at a 'cell' level. The most obvious use of this proposed extension will be the notebook, but the extension will similarly work in the Qt console and even at the terminal.

In the spirit of Python PEPs, this document is marked as IPEP 1, the first 'IPython Enhancement Proposal'.

Background

Early since its start, IPython has had a system of 'magic' commands, which in its current incarnation uses (optionally) a % prefix to indicate special commands that live in a separate namespace. These commands have single-line scope: when IPython encounters the line

%foo --flags  arguments

it checks whether foo is registered as a magic command, and if so, it calls it passing the entire rest of the line, as a string, as the only argument to the magic.

For historical reasons, implementation-wise the magic system is a fairly nasty hack: by default, a magic %foo is actually the method magic_foo of the main IPython InteractiveShell object, with signature magic_foo(self, parameter_s).

We offer a mechanism for users to register new magics defined from standalone functions, by using the define_magic method of the main IPython object, as follows:

# Define your function here
def foo_impl(self, parameter_s=''):
    'My very own magic!. (Use docstrings, IPython reads them).'

# Register it as a magic
get_ipython().define_magic('foo', foo_impl)

Over the years we have reduced to a minimum the intertwining of the various magic methods with the main object itself, hoping one day to completely separate the magics into standalone objects, thereby reducing significantly the footprint and complexity of the main object.

Specific goals

This proposal seeks to accomplish mainly two goals:

  1. Finish up the aforementioned separation of the magics away from the main IPython object. This will allow, amongst other good things, users to define their own magics by subclassing a lightweight base object. This is not possible today since the main magic object is enormous and contains every default magic method in its implementation.

  2. Extend the concept of magics to operate on multi-line blocks of text, introducing the concept of cell magics.

These two will be discussed separately, starting with the conceptually more interesting cell magics (goal #1 is mostly just an implementation cleanup).

Cell-level magics

We propose to introduce the concept of a cell-level magic, akin to how Sage uses the % syntax at the cell level. Sage uses the line-magic syntax from IPython in its notebook with a cell-wide meaning; here we propose to keep separate line- and cell-level magics, and our implementation will have a number of details different from how Sage does it. But the user-facing behavior will be very similar.

The idea is most easily illustrated with an example. Consider a cell (in the notebook or Qt console, we'll discuss later how this can work in terminal clients) that contains:

#!foo --flags  args
text - line 1
text - line 2
...
text - line N

In this case, if foo is a cell magic, it will be a function or method called with two arguments as:

foo('--flags args', 'text - line1\n...text - line N')

That is, a cell magic will be passed as a first argument the (possibly empty) rest of the line on which it was called, and as a second the body of the cell after the first line and until the end.

Execution semantics

In practice, cell magics (just as line magics) will be methods of an object that always has a self.shell attribute pointing to the main IPython InteractiveShell instance. The execution logic will be the following: IPython will return to the user, as the output of the cell the result of the call above to foo(...), with the only caveat being that it will trap any unhandled exceptions.

This means that if a user implements a magic meant to only do some rewriting of the input (for example to support an alternate syntax), this magic will still be responsible for calling IPython's execution machinery with the transformed output.

This choice of execution semantics is the only option if we want to allow these magics to have complete freedom on what they do with their input text implementation-wise. While there will likely be many magics meant to do simple transformations of their input meant later for regular execution, others may dispatch their input to be run by external programs, for example. Therefore there is no generic output API we can impose on them.

Choice of sigil

The sigil proposed above, #!, follows from the common pattern of unix scripts whose first line may start with this same sigil (the 'shebang') to indicate what program is meant to execute the rest of the file. In that regard, cell magics behave very similarly and therefore it seemed appropriate to rely on familiarity to make the concept easier to understand for new users.

The major downside of this sigil is that it requires two different characters, and hence is more annoying to type in cases of repeated use of the same magic.

Some other possible sigils we can consider: %%, //, >, &, $.

These are either binary operators or invalid syntax, hence they are all meaningless at the start of a cell. I haven't listed every possible binary operator, just the ones I felt could provide good readability and ease of typing. Other possible alternatives can obviously be discussed.

Of these, I find the following as particularly good candidates:

  • %% dovetails nicely with the current % for line magics.

  • $ is fully invalid Python syntax, easy to type and common in programming languages.

Implementation-wise, a single-character sigil is a bit more convenient.

Possibilities

If we adopt this proposal, a number of interesting possibilities can be implemented, such as (ignoring the sigil choice here):

  • timeit, prun: extending these timing/profiling utilities to work on whole cells instead of requiring the user to cram everything in one line.

  • cython: allow the user to type cython code and load it automatically (this is extremely useful in Sage). Similar things can be done with cython.inline and f2py for inlining C/C++ and Fortran.

  • R: a magic could keep a connection to an R interpreter, and allow the user to type in blocks of R code, optionally pulling back results to the user's python namespace automatically.

  • sh: pass everything to the system shell for execution, without having to prepend each line with ! separately.

These are just a few simple examples to motivate the utility of the feature, ultimately it will be up to the users to develop useful cell magics.

Terminal use

While the terminal client doesn't have the concept of a cell, we can still accomodate cell level magics in this environment, as follows. If a cell level magic is detected, the code path in the main IPython object that calls it will check first if there's any content in the cell itself (terminal clients will only have the first line, so they will have no cell content). In this case, it can use raw_input() to ask the user to input the content of the cell, prior to making the call.

Since this behavior is not desirable in the notebook or qt console, it will be off by default, and turned on only by the in-process terminal client or out of process console clients who initialize their own kernel. In all other cases it will be off, which simply means that a console client who connects to an existing kernel started by a notebook or Qt console will not have the ability to type cell magics. This is a very small restriction that is a reasonable compromise to keep the overall execution model simple and predictable.

Stacking cell magics

We consider the possibility of 'stacking' multiple cell magics akin to how stacked decorators work in Python, e.g.:

#!magic1 args...
#!magic2 args...
#!magic3 args...
...
cell body
...

Semantically, these would be applied bottommost-first to match how stacked decorators work in Python.

However, we must note an important difference here that complicates this idea: the api of decorators is very simple: they take a function as input and they return a function. In contrast, we've said that the input to a magic would be the body of the cell, but the magic can return any kind of output it wants. This means that, after one cell magic is applied, the result is not necessarily textual anymore, but instead it can be anything returned by the magic.

For this reason, we will most likely defer the idea of stacked magics until we have more experience with the basic system to better inform the decsision.

Implementation details and separation from main IPython object

We propose to stop having the current Magic class be a mixin used in InteractiveShell, and instead we will refactor the basic Magic to be a simple class with all the machinery for magic functions, but none implemented. Then, classes that wish to implement new magics can subclass this base class and provide their own methods.

A single class can provide more than one line magic and more than one cell magic if desired; this eliminates the need to create many unnecessary objects when common functionality can be shared, as well as allowing stateful magics (such as a hypothetical R one that would keep a live R interpreter) to expose multiple user-facing entry points with a single copy of the state.

To register line and cell magics, the class will declare two attributes: line_magics and cell_magics. Each of these will be a list of names, that must correspond to methods with the actual implementation, using the convention that line magic methods are named magic_$name and cell magic methods are named cmagic_$name. A simple example should make it clear:

class MyMagics(Magic):
    line_magics = ['foo', 'bar']
    cell_magics = ['foo', 'baz']  # the same name can be used in both 
                                  # line and cell magics

    def magic_foo(self, line):
        "The line magic %foo"

    def magic_bar(self, line):
        "The line magic %bar"

    def cmagic_foo(self, line, cell):
        "The cell magic #!foo"

    def cmagic_baz(self, line, cell):
        "The cell magic #!baz"

The justification for having these lists is to avoid having to manually scan the entire namespace of these objects at registration time. A small amount of duplication of information at object creation time lets us do the registration in a more efficient manner. We keep the implementation methods organized with the magic_ and cmagic_ prefixes to ensure there will never be any name collisions between the functionality of the base class (which may evolve over time) and any methods users may choose to implement in their own magics.

The signature of the constructor will be such that by default, when a Magic object is initialied all of its magics get registered, but this behavior can be overridden to invoke the registration method manually later on.

Furthermore, new magics can be added to an existing instance at runtime; these will need to be registered manually. We will update our implementation of the define_magic method to do this with the same signature (so user code will not need to be modified in this transition). We will also add a partner define_cell_magic to do the same thing with cell magics. These two methods will operate on an instance of the Magic class that will carry no other manually defined magics, and hence can be used to store all user-added magics that call these functional entry points.

As an alternative to explicit lists of names, we could instead use decorators to tag specific methods as line/cell magics:

class MyMagics(Magic):

    @magic
    def foo(self, line):
        "The line magic %foo"

    @cell_magic
    def foo(self, line, cell):
        "The cell magic #!foo"

Conversion of the current codebase

By now, our Magic objects only manipulate the main IPython object via their self.shell attribute, so converting the current codebase to this architecture should be fairly straightforward. We will break up the large Magic object into the base class and a few (probably no more than 3 or 4) objects carrying all our current builtin magics. Since we are preserving the magic_ naming convention we already use, this conversion should be straightforward and very low-risk.

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.