Skip to content

Instantly share code, notes, and snippets.

@judy2k
Last active April 16, 2017 18:43
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save judy2k/e1c2750c538065a5f06110113441dd0c to your computer and use it in GitHub Desktop.
Save judy2k/e1c2750c538065a5f06110113441dd0c to your computer and use it in GitHub Desktop.
Refactoring Talk Outline

This talk is aimed at intermediate Python developers who are interested in developing & maintaining Python libraries. The audience should understand core Python reasonably well, and be aware of things like the property decorator and what "dunder-methods" are.

The audience will learn how to structure their project and their code in ways that will allow them to make changes later without breaking their users' code. Hopefully they'll come away thinking about the structure of their library code in a clearer way.

So you've released a library! Now you need to maintain it. You want to add features, restructure the code, fix bugs, and maybe improve the library's usability. Your users just want their code to carry on working. That's okay!

This talk will cover techniques in both code and project management to allow you to keep your code moving forwards without breaking your users' code. It is aimed at developers with a little experience of writing libraries in Python, and will cover some intermediate subjects like function decorators and magic methods.

Refactoring in Python is a mixed bag - on the one hand you have powerful tools like the @property decorator, __dunder__ methods, and even metaclasses. On the other hand, because Python code has no concept of private or protected like some other languages, it can be difficult to know what your public interface even is.

I'll talk about how to identify your public interface, and make that clear to your users (cough documentation cough). I'll cover how to structure your tests so you know when you've broken your public interface. I'll discuss how to use some of Python's language features to trick your users into thinking your code hasn't changed at all (except for those brilliant new features you've just added!). And finally, I'll cover how you know it's time to break backwards compatibility and how to break it to your users.

Maintaining Compatibility

So you're going to release a library. Now you need to maintain it! You're going to add features, restructure the code, fix bugs, and maybe improve the library's useability.

Introduction

  • What is an interface
    • Modules, functions, classes, sequences, parameter types, exceptions raised.
  • What is refactoring?
    • Improving, reorganising code.
    • Not changing functionality.

First Steps

  • What is your public interface?
    • It's the bits your clients see.
    • In Python it can be nebulous, because no private, protected etc.
  • How can your user recognise your interface?
    • Underscore prefixes
    • Don't use dunders!
  • Separating your code into layers.
    • How many layers, and what can these designs look like?
  • Tests for your public interface.
    • Lots of tests for your public interface.
  • Documentation
    • Documentation defines your public interface. Do it.

Under The Hood

  • Replacing fields with properties
  • Replacing objects with singletons
    • Metaclasses for the win?
    • Borg? I hate borg
  • Changing the types of parameters, returned objects

Difficult Decisions

  • Maintaining old and new code (deprecation)
    • Do you add new functionality to the old codebase?
    • Forking or single distribution?
  • Breaking compatibility

Miscellanea

  • Looking at your clients' code. (If you can!)
  • Real-World tests.
  • Your users will do things you don't expect. There's only so much you can do.
  • Frameworks are harder to change because integration is deeper.

Notes:

Nice if you can have compelling examples that show how edge cases issue can be caught with test that are otherwise hard to detect. (from @chantr4)

So you've released a library. Now you need to maintain it! You're going to add features, restructure code, fix bugs, and improve usability. Your users want their code to keep on working. That's okay!

Learn pythonic techniques for safely changing your library's implementation, while protecting your users from unexpected change. I'll cover approaches to ensure you know when your library will break compatibility, and I'll discuss techniques for hiding changes using simple Pythonic magic.

Overview

  • Introduction [4 mins]
  • What Is Your Interface? [6 mins]
  • Ensuring Your Interface Is Stable [5 mins]
  • Coding Techniques - Change Code, Keep The Interface [6 mins]
  • Conclusion [5 mins]
  • Q&A [4mins]

Introduction [4 mins]

This part of the talk will discuss my background, and the premise for the talk. I've maintained many (mostly internal) Python libraries for the past 16 years. These days I'm lucky enough to maintain a commercially-backed open source library.

What Is Your Interface? [6 mins]

Before you can ensure your public interface is stable, you need to know what your public interface is. This is harder in Python than in languages like, say Java, because Python has no concept of private, protected, etc. The client code can see and use all of your code.

Your public interface consists of: Modules, functions, classes, sequences, parameter types & exceptions raised.

How does your library's user know which bits are public?

  • Convention (use underscore-prefixes to denote internals - don't use "dunder-prefixes", that's not what they're for!)
  • Layering (make the internals harder to obtain access to, deeper in the package)
  • Documentation! If you have reasonable documentation, then most users won't read your code. They will only use the features you document!

Ensuring Your Interface Is Stable [5 mins]

There are two steps to ensuring your interface is stable. The first was just covered - you have to know what your interface is. The second is testing. You should have a suite of automated tests that cover every part of your public interface. These tests should be segregated from the rest of your test codebase because if these tests fail, your response will be different. When a unit test fails, you usually change the implementation. When a public interface test fails, your response will be one of 2 things:

  • Bump version number of the next release (I'll briefly cover semantic versioning) and change the test so that it passes again.
  • Change the implementation (possibly reverting!), so that the public interface does not change.

Coding Techniques - Change Code, Keep The Interface [6 mins]

This section will cover some coding techniques for maintaining compatibility while changing implementation.

  • Replacing fields with properties: One of Python's great features is that the descriptor protocol allows you to replace attributes with methods. This allows you to replace static values with values that are generated on-demand.
  • Replacing objects with singletons: Because Python has no differentiation between constructors and other functions (they're just callables), you can swap one for the other. There are some subtle problems with this which can be addressed using the abc module and/or metaclasses. I'll briefly cover the borg pattern (as a clever trick)
  • Changing the types of parameters, returned objects: This example will cover how to replace one type with another, with maximal compatibility, using on subclassing and/or protocol definitions.
  • Forking / Adding another layer: Sometimes your code is okay, it's just the wrong level of abstraction. In this case you can fork or add a layer above. I'll demonstrate this approach, and the benefits & dangers of it, and how it can allow your existing users to continue to use your library with the option of migrating to a more suitable abstraction later.

Conclusion [5 mins]

I'll wrap up the talk, briefly covering on the major themes above, and touching on a few issues:

The keys to code agility when maintaining a library are:

  • Knowing your interface
  • Documentation
  • Testing
  • A clear versioning/compatibility system
  • A deprecation system

Other things to mention are:

  • Sometimes it's okay to break with a previous version. Make it clear why you've done it, and whether users can continue to use the old library. Make it clear why/when they should update.
  • If you break your library too often, your users will abandon it. If you don't make changes to your library, you may not be able to make necessary progress. It's a balance.
  • There are clever tricks to maintain compatibility, but they all have downsides. Documentation and testing are the most important things.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment