Skip to content

Instantly share code, notes, and snippets.

@tomchristie
Last active November 7, 2017 12:58
Show Gist options
  • Star 13 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tomchristie/cb388f0f6a0dec931c611775f32c5f98 to your computer and use it in GitHub Desktop.
Save tomchristie/cb388f0f6a0dec931c611775f32c5f98 to your computer and use it in GitHub Desktop.

======================== DEP XXX: Simplified routing syntax

  • DEP: XXX
  • Author: Tom Christie
  • Implementation Team: Tom Christie
  • Shepherd: Tim Graham
  • Status: Draft
  • Type: Enhancement
  • Created: 2016-10-03
  • Last-Modified: 2016-10-06

Abstract

This DEP aims to introduce a simpler and more readable routing syntax to Django. Additionally the new syntax would support type coercion of URL parameters.

We would plan for this to become the new convention by default, but would do so in a backwards compatible manner, leaving the existing regex based syntax as an option.

Background and Motivation

Here's a section directly taken from Django's documentation on URL configuration...

urlpatterns = [     
    url(r'^articles/2003/$', views.special_case_2003),
    url(r'^articles/(?P<year>[0-9]{4})/$', views.year_archive),
    url(r'^articles/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/$', views.month_archive),
    url(r'^articles/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/(?P<day>[0-9]{2})/$', views.article_detail),
]

There are two aspects to this that we'd like to improve on:

  • The Regex based URL syntax is unneccessarily verbose and complex for the vast majority of use-cases.
  • The existing URL resolver system does not handle typecasting, meaning that all URL parameters are treated as string literals.

In order to do so we propose to implement a new URL routing option, based on the Flask URL syntax. This is a simpler and more readable format. Unlike the regex approach, this syntax also includes type information, allowing us to provide for typecasting of the URL parameters that are passed to views.

The existing syntax would remain available, and there would be no plans to place it on a deprecation path. Indeed, the underlying implementation for the typed URL syntax would actually be to use expand the typed URLs out into the existing Regex style, although this would largely remain an implementation detail, rather than an exposed bit of API.

The end result is that we would like to be able to present the following interface to our developers...

urlpatterns = [     
    path('articles/2003/', views.special_case_2003),
    path('articles/<int:year>/', views.year_archive),
    path('articles/<int:year>/<int:month>/', views.month_archive),
    path('articles/<int:year>/<int:month>/<int:day>/', views.article_detail),
]

The path() argument would also accept arguments without a convertor prefix, in which case the convertor would default to "string", accepting any text except a '/'.

For example:

urlpatterns = [
    path('/users/', views.user_list),
    path('/users/<id>/', views.user_detail),
]

For further background, please see the "Challenge teaching Django to beginners: urls.py" discussion group thread.

Core vs Third-Party

In our consideration this feature should be included in core Django rather than as a third-party app, because it adds significant value and readbility.

It is far more valuable when presented to the community as the new standard, rather than as an alternative style that can be bolted on. If presented as a third-party add-on then the expense of a codebase going against the standard URL convention will likely always prevent widespread uptake.

Specification

Imports

The naming for the import needs to be decided on. The existing URL configuration uses:

from django.conf.urls import url

The naming questions would be:

  • What should the new style be called? Would we keep url, or would we need to introduce a different name to avoid confusion?
  • Where should the new style be imported from?

Our constraints here are that the existing naming makes sense, but we also need to ensure that we don't break backwards compatiblility.

Our proposal is that we should use a diffrent name and that the new style should be imported as...

from django.urls import path

A consistently named regex specific import would also be introduced...

from django.urls import path_regex

The name path makes semantic sense here, because it actually does represent a URL path component, rather than a complete URL.

The existing import of from django.conf.urls import url would become a shim for the more explicit from django.urls import path_regex.

Given that it is currently used in 100% of Django projects, the smooth path for users would be to not deprecate its usage immediately, but to consider placing it on the deprecation path at a later date.

Converters

Flask supports the following converters.

  • string - accepts any text without a slash (the default)
  • int - accepts integers
  • float - like int but for floating point values
  • path - like the default but also accepts slashes
  • any - matches one of the items provided
  • uuid - accepts UUID strings

We might also consider including a regex converter.

Furthermore, an interface for implementing custom convertors should exist. We could use the same API as Flask's BaseConverter for this purpose. The registration of custom convertors could be handled as a Django setting, CUSTOM_URL_CONVERTORS. The default set of convertors should probably always be included.

Failure to perform a type conversion against a captured string should result in an Http404 exception being raised.

Adding type conversion to the existing system

Adding a new URL syntax is easy enough, as they can be mapped onto the existing Regex syntax. The more involved piece of work would be providing for type conversion with the existing regex system. The type conversion functionality would need to support both named and unnamed capture groups.

One option could be:

  • Add a new convertors argument to the url argument.
  • The value can either be a list/tuple, in which case its elements are mapped onto the capture groups by position, or a dict, in which case its elements are mapped onto the capture groups by name. (The former case is more general as it supports using the positional style to correspond with either named or unamed groups)
  • The items in the convertors argument would each be instances of BaseConverter.

(An alternate might be to add seperate convertor_args and convertor_kwargs arguments.)

We would also need to support the reverse side of type conversion. Ensure that reverse can be called with typed arguments as well as string literals.

Preventing unintended errors

The following behaviour is not neccessary, and we might not choose to add this. However it is worth consideration as a way to guard against user error...

Even with differently named functions there remains some potential for user error. For example:

  • A developer using Django's new URL system accidentally uses from django.conf.urls import url, and fails to notice the error. They are unaware that they are using regex URLs, not typed URLs, and cannot determine why the project is not working as expected.
  • A developer who is continuing to use regex URLs incorrectly makes uses the import from django.urls import path, and fails to notice the error. They are unaware that they are using typed URLs, not regex URLs, and cannot determine why the project is not working as expected.

One way to guard against this would be to:

  • Enforce that new style path() arguments must not start with a leading '^'.
  • Enforce that old style url() arguments must start with a leading '^'.

This behaviour would ensure that the two different cases could not be used incorrectly.

There is a decidedly edge-case deprecation that this would introduce in that existing projects that happen to intentionally include an unachored URL regex would raise a ConfigurationError when upgraded. However this is a loud and documentable error, with a simple resolution. (Change the import to from django.urls import path_regex.)

Internal RegexURLPattern API

New style URLs should make the original string available to introspection using a .path attribute on the pattern instance.

They should be implemented as a TypedURLPattern that subclasses RegexURLPattern.

These are aspects of the internal API, and would not be documented behaviour.

Documentation

The new style syntax would present a cleaner interface to developers. It would be beneficial for us to introduce the newer syntax as the primary style, with the existing regex style as a secondary option.

It is suggested that should update all URL examples across the documentation to use the new style.

Implementation tasks

The following independant tasks can be identified:

  • Implement the convertors argument. This adds the low-level API support for type coercion. Ensure that lookups perform type coercion, and correspondingly, that calls to reverse work correctly with typed arguments.
  • Add support for the new style path function, with an underlying implementation based on the regex urls.
  • Add path_regex, with from django.conf.urls import url becoming a shim for it.
  • Add support for registering custom convertors, as defined in the Django settings.
  • Document the new style URL configuration.
  • Update existing URL cases in the documentation throughout.
  • Update the tests throughout, updating to the new style wherever possible.
@frankwiles
Copy link

frankwiles commented Oct 6, 2016

First off I REALLY like this idea. Adding in an easy way without removing full regex ability is great.

This probably isn't the proper place for discussing changes to this DEP, but I would highly recommend adding slug as an out of the box converter and pk and id being aliases to int. This would drastically cut down on new user questions, bugs and their overall understanding.

@tomchristie
Copy link
Author

tomchristie commented Oct 6, 2016

I would highly recommend adding slug as an out of the box converter

Yup, that might be a good idea. We'd need to consider which of slug or unicode_slug or both would make most sense.

pk and id being aliases to int

That's a big more fiddly, as the primary key may not necessarily be an AutoField.

In generic cases I think we still have to use the uncoerced standard text type.

@tomchristie
Copy link
Author

Putting this here for later reference. Tim Graham mentions "the simplify_regex() function in admindocs should probably be a method of the URLPattern class". That sounded like a good idea to me on first pass, but on looking through eg. how we use it in REST framework's schema generation, I think it'd end up making some things more awkward. (Plus it's a bit of internal API changing that could feasibly break a handful of third party packages out there)

@sjoerdjob
Copy link

Regarding the simplify_regex: Yes, it should definitely be a method of the RegexURLPattern/RegexURLResolver class, probably named something like `get_simple_path. Just leave the old API in place for a while as a pass-through.

For new-syntax routes, the get_simple_path should just return the original input, maybe without the type annotation. So as long as everything is still regex-based the current approach will work fine. But in my opinion, it is good to someday open up for non-regex based resolvers. Or resolvers that check different aspects of the request than the path.

@tomchristie
Copy link
Author

Note that we don't actually need to go from regex -> path, so I don't think a get_simple_path method is necessarily needed.

@hmleal
Copy link

hmleal commented Oct 17, 2016

👍

@sjoerdjob
Copy link

During the DUtH sprints, I talked to somebody who mentions that a particular usecase for URL routing is also reversing IPC URLs (and other API endpoints). We should be careful not to break anything w.r.t. that, (especially when dealing with reversing and "preventing unintended errors").

@EmilStenstrom
Copy link

This document is not here: django/deps#27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment