Skip to content

Instantly share code, notes, and snippets.

@ansharlubis
Last active October 3, 2022 11:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ansharlubis/222594dce21f2e4d71a4f7e077509e2b to your computer and use it in GitHub Desktop.
Save ansharlubis/222594dce21f2e4d71a4f7e077509e2b to your computer and use it in GitHub Desktop.
Implementation of generic function support for LPython

Overview

Project: Generic function support for LPython

Organization: Python Software Foundation (LPython)

Mentors: Ondřej Čertík, Gagandeep Singh, Rohit Goswami

Contributor: Luthfan Anshar Lubis

Project Summary

Generics is a common functionality found in statically-typed programming languages to allow easier maintenance of programs with identical implementations but different type signatures. LPython, a statically-typed Python compiler does not yet support this. In this project, we introduce generic functions as functions with type variables; these variables are statically type-checked and made concrete when lower-level code is generated. To allow type checks for generic function calls, restrictions (similar to traits in Rust) are provided to ensure safety of type parameters.

Background

Programs in LPython are statically-typed, requiring type annotations by users. They are first compiled into Abstract Semantic Representation (ASR), an intermediate representation that carries all the syntactic and semantic (including typing) information of the programs. From this IR, the programs are further compiled to lower level languages such as C and LLVM.

Another important feature of LPython is ensuring that programs for LPython can also be run in CPython.

Implementation

The project will be explained through several phases:

1. Introducing type variables

The first phase of the implementation was adding syntax support for type variable declarations. On the Python level, we rely on TypeVar provided by Python 3. On the ASR level, I added a new kind of type for expression called TypeParameter. This allows the declaration of the following function:

T = TypeVar('T')

def add(x: T, y: T) -> T:
  return x + y

The arguments and return value of the function add are typed with a TypeParameter on the ASR level.

Simple checks on the type variable declarations are also implemented. First, ensuring that a type variable declaration is unique. Second, ensuring that a function can only be typed with an already declared type variable.

Related PR:

Related blog posts:

2. Implementing generic function instantiation

A big leap from the first phase, I implemented a working generic function instantiation that can be compiled to LLVM and can be run, but still without type checks.

Using the example from the previous phase, it is now possible to have the following code:

T = TypeVar('T')

def add(x: T, y: T) -> T:
  return x + y

print(add(1,2))
print(add('a','b'))

The two function calls add(1,2) and add('a','b') instantiate the function add into two different functions on the ASR level, each with different type signatures. If we were to express it with Python syntax, we would have:

def __lpython_generic_add_0(x: i32, y: i32) -> i32: ...
def __lpython_generic_add_1(x: str, y: str) -> str: ...

Here, i32 represents 32-bit integer and str represents string.

At the same time, the function calls themselves are correspondingly renamed into:

print(__lpython_generic_add_0(1,2))
print(__lpython_generic_add_1('a','b'))

At the function call site, the consistence of the type substitution between the arguments types (i32 or str) with the type variables (T) is also checked. The function call add(1,'a') is rejected.

To allow compilation to LLVM, the generic function (add) is ignored while the instantiated functions (__lpython_generic_add_0, __lpython_generic_add_1) and renamed functions calls are compiled normally with the existing compilation.

This step particularly took a long time due to several reasons. First, the lack of familiarity with the already existing tools within LPython that can be utilized for function instantiations. Originally I started with a function instantiator that I personally defined, although later on finding out from one of the mentors about an ASR duplicator. Second, we did not start with a concrete design for the ASR grammar. This resulted in a couple iterations on the grammar for generic functions.

Related PR:

Related blog posts:

3. Adding necessary functionalities and handling ASR changes

I added support for Array and List types, along with corresponding examples where these types are used.

I also added support for generic Subfunction (functions without return values). However, during the implementation, the main branch of LPython made a change and combined both Function and Subfunction, requiring modifications on the generic functions support.

Related PR:

Related blog posts:

4. Adding hard-coded restrictions for type variables

The next step is adding a mechanism to type check generic function calls and operations involving values with type variables. Consider again the add program, there was no check on the operation x+y against the arguments with type T. There was also no type check on add(1,2) to make sure that the arguments 1 and 2 can handle the operation inside add.

As the first approach to this, I added restrictions on the type variable declarations. The restrictions I added were SupportsPlus, SupportsZero, and Divisible. These restrictions are hard-coded in the way that they are hard-coded into the compiler to be associated with specific computations. With restrictions, type variables declarations become:

T = TypeVar('T', bound=SupportsPlus)

The restriction SupportsPlus is associated with addition between integer, real, and string. However, it only supports addition between same typed arguments.

With these restrictions, it is possible to check whether the generic operation x+y is safe by checking the restrictions on the type variables associated with both arguments. It is also possible to check whether add(1,2) is safe by checking that 1 and 2 are typed as integers which does support addition.

Related PR:

Related blog posts:

5. Designing and implementing user-defined restrictions

The above approach to restrictions does handle type check on operations, but limit them to only those hard-coded into the compilers. We require a mechanism that can allow users to freely define restrictions according to need.

To illustrate this, let us consider a different generic function meanthat takes a list of T as arguments and calculate the mean value of the elements in the list:

def mean(x: list[T]) -> f64:
    k: i32 = len(x)
    if k == 0:
        return 0.0
    res: T
    res = 0.0
    i: i32
    for i in range(k):
        res = res + x[i]
    return res/k

Several computations are made on the variable res with type T. A zero assignment, an addition, and a division. To type check the body of mean, we can abstract the computations as restrictions, expressed as body-less functions.

@restriction
def zero(x: T) -> T:
    pass

@restriction
def add(x: T, y: T) -> T:
    pass

@restriction
def div(x: T, k: i32) -> f64:
    pass

def mean(x: list[T], **kwargs) -> f64:
    k: i32 = len(x)
    if k == 0:
        return 0.0
    res: T
    res = zero(x[0])
    i: i32
    for i in range(k):
        res = plus(res, x[i])
    return div(res/k)

Then, the function calls can assign specific functions that will take over the place of these restrictions (add, zero, div).

print(mean([1,2], zero=int.__zero__, add=int.__add__, div=int.__div__))
print(mean([1.0,2.0], zero=real.__zero__, add=real.__add__, div=real.__div__))

The assignment between a restriction (eg. zero) and a function (eg. int.__zero__) is also checked by making sure if the type substitution remains consistent or not.

Related PR:

Related blog posts:

Result

I have completed a prototype implementation for LPython. The implementation is strongly type-checked (strong concept) by using restrictions to define computations on type variables. Type-checked generic function calls are compiled into ASR then LLVM.

The latest running examples can be found in the latest pull request (lcompilers/lpython#1048).

Future Work

  • More complicated generic functions, such as recursive generic functions and nested generic function calls will require some modifications on both type checks and function instantiations
  • Designs can be further simplified to avoid defining similar restrictions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment