Skip to content

Instantly share code, notes, and snippets.

@anutosh491
Last active August 24, 2023 12:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save anutosh491/9cdea54f937705ed17a1dd64c892948d to your computer and use it in GitHub Desktop.
Save anutosh491/9cdea54f937705ed17a1dd64c892948d to your computer and use it in GitHub Desktop.
GSoC_2023_Final_Report

Google Summer of Code 2023 Final report

Student: Anutosh Bhat
Project: Implementing Symbolic Algorithms as a part of ASR
Organisation: Python Software Foundation
Mentor: Ondřej Čertík
Proposal: GSoC 23 Proposal
Weekly Blogs: anutosh491.github.io

Overview of Work Done

Through my GSoC project, I was able to implement symbolic operations & algorithms as a part of ASR (Abstract Semantic Representation). The following things have been achieved

  • Symbolic elements like Symbols, Constants like pi, Numerics like Integers , Unary operators/functions like Sin, Cos and Binary operators/functions like Add, Differentiation etc were introduced as intrinsic functions in intrinsic_registry.h
  • 3 backends: cpython_sym , c_sym and lvm_sym were introduced to test out symbolic operations against various backends.
  • A set of symbolic tests were introduced as integration tests and support for print & assert statements was added to test out symbolic operations.
  • The replace_symbolic pass was introduced to to lift the symbolic support from the C backend and extend its functionality to all other backends like LLVM.

Hence now we can compile and execute the following program through multiple backends.

from sympy import Symbol, pi, S
from lpython import S

def test_chained_operations():
    x: S = Symbol('x')
    y: S = Symbol('y')
    z: S = Symbol('z')
    a: S = Symbol('a')
    b: S = Symbol('b')
    
    # Chained Operations
    w: S = (x + y) * ((a - b) / (pi + z))
    result: S = (w ** S(2) - pi) + S(3)
    
    # Print Statements
    assert(result == S(3) + (a -b)**S(2)*(x + y)**S(2)/(z + pi)**S(2) - pi)
    print(result)
    
    # Additional Variables
    c: S = Symbol('c')
    d: S = Symbol('d')
    e: S = Symbol('e')
    f: S = Symbol('f')
    
    # Chained Operations with Additional Variables
    x = (c * d + e) / f
    y = (x - S(10)) * (pi + S(5))
    z = y ** (S(2) / (f + d))
    result = (z + e) * (a - b)
    
    # Print Statements
    assert(result == (a - b)*(e + ((S(5) + pi)*(S(-10) + (e + c*d)/f))**(S(2)/(d + f))))
    print(result)
    assert(x == (e + c*d)/f)
    print(x)
    assert(y == (S(5) + pi)*(S(-10) + (e + c*d)/f))
    print(y)
    assert(z == ((S(5) + pi)*(S(-10) + (e + c*d)/f))**(S(2)/(d + f)))
    print(z)

test_chained_operations()

(lf) anutosh491@spbhat68:~/lpython/lpython$ lpython --enable-symengine --backend=llvm symbolics.py
3 + (a - b)**2*(x + y)**2/(z + pi)**2 - pi
(a - b)*(e + ((5 + pi)*(-10 + (e + c*d)/f))**(2/(d + f)))
(e + c*d)/f
(5 + pi)*(-10 + (e + c*d)/f)
((5 + pi)*(-10 + (e + c*d)/f))**(2/(d + f))

(lf) anutosh491@spbhat68:~/lpython/lpython$ lpython --enable-symengine --backend=c symbolics.py
3 + (a - b)**2*(x + y)**2/(z + pi)**2 - pi
(a - b)*(e + ((5 + pi)*(-10 + (e + c*d)/f))**(2/(d + f)))
(e + c*d)/f
(5 + pi)*(-10 + (e + c*d)/f)
((5 + pi)*(-10 + (e + c*d)/f))**(2/(d + f))

Detailed Description

As described in the proposal, the initial approach while working on such a project had to be experimental. There was no clear roadmap and much of the first few weeks was spent in trial & error. This was important as we did lot of iterations over the best approach for implementing symbolic algorithms.

  • Week 1: I started introducing the backbone of any symbolic program in this week

    • Introduced the SymbolicExpression ttype
    • Introduced SymbolicSymbol as an intrinsic function
    • Supported imports from SymPy for CPython
  • Week 2 & 3: By this time we could replicate a symbol in the ASR but we wanted to move to the next step which is supporting assignment & print statements. We discussed a good amount of ways to do this ranging from framing passes to adding support through the LLVM backend . As we were finally going to use SymEngine's C interface, we thought the best approach here would be to add the symbolic support in the C backend itself. We also had to throughly understand how to deal with SymEngine's Basic type. This is because initialization, assignment and other operations for this type were quite different that the S type in SymPy. Hence we had to understand working of functions like basic_new_stack, basic_new_heap & basic_free_stack.

  • Week 4: By this week we were able to generate C code for simple assignment & print statements where we were able to assign and print symbols and constants like pi. I started adding support for binary operators like Add, Sub, Mul etc.

  • Week 5 & 6: It took me around 5 weeks to get my first Pull Request merged but once we had a concrete structure for our project down, the next 2 weeks went pretty smoothly and I was able to get 6 more Pull Requests merged in this time frame. Through those 6 PR's I was able to achieve the following through the C backend.

    • Introduced all unary/binary operators & a casting function from i32 to S.
    • Introduced support for directly executing Print statements without Assignment statements.
    • Added support for chaining symbolic operators & brackets (both left and right associative in nature) together.
    • Added support for comparing two symbolic expressions through assert statements using the SymbolicCompare node.
    • Added support for freeing basic variables.
  • Week 7: By the end of week 7, I was able to finish half of my project and we could now support almost all symbolic expressions/functions/operations through the C backend. In this week I achieved the following

    • Added symbolic binary operations like Differentiation and Expansion.
    • Added symbolic elementary functions like Sin, Cos, Log etc.
  • Week 8: During week 8, we started thinking in terms of lifting the symbolic support added to the C backend to all other backends. We thought of making use of the bindC functionality available in Lpython and whenever we would encounter a symbolic function like basic_add or basic_new_stack we would end up making C calls to symengine’s C wrapper header file where all these functions have been defined. Which means that although we are linking the symengine library, we are essentially trying to replicate the above symbolic operation with whatever is already implemented in LPython.For eg

    • Using the CPtr type which in the backend would represent a void pointer which can be treated equivalently to a SymEngine basic type.
    • And using the @ccall decorator with the symengine cwrapper header file passed into it.
  • Week 9: During this week, I worked on introducing symbolic tests based on the ccall decorator. I also made sure that these tests would pass through the newly introduced c_sym and llvm_sym backends

  • Week 10 & 11: I implemented the first half of the replace_symbolic pass during this week . Our approach here was to frame visitors with the following functionalities

    • Visit_Variable - This was responsible for replacing every variable with type S to a CPtr type and also create a relevant placeholder variable of i64 type.
    • Visit_Function - Update the function body with FunctionCall's & SubRoutineCall's as per the symbolic program and also update the functional dependencies.
    • Visit_Print, Visit_Assignment etc
  • Week 12: During this week I finished all remaining functionalities around the replace_symbolic pass.

    • Extended functionality of the pass and added features like chaining of operators.
    • Added freeing of basic variables through the pass.
    • Refactored the pass to cut down on code duplications and repetetions.

List of Issues created / closed: lpython/issues
List of PRs opened / merged / closed: lpython/pulls

Steps to support any Symbolic Operation

Now that we have a proper concrete structure for supporting symbolic stuff, I think the users/contributors should find it easy to add support for any functions or operations as per their use case. For eg Let's say a contributor wants to use the zeta function. The steps necessary for that are as follows

  1. Introduce an IntrinsicFunction node in the intrinsic_registry.h file
enum class IntrinsicScalarFunctions : int64_t {
    ....
    SymbolicExp,
    SymbolicAbs,
    SymbolicZeta
    // ...
};
  1. As the zeta function is yet another unary function like the sin function and we support macros for constants, unary or binary functions, we just need to register the zeta function through the macro.
#define create_symbolic_unary_macro(X)                                                    \
namespace X {                                                                             \
    static inline void verify_args(const ASR::IntrinsicScalarFunction_t& x,               \
            diag::Diagnostics& diagnostics) {                                             \
            // Implementation                                                             \
    }                                                                                     \
                                                                                          \
    static inline ASR::expr_t* eval_##X(Allocator &/*al*/, const Location &/*loc*/,       \
            ASR::ttype_t *, Vec<ASR::expr_t*> &/*args*/) {                                \
            // Implementation                                                             \
    }                                                                                     \
                                                                                          \
    static inline ASR::asr_t* create_##X(Allocator& al, const Location& loc,              \
            Vec<ASR::expr_t*>& args,                                                      \
            const std::function<void (const std::string &, const Location &)> err) {      \
            // Implementation                                                             \
    }                                                                                     \
} // namespace X

create_symbolic_unary_macro(SymbolicAbs)
create_symbolic_unary_macro(SymbolicExpand)
create_symbolic_unary_macro(SymbolicZeta)
  1. Register it through the correct SymEngine function in process_intrinsic_function in the replace_symbolic pass.
    void process_intrinsic_function(Allocator &al,  const Location &loc, ASR::IntrinsicScalarFunction_t* x, SymbolTable* module_scope,
        ASR::expr_t* target){
        int64_t intrinsic_id = x->m_intrinsic_id;
        switch (static_cast<LCompilers::ASRUtils::IntrinsicScalarFunctions>(intrinsic_id)) {
            ...........
            ...........
            case LCompilers::ASRUtils::IntrinsicScalarFunctions::SymbolicAbs: {
                process_unary_operator(al, loc, x, module_scope, "basic_abs", target);
                break;
            }
            case LCompilers::ASRUtils::IntrinsicScalarFunctions::SymbolicZeta: {
                process_unary_operator(al, loc, x, module_scope, "basic_zeta", target);
                break;
            }
            default: {
                throw LCompilersException("IntrinsicFunction: `"
                    + ASRUtils::get_intrinsic_name(intrinsic_id)
                    + "` is not implemented");
            }
        }
    }

Hence in 3 simple steps, one can now register any symbolic operations or functions in the ASR and use it through the LPython Compiler.

Future Work

As for future work, a short term goal could be to reduce the code duplication even further through some macros based approach. A long term goal could be to investigate how to make the symbolic tests work with the FAST flag enabled for the llvm_sym backend.

Learnings

I learnt a lot of new things throughout the journey:

  • I was an amateur in the field of compilers and after working with LPython, I think I understand the working of a compiler much better. Technical aspects of a Compiler like Symbol Table, Visitors and other things are now much more clear to me.
  • I pursued an internship simultaneously, hence time management was the key quality I learnt here.
  • I now understand the general workflow for debugging any issue related to LPython/LFortran.
  • I implemented 3 backends for testing out symbolic functionalties and CI support for them respectively , hence now I understand how to frame yml files for github actions/workflows.
  • git version control, conda, mamba, etc.

Acknowledgments

I would also like to take this opportunity to express my gratitude to those who have provided me with tremendous support and guidance during the past few months. In particular, I would like to thank Ondřej Čertík, Thirumalai Shaktivel, Smit Lunagariya, Ubaid Sheikh and Gagandeep Singh for their invaluable contributions in reviewing my PRs and providing assistance whenever I encountered difficulties.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment