Student: Anutosh Bhat
Project: Implementing Symbolic Algorithms as a part of ASR
Organisation: Python Software Foundation
Mentor: Ondřej Čertík
Proposal: GSoC 23 Proposal
Weekly Blogs: anutosh491.github.io
Through my GSoC project, I was able to implement symbolic operations & algorithms as a part of ASR (Abstract Semantic Representation). The following things have been achieved
- Symbolic elements like Symbols, Constants like pi, Numerics like Integers , Unary operators/functions like Sin, Cos and Binary operators/functions like Add, Differentiation etc were introduced as intrinsic functions in intrinsic_registry.h
- 3 backends: cpython_sym , c_sym and lvm_sym were introduced to test out symbolic operations against various backends.
- A set of symbolic tests were introduced as integration tests and support for print & assert statements was added to test out symbolic operations.
- The replace_symbolic pass was introduced to to lift the symbolic support from the C backend and extend its functionality to all other backends like LLVM.
Hence now we can compile and execute the following program through multiple backends.
from sympy import Symbol, pi, S
from lpython import S
def test_chained_operations():
x: S = Symbol('x')
y: S = Symbol('y')
z: S = Symbol('z')
a: S = Symbol('a')
b: S = Symbol('b')
# Chained Operations
w: S = (x + y) * ((a - b) / (pi + z))
result: S = (w ** S(2) - pi) + S(3)
# Print Statements
assert(result == S(3) + (a -b)**S(2)*(x + y)**S(2)/(z + pi)**S(2) - pi)
print(result)
# Additional Variables
c: S = Symbol('c')
d: S = Symbol('d')
e: S = Symbol('e')
f: S = Symbol('f')
# Chained Operations with Additional Variables
x = (c * d + e) / f
y = (x - S(10)) * (pi + S(5))
z = y ** (S(2) / (f + d))
result = (z + e) * (a - b)
# Print Statements
assert(result == (a - b)*(e + ((S(5) + pi)*(S(-10) + (e + c*d)/f))**(S(2)/(d + f))))
print(result)
assert(x == (e + c*d)/f)
print(x)
assert(y == (S(5) + pi)*(S(-10) + (e + c*d)/f))
print(y)
assert(z == ((S(5) + pi)*(S(-10) + (e + c*d)/f))**(S(2)/(d + f)))
print(z)
test_chained_operations()
(lf) anutosh491@spbhat68:~/lpython/lpython$ lpython --enable-symengine --backend=llvm symbolics.py
3 + (a - b)**2*(x + y)**2/(z + pi)**2 - pi
(a - b)*(e + ((5 + pi)*(-10 + (e + c*d)/f))**(2/(d + f)))
(e + c*d)/f
(5 + pi)*(-10 + (e + c*d)/f)
((5 + pi)*(-10 + (e + c*d)/f))**(2/(d + f))
(lf) anutosh491@spbhat68:~/lpython/lpython$ lpython --enable-symengine --backend=c symbolics.py
3 + (a - b)**2*(x + y)**2/(z + pi)**2 - pi
(a - b)*(e + ((5 + pi)*(-10 + (e + c*d)/f))**(2/(d + f)))
(e + c*d)/f
(5 + pi)*(-10 + (e + c*d)/f)
((5 + pi)*(-10 + (e + c*d)/f))**(2/(d + f))
As described in the proposal, the initial approach while working on such a project had to be experimental. There was no clear roadmap and much of the first few weeks was spent in trial & error. This was important as we did lot of iterations over the best approach for implementing symbolic algorithms.
-
Week 1: I started introducing the backbone of any symbolic program in this week
- Introduced the
SymbolicExpression
ttype - Introduced
SymbolicSymbol
as an intrinsic function - Supported imports from
SymPy
forCPython
- Introduced the
-
Week 2 & 3: By this time we could replicate a
symbol
in the ASR but we wanted to move to the next step which is supportingassignment
&print
statements. We discussed a good amount of ways to do this ranging from framing passes to adding support through theLLVM
backend . As we were finally going to useSymEngine's C interface
, we thought the best approach here would be to add the symbolic support in the C backend itself. We also had to throughly understand how to deal with SymEngine'sBasic
type. This is because initialization, assignment and other operations for this type were quite different that theS
type inSymPy
. Hence we had to understand working of functions likebasic_new_stack
,basic_new_heap
&basic_free_stack
. -
Week 4: By this week we were able to generate C code for simple assignment & print statements where we were able to assign and print
symbols
and constants likepi
. I started adding support for binary operators likeAdd
,Sub
,Mul
etc. -
Week 5 & 6: It took me around 5 weeks to get my first Pull Request merged but once we had a concrete structure for our project down, the next 2 weeks went pretty smoothly and I was able to get 6 more Pull Requests merged in this time frame. Through those 6 PR's I was able to achieve the following through the
C
backend.- Introduced all unary/binary operators & a casting function from
i32
toS
. - Introduced support for directly executing
Print
statements withoutAssignment
statements. - Added support for chaining symbolic operators & brackets (both left and right associative in nature) together.
- Added support for comparing two symbolic expressions through
assert
statements using theSymbolicCompare
node. - Added support for freeing
basic
variables.
- Introduced all unary/binary operators & a casting function from
-
Week 7: By the end of week 7, I was able to finish half of my project and we could now support almost all symbolic expressions/functions/operations through the C backend. In this week I achieved the following
- Added symbolic binary operations like
Differentiation
andExpansion
. - Added symbolic elementary functions like
Sin
,Cos
,Log
etc.
- Added symbolic binary operations like
-
Week 8: During week 8, we started thinking in terms of lifting the symbolic support added to the
C
backend to all other backends. We thought of making use of the bindC functionality available in Lpython and whenever we would encounter a symbolic function likebasic_add
orbasic_new_stack
we would end up making C calls to symengine’s C wrapper header file where all these functions have been defined. Which means that although we are linking the symengine library, we are essentially trying to replicate the above symbolic operation with whatever is already implemented in LPython.For eg- Using the
CPtr
type which in the backend would represent avoid pointer
which can be treated equivalently to aSymEngine basic
type. - And using the
@ccall
decorator with the symengine cwrapper header file passed into it.
- Using the
-
Week 9: During this week, I worked on introducing symbolic tests based on the
ccall
decorator. I also made sure that these tests would pass through the newly introducedc_sym
andllvm_sym
backends -
Week 10 & 11: I implemented the first half of the
replace_symbolic
pass during this week . Our approach here was to frame visitors with the following functionalities- Visit_Variable - This was responsible for replacing every variable with type
S
to aCPtr
type and also create a relevant placeholder variable ofi64
type. - Visit_Function - Update the function body with
FunctionCall
's &SubRoutineCall
's as per the symbolic program and also update the functional dependencies. - Visit_Print, Visit_Assignment etc
- Visit_Variable - This was responsible for replacing every variable with type
-
Week 12: During this week I finished all remaining functionalities around the
replace_symbolic
pass.- Extended functionality of the pass and added features like chaining of operators.
- Added freeing of basic variables through the pass.
- Refactored the pass to cut down on code duplications and repetetions.
List of Issues created / closed: lpython/issues
List of PRs opened / merged / closed: lpython/pulls
Now that we have a proper concrete structure for supporting symbolic stuff, I think the users/contributors should find it easy to add support for any functions or operations as per their use case. For eg Let's say a contributor wants to use the zeta
function. The steps necessary for that are as follows
- Introduce an IntrinsicFunction node in the intrinsic_registry.h file
enum class IntrinsicScalarFunctions : int64_t {
....
SymbolicExp,
SymbolicAbs,
SymbolicZeta
// ...
};
- As the zeta function is yet another unary function like the sin function and we support macros for constants, unary or binary functions, we just need to register the zeta function through the macro.
#define create_symbolic_unary_macro(X) \
namespace X { \
static inline void verify_args(const ASR::IntrinsicScalarFunction_t& x, \
diag::Diagnostics& diagnostics) { \
// Implementation \
} \
\
static inline ASR::expr_t* eval_##X(Allocator &/*al*/, const Location &/*loc*/, \
ASR::ttype_t *, Vec<ASR::expr_t*> &/*args*/) { \
// Implementation \
} \
\
static inline ASR::asr_t* create_##X(Allocator& al, const Location& loc, \
Vec<ASR::expr_t*>& args, \
const std::function<void (const std::string &, const Location &)> err) { \
// Implementation \
} \
} // namespace X
create_symbolic_unary_macro(SymbolicAbs)
create_symbolic_unary_macro(SymbolicExpand)
create_symbolic_unary_macro(SymbolicZeta)
- Register it through the correct
SymEngine
function inprocess_intrinsic_function
in thereplace_symbolic
pass.
void process_intrinsic_function(Allocator &al, const Location &loc, ASR::IntrinsicScalarFunction_t* x, SymbolTable* module_scope,
ASR::expr_t* target){
int64_t intrinsic_id = x->m_intrinsic_id;
switch (static_cast<LCompilers::ASRUtils::IntrinsicScalarFunctions>(intrinsic_id)) {
...........
...........
case LCompilers::ASRUtils::IntrinsicScalarFunctions::SymbolicAbs: {
process_unary_operator(al, loc, x, module_scope, "basic_abs", target);
break;
}
case LCompilers::ASRUtils::IntrinsicScalarFunctions::SymbolicZeta: {
process_unary_operator(al, loc, x, module_scope, "basic_zeta", target);
break;
}
default: {
throw LCompilersException("IntrinsicFunction: `"
+ ASRUtils::get_intrinsic_name(intrinsic_id)
+ "` is not implemented");
}
}
}
Hence in 3 simple steps, one can now register any symbolic operations or functions in the ASR and use it through the LPython Compiler.
As for future work, a short term goal could be to reduce the code duplication even further through some macros based approach. A long term goal could be to investigate how to make the symbolic tests work with the FAST
flag enabled for the llvm_sym
backend.
I learnt a lot of new things throughout the journey:
- I was an amateur in the field of compilers and after working with LPython, I think I understand the working of a compiler much better. Technical aspects of a Compiler like
Symbol Table
,Visitors
and other things are now much more clear to me. - I pursued an internship simultaneously, hence time management was the key quality I learnt here.
- I now understand the general workflow for debugging any issue related to LPython/LFortran.
- I implemented 3 backends for testing out symbolic functionalties and CI support for them respectively , hence now I understand how to frame
yml
files for github actions/workflows. git
version control,conda
,mamba
, etc.
I would also like to take this opportunity to express my gratitude to those who have provided me with tremendous support and guidance during the past few months. In particular, I would like to thank Ondřej Čertík, Thirumalai Shaktivel, Smit Lunagariya, Ubaid Sheikh and Gagandeep Singh for their invaluable contributions in reviewing my PRs and providing assistance whenever I encountered difficulties.