Skip to content

Instantly share code, notes, and snippets.

@Madhav2310
Last active August 21, 2023 16:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Madhav2310/2c9a1e286cb7b741a94be07acb13d22a to your computer and use it in GitHub Desktop.
Save Madhav2310/2c9a1e286cb7b741a94be07acb13d22a to your computer and use it in GitHub Desktop.

GSoC logo

GSoC'22 - Final Project Report

The following report summarizes the work done during Google Summer of Code 2022 along with the results, scope for improvements and future work. This also serves as the final project report with all the contributions.

Basic Info

About the Project

LPython is an ahead-of-time compiler for Python, built using the Abstract Semantic Representation (ASR) technology. It is currently in the pre-alpha stage and in heavy development. LPython is written in C++, and it has multiple backends to generate code, including LLVM and C++. It is designed as a library with separate building blocks – the parser, Abstract Syntax Tree [AST], Abstract Semantic Representation [ASR], semantic phase, codegen – that are all exposed to the user or developer in a natural way to make it easy to contribute back. It works on Windows, Linux, and Mac OS.

The speed of LPython comes from the high-level optimizations done at the ASR level, as well as the low-level optimizations that the LLVM can do. My project, in particular, involved discussing which modules will be needed for LPython (from a scientific computing perspective, in the beginning), creating a priority list, and then implementing each module properly. The aim of this project was to make LPython work for any Python code down the road.

Goals:

  • Implementing priority modules needed for LPython.
  • Creating extensive integration tests for respective functions in modules.
  • Zero bugs - Fix the currently identified bugs.
  • The best possible performance for numerical array-oriented code.
  • Compile a subset of Python and be Python-3+336 that runs on all platforms.
  • Fast compilation.
  • Make the documentation user and developer-friendly.

Communication and Work Management

  • Weekly video conferences over Google Meet were the primary mode of communication with the mentors.
  • The LPython zulip workspace was used to resolve all doubts, suggestions, and comments along with Github Issues for faster coordination.
  • Weekly blogs and updates were provided in the community to ensure everyone was updated with the project.

Pull Requests and Issues

Phase 1: Math, String and Decimal libraries

  • #377: With this PR, I implemented the trunc() function of Math Library
  • #395 : Raised an issue for passing a list iterable in a function.
  • #397: Initial implementation of Capitalize() function of String Library
  • #564: With this PR, I implemented the cbrt() (Cube root) and Exp2 functions of Math Library
  • #567: Raised an issue to clear up on handling of different function arguments and corresponding return types.
  • #581: With this PR, I attempted overloading the pow() and Fabs functions of Math Library.
  • #599 : Bug: Overload based on return type.
  • #669: With this PR I expanded the String Library with Upper() and Lower() functions.
  • #679: With this PR, I initiated the implementation of the Decimal Library with the dataclass.
  • #696: Raised an issue about the need for class functionality for implementing various libraries like decimal, fraction, numbers etc.
  • #722: Raised an issue about a bug in annotation assignment in dataclass.
  • #731: With this PR, I overloaded the pow() and Fabs functions of Math Library for several datatypes.
  • #732: Raised an issue about the missing feature of iterating over strings and other iterables.
  • #733: Raised an issue about the bug which fails string to list casting.
  • #747: With this PR, I did integration and reference testing of Pow and Fabs functions of Math Library.

Phase 2: Statistics and Numpy_intrinsic libraries

  • #769: With this PR, I initiated the implementation of the Statistics Library with the Mean, Geometric Mean & Harmonic Mean functions.
  • #770: Raised an issue about the bug where DoLoops increment by i32 but fail with every other datatype.
  • #771: With this PR, I solved the #770 issue by generalizing the increment for all values i8, i16, i32, and i64.
  • #876: With this PR, I overloaded Pow for Modulo third argument along with integration tests.
  • #877: With this PR, I implemented the Fmean function and overloaded Mean for various datatypes.
  • #901: With this PR, I expanded the Statistics library with Variance and Stdev functions.
  • #956: With this PR, I overloaded the Gmean and Hmean functions for other datatypes.
  • #958: With this PR, I expanded the Statistics library with the Mode function.
  • #961: With this PR, I expanded the Statistics library with Covariance and Correlation functions.
  • #1002: With this PR, I expanded the Numpy_intrinsic library with Sinh and Cosh functions along with extensive tests.
  • #1003: With this PR, I overloaded Covariance and Correlation functions for various datatypes.
  • #1004: With this PR, I expanded the Statistics library with Pvariance and Pstdev functions.
  • #1005: With this PR, I expanded the Statistics library with the Linear Regression function.
  • #1064: With this PR, I expanded the Numpy_intrinsic library with Tanh and Exp functions along with extensive tests.
  • #1075: With this PR, I expanded the Numpy_intrinsic library with Arcsinh and Arccosh functions along with extensive tests.
  • #1081: With this PR, I expanded the Numpy_intrinsic library with Arctanh function along with extensive tests.
  • #1083: With this PR, I expanded the Numpy_intrinsic library with Floor and Ceil functions along with extensive tests.

Future Scope

  • The Numpy_intrinsic library, a high-priority library, currently still has various functions remaining, and I will continue working on implementing them in the coming weeks.
  • There are also various more important libraries yet to be implemented like String, Number, Fraction, Decimal, OS, Itertools, Functools, Time, etc. However, most of these require Classes to be implemented in the backend. So this is something that can be worked upon parallelly.
  • Most functions are implemented with lists; as we move forward, we should overload them for tuples(recently implemented), sets(not implemented), and dictionaries(not implemented).

My Learnings

Throughout the duration of the project, I have grown a lot. In this project, I implemented in the backend the functions that I had simply been using all these years. I learned about algorithms, the testing, and the level of detail that goes into making them. I learned about constructing compilers and their various stages. Writing code for several corner cases was very interesting, which we usually tend to ignore. For example, in LPython, it is not possible to add an integer with a floating point number. A cast should be performed before performing the addition. Participating in GSoC has definitely improved my interpersonal and technical skills. Here are some of the prominent ones:

  • Collaborating with different teams across different time zones.
  • Understanding and implementing the opinion and suggestions of different stakeholders in the project.
  • Studying the workflow of projects and possible customizations that can be made.
  • Analysing different approaches to solve a problem along with their pros and cons.
  • Making a simple and easy-to-follow process that can easily be understood and followed by the community members.
  • Writing clean and well-commented code.
  • Understanding and reviewing code written by someone else and modifying it.
  • Implementing checks and error handling to alert the user in case of any malfunction.
  • Got to know about and work on various tech stacks and technologies like C++, Python, LLVM, libasr, ASTs, etc.
  • Effectively communicating the project details by writing weekly blogs and sending updates to the whole community.

Overall, this project has made me a better programmer, problem solver, debugger, tester, and a team player!

Note of Thanks

The project would not have been possible without the ongoing support of many people. I would especially like to thank:

  • Gagandeep Singh for being the most supporting and patient mentor. If it weren't for your guidance, sir, I wouldn't have made it so far.
  • Ondřej Čertík,, Smit Lunagariya, Naman Gera for discussing, suggesting improvements and assisting me throughout the project.
  • Ubaid Shaikh, for being a great partner. Really greatful for his great insights and discussions.
  • LPython Community for discussing the project during community calls and being so appreciative about it.
  • Friends and family for their encouragement and moral support throughout this project.

Footnotes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment