"Flang" - a frontend for the Fortran programming language, a 2013 Google summer of code(GSoC) proposal for the LLVM project.
Name: Aleksei(Alex) Lorenz.
University: National university of Ireland, Galway
Course/Program: Bachelors in Computer Science and Information Technology
Contact email: email@example.com
Contact phone number: (+353) 870544409
Fortran is a "general-purpose, imperative programming language that is especially suited to numeric computation and scientific computing". Flang is a frontend for Fortran which uses LLVM as backend. It was started by Bill Wendling who worked on a lot of core, lexical and some parsing and semantics code. For this GSoC, I plan to make flang a fully featured frontend which supports a useful subset of Fortran 77, 90 and perhaps 95, and is able to compile and run programs and libraries such as Basic Linear Algebra Subprograms(BLAS, http://www.netlib.org/blas/) and Linear Algebra Package(LAPACK, http://www.netlib.org/lapack/).
I think that this project will be useful for LLVM for a number of reasons. First of all, it will expand the array of LLVM native frontends and will reduce the dependency on GCC for Fortran. It will also provide further proof of how mature and capable LLVM is. Lastly, it might encourage some companies/organisations who utilize Fortran to use flang and support the LLVM project. It might also bring some fame/a wider recognition to LLVM by reaching a new audience, and possibly bringing new developers into the project.
I also think that this project will be useful for Fortran. Right now, you are able to compile Fortran using LLVM with GCC and DragonEgg, but with Flang you will be able to compile Fortran using LLVM without the need for DragonEgg. This will potentially lead to faster compilation times, better errors and warnings, more efficient code generation, improved support for more/new architectures and better/easier integration with various tools such as IDEs, static analysers and documentation generators. It should also make Fortran development on non-unix systems such as Windows easier.
Fortran programming language is covered by many standards such as Fortran 77, 90, 95, 2003 and 2008. For this GSoC I will work on providing a high quality implementation of the important and the most widely used features from the Fortran 77, 90 and perhaps 95 standards. I believe that this approach is better than focusing on a full standard like Fortran 77, because the standard is very broad and has a lot of details which would take a lot of time and hard work to cover. On top of that, some features of the old standard are now considered obsolete and some are rarely used in practice. Therefore I will concentrate on providing a good compiler with insightful errors and warnings, which will be able to compile most Fortran programs and libraries. I think this is a realistic goal, and I am determined enough to achieve it.
I plan to cover the subset of Fortran standard in two steps - first of all I will focus on implementing parsing, AST, semantics and LLVM IR generation for a subset of Fortran 77 standard which will enable flang to compile and run BLAS. Then, I will implement the features and the IR generation for a subset of Fortran 90 standard which will enable flang to compile and run LAPACK. While working on these steps I will test the implemented features so they will correspond to the full standard specification.
The Fortran 77 subset which I plan on covering is limited to - all logical, arithmetical, character and relational expressions from the Fortran 77 standard, IF construct and statement, DO statement, the GOTO statement and PAUSE, CONTINUE and STOP statements. The set of supported specification statements will at least include DIMENSION, IMPLICIT, PARAMETER, EXTERNAL, INTRINSIC, and SAVE statement and of course the core type statements. DATA and ASSIGN statements will also be supported. INCLUDE statement will be supported. Functions, subroutines and RETURN, CALL and ENTRY statements will be fully supported. The intrinsic function support will at least cover all the intrinsic functions from the Fortran 77 standard. Both BLAS and LAPACK require a WRITE and FORMAT statements, and I will provide a basic support for them. Flang's type system will fully support core types, strings, and multidimensional and asterisk length arrays. This is quite a large set of features, and it will be enough to compile and run BLAS.
The Fortran 90 subset which I plan on covering includes support for - SELECT CASE statement, EXIT and CYCLE control flow statements, RECURSIVE procedures, WHERE statement, support for operations on array sections, derived data types and possibly some new intrinsic functions which might be required for LAPACK. This is also quite a large set of features, but the Fortran 77 subset should provide a nice foundation which will enable me to implement the support for them in the required time. These features should be enough to compile and run LAPACK.
If I will cover these two subsets before their rough deadline(Week 11), I might work on several other features which include a wider IO support, BOZ literal constants, modules, pointers with dynamic memory allocation and the FORALL statement.
As the support for various language features will be added to flang, testing will be required to verify that the initial and the future implementations work as intended. During the first week of development I will expand and improve the testing framework so that it will be able verify various errors and warnings. The written tests will try to cover all the combinations and check for various warnings and errors. I also plan to write some tests which will check the generated LLVM IR.
I plan to continue the work that Bill Wendling has put into flang. I will utilize the common code style to keep the code consistent. Most of the architectural and implementation ideas/details for flang were borrowed from clang, which is a good thing, because clang is designed using sound OOP principles which relate well to parsing, ASTs, and semantics, and I plan to continue this trend. I also plan to base my code/reuse some of the code from clang for things like code generation and the driver. All of these things should make flang easily accessible for clang and even non clang developers.
The code generation part of flang will use the C++ LLVM api. The standard functions such as stop and various string and array operations will be written in C and will utilize the C standard library. They will then be compiled to LLVM IR using clang and added as a LLVM module to flang, and, where necessary(ABI/Calling convention differences), I will implement wrappers around these functions using IR builder. I plan on using a simple ABI at the start for the ease of development. The Fotran 90 subset with it's array operations will have an array ABI which will conform to the ftp://ftp.nag.co.uk/sc22wg5/N1901-N1950/N1942.pdf specification. The compability with ABIs for other compilers and languages will be added at a later stage. I will also take a look at the IR produced by DragonEgg, and hopefully I will take away some ideas from it which will help me later during the development. Although I will provide support only for the WRITE and FORMAT statements, I might use an already developed library such as http://www.netlib.org/f2c/, because Fortran has a rich set of IO and formatting statements built into the core language which require an advanced runtime to support the various functions and parameters.
I've already started working on flang, and so far it was a good experience - I've fixed some bugs and added some new features. I familiarized myself with most of the code and gained insight into the architectural and the design choices. I also studied several Fortran standards and got to know a couple of intricacies and things that may make this project difficult. I must say that I genuinely enjoy working on flang, and believe that this kind of project will suit me well for this GSoC. My repository can be viewed at https://github.com/hyp/flang.
I will be available for work throughout the whole summer. I don't have any internships nor do I plan on going abroad during this summer, so I will work on flang as a full time job. My next college year starts in the first week of September, so during the last 2 weeks I will be working part time on flang, but I still will be committed to the project.
Supporting a compiler is a serious and challenging task, and it can't be stopped after this GSoC is finished. Therefore I plan to continue working on flang even after GSoC. I can't promise full dedication, but I definitely will be fixing bugs and adding new features from time to time.
This is a rough, high level plan of work that will be done.
Community bonding period:
Work on improving my Fortran knowledge, write a couple of Fortran programs, read and study the Fortran 77/90/95/2003/2008 standards, study clang for architectural/code ideas, look at the LLVM IR produced by DragonEgg, learn some of the intricacies of Fortran.
Week 1 - Testing framework + Statements + Expressions:
Expand the testing framework.
Fix any lexer bugs/add any missing features.
Implement parsing, semantics and AST for the action, control flow, INCLUDE, SAVE and ASSIGN statements and the expressions from the chosen Fortran 77 subset.
Week 2 - Types + Arrays + Strings + Specifications + IO:
Implement parsing, semantics, AST and expand the type system to allow for core types and arrays and most of the specification statements from the chosen subset of Fortran 77.
Implement parsing, semantics and AST for the WRITE and FORMAT statement.
Week 3 - Functions + Subroutines + IO:
Implement parsing, semantics and AST for statements and expressions for the function/procedure declarations from the chosen subset of Fortran 77. Implement the EXTERNAL, INTRINSIC and PARAMETER specification statements. Implement the RETURN, CALL and ENTER statements.
Make sure that the parsing, semantics and AST features from the previous weeks are done.
Week 4 - LLVM IR generation 1:
Start work on the code generation framework.
Implement LLVM type generation for the core non-array types.
Implement LLVM IR generation for various action statements and expressions, such as program, variables, operators, assignment, IF, GOTO, loops.
Add LLVM IR/bitcode output.
Week 5 - LLVM IR generation 2:
Implement LLVM types for Arrays and Strings.
Implement LLVM IR generation for intrinsic functions.
Implement LLVM IR generation for Arrays and String indexing, slicing and any other operations.
Week 6 - LLVM IR generation 3:
Implement LLVM IR generation for functions, procedures, subprograms, and call, return and entry statements.
Implement LLVM IR generation for the WRITE and FORMAT statements.
Add optimization passes.
Add object files output.
Add any missing generation features.
Week 7 - BLAS:
Compile and run BLAS using flang. Fix any bugs / add any missing features for it's support.
Week 8 - Fortran 90.1 -
Implement parsing, semantics and AST for SELECT CASE statement, EXIT and CYCLE statements, WHERE statement and array operations.
Week 9 - Fortran 90.2 -
Implement parsing, semantics, AST and extend the type system to support derived types.
Implement RECURSIVE procedures and any additional intrinsic functions.
Week 10 - LLVM IR generation 4:
Implement LLVM types for derived types.
Implement LLVM IR generation for the added features from the Fortran 90 subset.
Week 11 - LAPACK:
Compile and run LAPACK using flang. Fix any bugs / add any missing features for it's support.
Week 12 - Driver + polishing:
Work on the flang driver - command line options(similar to clang), help, automatic linking/executable generation, any other things.
Implement Standard version command line switch and file extension recognition(e.g. file.f95 -> uses Fortran95 standard). Add support for version recognition and control while parsing - give errors/warnings for missing/deprecated features.
Check the tests and make sure that they cover as much as possible and that all the tests for the errors and warnings work as they should.
Polish/Refactor code if required.
Write user manual.