ldionne/GSoC2015.md

## GSoC2015.md

      
    Raw
  

              GSoC2015.md
            
          
    Abstract

The goal of this project is to improve the compile-time associative data
structures present in the Hana C++14 metaprogramming library. I aim to
provide a richer interface than what is currently available and also to
improve the compile-time performance of those data structures, while
keeping the runtime performance on-par with non-naive hand-written code.
Background information

I am an undergraduate in mathematics with a passion for programming.
My educational and programming backgrounds are summarized on my resume 1.
I could write those here, but that would be redundant.
My main programming interests are metaprogramming and functional programming
in C++ or other languages (Haskell, Ruby). I usually love it when there
are some abstract mathematics hidden somewhere, so I search for that. I am
interested in contributing to Boost because of the quality of what gets in
there. One of my primary goals at this early stage of my life is to get
better. Hence, having to satisfy very high standards and to justify my
decisions in front of very smart people is a good thing.
The project I am proposing is a continuation of the GSoC project I worked
on last year, which is a library for metaprogramming in C++14 called Hana.
All in all, I have been working on this project for almost two years, and
I am currently receiving a stipend from Boost to continue working on the
project as an extension to last year's GSoC. The library is also about to
be formally reviewed for inclusion in Boost, which shows the serious of the
initiative.
If I am accepted for GSoC this year, Joel Falcou, who mentored me last year,
agreed to mentor me again this year.
Project proposal

Hana 2 is a library providing data structures and algorithms working on
heterogeneous objects. It is basically a standard library where you would
replace std::vector by std::tuple (which makes it quite different from the
standard library).
There is still room for improvement in many parts of the library. Some of
the most lacking parts are the associative data structures, Set and Map,
which provide a limited interface and are implemented using tuples and
linear searching internally, which is inefficient. My plans for this
summer are to research and write a new implementation for these data
structures. I will also research how the interface of these structures
can be improved.
More specifically, here is a list of tasks that need to be done:


Find what functionality is missing in the associative data structures.


Find a way to incorporate the new functionality in the current concepts
provided by the library, or modify those concepts to improve their
generality and consistency.


Explore possible implementation techniques for compile-time maps and sets.


Benchmark the compile-time and runtime performance of the structures.


Re-implement those data structures, with unit tests,
examples and documentation.


Proposed milestones and schedule

April 27: Start working on my presentation for C++Now 2015 3. C++Now is a
Boost-centric conference held every year in Aspen; I will be
presenting Hana this year and the presentation will include
a presentation of the proposed work for the summer.
May 16: End of the C++Now conference. Start gathering a list of missing
functionality from the associative data structures.
June 8: Classify the functionality that we need for the associative
data structures into abstract concepts. Compare with what's done
elsewhere, especially in functional programming languages. Speaking
from experience, this is usually difficult to do properly and I
must take my time if I want to come up with a killer interface.
June 26: Submit midterm evaluations to Google.
July 1: Write compile-time and runtime benchmarks for the associative data
structures. For this, I can use the compile-time benchmark framework
I set up during last year's GSoC. I will use those benchmarks to
drive my research of the most efficient implementation.
July 10: Explore possible implementation techniques. I foresee a lot of
optimization possibilities for very specific (but very useful)
cases, like a set containing types only. Making this all fit
together should be a quite a challenge. Each implementation
technique should be thoroughly benchmarked and it should be
possible to justify the final implementation choice with raw data.
August 1: Complete the documentation and polish the work that was done.
Prepare to officially incorporate the changes into the project.
August 17: Pencils down date.
Personal details


Name: Louis Dionne


University: Laval University (Quebec, Canada)


Major: Mathematics


Degree program: B.Sc.


Email: ldionne.2@gmail.com


Homepage: http://ldionne.com


Availability:
I plan on working fulltime on the project during the whole summer,
which means early May to late August. I am currently having an
extension of GSoC 2014, so the exact start date is not meaningful:
I've been working on the project non-stop since the beginning of
GSoC in 2014.


Knowledge of (from 0 to 5):

C++ 98/03: 4
C++ 11/14: 5
C++ Standard Library: 4
Boost: 4
Git: 4
Doxygen: 4
CMake: 4

I use the SublimeText editor on OS X and compile on the command line.
I am deeply familiar with Clang and GCC, and I know their bug report
system a bit too well for my liking. :-)