Skip to content

Instantly share code, notes, and snippets.

@goncalotomas
Last active August 29, 2017 14:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save goncalotomas/a1ac942dc6fbb4a21b1203b7d251baf3 to your computer and use it in GitHub Desktop.
Save goncalotomas/a1ac942dc6fbb4a21b1203b7d251baf3 to your computer and use it in GitHub Desktop.
GSoC 2017 Final Project Report

Project details

Organisation: Beam Community
Team: Lasp-Lang
Title: Implementing a Real World Application in the Lasp Programming Language
Goal: Provide a way to evaluate Lasp's performance by implementing real world applications with it
Mentors: Christopher Meiklejohn and Vitor Enes Duarte

Foreword/Disclaimer

Whilst participating in GSoC I was doing my Master's thesis, which implements a new benchmark tailored for distributed key-value storage systems called FMKe. Each Lasp server instance participates in a distributed key-value store, so despite initially focusing on evaluating general performance, since there was this common interest, a large focus was given to evaluate only the distributed key value store of Lasp.

Project Goals

Like previously mentioned, this project is related to my Master's thesis.
The project goals, as defined in the initial project submission, were to implement the benchmark that I'm designing as my thesis for the Lasp key-value storage. Since the benchmark is already available for other databases, the main goal was to comparatively evaluate the performance of Lasp's distributed storage system by comparing benchmark results between all available key-value stores.

Contributions

Here are the following contributions that I made or was involved in, not only for the Lasp Programming Language but to the BEAM community in general:

  1. Erlang/OTP 20 on homebrew
  2. New benchmarking tool
  3. Protocol buffer interface
  4. Lasp erlang client

Erlang/OTP 20 on homebrew

A new version of Erlang is released roughly every year, and the previous 2 releases (version 18 and 19) had some breaking changes, so it felt important to be able to work with this newer version from the start, so that little or no time is spent adapting code to work with version 20, that came out on June 21st.

Creating a pull request to support the latest version of Erlang/OTP in homebrew required me to support previous versions, and in the context of this contribution the following issues were created:

In terms of pull requests, the following were open (and successfully merged in):

New benchmarking tool

This project's goals revolved around evaluating performance for the Lasp programming system, so the usefulness of a benchmarking tool for this context is unquestionable. One good open-source tool existed, Basho Bench, and the recently deceased company had not looked into this particular project in a long time, the project was stale and did not compile with versions of Erlang/OTP newer than 16.

I suggested we take this tool and bring it back to life by taking the original code and spending a considerable amount of time fixing all of the incompatibilities with all the latest Erlang versions, as well as using CI tests and migrating to rebar3, the latest build tool for Erlang/OTP. An independent (detached) fork of the Basho repository was created, and the following changes were made:

Note that the pull request with the biggest changes contains commits from other authors (specifically from here) that remain credited, since some work had already been done by several people to try and use this tool with recent Erlang versions and build tools.

Once the changes were completed and to provide this tool to other Beam Community users, a package was created on Hex, which is the standard package manager for Erlang and Elixir projects.

Protocol Buffer Interface

In the FMKe benchmark the application server communicates with the storage layer via an arbitrary interface module. The databases for which FMKe is already implemented all use protocol buffers to communicate, and after initially considering and planning an RPC interface between FMKe and the Lasp KV storage, we opted to provide a Protocol Buffer interface not only to make results as comparable as possible, but also because it would be a good addition to Lasp.

Since there was no protocol buffer interface for Lasp, I wrote a library from scratch that contained the protocol buffer messages and some utility functions written in Erlang to easily encode and decode operations. Since this library was to be later used as a dependency by Lasp itself, it was also made available as a Hex package. Several revisions of the protocol buffer messages had to be made to ensure that encoding all operations and state required for Lasp was possible.

On the Lasp repository code, some modifications were made that include a server that listens on a configurable port for protocol buffer requests. These changes are here and can only be merged once this pull request is merged.

Lasp Erlang Client

I hope that the protocol buffer interface and library that were made available will prove useful, but a client library that could abstract away this type of communication would not only make it easier to request operations on a remote Lasp node, but also ease the implementation of the Lasp driver for the FMKe benchmark. I wrote a minimal client library that provides a simple put and get interface to fetch and update keys from Lasp, and it manages the encoding of requests and decoding of server responses transparently.

I implemented this library in a new repository, but decided not to release it as a Hex package since some Lasp functionality is not available in the client (further details below).

Contrast with Project Goals

In brief, consider the project goals to be the following set of tasks:

  • Design and implement protocol buffer messages to encode Lasp operations and CRDT states ✔️
  • Implement a client library that abstracts the use of protocol buffer messages ✔️
  • Adapt Lasp to respond to requests via protocol buffer interface ✔️
  • Write FMKe driver for Lasp ❌
  • Measure Lasp performance versus competition (Redis, Riak, AntidoteDB) ❌

(✔️ - done, ❌ - not finished)

Detailed status report

The comparative evaluation of the Lasp distributed key-value storage depends on the successful implementation of its components: the protocol buffer messages, the changes to Lasp and the erlang client, and finally the FMKe driver. Currently, the FMKe driver is still under development, so we can't use FMKe to compare Lasp KV with other databases yet. However, since this is valuable work for my thesis, I will continue working on this regardless of the GSoC timeline.

Lasp is more than it's distributed key-value storage sub-system and thus allows for much more than fetching and updating keys, so one of the features that would improve flexibility in the Lasp erlang client would be to support other operations not related to key-value storage.

The changes to Lasp that allow for processing operations via the protocol buffer messages have not yet been merged in because there are some changes that need to be done in the lasp-lang/types repository to allow for a different representation of map values. These required changes are here and here and still need to be reviewed.

Final thoughts

The biggest constraint throughout this Summer of Code was time. Maybe I should have thought twice before taking on an internship with final exams and thesis deliverables to hand in, but I'm glad I didn't. I learned a lot from both my mentors, to whom I am very grateful that they were able to put up with me this long.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment