Skip to content

Instantly share code, notes, and snippets.

@emilaxelsson
Last active December 15, 2015 00:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save emilaxelsson/5170284 to your computer and use it in GitHub Desktop.
Save emilaxelsson/5170284 to your computer and use it in GitHub Desktop.
Virtual tuples in Feldspar

Proposal for feldspar-language and feldspar-compiler.

Background

Tuples are currently compiled to structs whenever they are forced by a Syntax-overloaded function, such as condition or forLoop. In most cases, we would like to avoid the struct and just have the members as separate variables.

Consider the example

\a b -> uncurry (+) $ condition a (b,2) (3,b) :: Data Index

which generates the following code:

void test(uint32_t v0, uint32_t v1, uint32_t * out)
{
  struct s_unsignedS32_unsignedS32 v2;
  
  if (v0)
  {
    (v2).member1 = v1;
    (v2).member2 = 2;
  }
  else
  {
    (v2).member1 = 3;
    (v2).member2 = v1;
  }
  *out = ((v2).member1 + (v2).member2);
}

Instead we would like to get:

void test(uint32_t v0, uint32_t v1, uint32_t * out)
{
  uint32_t v2_1;
  uint32_t v2_2;
  
  if (v0)
  {
    v2_1 = v1;
    v2_2 = 2;
  }
  else
  {
    v2_1 = 3;
    v2_2 = v1;
  }
  *out = (v2_1 + v2_2);
}

Problems

For the above example it looks like it should be very easy to change the code generator to get the desired behavior, but there are a some corner cases to be aware of. To understand the problem we need to note that v2 still exists as a variable in the AST. The virtual variables v2_1 and v2_2 are represented as sel1 v2 and sel2 v2 respectively. This raises two questions:

  1. What happens if v2 is used somewhere else in the program? (In a deeply nested tuple, any sub-tuple could potentially be used elsewhere.)
  2. What happens if v2 is aliased to an non-variable expression in the code generator?

Here is an example of case (2):

\a -> parallel 10 (\i -> desugar (condition a (i,2) (3,i) :: (Data Index, Data Index)))
void test(uint32_t v0, struct array * * out)
{
  *out = initArray(*out, sizeof(struct s_unsignedS32_unsignedS32), 10);
  for (uint32_t v1 = 0; v1 < 10; v1 += 1)
  {
    if (v0)
    {
      (at(struct s_unsignedS32_unsignedS32,*out,v1)).member1 = v1;
      (at(struct s_unsignedS32_unsignedS32,*out,v1)).member2 = 2;
    }
    else
    {
      (at(struct s_unsignedS32_unsignedS32,*out,v1)).member1 = 3;
      (at(struct s_unsignedS32_unsignedS32,*out,v1)).member2 = v1;
    }
  }
}

History

In the earliest versions of Feldspar, we actually had virtual tuples. The aim has always been for tuples to be virtual by default. The user interface made sure that the above corner cases did not occur, but this invariant was not type-checked. However, some people were asking for support for structs, since virtual tuples could not be stored in arrays (I think this had to do with the fact that Vectors were desugared to pairs).

In one later version (not sure if it was ever released) the type system was used to statically enforce the absence of the above corner cases. The internal representation was tremendously complicated as a result.

Flexible tuples

Then, we switched to the current solution where all forced tuples become structs. The idea since then has been to implement a hybrid solution "flexible tuples" where tuples are virtual by default, and structs are only created if necessary (i.e. when a tuple needs to be stored). The hope was that this solution would be able to handle the above corner cases so that we would avoid unnecessary restrictions on the AST.

But it's not very clear how to solve the aliasing problem (2). Even if something could be worked out, it seems that this whole approach would make compilation unpredictable, which goes against the philosophy of Feldspar.

Proposal

TODO Make more concrete...

Note that problem (2), when a tuple is aliased to a non-variable expression, only uccurrs when a tuple is stored in a larger structure (array or tuple). By not allowing tuples to be stored, this case would not occur. This could be done by placing suitable class constraints on functions that create/manipulate data structures (parallel, setIx, etc.).

Probably case (1) can also be handled with such restrictions. We also need to make sure that code motion doesn't lift out parts of a tuple.

The type of setIx could be

setIx :: Storable a => Data [a] -> Data Index -> Data a -> Data [a]

where Storable includes scalar and arrays, but not tuples. In the cases where we actually want to store tuples, we could add constructs for "packing":

pack   :: Syntax a => a -> Data (Packed (Internal a))
unpack :: Syntax a => Data (Packed (Internal a)) -> a

The Packed type should be an instance of Storable.

The packing concept could probably be used for other things as well, such as deciding how to layout data in memory.

It would still be interesting to see if it is possible to statically guarantee that tuples are used correctly, but that's outside the scope of this proposal.

@pjonsson
Copy link

We have complete freedom to change the data layout and ordering of elements inside the compiler as long as that does not escape to the outside world. I think it would be quite interesting to just consider the (common) case where the return type is not a tuple (or some other compound data type that contains a tuple), and optimize for the common case.

At the feldspar-language level the type information that we already have available should be sufficient to make a reasonably tight approximation of whether tuples are escaping from the function or not.

I'm not sure where pack and unpack would be exposed, but I'm not thrilled by the idea to add more noise in the form of explicit packs and unpacks to the surface language.

@emilaxelsson
Copy link
Author

So it seems that you would prefer the "flexible tuples" option. I think this can be obtained from the above proposal by just making packing/unpacking implicit (and not introducing the Storable class for e.g. setIx). If pack/unpack are implicit, we don't even need symbols for them. I think I agree. My point that flexible tuples would make compilation unpredictable might not be valid, since it will be easy to predict when packing/unpacking will happen.

The next step should probably be to implement virtual tuples without pack/unpack. Then we can add implicit packing when tuples are stored. If this fails, we might go for explicit packing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment