Skip to content

Instantly share code, notes, and snippets.

@dhollman
Last active August 8, 2018 00:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dhollman/0ff509a3a96a99f556d164c155e6b915 to your computer and use it in GitHub Desktop.
Save dhollman/0ff509a3a96a99f556d164c155e6b915 to your computer and use it in GitHub Desktop.
Drawing yet another line between P0443 and P1054

Replacing the Execution Function Arguments with a Continuation Concept

Here are some possible continuation concepts to replace the NullaryCallable argument currently given to execution functions in P0443, ordered from smallest to biggest change. All names are up for discussion (including the term "continuation" itself), with the caveat that I think there is general agreement that the call operator shouldn't be used directly.

Continuation Concept A: simplest, but least generic version

Design decisions:

  • no value type
  • error type always exception_ptr
  • return types must be void
struct ContinuationA {
  void value() &&;
  void error(std::exception_ptr) &&;
};

At this level of abstraction, pretty much everything is passed through side channels. Unless an executor implementer specializes the layer above the executor interface, pretty much any connection between data and work done on that data is lost in the programming model.

Continuation Concept B: adding return types

Design decisions:

  • same as Concept A, but allow return values and enforce the restriction that value() and error() return the same type.
template <typename R>
struct ContinuationB {
  R value() &&;
  R error(std::exception_ptr) &&;
};

(Note that, as with many of these concepts, the template parameter R need only be explicit in the concept itself and any type-erased wrappers; it can be auto in most concrete types.)

This design allows the flow of values out of a continuation to occur without side channels (i.e., it is explicit in the programming model), but the flow of data in to the continuation still has to use side channels. Errant flow that cannot sufficiently recover to produce an R must use exceptions or other side channels. Note that R can be void.

Another advantage of this design is that because output type information is explicit, a wrapper around an executor that gets the result value (the analog of future::get) from a execution graph does not need to be explicitly typed by the user (as in most push models).

The important part of this design is the type specificity; a design where value() and error() take something that is the moral equivalent of an output parameter (such as another continuation) would also fall into this category (I have some concerns about the loss of RVO and other optimizations in such models, but I could easily be convinced that the advantages outweigh the costs or that my concerns are not real problems.)

Continuation Concept C: add value parameterization

Design decisions:

  • same as Concept B, except that value() takes an argument
template <typename R>
struct ContinuationC {
  template <typename V>  
  R value(V&&) &&;
  R error(std::exception_ptr) &&;
};

This model allows values to flow both in and out of the continuation explicitly, drawing a clear connection between data and the work on that data (particularly when, for instance, the user is required by convention to avoid global variables and other side channels).

Note that this change and all of the ones below it lead to a corresponding parameterization of the executor concept. I view this as a positive, since it explicitly connects data with work on that data in the programming model. Data-agnostic layers of the software stack can still ignore this information by simply forwarding it on to other layers of the stack, but the opposite is much more difficult without de facto standard side channels (which cannot always be easily constructed on top of data-oblivious semantics).

Continuation Concept D: add error type parameterization

Design decisions:

  • same as Concept C, except that the continuation is parameterized on the argument type that error() accepts.
template <typename R>
struct ContinuationD {
  template <typename V>  
  R value(V&&) &&;
  template <typename E>  
  R error(E&&) &&;
};

I don't have as strong of an understanding of how error propagation is treated in these sorts of models, so I will defer the discussion of the merits of this model to those who do.

Continuation Concept E: allow different return type for error()

Design decisions:

  • same as Concept D, except that error() can return a different type from value()
template <typename R, typename S>
struct ContinuationE {
  template <typename V>  
  R value(V&&) &&;
  template <typename E>  
  S error(E&&) &&;
};

As with Concept D, I will defer this discussion to those who understand modern error handling usage patterns better than I do.

Overview

Herein is yet another attempt at combining P0443 and P1055. To the extent possible, we attempt to find a superset of the semantics in P0443+P1054 and P1055 with a syntax that aims to create the greatest consensus.

Approach

Starting from P0443 executors, the OneWayExecutor concept is extended and renamed Executor. Instead of taking a simple function object, the execute method of the new Executor concept takes as a parameter a type meeting the requirements of Continuation (which will be renamed once someone comes up with a better name), which attempts to combine the P1055-style requirements of Receiver with a generalization of the executor rebinding in P1054:

template <typename R, typename S>
struct MyContinuation {
  template <typename T, typename VSubEx>  
  R value(T&& v, VSubEx&& ex) &&;
  template <typename E, typename ESubEx>  
  S error(E&& e, ESubEx&& ex) &&;
};

The class MyContinuation<R, S> meets the requirements of Continuation<T, E, R, S, VSubEx, ESubEx> for all T and E and all subexecutors that support the error type E. While the concept itself mandates that both methods be present, we propose providing standard library functions for constructing the most common continuation patterns:

namespace _impl {
template<typename Callable>
struct _on_value_continuation {
  remove_cvref_t<Callable> c_;
  template <typename T, typename VSubEx, typename=/*...constraints here...*/>
  auto value(T&& v, VSubEx&& ex) && {
    return std::move(c_)(forward<T>(v), forward<VSubEx>(ex));
  }
  template <typename E, typename ESubEx, typename=enable_if_t<
    execution::can_query_v<remove_cvref_t<ESubEx>, execution::error_handler_t<remove_cvref_t<E>>>
  >>
  auto error(E&& e, ESubEx&& ex) && {
    /* query the executor's generic error handler for errors of type E */
    return execution::query(forward<ESubEx>(ex), execution::error_handler<remove_cvref_t<E>>)(forward<E>(e));    
  }
};
} // end namespace _impl

template <typename Callable>
auto on_value(Callable&& c) {
  return _impl::_on_value_continuation<remove_cvref_t<Callable>>{forward<Callable>(c)};
}

Given this definition of a Continuation, the Executor concept now looks quite simple:

struct MyExecutor {
  template <typename Continuation>
  void execute(Continuation&&) &&;
};

which, for instance, could meet the requirements of Executor<T, E, SubEx> for all T and E and all subexecutors SubEx such that E is convertible to the error type of SubEx and MyExecutor is convertible to SubEx. (The most common use of the subexecutor parameter is expected to be for an executor to pass itself (or another executor that wraps the same context and properties, in the move-only case), but when an executor needs to do otherwise, there is no other reasonable channel for conveying that information).

Customization via ADL

One possibility (that is not essential to this approach) is to put the customization point for execute at namespace scope rather than as a method. This could still always default to the method version, if available, but namespace scope customization points have several advantages.

Reusable Executors

Many programming models do, of course, reuse executors, which gives us the ReusableExecutor concept (subject to renaming), which extends the Executor concept:

struct MyReusableExecutor {
  template <typename Continuation>
  void execute(Continuation&&);
};

Note that this means that if a type Ex meets the requirements of OneWayExecutor in P0443, it is approximately equivalent to say Ex meets the requirements of ReusableExecutor<void,std::exception_ptr, Ex>, with a few small syntax changes.

Fundamentally, though, to avoid using side-channels for the expression of dependent execution, we also need a way to create an executor that executes work dependent on some other work. The updated version of ThenExecutor provides this functionality, and is analogous to the P1055-like Adapter<S> concept, but with more specific semantics. For instance, a simple ThenExecutor might look like:

template <typename T, typename E>
struct MyThenExecutor {
  template <typename Continuation>
  auto then_execute(Continuation&& ec) &&;
  /* ... also has an execute() method ... */
};

where the parameter ec is constrained such that ec.value() can be called with an rvalue of type MyExecutor<void,E> and an rvalue of type T, and ec.error() can be called with an rvalue of type MyExecutor<void,E> and an rvalue of type E. If the return type of ec.value(...) is R and the return type of ec.error(...) is S, then MyThenExecutor<T, E> meets the requirements of ThenExecutor<T,E,MyThenExecutor<void,E>,MyThenExecutor<R,S,MyThenExecutor<void,S>>> (which seems unnecessarily complex, but has requirements that can be met by a much simpler class or class template; note that this is comparable in complexity to concepts like MergeMovable or Sortable in the RangesTS, N4569). The concept ReusableThenExecutor extends ThenExecutor in a way analogous to ReusableExecutor and Executor.

Bulk

In this model, as with the p1055 model, bulk execution can be expressed using a concept that extends Continuation, with a concept called BulkContinuation. Basically speaking, we have

template <typename R, typename S>
struct MyBulkContinuation {
  template <typename T, typename VSubEx typename Index, typename SharedValue>   
  void value_element(T const&, VSubEx const&, Index const&);
  template <typename E, typename ESubEx, typename Index, typename SharedError>   
  void error_element(E const&, ESubEx const&, Index const&);
  auto const& shape();
  template <typename VSubEx>
  auto value_result_factory(VSubEx&&);  
  template <typename ESubEx>
  auto error_result_factory(ESubEx&&);
  template <typename T, typename VSubEx>  
  R value(T&& v, VSubEx&& ex) &&;
  template <typename E, typename ESubEx>  
  S error(E&& e, ESubEx&& ex) &&;
};

where all methods except shape() are optional, but if shared_value_factory() is present, value_element() must be present, and if shared_error_factory() is present, error_element() overload must be present, and at least on of these two must be present if shape() is present (when either of the four-argument versions are missing, the executor should call the two-argument value() and error() methods, which must be present).

Properties

Properties should continue to operate essentially unchanged from P0443, with the added benefit that properties can be more readily added, removed, or intercepted for some other reason by adaptors, since all Continuation propagate a subexecutor that may not be the exact executor that the Continuation was submitted to. Transparent adaptation of an Executor into a ThenExecutor using properties should be reasonable (and should interact favorably with the existing P0443 polymorphic wrapper), but more investigation is needed to say for certain. One-shot to reusable and back adaptation should also be able to use this mechanism, if desired, but more investigation is needed. Importantly, since (just as in P0443) executors are lightweight handles to execution contexts, propagation of properties from *this to the return value in ThenExecutors should be as simple as propagation of unrelated properties through the require mechanism.

Reasoning: Subexecutors

TODO - basically because it's critical and there's no other reasonable way to do it (and there are quite a few surprising benefits)

Examples/Sketches (Partially Complete)

template <typename Value, typename NestedContinuation>
struct ContinuationWithValue {
  Value v_;
  NestedContinuation c_;
  template <typename SubEx>
  auto value(SubEx&& subex) && {
    return std::move(c_).value(std::move(v_), forward<SubEx>(subex));
  }
  template <typename Error, typename SubEx>
  auto error(Error&& err, SubEx&& subex) && {
    return std::move(c_).error(forward<Error>(err), forward<SubEx>(subex));
  }
};
template <typename Value, typename NestedContinuation>
ContinuationWithValue(Value&&, NestedContinuation&&)
  -> ContinuationWithValue<remove_cvref_t<Value>, remove_cvref_t<NestedContinuation>>;

template <class Value, class Executor>
struct ValueExecutor {
  Value v_;
  Executor ex_;
  template <typename Continuation>
  void execute(Continuation&& c) && {
    std::move(ex_).execute(ContinuationWithValue{std::move(v_), forward<Continuation>(c)});
  }
  template <typename Continuation>
  auto then_execute(Continuation&& c) && {
    return std::move(ex_).then_execute(ContinuationWithValue{std::move(v_), forward<Continuation>(c)});
  }
};
template <typename Value, typename Executor>
ValueExecutor(Value&&, Executor&&)
  -> ValueExecutor<remove_cvref_t<Value>, remove_cvref_t<Executor>>;

struct OpenMPBulkExecutor {
  template <typename BulkContinuation, typename=/*...constrain to bulk continuations only...*/>
  void execute(BulkContinuation&& bc) && {
    using shape_t = remove_cvref_t<decltype(bc.shape())>;
    shape_t shape = bc.shape();
    auto& subex = bc.shared_value_factory();
    const auto chunk = shape / omp_num_threads();
    const auto extra = shape % (chunk * omp_num_threads())
    using result_t = remove_cvref_t<decltype(bc.value_result_factory())>;
    auto results = std::unique_ptr<T[]>(omp_num_threads()); 
    #pragma omp parallel 
    {
      auto my_chunk = chunk + (omp_thread_num() < extra);
      auto my_off = chunk * omp_thread_num() + min(omp_thread_num(), extra);
      results[omp_thread_num()] = bc.value_result_factory(*this);
      for(shape_t i = my_off; i < my_off + my_chunk; ++i) {
        bc.value_element(*this, i, results[omp_thread_num()]);
      }
    } 
    bc.finalize(std::move(results));
  }
  template <typename Continuation, typename=/*...constrain to non-bulk continuations only...*/>
  void execute(Continuation&& bc) && {
    std::forward<Continuation>(bc).value(*this);
  }
};

template <typename NewExecutor, typename Continuation>
struct ViaContinuation {
  NewExecutor ex_;
  Continuation c_;
  template <typename T, typename VSubEx>
  void value(T&& val, VSubEx&&) && {
    std::move(ex_).execute(execution::on_void(
      [v=forward<T>(val), cc=std::move(c_)](auto&& subex) mutable {
        c_.value(std::move(v), forward<decltype(subex)>(subex));
      }
    ));
  }
  template <typename E, typename ESubEx>
  void error(E&& err, ESubEx&&) && {
    std::move(ex_).execute(execution::on_void(
      [cc=std::move(c_),e=forward<E>(err)](auto&& subex) mutable {
        std::move(cc).error(std::move(err), forward<decltype(subex)>(subex));
      }
    ));
  }
};

template <typename NewExecutor>
struct ViaExecutor {
  OldExecutor old_ex_;
  NewExecutor ex_;
  template <typename Continuation>
  void execute(Continuation&& c) && {
    std::move(old_ex_).execute(make_continuation_with_args(
      [](Continuation&& cc, NewExecutor&& ex, auto&& val, auto&& /* old_subex */) {
        std::move(ex).execute(execution::on_void(
          [cc_=std::move(cc),val=forward<decltype(val)>(val)](auto&& subex) mutable {
            std::move(cc_).value(std::move(val), forward<decltype(subex)>(subex));
          }
        ));
      },
      [](Continuation&& cc, NewExecutor&& ex, auto&& err, auto&& /* old_subex */) {
        std::move(ex).execute(execution::on_void(
          [cc_=std::move(cc),err=forward<decltype(err)>(err)](auto&& subex) mutable {
            std::move(cc_).error(std::move(err), forward<decltype(subex)>(subex));
          }
        ));
      },
      forward<Continuation>(c),
      std::move(ex_)
    );
  }
};

template <typename PrevContinuation, typename Continuation>
struct ThenContinuation {
  PrevContinuation pc;
  Continuation cc;
  template <class Value, class SubEx>
  auto value(Value&& val, SubEx subex) {
    optional<decay_t<decltype(val)>> nextval;
    try {
      nextval = forward<decltype(pc)>(pc).value(forward<delctype(val)>(val), subex);
    }
    catch(...) {
      return forward<decltype(cc)>(cc).error(std::current_exception(), std::move(subex));
    }
    return forward<decltype(cc)>(cc).value(std::move(*nextval), std::move(subex));
  }
  template <class Error, class SubEx>
  auto error(Error&& err, SubEx subex) {
    // Is this the expected behavior, or should be be returning to the value chain if the error is handled?
    return forward<decltype(cc)>(cc).error(
      forward<decltype(pc)>(pc).error(forward<decltype(err)>(err), subex),
      std::move(subex)
    );
  }
};

template <typename Executor, typename PrevContinuation>
struct ThenExecutorAdaptor {
  Executor ex_;
  PrevContinuation cont_;
  template <typename Continuation>
  auto then_execute(Continuation&& c) && {
    return ThenExecutorAdaptor{
      std::move(ex_),
      ThenContinuation{
        std::move(cont_),
        forward<Continuation>(c)
      }
    };
  }
  template <typename Continuation>
  void execute(Continuation&& c) {
    std::move(ex_).execute(
      ThenContinuation{
        std::move(cont_),
        forward<Continuation>(c)
      }
    );
  }
};


Wording

Continuation Requirements

  1. An executor continuation is an object that expresses a set of effects to be evaluated in the event of normal and exceptional execution.

  2. An Continuation type for value type T, error type E, subexecutor type SubEx<E>, value return type R, and error return type S shall meet the MoveConstructible requirements and the requirements described in the tables below. Requirements with well-formed defaults provided are optional.

  3. Given the variables in the table below, an executor continuation ef is said to be terminally evaluated with rt, re, and rsubex when:

    • If T is (possibly cv-qualified) void, exactly one of the following expressions is evaluated
      • std::move(ef).value(rsubex) or
      • std::move(ef).error(rsubex, re).
    • Otherwise, exactly one of the following expressions is evaluated.
      • std::move(ef).value(rsubex, rt) or
      • std::move(ef).error(rsubex, re).

Descriptive Variable Definitions

Variable Definition

T, R, S

Either:

  • Any (possibly cv-qualified) object type that is not an array and meets the MoveConstructible requirements, or
  • (possibly cv-qualified) void.

E

Any (possibly cv-qualified) object type that is not an array and meets the MoveConstructible requirements.

SubEx<E>

An `Executor` type for value type `void` and error type `E`.

EF<T, E, R, S, SubEx<E>>

A type that meets the `Continuation>` requirements.

ref

A non-const rvalue of type `EF>`

rt

If `T` is not (possibly cv-qualified) `void`, a non-const rvalue of type `T`.

re

A non-const rvalue of type `E`.

rsubex

A non-const rvalue of type `SubEx`.

Continuation Requirements

Expression Required when Return Type

ref.value(rsubex, rt)

T is not (possibly cv-qualified) void.

R

ref.value(rsubex)

T is (possibly cv-qualified) void.

R

ref.error(rsubex, re)

Always

S

RepeatableContinuation Requirements

  1. A repeatable executor continuation is an [=executor continuation=] that can be [=evaluated=] more than once.

  2. A RepeatableContinuation type for value type T, error type E, subexecutor type SubEx<E>, value return type R, and error return type S shall meet the Continuation requirements and the requirements described in the tables below.

  3. Given the variables in the table below, an executor continuation ef is said to be repeatably evaluated with t, e, and subex when:

    • If T is (possibly cv-qualified) void, exactly one of the following expressions is evaluated
      • repef.value(rsubex) or
      • repef.error(rsubex, re).
    • Otherwise, exactly one of the following expressions is evaluated.
      • repef.value(rsubex, rt) or
      • repef.error(rsubex, re).

Descriptive Variable Definitions

In addition to the descriptive variable definitions from the requirements for Continuation, the requirements for RepeatableContinuation use the following variables:

Variable Definition

RepEF<T, E, R, S, SubEx<E>>

A type that meets the `RepeatableContinuation>` requirements.

repef

A value of type `EF>`

t

If `T` is not (possibly cv-qualified) `void`, an lvalue of type `T`.

e

An lvalue of type `E`.

subex

An lvalue of type `SubEx`.

RepetableContinuation Requirements

Expression Required when Return Type

repef.value(subex, rt)

T is not (possibly cv-qualified) void.

R

repef.value(subex)

T is (possibly cv-qualified) void.

R

repef.error(subex, re)

Always

S

BulkContinuation Requirements

TODO

Executor Requirements

  1. An executor is an object that can execute an [=executor continuation=].

  2. An executor executes executor continuations given as arguments to its execution functions

  3. An Executor provides the [=execution function=] execute.

  4. An Executor type for value type T, error type E, and subexecutor type SubEx<E> shall meet the MoveConstructible requirements, the MoveAssignable requirements, and the requirements described in the Tables below.

Descriptive Variable Definitions

In addition to the descriptive variable definitions from the requirements for Continuation, the requirements for Executor use the following variables:

Variable Definition

Ex<T, E, SubEx<E>>

An `Executor` type for value type `T`, error type `E`, and subexecutor type `SubEx`.

rex

A non-const rvalue of type `Ex>`.

VEF<T, E, SubEx<E>>

An `Continuation>` type.

vef

A value of type `VEF>`

Executor Requirements

Expression Return Type Operational Semantics

execution::executor_value_t<Ex<T, E, SubEx<E>>>

T

execution::executor_error_t<Ex<T, E, SubEx<E>>>

E

execution::subexecutor_t<Ex<T, E, SubEx<E>>>

SubEx<E>

rex.execute(vef)

(not used)
  • Effects: Evaluates DECAY_COPY(forward<VEF<T,E>>(vef)) on the calling thread to create cf that will be [=terminally evaluated=] by an execution agent. The evaluation of rex.execute(ef) synchronizes with [intro.multithread] the evaluation of cf.

ThenExecutor Requirements

  1. A then executor is an object that can create an [=executor=] with execution predicated on the result of an [=executor continuation=].

  2. A ThenExecutor provides the [=execution function=] then_execute.

  3. A ThenExecutor type for value type T, error type E, subexecutor type SubEx<E>, and return executor type REx<R,S,RSubEx<S>> shall meet the MoveConstructible requirements, the MoveAssignable requirements, and the requirements described in the tables below.

Descriptive Variable Definitions

In addition to the descriptive variable definitions from the requirements for Executor, the requirements for ThenExecutor use the following variables:

Variable Definition

RSubEx<S>

An `Executor` type for value type `void` and error type `S`.

REx<R,S,RSubEx<S>>

An `Executor` type for value type `void` and error type `S`.

TEx<T, E, SubEx<E>, R, S, REx<R,S,RSubEx<S>>>

A `ThenExecutor` type for value type `T`, error type `E`, subexecutor type `SubEx`, value return type `R`, and error return type `S`.

rtex

A non-const rvalue of type `TEx, R, S, RSubEx>`.

ThenExecutor Requirements

Expression Return Type Operational Semantics

rtex.then_execute(ef)

A type that meets the requirements of `Executor>`
  • Effects: Evaluates DECAY_COPY(forward<EF<T,E,SubEx<E>,R,S>>(ef)) on the calling thread to create cf that will be [=terminally evaluated=] by an execution agent. The evaluation of rtex.then_execute(ef) synchronizes with [intro.multithread] the evaluation of cf.
  • Returns: An executor that ensures the exit from the evaluation of cf synchronizes with the evaluation of any [=executor continuation=] it [=executes=].

Open Questions, Possible Extensions, and Adaptations

  • Should we specify concepts in terms of methods or in terms of namespace scope functions (with ADL)?
  • Do subexecutors need to support the same error type as the parent executor?
  • I'm not entirely sure whether the error method needs to take a subexecutor.
  • There needs to be a way (other than the destructor or some sentinel value) to say that all submission is done on a reusable executor.
  • Alternative naming: Shared instead of Reusable
  • Allowing an extension of ThenExecutor that has a method which returns a ThenExecutor?
  • Talk about/explore whether this model for bulk facilitates coarse dependency refinement models

Summary of Proposed Changes to P0443

Roughly speaking, the major differences between the current state of 443 and what I'm proposing here are:

  • Continuations (name up for debate) replace function objects as the arguments to execution functions.
  • All of the functionality of bulk_execute and friends moves to BulkContinuation, but should otherwise be essentially unchanged. Importantly, the then portion of bulk_then_execute is now an orthogonal piece.
  • Two-way is gone. We could add a generic concept of a ReadyThenExecutor or something, but it seems unnecessary, since Executors not derived from other, existing executors are rarely constructed in a generic context. (Probably needs more discussion, though.)
  • Copy constructibility no longer required, and separate concepts exist for rvalue reference qualified execution functions. (This isn't essential, but it feels to me like a small thing that bridges the gap between the 443 version of the world and the 1055 version of the world).
  • The Continuation functions now take subexecutors as the first argument, intended for expression of nested execution (e.g., a ThreadExecutor may get passed to an executor continuation scheduled to a ThreadPoolExecutor). A lot of talk on this topic on the 2018-07-25 (Wednesday) call convinced me that side-channels are inadequate for expressing these sort of relationships. I'm working on a more complete justification of this. The ability for an executor to switch the executor of nested work seems critical to maintaining locality in many scenarios, and I think it solves a lot of our problems with, e.g., parallel_reduce scratch space. It seems heavy-handed to pass an executor into every invocation of an Continuation, but to me this also seems to be the most general, and the less general case where nested execution happens on the same executor can be a simple adaptor on top of this. I know I need more justification for something this pervasive, so more to come.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment