Skip to content

Instantly share code, notes, and snippets.

@lefticus
Last active August 16, 2023 18:48
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save lefticus/6ca859489c04bb1efdb6cbf1e53d352b to your computer and use it in GitHub Desktop.
Save lefticus/6ca859489c04bb1efdb6cbf1e53d352b to your computer and use it in GitHub Desktop.

A View to a Thing

Jason Turner

  • Host of C++ Weekly
  • Co-host of CppCast
  • Speaker / Contractor / Trainer

Background

C++17's std::string_view

We like std::string_view. It gives us type erasure and efficiency at the same time.

#include <string>
void observe_string(std::string_view sv);

const std::string &get_string();

int main()
{
  observe_string(get_string());  // efficient
  observe_string("Hello World"); // efficient
}

https://godbolt.org/z/rnonzs

C++20's std::span

C++20 brings us std::span, which is similar, for any contiguous data.

  • std::span can have compile-time fixed size
  • it can be const or not, so it's not a "view" per-se
#include <span>
#include <vector>
#include <string>

void use_contiguous_data(std::span<const char>);

const std::vector<char> &get_vec();
const std::string &get_string();

int main()
{
  use_contiguous_data(get_vec());
  use_contiguous_data(get_string());
  use_contiguous_data("Hello World");
}

https://godbolt.org/z/f8cd7P


The Plan

What can we do for non-contiguous containers, like std::map?

// we want to be able to do something similar
// with maps
void use_map(Map_View<int, int>);

const std::map<int, int> &get_map();
const std::unordered_map<int, int> &get_unordered_map();
const boost::flat_map<int, int> &get_flat_map();

int main()
{
  use_map(get_map());
  use_map(get_unordered_map());
  use_map(get_flat_map());
}

The Technique

A key technique we will use is the fact that a non-capturing lambda can be converted into a raw function pointer easily.

int main()
{
  int (*func)(int, int) = [](int x, int y) { return x * y; };
  return func(2, 3);
}

https://godbolt.org/z/vrdKoo


A function_view

We can build on this to build a function_view object. Type erasure for callable things with none of the overhead of std::function.

Note: Vittorio Romeo and Jonathan Müller have made similar implementations. It is well known that these implementations have lifetime issues. Just like std::span or std::string_view if the view outlives the data you will have an invalid pointer.

#include <type_traits>
#include <utility>

// we forward declare that it's a template
template <typename Signature>
struct function_view;

// then partially specialize on a function signature
template <typename Ret, typename... Param>
struct function_view<Ret(Param...)>
{
  // the function to call, stored as a void *
  const void *function;

  // the function that will call `function`
  Ret (*caller)(const void *, Param...);

  // store a pointer to the passed in function
  // and create a new function that knows how to cast
  // it and call it
  template <typename Function>
  function_view(const Function &func)
      : function{&func},
        caller{
            // this is the magic
            [](const void *func, Param... param) -> decltype(auto) {
              return (*static_cast<const Function *>(func))(std::move(param)...);
            }}
  {
  }

  // the one thing it needs to know how to do is act like a function
  template <typename... PassedParam>
  Ret operator()(PassedParam... param)
  {
    return caller(function, std::forward<PassedParam>(param)...);
  }
};

auto use_func(function_view<int(int, int)> f);

int main()
{
  return use_func([i = 5](int x, int y) { return i + x + y; });
}

https://godbolt.org/z/6MWx7b

That worked with a lambda, what happens if we try to call it with a free function pointer? ☝☝☝☝

int mod(const int &x, const int &y)
{
  return x % y;
}

int main()
{
  // unfortunately this cannot work because
  // we cannot convert a function * to a void *. It is allowed
  // - if the platform allows it. I think all non-harvard-architecture
  // CPUs would allow this, so let's make it happen
  return use_func(mod);
}

So we need to add a new overload for the function_view constructor to make this work.

// a little help from concepts.
// This could be done as an overload (see the function_view template
// specialization for ideas) or with SFINAE
template <typename Function>
function_view(const Function &func) requires std::is_function_v<Function>
    : function{reinterpret_cast<const void *>(func)},
      caller{
          // getting a const free function pointer doesn't work
          // so we have to cast away const-ness. This is (probably?) safe
          [](const void *func, Param... param) -> decltype(auto) {
            return (*reinterpret_cast<Function *>(const_cast<void *>(func)))
              (std::move(param)...);
          }}
{
}

For bonus points we might consider deleting the r-value reference overload for the constructor to limit the cases where we would have a pointer to a temporary function object. (Exercise left to the reader.)

What we start to see is that this technique is effectively the same as a virtual function call.

https://godbolt.org/z/xrKc1n


You Said Something About a map_view?

So what would this look like?

template <typename Key, typename Value>
struct Map_View
{
  const Value &at(const Key &key) const;

  auto begin() const;
  auto end() const;
};

Should it be const if it's a view? or should it be more like span? Both are possible, actually.

template <typename Key, typename Value>
struct Map_View
{
  const Value &at(const Key &key) const;
  Value &operator[](const Key &key); // ?

  auto begin() const;
  auto end() const;
  auto begin(); //?
  auto end();   //?
};

Read-only For Us Today

template <typename Key, typename Value>
struct Map_View
{
  const Value &at(const Key &key) const;

  auto begin() const;
  auto end() const;
};

Let's look at the most simple version, one that only has the at() member function.

template <typename Key, typename Value>
struct Map_View
{
  const void *data;

  // some `using` declarations to make things easier to read
  using const_at_function = const Value &(*)(const void *, const Key &);

  // function pointer to invoke
  const_at_function const_at;
 
  // fixed function type for at
  const Value &at(const Key &key) const
  {
    // invoke the function pointer created by the lambda
    return const_at(data, key);
  }

  template <typename MapType>
  Map_View(const MapType &map)
      : data{&map},
        // initialize our const_at pointer with the custom lambda
        const_at{[](const auto *ptr, const auto &key) -> decltype(auto) {
          return static_cast<const MapType *>(ptr)->at(key);
        }}
  {
  }
};

https://godbolt.org/z/PnvEWE

This was straightforward for the "at()" member, let's go ahead and add the iterators


But Wait, What Should The Iteration Functions Return?

We cannot return the actual map's iterator, that would defeat the purpose. So, I guess it will be an Interator_Proxy?

template <typename Key, typename Value>
struct Map_View
{
  const void *map;

  // some `using` declarations to make things easier to read
  using const_at_function = const Value &(*)(const void *, const Key &);
  using const_iterator_function =
      Iterator_Proxy<const std::pair<const Key, Value> &> (*)(const void *);

  const_at_function const_at;
  const_iterator_function const_begin;
  const_iterator_function const_end;

  const_at_function const_at;

  const Value &at(const Key &key) const { return const_at(map, key); }
  auto begin() const { return const_begin(data); }
  auto end() const { return const_end(data); }

  template <typename MapType>
  Map_View(const MapType &map)
      : data{&map},
        const_at{[](const auto *ptr, const auto &key) -> decltype(auto) {
          return static_cast<const MapType *>(ptr)->at(key);
        }},
        // lambdas for calling `begin` and `end`
        const_begin{[](const auto *ptr) -> Iterator_Proxy<const std::pair<const Key, Value>&> {
          return static_cast<const MapType *>(ptr)->begin();
        }},
        const_end{[](const auto *ptr) -> Iterator_Proxy<const std::pair<const Key, Value>&> {
          return static_cast<const MapType *>(ptr)->end();
        }}
  {
  }
};

Now We Just Have to Build The Iterator_Proxy

#include <any>
#include <map>
#include <unordered_map>
#include <iostream>

template <typename ResultType>
struct Iterator_Proxy
{
  // needs to own the iterator, not just view it
  // if we tried to just take a pointer
  // to the iterator it will be popped from the stack
  // and destroyed out from under us
  std::any iterator;

  using increment_function_t = void (*)(std::any &obj);
  using dereference_function_t = ResultType (*)(std::any &obj);
  using not_equal_function_t = bool (*)(const std::any &obj,
                                        const Iterator_Proxy &);

  increment_function_t increment_function;
  dereference_function_t dereference_function;
  not_equal_function_t not_equal_function;

  template <typename T>
  Iterator_Proxy(T iterator)
      : iterator{std::move(iterator)},
        increment_function{
            [](std::any &obj) { std::any_cast<T>(&obj)->operator++();
        }},
        dereference_function{
            [](std::any &obj) -> decltype(auto) { return std::any_cast<T>(&obj)->operator*();
        }},
        not_equal_function{[](const std::any &obj, const Iterator_Proxy &rhs) -> bool {
          return *std::any_cast<const T>(&obj) != *std::any_cast<const T>(&rhs.iterator);
        }} {}

  Iterator_Proxy &operator++()
  {
    increment_function(iterator);
    return *this;
  }

  ResultType operator*() { return dereference_function(iterator); }

  bool operator!=(const Iterator_Proxy &rhs)
  {
    return not_equal_function(iterator, rhs);
  }
};

template <typename Key, typename Value>
struct Map_View
{
  const void *data;

  // some `using` declarations to make things easier to read
  using const_at_function = const Value &(*)(const void *, const Key &);
  using const_iterator_function =
      Iterator_Proxy<const std::pair<const Key, Value> &> (*)(const void *);

  const_at_function const_at;
  const_iterator_function const_begin;
  const_iterator_function const_end;

  const Value &at(const Key &key) const
  {
    return const_at(data, key);
  }

  auto begin() const
  {
    return const_begin(data);
  }
  auto end() const
  {
    return const_end(data);
  }

  template <typename MapType>
  Map_View(const MapType &map)
      : data{&map},
        const_at{[](const auto *ptr, const auto &key) -> decltype(auto) {
          return static_cast<const MapType *>(ptr)->at(key);
        }},
        const_begin{[](const auto *ptr) -> Iterator_Proxy<const std::pair<const Key, Value>&> {
          return static_cast<const MapType *>(ptr)->begin();
        }},
        const_end{[](const auto *ptr) -> Iterator_Proxy<const std::pair<const Key, Value>&> {
          return static_cast<const MapType *>(ptr)->end();
        }}
  {
  }
};

auto get_value(Map_View<int, int> mv) { return mv.at(3); }

auto iterate(Map_View<int, int> mv)
{
  for (const auto &[key, value] : mv)
  {
    std::cout << key << ':' << value << '\n';
  }
}

int main()
{
  std::map<int, int> map{{1, 3}, {3, 5}};
  iterate(map);

  return get_value(map);
}

https://godbolt.org/z/4M7Tsb


Problems

Besides the lifetime issues presented previously, we're using std::any. Which might allocate, but we need something to own this. There is an alternative:

template <typename ResultType>
struct Iterator_Proxy
{
  const void *container;
  std::ptrdiff_t offset;

  using dereference_function_t = ResultType (*)(const void *obj, std::ptrdiff_t offset);

  dereference_function_t dereference_function;

  template <typename Container, typename Iterator>
  Iterator_Proxy(const Container &container, const Iterator &itr)
      : container{&container},
        offset{std::distance(container.begin(), itr)},
        dereference_function{
            [](const void *obj, std::ptrdiff_t offset) -> decltype(auto) {
              // this is potentially very expensive if incrementing
              // an iterator is expensive
              return *std::next(static_cast<const Container *>(obj)->begin(), offset);
            }}
  {
  }

  Iterator_Proxy &operator++()
  {
    ++offset;
    return *this;
  }

  ResultType operator*() { return dereference_function(container, offset); }

  bool operator!=(const Iterator_Proxy &rhs)
  {
    return container != rhs.container || offset != rhs.offset;
  }
};

https://godbolt.org/z/198zqn

  • We can probably make a specialization that uses std::any when that would be cheaper than using std::next
  • There's some Boost iterator libraries that may help here

Conclusions

  • Aside from proving these small test cases work, none of it has been tested
  • Performance implications? Who knows! (Someone should probably measure that.)
  • I have prototyped a regex_view that works between both boost::regex and CTRE.
  • What other possibilities exist? Maybe this can solve some architecture problem you are having.
  • Lots of DRY violations with the casts, but I couldn't find an alternative.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment