markdewing/mini_mpi3_step2.md

## mini_mpi3_step2.md

      
    Raw
  

              mini_mpi3_step2.md
            
          
    Development of a mini C++ MPI3 wrapper

Step 2 - Communicator

Add a communicator class.  For now it contains a rank function to get the MPI rank.
Eventually it will contain most of the communication functions.
The raw MPI_Comm value is stored internally to the class.
Add a method to the environment to get a world communicator object.
(Todo: explain the use of static get_world_instance.)
mini_mpi3.hpp
#ifndef MINI_MPI3_HPP
#define MINI_MPI3_HPP

#include <stdexcept>

#include <mpi.h>

class communicator
{
public:
  communicator(MPI_Comm impl) noexcept : impl_(impl) {}

  int rank() const {
    int rank = -1;
    int s = MPI_Comm_rank(impl_, &rank);
    if (s != MPI_SUCCESS) throw std::runtime_error("MPI_Comm_rank failed");
    return rank;
  }

private:
  MPI_Comm impl_;
};


inline void finalize()
{
  int s = MPI_Finalize();
  if (s != MPI_SUCCESS) throw std::runtime_error("cannot finalize MPI");

}

inline void initialize(){
  int s = MPI_Init(nullptr, nullptr);
  if (s != MPI_SUCCESS) throw std::runtime_error("cannot initialize MPI");
}


class environment
{
public:
  environment() { initialize(); }
  ~environment() { finalize(); }

  static inline communicator& get_world_instance()
  {
    static communicator instance{MPI_COMM_WORLD};
    return instance;
  }

  communicator world() const {
    communicator ret{get_world_instance()};
    return ret;
  }

};


#endif /* MINI_MPI3_HPP */
Test code:
#include <mini_mpi3.hpp>
#include <iostream>


int main()
{
  environment env;
  communicator world = env.world();
  std::cout << "Hello from " << world.rank() << std::endl;
  return 0;
}
Run command:

mpirun -np 4 ./a.out

Output:
Hello from 2
Hello from 0
Hello from Hello from 3
1

Next step: First communication routine

  
## mini_mpi_step1.md

      
    Raw
  

              mini_mpi_step1.md
            
          
    Development of a mini C++ MPI3 wrapper

Step 1 - Initialization

The first part is to call MPI_Init and MPI_Finalize
Create an 'environment' class that performs these functions for the lifetime of the
object.
For simplicity, MPI_Init is called with null pointers.
#ifndef MINI_MPI3_HPP
#define MINI_MPI3_HPP

#include <stdexcept>

#include <mpi.h>


inline void finalize()
{
  int s = MPI_Finalize();
  if (s != MPI_SUCCESS) throw std::runtime_error("cannot finalize MPI");

}

inline void initialize(){
  int s = MPI_Init(nullptr, nullptr);
  if (s != MPI_SUCCESS) throw std::runtime_error("cannot initialize MPI");
}


class environment
{
public:
  environment() { initialize(); }
  ~environment() { finalize(); }
};


#endif /* MINI_MPI3_HPP */
Test code:
#include <mini_mpi3.hpp>
#include <iostream>


int main()
{
  environment env;
  std::cout << "Hello" << std::endl;
  return 0;
}
Run command:

mpirun -np 4 ./a.out

Output:
Hello
Hello
Hello
Hello

Next step: Communicator

  
## mini_mpi_step3.md

      
    Raw
  

              mini_mpi_step3.md
            
          
    Development of a mini C++ MPI3 wrapper

Step 3 - First communication routine

The next step is to add a communication routine to the communicator class.
For simplicity, we will use broadcast.
For the C++ interface, we will add broadcast of a scalar value.
For the call to MPI_Bcast most of the parameters are straightforward

the address of the value passed in,
the count for a scalar is one
the datatype is more involved - will discuss below
the root value (optional value to the broadcast_value function)
the internal MPI_Comm value.

The biggest issue is the C++ type needs to be converted the MPI type enumeration.
Use a template class ('datatype') and specializations for each C++ type
contain an operator that converts to MPI_Datatype.
(There are other ways of representing this in the class - setting up
a static 'value' set to the MPI data type)
For this mini mpi3, conversion for int and double is implemented.
For a more extensive list, a macro is used.
mini_mpi3.hpp
#ifndef MINI_MPI3_HPP
#define MINI_MPI3_HPP

#include <stdexcept>

#include <mpi.h>

// Convert C++ type to MPI type enumeration
// int -> MPI_INT
// double -> MPI_DOUBLE
template <class T> class datatype;

template <> struct datatype<int> {
  operator MPI_Datatype() const{return MPI_INT;};
};

template <> struct datatype<double> {
  operator MPI_Datatype() const{return MPI_DOUBLE;};
};

class communicator
{
public:
  communicator(MPI_Comm impl) noexcept : impl_(impl) {}

  int rank() const {
    int rank = -1;
    int s = MPI_Comm_rank(impl_, &rank);
    if (s != MPI_SUCCESS) throw std::runtime_error("MPI_Comm_rank failed");
    return rank;
  }

  template<class T>
  void broadcast_value(T &t, int root = 0) {
    int count = 1;
    MPI_Datatype dt = datatype<T>{};
    MPI_Bcast(std::addressof(t), count, dt, root, impl_);
  }


private:
  MPI_Comm impl_;
};


inline void finalize()
{
  int s = MPI_Finalize();
  if (s != MPI_SUCCESS) throw std::runtime_error("cannot finalize MPI");

}

inline void initialize(){
  int s = MPI_Init(nullptr, nullptr);
  if (s != MPI_SUCCESS) throw std::runtime_error("cannot initialize MPI");
}


class environment
{
public:
  environment() { initialize(); }
  ~environment() { finalize(); }

  static inline communicator& get_world_instance()
  {
    static communicator instance{MPI_COMM_WORLD};
    return instance;
  }

  communicator world() const {
    communicator ret{get_world_instance()};
    return ret;
  }

};


#endif /* MINI_MPI3_HPP */
For the test, set a scalar value on rank 0 to a different value than the
other ranks.  Perform the broadcast and verify the new value on all ranks.
Test code:
#include <mini_mpi3.hpp>
#include <iostream>

int main()
{
  environment env;
  communicator world = env.world();

  int a = 1;
  if (world.rank() == 0) {a = 3;}
  world.broadcast_value(a);
  std::cout << "From rank " << world.rank()  << " a = " << a << std::endl;
  
  return 0;
}
Run command:

mpirun -np 4 ./a.out

Output:
From rank 0 a = 3
From rank 1 a = 3
From rank 2 a = 3
From rank 3 a = 3

Next step: Broadcast array

  
## mini_mpi_step4.md

      
    Raw
  

              mini_mpi_step4.md
            
          
    Development of a mini C++ MPI3 wrapper

Step 4 - Broadcast array

The next improvement is to broadcast an array type with a start iterator and a length.
And make it work with a variety of array types (raw C++ arrays, STL containers)
Here the issue is obtaining the address corresponding to the start iterator.
We are assuming the container storage is contiguous.
One solution is to declare a templated get_pointer function
that is overloaded on different types - C++ pointer, STL iterator, etc.
mini_mpi3.hpp
#ifndef MINI_MPI3_HPP
#define MINI_MPI3_HPP

#include <stdexcept>

#include <mpi.h>

// Convert C++ type to MPI type enumeration
// int -> MPI_INT
// double -> MPI_DOUBLE
template <class T> class datatype;

template <> struct datatype<int> {
  operator MPI_Datatype() const{return MPI_INT;};
};

template <> struct datatype<double> {
  operator MPI_Datatype() const{return MPI_DOUBLE;};
};


// Get address for different kinds of iterators
template <class T> auto get_pointer(T *t) { return t; }

template <class It> auto get_pointer(It it) { return it.base(); }


class communicator
{
public:
  communicator(MPI_Comm impl) noexcept : impl_(impl) {}

  int rank() const {
    int rank = -1;
    int s = MPI_Comm_rank(impl_, &rank);
    if (s != MPI_SUCCESS) throw std::runtime_error("MPI_Comm_rank failed");
    return rank;
  }

  template<class T>
  void broadcast_value(T &t, int root = 0) {
    int count = 1;
    MPI_Datatype dt = datatype<T>{};
    MPI_Bcast(std::addressof(t), count, dt, root, impl_);
  }

  template <class It, typename Size>
  void broadcast_n(It first, Size count, int root = 0) {
    MPI_Datatype dt = datatype<typename std::iterator_traits<It>::value_type>{};
    MPI_Bcast(get_pointer(first), count, dt, root, impl_);
  }


private:
  MPI_Comm impl_;
};


inline void finalize()
{
  int s = MPI_Finalize();
  if (s != MPI_SUCCESS) throw std::runtime_error("cannot finalize MPI");

}

inline void initialize(){
  int s = MPI_Init(nullptr, nullptr);
  if (s != MPI_SUCCESS) throw std::runtime_error("cannot initialize MPI");
}


class environment
{
public:
  environment() { initialize(); }
  ~environment() { finalize(); }

  static inline communicator& get_world_instance()
  {
    static communicator instance{MPI_COMM_WORLD};
    return instance;
  }

  communicator world() const {
    communicator ret{get_world_instance()};
    return ret;
  }

};


#endif /* MINI_MPI3_HPP */
This test uses both a raw C++ array and a std::vector.
Test code:
#include <mini_mpi3.hpp>
#include <iostream>
#include <vector>

int main()
{
  environment env;
  communicator world = env.world();

  // Broadcast_n with raw C++ array
  double va[3]; 
  if (world.rank() == 0) {va[0] = 1.1;}
  world.broadcast_n(va, 3);
  std::cout << "From rank " << world.rank()  << " va[0] = " << va[0] << std::endl;

  // Broadcast_n with std::vector
  std::vector<double> vv{1.0, 2.0, 3.0};
  if (world.rank() == 0) {vv[0] = 4.0;}
  world.broadcast_n(vv.begin(), vv.size());
  std::cout << "From rank " << world.rank()  << " vv[0] = " << vv[0] << std::endl;
  
  return 0;
}
Run command:

mpirun -np 4 ./a.out

Output:
From rank 0 va[0] = 1.1
From rank 0 vv[0] = 4
From rank 1 va[0] = 1.1
From rank 1 vv[0] = 4
From rank 2 va[0] = 1.1
From rank 2 vv[0] = 4
From rank 3 va[0] = 1.1
From rank 3 vv[0] = 4


## nav.md

      
    Raw
  

              nav.md
            
          
    Development of a mini C++ MPI3 wrapper

This guide uses a series of steps to help understand how the C++ MPI3 library is developed.
Location of the full-size MPI3 wrapper library https://gitlab.com/correaa/boost-mpi3
Steps:

Initialization
Communicator
First communication routine
Broadcast array

Meta comments

Explaining a code by a series of small incremental steps seems very useful.
Part of this example is exploring if a structured format would help make such guides easier to write.
The main ingredients at each step seem to be:

a description of each step
the source for each step (this version has the full file - eventually would need snippets or highlights to keep the displayed size managable)
an optional description of the test code
test code
output of the test code.

Depending on the step, the compile command and compiler output might be useful (especially if demonstrating a compile failure at some step).
Is this a useful format for understanding a code?