Yang Yang(Tony) tonyyang-svail

## clion_in_docker.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                tonyyang-svail
                / clion_in_docker.md
            
            
              Last active
              June 25, 2018 18:56
            
          
    Steps to set up Clion in docker on Linux

Step


Build Paddlepaddle docker images: cd ${PADDLE_SOURCE_DIR} && docker build -t paddle:dev .
Add docker to linux X11 for GUI

xhost + local:docker
xhost + local:nvidia-docker if you are need to use GPU


Go to ${PADDLE_SOURCE_DIR}, start docker and mount paddle's source code to /paddle.

sudo nvidia-docker run -v $PWD:/paddle -it -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix paddle:dev bash


## restnet50.proto
name: "ResNet-50"
input: "data"
input_dim: 1
input_dim: 3
input_dim: 224
input_dim: 224

layer {
	bottom: "data"
	top: "conv1"

## error_check.cu
// https://stackoverflow.com/questions/14038589/what-is-the-canonical-way-to-check-for-errors-using-the-cuda-runtime-api
#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
inline void gpuAssert(cudaError_t code, const char *file, int line, bool abort=true)
{
   if (code != cudaSuccess)
   {
      fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
      if (abort) exit(code);
   }
}

## shell+pipeline.sh
# find backward in all CMakeLists.txt
find paddle/fluid/ -name "CMakeLists.txt" | xargs grep backward

# replace dynamic to tape in all files
ack -l dynamic | xargs sed -i 's/dynamic/tape/g'

## variadic+templates.cpp
#include <iostream>
#include <tuple>

template <size_t I, bool at_end, typename... ARGS>
struct IterOverTypesImpl;

template <size_t I, typename... ARGS>
struct IterOverTypesImpl<I, false, ARGS...> {
    void operator()() {
        using T = typename std::tuple_element<I, std::tuple<ARGS...>>::type;

## question+on+fluid.txt
Since most of the codes are divided into seperate operators,
and every operator is stored inside a map, how would c++ compiler
take the advantage of the global view to optimize?

## SoftwareEngineering.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                tonyyang-svail
                / SoftwareEngineering.md
            
            
              Last active
              April 27, 2018 23:15
            
          
    Function Default Arguemnt

It seems that we need this fix because the last parameter of TensorCopy has a default value:
https://github.com/PaddlePaddle/Paddle/blob/c816121d11f7aed2939c5b859423883ce8bab050/paddle/fluid/framework/tensor_util.h#L26-L28
The code style does warned the use of default argument: https://google.github.io/styleguide/cppguide.html#Default_Arguments
In short, the existence of this default argument here might make users ignore the fact that TensorCopy has two modes, thus leads to some mis-uses.

  
## add_ci.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                tonyyang-svail
                / add_ci.md
            
            
              Last active
              May 15, 2018 23:35
            
          
    Prepare machine

Install Nvidia Driver

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-390

Install Docker CE

https://docs.docker.com/install/linux/docker-ce/ubuntu/#install-using-the-repository

  
## gdb+in+docker.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                tonyyang-svail
                / gdb+in+docker.md
            
            
              Created
              July 7, 2018 05:29
            
          
    Main thread: https://www.cprogramming.com/debugging/segfaults.html
Steps:

Run Docker with priviledged: sudo nvidia-docker run -it --rm  --privileged --net=host -v $PWD:/paddle -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=$DISPLAY paddle:dev bash
Get rid of read-only file system: mount -o remount,rw /
Make core dump path as the current path bash -c 'echo core.%e.%p > /proc/sys/kernel/core_pattern'
Install gdb: apt-get install gdb
Run core dumped executable: ./op_registry_test
Look for core.dump file, in my case, core.op_registry_tes.72
Use gdb to exam: gdb op_registry_test core.op_registry_tes.72


## type_example.cpp
#include <iostream>
#include <typeinfo>  //for 'typeid' to work
#include <typeindex>

using namespace std;

class Base {};

int main() {
  cout << typeid(Base).name() << endl; // 4Base
	name: "ResNet-50"
	input: "data"
	input_dim: 1
	input_dim: 3
	input_dim: 224
	input_dim: 224

	layer {
	bottom: "data"
	top: "conv1"
	// https://stackoverflow.com/questions/14038589/what-is-the-canonical-way-to-check-for-errors-using-the-cuda-runtime-api
	#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
	inline void gpuAssert(cudaError_t code, const char *file, int line, bool abort=true)
	{
	if (code != cudaSuccess)
	{
	fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
	if (abort) exit(code);
	}
	}
	# find backward in all CMakeLists.txt
	find paddle/fluid/ -name "CMakeLists.txt" \| xargs grep backward

	# replace dynamic to tape in all files
	ack -l dynamic \| xargs sed -i 's/dynamic/tape/g'
	#include <iostream>
	#include <tuple>

	template <size_t I, bool at_end, typename... ARGS>
	struct IterOverTypesImpl;

	template <size_t I, typename... ARGS>
	struct IterOverTypesImpl<I, false, ARGS...> {
	void operator()() {
	using T = typename std::tuple_element<I, std::tuple<ARGS...>>::type;
	Since most of the codes are divided into seperate operators,
	and every operator is stored inside a map, how would c++ compiler
	take the advantage of the global view to optimize?
	#include <iostream>
	#include <typeinfo> //for 'typeid' to work
	#include <typeindex>

	using namespace std;

	class Base {};

	int main() {
	cout << typeid(Base).name() << endl; // 4Base