Skip to content

Instantly share code, notes, and snippets.

@sunrise2575
Created August 4, 2021 06:12
Show Gist options
  • Save sunrise2575/e5dd9c5c10c3deb42087c185842bcdb5 to your computer and use it in GitHub Desktop.
Save sunrise2575/e5dd9c5c10c3deb42087c185842bcdb5 to your computer and use it in GitHub Desktop.
Code snippet collection

코드 예시 모음

@sunrise2575
Copy link
Author

sunrise2575 commented Aug 4, 2021

Goroutine (채널 이용하여 대기하는 경우, non-blocking)

package main

func main() {
  c := make(chan bool, 1)
  go func() {
    fmt.Println("자식 고루틴")
    c <- true
  }()
  fmt.Println("부모 고루틴")
  <-c
}

Goroutine (sync.WaitGroup 이용하여 대기하는 경우, non-blocking)

package main

func main() {
  var wg sync.WaitGroup
  wg.Add(1)
  go func() {
    defer wg.Done()
    fmt.Println("자식 고루틴")
  }()
  fmt.Println("부모 고루틴")
  wg.Wait()
}

@sunrise2575
Copy link
Author

sunrise2575 commented Aug 4, 2021

C++ thread (non-blocking)

#include <thread>
#include <stdio.h>

int main() {
  auto t = std::thread([]{
    printf("자식 쓰레드\n");
  });
  printf("부모 쓰레드\n");
  t.join();
  return 0;
}

@sunrise2575
Copy link
Author

sunrise2575 commented Aug 4, 2021

Python coroutine (non-blocking)

import asyncio

async def coroutine():
    print("자식 코루틴")

async def main():
    c = asyncio.create_task(coroutine())
    print("부모 코루틴")
    await c

if __name__=="__main__":
    asyncio.run(main())

@sunrise2575
Copy link
Author

sunrise2575 commented Aug 4, 2021

Python threading (non-blocking, 주의: GIL때문에 사실상 single-thread에서만 동작)

import threading

def work():
    print("자식 쓰레드")

t = threading.Thread(target=work)
t.start()
print("부모 쓰레드")
t.join()

GIL(Global Interpreter Lock)

파이썬에서는 하나의 프로세스 안에 모든 자원의 락(Lock)을 글로벌(Global)하게 관리함으로써 한번에 하나의 쓰레드만 자원을 컨트롤하여 동작하도록 한다. GIL 때문에 한번에 하나의 쓰레드만 계산을 실행하여 실행 시간이 비슷한 것이다. GIL이 적용되는 것은 cpu 동작에서이고 쓰레드가 cpu 동작을 마치고 I/O 작업을 실행하는 동안에는 다른 쓰레드가 cpu 동작을 동시에 실행할 수 있다.

@sunrise2575
Copy link
Author

sunrise2575 commented Aug 4, 2021

Python multiprocessing (non-blocking)

import multiprocessing

def work():
    print("자식 프로세스")

t = multiprocessing.Process(target=work)
t.start()
print("부모 프로세스")
t.join()

@sunrise2575
Copy link
Author

Javascript coroutine (non-blocking)

async function main(){
  const c = () => new Promise(res => {
    console.log("자식 코루틴");
    res();
  });
  console.log("부모 코루틴");
  await c();

  return 0;
}
main()

@sunrise2575
Copy link
Author

sunrise2575 commented Aug 4, 2021

Rust thread (non-blocking)

use std::thread;

fn main() {
    let t = thread::spawn(move || {
        println!("자식 쓰레드");
    });
    println!("부모 쓰레드");
    t.join();
}

@sunrise2575
Copy link
Author

Tensorflow single-GPU & multi-GPU matrix multiplication

Python 3.7
TensorFlow 1.13
Numpy 1.16

TensorFlow 1.13버전 쓰려면 위 3가지 세팅 (파이썬, 넘파이 버전) 이 정확히 맞아야 함

import numpy as np
import datetime
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0,1'
#os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

import tensorflow as tf


def singleGPU(A, B, n: int):
    print("Single GPU")
    with tf.device('/gpu:0'):
        a = tf.identity(A)
        b = tf.identity(B)
        for _ in range(n):
            a = tf.matmul(a, a)
            b = tf.matmul(b, b)

    with tf.device('/cpu:0'):
        sum = tf.add(a, b)

    t_start = datetime.datetime.now()
    with tf.Session() as sess:
        sess.run(sum)
    t_end = datetime.datetime.now()

    print(t_end - t_start)


def multiGPU(A, B, n: int):
    print("multi GPU")
    with tf.device('/gpu:0'):
        a = tf.identity(A)
        for _ in range(n):
            a = tf.matmul(a, a)

    with tf.device('/gpu:1'):
        b = tf.identity(B)
        for _ in range(n):
            b = tf.matmul(b, b)

    with tf.device('/cpu:0'):
        sum = tf.add(a, b)

    t_start = datetime.datetime.now()
    with tf.Session() as sess:
        sess.run(sum)
    t_end = datetime.datetime.now()

    print(t_end - t_start)


def main():
    with tf.device('/cpu:0'):
        A = tf.random.uniform((1 << 14, 1 << 14), dtype=tf.float32)
        B = tf.random.uniform((1 << 14, 1 << 14), dtype=tf.float32)

    n = 10

    print("A: {}, B: {}".format(A.shape, B.shape))

    singleGPU(A, B, n)
    multiGPU(A, B, n)


main()

@sunrise2575
Copy link
Author

sunrise2575 commented Aug 27, 2021

sequential data 1-d convolution으로 학습하는 예제
sequential data가 scalar가 들어오는게 아니고 vector값이 들어오면 MyModel 에서 in_channel` 을 vector dimension 만큼 늘려주면 끝.

입력 데이터 넣기 전에 -1~1 사이로 값을 정규화 하면 성능이 훨씬 향상될 것.

import torch


class MyModel(torch.nn.Module):
    def __init__(self, time_length, kernel_size):
        super(MyModel, self).__init__()
        self.time_length = time_length
        self.conv = torch.nn.Conv1d(
            in_channels=1, out_channels=32, kernel_size=kernel_size)
        self.fc = torch.nn.Conv1d(
            in_channels=32, out_channels=1, kernel_size=1)
        self.relu = torch.nn.ReLU()

    def forward(self, x):
        h = self.relu(self.conv(x[:, :, :self.time_length - 1]))

        y_hat = self.fc(h)
        return y_hat


def main_():
    batch_size = 100
    time_length = 32
    kernel_size = 16
    total_epoch = 350

    x = torch.Tensor([[range(x, x + time_length)] for x in range(batch_size)])
    y = x[:, :, kernel_size:]

    cut = int(0.8 * batch_size)

    x_train, y_train = x[:cut], y[:cut]
    x_test, y_test = x[cut:], y[cut:]

    model = MyModel(time_length=time_length, kernel_size=kernel_size)

    loss_fn = torch.nn.L1Loss()
    optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)

    for epoch in range(350):
        # train
        y_train_hat = model(x_train)
        loss_train = loss_fn(y_train_hat, y_train)

        optimizer.zero_grad()
        loss_train.backward()
        optimizer.step()

        # test
        y_test_hat = model(x_test)
        loss_test = loss_fn(y_test_hat, y_test)

        if epoch % 25 == 0 or total_epoch == 25 - 1:
            print("epoch {:4d}, train loss: {:.3f}, test loss: {:.3f}".format(
                epoch, loss_train.item(), loss_test.item()))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment