sunrise2575/main.md

Created August 4, 2021 06:12

Star () You must be signed in to star a gist
Fork () You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/sunrise2575/e5dd9c5c10c3deb42087c185842bcdb5.js"></script>
Save sunrise2575/e5dd9c5c10c3deb42087c185842bcdb5 to your computer and use it in GitHub Desktop.

Download ZIP

Code snippet collection

Raw

main.md

코드 예시 모음

Author

sunrise2575 commented Aug 4, 2021 •

edited

Loading

Python threading (non-blocking, 주의: GIL때문에 사실상 single-thread에서만 동작)

import threading

def work():
    print("자식 쓰레드")

t = threading.Thread(target=work)
t.start()
print("부모 쓰레드")
t.join()

GIL(Global Interpreter Lock)

파이썬에서는 하나의 프로세스 안에 모든 자원의 락(Lock)을 글로벌(Global)하게 관리함으로써 한번에 하나의 쓰레드만 자원을 컨트롤하여 동작하도록 한다. GIL 때문에 한번에 하나의 쓰레드만 계산을 실행하여 실행 시간이 비슷한 것이다. GIL이 적용되는 것은 cpu 동작에서이고 쓰레드가 cpu 동작을 마치고 I/O 작업을 실행하는 동안에는 다른 쓰레드가 cpu 동작을 동시에 실행할 수 있다.

Author

sunrise2575 commented Aug 4, 2021 •

edited

Loading

Python multiprocessing (non-blocking)

import multiprocessing

def work():
    print("자식 프로세스")

t = multiprocessing.Process(target=work)
t.start()
print("부모 프로세스")
t.join()

Author

sunrise2575 commented Aug 4, 2021

Javascript coroutine (non-blocking)

async function main(){
  const c = () => new Promise(res => {
    console.log("자식 코루틴");
    res();
  });
  console.log("부모 코루틴");
  await c();

  return 0;
}
main()

Author

sunrise2575 commented Aug 4, 2021 •

edited

Loading

Rust thread (non-blocking)

use std::thread;

fn main() {
    let t = thread::spawn(move || {
        println!("자식 쓰레드");
    });
    println!("부모 쓰레드");
    t.join();
}

Author

sunrise2575 commented Aug 24, 2021

Tensorflow single-GPU & multi-GPU matrix multiplication

Python 3.7
TensorFlow 1.13
Numpy 1.16

TensorFlow 1.13버전 쓰려면 위 3가지 세팅 (파이썬, 넘파이 버전) 이 정확히 맞아야 함

import numpy as np
import datetime
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0,1'
#os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

import tensorflow as tf


def singleGPU(A, B, n: int):
    print("Single GPU")
    with tf.device('/gpu:0'):
        a = tf.identity(A)
        b = tf.identity(B)
        for _ in range(n):
            a = tf.matmul(a, a)
            b = tf.matmul(b, b)

    with tf.device('/cpu:0'):
        sum = tf.add(a, b)

    t_start = datetime.datetime.now()
    with tf.Session() as sess:
        sess.run(sum)
    t_end = datetime.datetime.now()

    print(t_end - t_start)


def multiGPU(A, B, n: int):
    print("multi GPU")
    with tf.device('/gpu:0'):
        a = tf.identity(A)
        for _ in range(n):
            a = tf.matmul(a, a)

    with tf.device('/gpu:1'):
        b = tf.identity(B)
        for _ in range(n):
            b = tf.matmul(b, b)

    with tf.device('/cpu:0'):
        sum = tf.add(a, b)

    t_start = datetime.datetime.now()
    with tf.Session() as sess:
        sess.run(sum)
    t_end = datetime.datetime.now()

    print(t_end - t_start)


def main():
    with tf.device('/cpu:0'):
        A = tf.random.uniform((1 << 14, 1 << 14), dtype=tf.float32)
        B = tf.random.uniform((1 << 14, 1 << 14), dtype=tf.float32)

    n = 10

    print("A: {}, B: {}".format(A.shape, B.shape))

    singleGPU(A, B, n)
    multiGPU(A, B, n)


main()

Author

sunrise2575 commented Aug 27, 2021 •

edited

Loading

sequential data 1-d convolution으로 학습하는 예제
sequential data가 scalar가 들어오는게 아니고 vector값이 들어오면 MyModel 에서 in_channel` 을 vector dimension 만큼 늘려주면 끝.

입력 데이터 넣기 전에 -1~1 사이로 값을 정규화 하면 성능이 훨씬 향상될 것.

import torch


class MyModel(torch.nn.Module):
    def __init__(self, time_length, kernel_size):
        super(MyModel, self).__init__()
        self.time_length = time_length
        self.conv = torch.nn.Conv1d(
            in_channels=1, out_channels=32, kernel_size=kernel_size)
        self.fc = torch.nn.Conv1d(
            in_channels=32, out_channels=1, kernel_size=1)
        self.relu = torch.nn.ReLU()

    def forward(self, x):
        h = self.relu(self.conv(x[:, :, :self.time_length - 1]))

        y_hat = self.fc(h)
        return y_hat


def main_():
    batch_size = 100
    time_length = 32
    kernel_size = 16
    total_epoch = 350

    x = torch.Tensor([[range(x, x + time_length)] for x in range(batch_size)])
    y = x[:, :, kernel_size:]

    cut = int(0.8 * batch_size)

    x_train, y_train = x[:cut], y[:cut]
    x_test, y_test = x[cut:], y[cut:]

    model = MyModel(time_length=time_length, kernel_size=kernel_size)

    loss_fn = torch.nn.L1Loss()
    optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)

    for epoch in range(350):
        # train
        y_train_hat = model(x_train)
        loss_train = loss_fn(y_train_hat, y_train)

        optimizer.zero_grad()
        loss_train.backward()
        optimizer.step()

        # test
        y_test_hat = model(x_test)
        loss_test = loss_fn(y_test_hat, y_test)

        if epoch % 25 == 0 or total_epoch == 25 - 1:
            print("epoch {:4d}, train loss: {:.3f}, test loss: {:.3f}".format(
                epoch, loss_train.item(), loss_test.item()))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment