Skip to content

Instantly share code, notes, and snippets.

View ibeltagy's full-sized avatar

Iz Beltagy ibeltagy

View GitHub Profile
@ibeltagy
ibeltagy / bootstrap_public.sh
Last active August 24, 2022 17:53
Commands to setup a new environment
# Generate ssh key
ssh-keygen -t rsa -b 4096 -C "youremail@something.com"
eval "$(ssh-agent -s)"
ssh-add -K ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub
# Setup ssh key forwarding (in the client machine)
vim ~/.ssh/config
Host *
ForwardAgent yes
Lowering:
=====================
// attr [R] storage_scope = "global"
allocate R[float32 * ((bsz*d1)*d2)]
produce R {
// attr [iter_var(blockIdx.z, , blockIdx.z)] thread_extent = bsz
// attr [R.local] storage_scope = "local"
allocate R.local[float32 * 64]
// attr [A.shared] storage_scope = "shared"
allocate A.shared[float32 * 512]
import time
import torch
import tvm
from tvm.contrib import dlpack
from tvm import te
def _compile_function(b0: int = 4, b1: int = 4, b2: int = 16):
bsz = te.var('bsz')
d1 = te.var('d1')
CompileTime
TotalSamples: 13
Accumulator: 510ms604.405us
ValueRate: 012ms791.510us / second
Rate: 0.300801 / second
Percentiles: 1%=003ms193.068us; 5%=003ms193.068us; 10%=003ms370.462us; 20%=004ms592.736us; 50%=005ms794.690us; 80%=124ms659.441us; 90%=124ms151.474us; 95%=219ms912.882us; 99%=219ms912.882us
Metric: ExecuteTime
TotalSamples: 264
Accumulator: 08s532ms717.936us
ValueRate: 068ms308.366us / second
Start Duration Grid Size Block Size Regs* SSMem* DSMem* Size Throughput SrcMemType DstMemType Device Context Stream Name
4.60628s 6.7002ms (104 1 1) (256 1 1) 40 0B 0B - - - - Tesla K80 (0) 1 7 _ZN84_GLOBAL__N__60_tmpxft_00007ca2_00000000_11_Distributions_compute_75_cpp1_ii_c3aa7ee643distribution_elementwise_grid_stride_kernelIfLi4EZZZN2at6native19uniform_kernel_cudaERNS1_14TensorIteratorEddPNS1_9GeneratorEENKUlvE_clEvENKUlvE0_clEvEUlP24curandStatePhilox4_32_10E0_ZNS_27distribution_nullary_kernelIffLi4ESB_ZZZNS2_19uniform_kernel_cudaES4_ddS6_ENKS7_clEvENKS8_clEvEUlfE_EEvS4_PNS1_13CUDAGeneratorERKT2_T3_EUlifE_EEviSt4pairImmET1_SG_ [212]
4.61298s 6.6982ms (104 1 1) (256 1 1) 40 0B 0B - - - - Tesla K80 (0) 1 7 _ZN84_GLOBAL__N__60_tmpxft_00007ca2_00000000_11_Distributio
2019-09-20 23:35:07 [] Cluster configuration: {client_workers: [{10.6.32.15, n1-standard-16, europe-west4-a, xla-0ffn}, {10.6.32.27, n1-standard-16, europe-west4-a, xla-1hgb}, {10.6.32.60, n1-standard-16, europe-west4-a, xla-1jc5}, {10.6.32.109, n1-standard-16, europe-west4-a, xla-2k9d}, {10.6.32.12, n1-standard-16, europe-west4-a, xla-2wds}, {10.6.32.41, n1-standard-16, europe-west4-a, xla-38d4}, {10.6.32.118, n1-standard-16, europe-west4-a, xla-3wx4}, {10.6.32.116, n1-standard-16, europe-west4-a, xla-4zkl}, {10.6.32.52, n1-standard-16, europe-west4-a, xla-51lv}, {10.6.32.66, n1-standard-16, europe-west4-a, xla-52lk}, {10.6.32.127, n1-standard-16, europe-west4-a, xla-5317}, {10.6.32.126, n1-standard-16, europe-west4-a, xla-5520}, {10.6.47.196, n1-standard-16, europe-west4-a, xla-5vk3}, {10.6.32.10, n1-standard-16, europe-west4-a, xla-753h}, {10.6.32.51, n1-standard-16, europe-west4-a, xla-7llm}, {10.6.32.32, n1-standard-16, europe-west4-a, xla-8bsj}, {10.6.47.194, n1-standard-16, europe-west4-a, xla-9q8t},
beltagy@xla-group-0bcq:/usr/share/torch-xla-nightly/pytorch/xla$ conda activate pytorch-nightly
(pytorch-nightly) beltagy@xla-group-0bcq:/usr/share/torch-xla-nightly/pytorch/xla$ python torch_xla_py/xla_dist.py --tpu=tpu512 --conda-env=pytorch-nightly --env=ABC=1 -- python /usr/share/torch-xla-nightly/pytorch/xla/test/test_train_cifar.py
2019-09-20 04:23:48 [] Command to distribute: "python" "/usr/share/torch-xla-nightly/pytorch/xla/test/test_train_cifar.py"
2019-09-20 04:23:48 [] Cluster configuration: {client_workers: [{10.6.47.220, n1-standard-16, europe-west4-a, xla-group-0bcq}, {10.6.47.200, n1-standard-16, europe-west4-a, xla-group-0gt7}, {10.6.47.218, n1-standard-16, europe-west4-a, xla-group-1lhq}, {10.6.47.251, n1-standard-16, europe-west4-a, xla-group-1x8f}, {10.6.47.232, n1-standard-16, europe-west4-a, xla-group-25jd}, {10.6.47.245, n1-standard-16, europe-west4-a, xla-group-2802}, {10.6.47.204, n1-standard-16, europe-west4-a, xla-group-2sxg}, {10.6.47.237, n1-standard-16, europe-west4-a, xla-gro
2019-09-07 01:17:07 [] Command to distribute: "python" "/usr/share/torch-xla-nightly/pytorch/xla/test/test_train_imagenet.py" "--fake_data"
2019-09-07 01:17:07 [] Cluster configuration: {client_workers: [{10.6.47.193, n1-standard-16, europe-west4-a, xla-group-050v}, {10.6.32.48, n1-standard-16, europe-west4-a, xla-group-07s7}, {10.6.32.38, n1-standard-16, europe-west4-a, xla-group-0vzx}, {10.6.32.22, n1-standard-16, europe-west4-a, xla-group-0zqc}, {10.6.32.13, n1-standard-16, europe-west4-a, xla-group-1jd9}, {10.6.32.31, n1-standard-16, europe-west4-a, xla-group-2qw4}, {10.6.32.52, n1-standard-16, europe-west4-a, xla-group-3pkr}, {10.6.32.28, n1-standard-16, europe-west4-a, xla-group-3qrr}, {10.6.32.29, n1-standard-16, europe-west4-a, xla-group-4hv6}, {10.6.32.26, n1-standard-16, europe-west4-a, xla-group-4s6r}, {10.6.32.11, n1-standard-16, europe-west4-a, xla-group-4w79}, {10.6.32.7, n1-standard-16, europe-west4-a, xla-group-5w85}, {10.6.32.35, n1-standard-16, europe-west4-a, xla-group-6757}, {10.6.32.14,
[ScheduleSyncTensorsGraph]
TensorsGraphInfo:
HloModule IrToHlo.92
%add_F32.12 (lhs.13: f32[], rhs.14: f32[]) -> f32[] {
%lhs.13 = f32[] parameter(0)
%rhs.14 = f32[] parameter(1)
ROOT %add.15 = f32[] add(f32[] %lhs.13, f32[] %rhs.14)
}
2019-08-27 18:23:18.396066: E tensorflow/compiler/xla/xla_client/xla_util.cc:72] >>> Dumping Computation 0 [42/2006]
2019-08-27 18:23:18.396287: E tensorflow/compiler/xla/xla_client/xla_util.cc:72] HloModule SyncTensorsGraph.92
2019-08-27 18:23:18.396298: E tensorflow/compiler/xla/xla_client/xla_util.cc:72]
2019-08-27 18:23:18.396319: E tensorflow/compiler/xla/xla_client/xla_util.cc:72] %add_F32.12 (lhs.13: f32[], rhs.14: f32[]) -> f32[] {
2019-08-27 18:23:18.396328: E tensorflow/compiler/xla/xla_client/xla_util.cc:72] %lhs.13 = f32[] parameter(0)
2019-08-27 18:23:18.396353: E tensorflow/compiler/xla/xla_client/xla_util.cc:72] %rhs.14 = f32[] parameter(1)
2019-08-27 18: