Skip to content

Instantly share code, notes, and snippets.

View run-cudf-benchmark-perlmutter.sh
#!/bin/bash
#SBATCH --ntasks=16
#SBATCH --ntasks-per-node=1
#SBATCH --account=dasrepo_g
#SBATCH --constraint=gpu
#SBATCH --gpus-per-node=4
#SBATCH --qos=early_science
#SBATCH --time 00:09:00
View cuDF-merge-results-perlmutter.txt
-------------------------------
backend | dask
merge type | gpu
rows-per-chunk | 50000000
base-chunks | 4
other-chunks | 4
broadcast | default
protocol | ucx
device(s) | 0
rmm-pool | False
View gwas-gpu.py
import cupy
import numpy as np
import xarray as xr
import dask.array as da
from dask.array import stats
import fsspec
n = 10000 # Number of variants (i.e. genomic locations)
m = 100000 # Number of individuals (i.e. people)
c = 3 # Number of covariates (i.e. confounders)
View benchmark_array-selene_a100_cuda112_condacupy860_cub_cutensor-results.txt
--------------------------------------------------------------------------------------------------------------------- benchmark: 48 tests ----------------------------------------------------------------------------------------------------------------------
Name (time in us) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_Array_Slicing[shape0-cupy] 23.8450 (1.0) 68.3790 (2.03) 34.2448 (1.19) 19.1792 (8.08) 25.4280 (1.0) 14.3270 (5.18) 1;1 29,201.5031 (0.84) 5
View gpu-yarn.txt
## start cluster
REGION="us-east1"
CLUSTER_NAME="dask-rapids-test"
NUM_GPUS=2
NUM_WORKERS=2
gcloud dataproc clusters create $CLUSTER_NAME \
--region $REGION \
--image-version=2.0.0-RC22-ubuntu18 \
--master-machine-type n1-standard-16 \
--num-workers $NUM_WORKERS \
View yossi-patches.patch
diff --git a/src/ucp/core/ucp_types.h b/src/ucp/core/ucp_types.h
index 458317530..e2047d339 100644
--- a/src/ucp/core/ucp_types.h
+++ b/src/ucp/core/ucp_types.h
@@ -38,7 +38,7 @@ typedef uint8_t ucp_lane_map_t;
/* Worker configuration index for endpoint and rkey */
typedef uint8_t ucp_worker_cfg_index_t;
-#define UCP_WORKER_MAX_EP_CONFIG 16
+#define UCP_WORKER_MAX_EP_CONFIG 64
View sched-20210112-big-shuffle-not-optimized.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View sched-20210112-big-shuffle-optimized.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View profiled-main-optimized-ucx.html
This file has been truncated, but you can view the full file.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
View profiled-main-not-optimized-ucx.html
This file has been truncated, but you can view the full file.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">