Skip to content

Instantly share code, notes, and snippets.

@lcw
lcw / 45886739_1_rank.00_stderr
Last active February 24, 2023 18:50
Julia 1.9 MPI OOM
We couldn’t find that file to show.
@lcw
lcw / 45886644_1_rank.0_stderr
Last active February 24, 2023 00:22
MPI.jl working
We couldn’t find that file to show.
@lcw
lcw / 45886643_1_rank.00_stderr
Last active February 24, 2023 00:21
MPI.jl fail in MPI.Init()
signal (15): Terminated
in expression starting at /home/lwilcox/scalable/MPI/hello.jl:1
sweep_page at /cache/build/default-aws-shared0-3/julialang/julia-release-1-dot-8/src/gc.c:1379 [inlined]
sweep_pool_page at /cache/build/default-aws-shared0-3/julialang/julia-release-1-dot-8/src/gc.c:1447 [inlined]
sweep_pool_pagetable0 at /cache/build/default-aws-shared0-3/julialang/julia-release-1-dot-8/src/gc.c:1467 [inlined]
sweep_pool_pagetable1 at /cache/build/default-aws-shared0-3/julialang/julia-release-1-dot-8/src/gc.c:1487 [inlined]
sweep_pool_pagetable at /cache/build/default-aws-shared0-3/julialang/julia-release-1-dot-8/src/gc.c:1517 [inlined]
gc_sweep_pool at /cache/build/default-aws-shared0-3/julialang/julia-release-1-dot-8/src/gc.c:1592
_jl_gc_collect at /cache/build/default-aws-shared0-3/julialang/julia-release-1-dot-8/src/gc.c:3243
@lcw
lcw / 45886605_1_rank.00_stderr
Last active February 23, 2023 23:50
MPI.jl hang in MPI.Init()
[compute-2-30.hamming.cluster:40843] OPAL ERROR: Error in file ../../opal/runtime/opal_progress_threads.c at line 134
[compute-2-30.hamming.cluster:40843] OPAL ERROR: Error in file ../../opal/runtime/opal_progress_threads.c at line 193
[compute-2-30.hamming.cluster:40843] PMIX ERROR: ERROR in file ../../../../../../../opal/mca/pmix/pmix2x/pmix/src/runtime/pmix_progress_threads.c at line 140
[compute-2-30.hamming.cluster:40843] PMIX ERROR: ERROR in file ../../../../../../../opal/mca/pmix/pmix2x/pmix/src/runtime/pmix_progress_threads.c at line 199
--------------------------------------------------------------------------
It looks like pmix_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during pmix_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
@lcw
lcw / error.sh
Created January 26, 2023 17:15
Julia compute-sanitizer error
[lwilcox@compute-8-21 ~]$ julia -g2
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.8.4 (2022-12-23)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
# This file is machine-generated - editing it directly is not advised
[[AbstractFFTs]]
deps = ["ChainRulesCore", "LinearAlgebra"]
git-tree-sha1 = "6f1d9bc1c08f9f4a8fa92e3ea3cb50153a1b40d4"
uuid = "621f4979-c628-5d54-868e-fcf4e3e8185c"
version = "1.1.0"
[[Adapt]]
deps = ["LinearAlgebra"]
@lcw
lcw / get_dod_certs.sh
Created June 9, 2021 20:53 — forked from jfeilbach/get_dod_certs.sh
get the DoD certs including root certs. download, verify, install, and revoke
#!/bin/bash
# DoD Root Certificate install 19 July 2019
# to do: add Firefix import
# set cert numbers as variables
# combine fingerprint functions
# check all root CA fingerprints
# compare against CRL
SECONDS='0'
NC='\e[0m'
@lcw
lcw / gpu_demo.md
Last active February 3, 2021 22:04
@lcw
lcw / Manifest.toml
Last active January 23, 2021 00:42
# This file is machine-generated - editing it directly is not advised
[[Adapt]]
deps = ["LinearAlgebra"]
git-tree-sha1 = "345a14764e43fe927d6f5c250fe4c8e4664e6ee8"
uuid = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"
version = "2.4.0"
[[ArgCheck]]
git-tree-sha1 = "dedbbb2ddb876f899585c4ec4433265e3017215a"

GCM Tracer timings in CLIMA

We are experimenting with the number of tracers in ClimateMachine. We are starting with ClimateMachine branch lcw/diff_nstate (aka f25ed2a9c93674ae27c3e0e5280c1aadab64807e). The following change is made reduce the overall runtime of the simulation.

diff --git a/experiments/AtmosGCM/heldsuarez.jl b/experiments/AtmosGCM/heldsuarez.jl
index e2d980ee1..a1537958b 100755