same runner.py as https://gist.github.com/wconstab/9802986a1353ee8eb14e12d1e6a23b79
run with rm mytest; LTC_SAVE_TENSORS_FILE=mytest PYTORCH_JIT_LOG_LEVEL=">>>graph_fuser" LTC_TS_CUDA=1 python bias_dropout_add_layernorm.py > console.log 2>&1
Observed this warning for some reason:
[W manager.cpp:305] Warning: FALLBACK path has been taken. This is an indication that codegenFailed for some reason. To debug try disable codegen fallback pathvia setting the env variableexport PYTORCH_NVFUSER_DISABLE_FALLBACK=1 (function runCudaFusionGroup)
But I don't see the fragmentation of backward that you mentioned. I'm also not sure if backward is complete in this case, it only includes native_layer_norm and sum.