ezyang/gist:a1b15c758693ef52979676f1cb015667 Secret

## gistfile0.txt
cuda train BERT_pytorch                       PASS
Dynamo produced 4 graph(s) covering 574 ops
cuda train Background_Matting                 PASS
Dynamo produced 2 graph(s) covering 366 ops
WARNING:root:DALLE2_pytorch failed to load
Eager model failed to run
Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1019, in validate_model
    self.model_iter_fn(model, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 356, in forward_and_backward_pass
    self.grad_scaler.scale(loss).backward()
  File "/scratch/ezyang/work/a/pytorch/torch/_tensor.py", line 484, in backward
    torch.autograd.backward(
  File "/scratch/ezyang/work/a/pytorch/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 2042, in run
    ) = runner.load_model(device, model_name, batch_size=batch_size)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 300, in load_model
    self.validate_model(model, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1021, in validate_model
    raise NotImplementedError("Eager model failed to run") from e
NotImplementedError: Eager model failed to run

cuda train LearningToPaint                    PASS
Dynamo produced 2 graph(s) covering 144 ops
cuda train Super_SloMo                        PASS
Dynamo produced 1 graph(s) covering 374 ops
cuda train alexnet                            PASS
Dynamo produced 2 graph(s) covering 44 ops
cuda train attention_is_all_you_need_pytorch  PASS
Dynamo produced 6 graph(s) covering 615 ops
cuda train dcgan                              PASS
Dynamo produced 2 graph(s) covering 26 ops
cuda train densenet121                        PASS
Dynamo produced 2 graph(s) covering 862 ops
WARNING:root:detectron2_fcos_r_50_fpn failed to load
FCOS train is not supported by upstream detectron2. See GH Issue: https://github.com/facebookresearch/detectron2/issues/4369.
Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 2042, in run
    ) = runner.load_model(device, model_name, batch_size=batch_size)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 268, in load_model
    benchmark = benchmark_cls(
  File "/scratch/ezyang/work/a/torchbenchmark/torchbenchmark/util/model.py", line 19, in __call__
    obj = type.__call__(cls, *args, **kwargs)
  File "/scratch/ezyang/work/a/torchbenchmark/torchbenchmark/models/detectron2_fcos_r_50_fpn/__init__.py", line 15, in __init__
    super().__init__(variant="COCO-Detection/fcos_R_50_FPN_1x.py", test=test, device=device,
  File "/scratch/ezyang/work/a/torchbenchmark/torchbenchmark/util/framework/detectron2/model_factory.py", line 100, in __init__
    loader = self.setup_train(cfg, args)
  File "/scratch/ezyang/work/a/torchbenchmark/torchbenchmark/util/framework/detectron2/model_factory.py", line 110, in setup_train
    raise NotImplementedError("FCOS train is not supported by upstream detectron2. " \
NotImplementedError: FCOS train is not supported by upstream detectron2. See GH Issue: https://github.com/facebookresearch/detectron2/issues/4369.

WARNING:root:detectron2_maskrcnn_r_50_c4 failed to load
Eager model failed to run
Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1019, in validate_model
    self.model_iter_fn(model, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 355, in forward_and_backward_pass
    loss = self.compute_loss(pred)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 345, in compute_loss
    return reduce_to_scalar_loss(pred)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 97, in reduce_to_scalar_loss
    return sum([reduce_to_scalar_loss(x) for x in out]) / len(out)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 97, in <listcomp>
    return sum([reduce_to_scalar_loss(x) for x in out]) / len(out)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 107, in reduce_to_scalar_loss
    return sum([reduce_to_scalar_loss(value) for value in out.values()]) / len(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 107, in <listcomp>
    return sum([reduce_to_scalar_loss(value) for value in out.values()]) / len(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 110, in reduce_to_scalar_loss
    raise NotImplementedError("Don't know how to reduce", type(out))
NotImplementedError: ("Don't know how to reduce", <class 'detectron2.structures.instances.Instances'>)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 2042, in run
    ) = runner.load_model(device, model_name, batch_size=batch_size)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 300, in load_model
    self.validate_model(model, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1021, in validate_model
    raise NotImplementedError("Eager model failed to run") from e
NotImplementedError: Eager model failed to run

cuda train dlrm                               PASS
Dynamo produced 1 graph(s) covering 40 ops
WARNING:root:doctr_det_predictor failed to load
Eager model failed to run
Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1019, in validate_model
    self.model_iter_fn(model, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 355, in forward_and_backward_pass
    loss = self.compute_loss(pred)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 345, in compute_loss
    return reduce_to_scalar_loss(pred)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 107, in reduce_to_scalar_loss
    return sum([reduce_to_scalar_loss(value) for value in out.values()]) / len(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 107, in <listcomp>
    return sum([reduce_to_scalar_loss(value) for value in out.values()]) / len(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 97, in reduce_to_scalar_loss
    return sum([reduce_to_scalar_loss(x) for x in out]) / len(out)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 97, in <listcomp>
    return sum([reduce_to_scalar_loss(x) for x in out]) / len(out)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 110, in reduce_to_scalar_loss
    raise NotImplementedError("Don't know how to reduce", type(out))
NotImplementedError: ("Don't know how to reduce", <class 'numpy.ndarray'>)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 2042, in run
    ) = runner.load_model(device, model_name, batch_size=batch_size)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 300, in load_model
    self.validate_model(model, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1021, in validate_model
    raise NotImplementedError("Eager model failed to run") from e
NotImplementedError: Eager model failed to run

WARNING:root:doctr_reco_predictor failed to load
Eager model failed to run
Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1019, in validate_model
    self.model_iter_fn(model, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 355, in forward_and_backward_pass
    loss = self.compute_loss(pred)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 345, in compute_loss
    return reduce_to_scalar_loss(pred)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 107, in reduce_to_scalar_loss
    return sum([reduce_to_scalar_loss(value) for value in out.values()]) / len(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 107, in <listcomp>
    return sum([reduce_to_scalar_loss(value) for value in out.values()]) / len(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 97, in reduce_to_scalar_loss
    return sum([reduce_to_scalar_loss(x) for x in out]) / len(out)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 97, in <listcomp>
    return sum([reduce_to_scalar_loss(x) for x in out]) / len(out)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 97, in reduce_to_scalar_loss
    return sum([reduce_to_scalar_loss(x) for x in out]) / len(out)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 97, in <listcomp>
    return sum([reduce_to_scalar_loss(x) for x in out]) / len(out)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/testing.py", line 110, in reduce_to_scalar_loss
    raise NotImplementedError("Don't know how to reduce", type(out))
NotImplementedError: ("Don't know how to reduce", <class 'str'>)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 2042, in run
    ) = runner.load_model(device, model_name, batch_size=batch_size)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 300, in load_model
    self.validate_model(model, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1021, in validate_model
    raise NotImplementedError("Eager model failed to run") from e
NotImplementedError: Eager model failed to run

/scratch/ezyang/work/a/pytorch/torch/utils/tensorboard/__init__.py:4: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
  if not hasattr(tensorboard, "__version__") or LooseVersion(
/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/gym/core.py:317: DeprecationWarning: [33mWARN: Initializing wrapper in old step API which returns one bool instead of two. It is recommended to set `new_step_api=True` to use new step API. This will be the default behaviour in future.[0m
  deprecation(
cuda train drq                                PASS
Dynamo produced 4 graph(s) covering 61 ops
cuda train fastNLP_Bert                       PASS
Dynamo produced 6 graph(s) covering 629 ops
cuda train functorch_dp_cifar10               WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 1)//2*(Mod((s2 - 1)//2**2/((s2 - 1)//2 + 1) + 2*(s2 - 1)//2/((s2 - 1)//2 + 1) + 1/((s2 - 1)//2 + 1), 1)) + Mod((s2 - 1)//2**2/((s2 - 1)//2 + 1) + 2*(s2 - 1)//2/((s2 - 1)//2 + 1) + 1/((s2 - 1)//2 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 1)//4*(Mod((s2 - 1)//4**2/((s2 - 1)//4 + 1) + 2*(s2 - 1)//4/((s2 - 1)//4 + 1) + 1/((s2 - 1)//4 + 1), 1)) + Mod((s2 - 1)//4**2/((s2 - 1)//4 + 1) + 2*(s2 - 1)//4/((s2 - 1)//4 + 1) + 1/((s2 - 1)//4 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 1)//8*(Mod((s2 - 1)//8**2/((s2 - 1)//8 + 1) + 2*(s2 - 1)//8/((s2 - 1)//8 + 1) + 1/((s2 - 1)//8 + 1), 1)) + Mod((s2 - 1)//8**2/((s2 - 1)//8 + 1) + 2*(s2 - 1)//8/((s2 - 1)//8 + 1) + 1/((s2 - 1)//8 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 1)//16*(Mod((s2 - 1)//16**2/((s2 - 1)//16 + 1) + 2*(s2 - 1)//16/((s2 - 1)//16 + 1) + 1/((s2 - 1)//16 + 1), 1)) + Mod((s2 - 1)//16**2/((s2 - 1)//16 + 1) + 2*(s2 - 1)//16/((s2 - 1)//16 + 1) + 1/((s2 - 1)//16 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 1)//2*(Mod((s2 - 1)//2**2/((s2 - 1)//2 + 1) + 2*(s2 - 1)//2/((s2 - 1)//2 + 1) + 1/((s2 - 1)//2 + 1), 1)) + Mod((s2 - 1)//2**2/((s2 - 1)//2 + 1) + 2*(s2 - 1)//2/((s2 - 1)//2 + 1) + 1/((s2 - 1)//2 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 1)//4*(Mod((s2 - 1)//4**2/((s2 - 1)//4 + 1) + 2*(s2 - 1)//4/((s2 - 1)//4 + 1) + 1/((s2 - 1)//4 + 1), 1)) + Mod((s2 - 1)//4**2/((s2 - 1)//4 + 1) + 2*(s2 - 1)//4/((s2 - 1)//4 + 1) + 1/((s2 - 1)//4 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 1)//8*(Mod((s2 - 1)//8**2/((s2 - 1)//8 + 1) + 2*(s2 - 1)//8/((s2 - 1)//8 + 1) + 1/((s2 - 1)//8 + 1), 1)) + Mod((s2 - 1)//8**2/((s2 - 1)//8 + 1) + 2*(s2 - 1)//8/((s2 - 1)//8 + 1) + 1/((s2 - 1)//8 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 1)//16*(Mod((s2 - 1)//16**2/((s2 - 1)//16 + 1) + 2*(s2 - 1)//16/((s2 - 1)//16 + 1) + 1/((s2 - 1)//16 + 1), 1)) + Mod((s2 - 1)//16**2/((s2 - 1)//16 + 1) + 2*(s2 - 1)//16/((s2 - 1)//16 + 1) + 1/((s2 - 1)//16 + 1), 1) - 0, s2)
PASS
Dynamo produced 2 graph(s) covering 138 ops
cuda train functorch_maml_omniglot            PASS
Dynamo produced 2 graph(s) covering 28 ops
cuda train hf_Albert                          PASS
Dynamo produced 4 graph(s) covering 571 ops
cuda train hf_Bart                            PASS
Dynamo produced 39 graph(s) covering 571 ops
cuda train hf_Bert                            PASS
Dynamo produced 5 graph(s) covering 556 ops
cuda train hf_Bert_large                      PASS
Dynamo produced 5 graph(s) covering 1096 ops
cuda train hf_BigBird                         PASS
Dynamo produced 63 graph(s) covering 806 ops
cuda train hf_DistilBert                      PASS
Dynamo produced 4 graph(s) covering 217 ops
cuda train hf_GPT2                            PASS
Dynamo produced 60 graph(s) covering 924 ops
cuda train hf_GPT2_large                      PASS
Dynamo produced 0 graph(s) covering 0 ops
cuda train hf_Longformer                      [2022-12-17 16:17:15,491] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (64)
   function: '_chunk' (/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/transformers/models/longformer/modeling_longformer.py:770)
   reasons:  hidden_states.stride()[0] == hidden_states.size()[2] and hidden_states.stride()[1] == hidden_states.size()[0]*hidden_states.size()[2] and hidden_states.stride()[2] == 1 and hidden_states.storage_offset() == 0 and Eq(hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2], 512*hidden_states.size()[0]*hidden_states.size()[2]*hidden_states.size()[1]//512) and Eq(Mod(hidden_states.size()[1], hidden_states.size()[1]//512), 0) and Ne(hidden_states.size()[1]/hidden_states.size()[1]//512, 1) and Ne(hidden_states.size()[1]//512, 1) and Ne(hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512, 1) and hidden_states.size()[1]//512 >= 2 and hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512 >= hidden_states.size()[2] and hidden_states.size()[1]/hidden_states.size()[1]//512 >= 2 and hidden_states.size()[0]*hidden_states.size()[2] < hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512 and Ne(hidden_states.size()[1]//512, 0) and hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512 >= 0 and hidden_states.size()[1]//512 > 1 and Eq(hidden_states.size()[1]/hidden_states.size()[1]//512, 512) and Ne(2*hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2] - hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512, 0) and Ne(2*hidden_states.size()[1]//512 - 1, 1) and Ne((hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2, 1) and 2*hidden_states.size()[1]//512 - 1 >= 2 and (hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2 >= hidden_states.size()[2] and hidden_states.size()[0]*hidden_states.size()[2] < (hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2 and Ne((hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2, hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512) and Ne((hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2, 0) and (hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2 >= 0 and 2*hidden_states.size()[1]//512 - 1 > 1 and 1 < 2*hidden_states.size()[1]//512*(hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2 - (hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2 and hidden_states.size()[1]/hidden_states.size()[1]//512 >= 0 and Ne(2*hidden_states.size()[1]//512 - 1, -1) and 2*hidden_states.size()[1]//512 - 1 >= 0 and hidden_states.size()[0] != 0 and hidden_states.size()[0] != 1 and hidden_states.size()[1] != 0 and hidden_states.size()[1] != 1 and hidden_states.size()[2] != 0 and hidden_states.size()[2] != 1
to diagnose recompilation issues, see https://github.com/pytorch/torchdynamo/blob/main/TROUBLESHOOTING.md.
PASS
Dynamo produced 124 graph(s) covering 1682 ops
cuda train hf_Reformer                        PASS
Dynamo produced 60 graph(s) covering 518 ops
cuda train hf_T5                              WARNING:common:fp64 golden ref were not generated for hf_T5. Setting accuracy check to cosine
PASS
Dynamo produced 1 graph(s) covering 881 ops
cuda train hf_T5_base                         WARNING:common:fp64 golden ref were not generated for hf_T5_base. Setting accuracy check to cosine
PASS
Dynamo produced 1 graph(s) covering 1643 ops
cuda train hf_T5_large                        PASS
Dynamo produced 0 graph(s) covering 0 ops
cuda train lennard_jones                      PASS
Dynamo produced 1 graph(s) covering 9 ops
cuda train maml_omniglot                      PASS
Dynamo produced 2 graph(s) covering 28 ops
cuda train mnasnet1_0                         PASS
Dynamo produced 1 graph(s) covering 152 ops
cuda train mobilenet_v2                       PASS
Dynamo produced 1 graph(s) covering 153 ops
cuda train mobilenet_v2_quantized_qat         WARNING:common:fp64 golden ref were not generated for mobilenet_v2_quantized_qat. Setting accuracy check to cosine
ERROR:common:output with shape [1] doesn't match the broadcast shape [32]
Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1184, in check_accuracy
    new_result = optimized_model_iter_fn(model_copy, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 212, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1061, in run_n_iterations
    self.model_iter_fn(mod, inputs, collect_outputs=False)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 351, in forward_and_backward_pass
    cloned_inputs = clone_inputs(inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 352, in <graph break in forward_and_backward_pass>
    self.optimizer_zero_grad(mod)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 354, in <graph break in forward_and_backward_pass>
    pred = mod(*cloned_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/fx/graph_module.py", line 660, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/fx/graph_module.py", line 279, in __call__
    raise e
  File "/scratch/ezyang/work/a/pytorch/torch/fx/graph_module.py", line 269, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
  File "/scratch/ezyang/work/a/pytorch/torch/nn/modules/module.py", line 1482, in _call_impl
    return forward_call(*args, **kwargs)
  File "<eval_with_key>.8", line 4, in forward
    def forward(self, x : torch.Tensor) -> torch.Tensor:
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 212, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_functorch/aot_autograd.py", line 2476, in forward
    return compiled_fn(full_args)
  File "/scratch/ezyang/work/a/pytorch/torch/_functorch/aot_autograd.py", line 1024, in g
    return f(*args)
  File "/scratch/ezyang/work/a/pytorch/torch/_functorch/aot_autograd.py", line 2045, in debug_compiled_function
    return compiled_function(*args)
  File "/scratch/ezyang/work/a/pytorch/torch/_functorch/aot_autograd.py", line 1962, in compiled_function
    original_inpt.copy_(updated_inpt)
RuntimeError: output with shape [1] doesn't match the broadcast shape [32]
TorchDynamo optimized model failed to run because of following error
FAIL
Dynamo produced 1 graph(s) covering 203 ops
cuda train mobilenet_v3_large                 PASS
Dynamo produced 1 graph(s) covering 187 ops
cuda train moco                               [2022-12-17 16:24:06,568] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (64)
   function: '<graph break in _momentum_update_key_encoder>' (/scratch/ezyang/work/a/torchbenchmark/torchbenchmark/models/moco/moco/builder.py:50)
   reasons:  ___tuple_iterator_len(___stack0) == 160
to diagnose recompilation issues, see https://github.com/pytorch/torchdynamo/blob/main/TROUBLESHOOTING.md.
ERROR:common:

from user code:
   File "/scratch/ezyang/work/a/torchbenchmark/torchbenchmark/models/moco/moco/builder.py", line 172, in concat_all_gather
    torch.distributed.all_gather(tensors_gather, tensor, async_op=False)

Set torch._dynamo.config.verbose=True for more information


You can suppress this exception and fall back to eager by setting:
    torch._dynamo.config.suppress_errors = True
Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/torch/_subclasses/fake_tensor.py", line 915, in __torch_dispatch__
    r = func(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_ops.py", line 284, in __call__
    return self._op(*args, **kwargs or {})
  File "/scratch/ezyang/work/a/pytorch/torch/_ops.py", line 377, in _get_dispatch
    final_key = resolve_key(self, key)
  File "/scratch/ezyang/work/a/pytorch/torch/_ops.py", line 106, in resolve_key
    raise NotImplementedError(f"could not find kernel for {op} at dispatch key {k}")
NotImplementedError: could not find kernel for c10d.allgather_.default at dispatch key DispatchKey.Meta

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1055, in run_node
    return node.target(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/distributed/distributed_c10d.py", line 1429, in wrapper
    return func(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/distributed/distributed_c10d.py", line 2424, in all_gather
    work = default_pg.allgather([tensor_list], [tensor])
  File "/scratch/ezyang/work/a/pytorch/torch/_subclasses/fake_tensor.py", line 920, in __torch_dispatch__
    return run_fallback_kernel(self, func, args, kwargs, not_implemented_error)
  File "/scratch/ezyang/work/a/pytorch/torch/_subclasses/fake_tensor.py", line 1099, in run_fallback_kernel
    args = tree_map(to_real_tensor, args)
  File "/scratch/ezyang/work/a/pytorch/torch/utils/_pytree.py", line 195, in tree_map
    return tree_unflatten([fn(i) for i in flat_args], spec)
  File "/scratch/ezyang/work/a/pytorch/torch/utils/_pytree.py", line 195, in <listcomp>
    return tree_unflatten([fn(i) for i in flat_args], spec)
  File "/scratch/ezyang/work/a/pytorch/torch/_subclasses/fake_tensor.py", line 1092, in to_real_tensor
    out = torch.zeros_like(e, device=e.fake_device)
RuntimeError: Cannot call strides() on tensor with symbolic sizes/strides

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1014, in get_fake_value
    return wrap_fake_exception(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 704, in wrap_fake_exception
    return fn()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1015, in <lambda>
    lambda: run_node(tx.output, node, args, kwargs, nnmodule)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1064, in run_node
    raise RuntimeError(
RuntimeError: Failed running call_function <function all_gather at 0x7f2ee2449dc0>(*([FakeTensor(FakeTensor(..., device='meta', size=(s0, s1, s2, s2)), cuda:0)], FakeTensor(FakeTensor(..., device='meta', size=(s0, s1, s2, s2)), cuda:0)), **{'async_op': False}):
Cannot call strides() on tensor with symbolic sizes/strides
(scroll up for backtrace)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1184, in check_accuracy
    new_result = optimized_model_iter_fn(model_copy, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 212, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1061, in run_n_iterations
    self.model_iter_fn(mod, inputs, collect_outputs=False)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 351, in forward_and_backward_pass
    cloned_inputs = clone_inputs(inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 352, in <graph break in forward_and_backward_pass>
    self.optimizer_zero_grad(mod)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 354, in <graph break in forward_and_backward_pass>
    pred = mod(*cloned_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/nn/modules/module.py", line 1482, in _call_impl
    return forward_call(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/nn/parallel/distributed.py", line 1098, in forward
    output = self._run_ddp_forward(*inputs, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/nn/parallel/distributed.py", line 1051, in _run_ddp_forward
    return module_to_run(*inputs[0], **kwargs[0])  # type: ignore[index]
  File "/scratch/ezyang/work/a/pytorch/torch/nn/modules/module.py", line 1482, in _call_impl
    return forward_call(*args, **kwargs)
  File "/scratch/ezyang/work/a/torchbenchmark/torchbenchmark/models/moco/moco/builder.py", line 130, in forward
    self._momentum_update_key_encoder()  # update the key encoder
  File "/scratch/ezyang/work/a/torchbenchmark/torchbenchmark/models/moco/moco/builder.py", line 133, in <graph break in forward>
    im_k, idx_unshuffle = self._batch_shuffle_ddp(im_k)
  File "/scratch/ezyang/work/a/pytorch/torch/autograd/grad_mode.py", line 34, in decorate_context
    return func(*args, **kwargs)
  File "/scratch/ezyang/work/a/torchbenchmark/torchbenchmark/models/moco/moco/builder.py", line 76, in _batch_shuffle_ddp
    x_gather = concat_all_gather(x)
  File "/scratch/ezyang/work/a/pytorch/torch/autograd/grad_mode.py", line 34, in decorate_context
    return func(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 330, in catch_errors
    return hijacked_callback(frame, cache_size, hooks)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 480, in _convert_frame
    result = inner_convert(frame, cache_size, hooks)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 103, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 90, in time_wrapper
    r = func(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 339, in _convert_frame_assert
    return _compile(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 400, in _compile
    out_code = transform_code_object(code, transform)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/bytecode_transformation.py", line 341, in transform_code_object
    transformations(instructions, code_options)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 387, in transform
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1684, in run
    super().run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1015, in CALL_FUNCTION_KW
    self.call_function(fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/torch.py", line 468, in call_function
    tensor_variable = wrap_fx_proxy(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/builder.py", line 733, in wrap_fx_proxy
    return wrap_fx_proxy_cls(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/builder.py", line 773, in wrap_fx_proxy_cls
    example_value = get_fake_value(proxy.node, tx)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1034, in get_fake_value
    raise TorchRuntimeError() from e
torch._dynamo.exc.TorchRuntimeError:

from user code:
   File "/scratch/ezyang/work/a/torchbenchmark/torchbenchmark/models/moco/moco/builder.py", line 172, in concat_all_gather
    torch.distributed.all_gather(tensors_gather, tensor, async_op=False)

Set torch._dynamo.config.verbose=True for more information


You can suppress this exception and fall back to eager by setting:
    torch._dynamo.config.suppress_errors = True

TorchDynamo optimized model failed to run because of following error
FAIL
Dynamo produced 68 graph(s) covering 507 ops
cuda train nvidia_deeprecommender             PASS
Dynamo produced 1 graph(s) covering 13 ops
cuda train phlippe_densenet                   PASS
Dynamo produced 1 graph(s) covering 186 ops
cuda train phlippe_resnet                     PASS
Dynamo produced 1 graph(s) covering 71 ops
--dataroot /scratch/ezyang/work/a/torchbenchmark/torchbenchmark/data/.data/pytorch_CycleGAN_and_pix2pix_inputs/datasets/horse2zebra --name horse2zebra --model cycle_gan --display_id 0 --n_epochs 3 --n_epochs_decay 3 --gpu_ids 0 --checkpoints_dir /scratch/ezyang/work/a/torchbenchmark/torchbenchmark/models/pytorch_CycleGAN_and_pix2pix/.data/checkpoints
cuda train pytorch_CycleGAN_and_pix2pix       PASS
Dynamo produced 1 graph(s) covering 91 ops
cuda train pytorch_stargan                    PASS
Dynamo produced 1 graph(s) covering 60 ops
cuda train pytorch_struct                     PASS
Dynamo produced 1 graph(s) covering 47 ops
cuda train pytorch_unet                       PASS
Dynamo produced 2 graph(s) covering 270 ops
cuda train resnet152                          PASS
Dynamo produced 1 graph(s) covering 515 ops
cuda train resnet18                           PASS
Dynamo produced 1 graph(s) covering 69 ops
cuda train resnet50                           PASS
Dynamo produced 1 graph(s) covering 175 ops
cuda train resnet50_quantized_qat             WARNING:common:fp64 golden ref were not generated for resnet50_quantized_qat. Setting accuracy check to cosine
ERROR:common:output with shape [1] doesn't match the broadcast shape [64]
Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1184, in check_accuracy
    new_result = optimized_model_iter_fn(model_copy, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 212, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1061, in run_n_iterations
    self.model_iter_fn(mod, inputs, collect_outputs=False)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 351, in forward_and_backward_pass
    cloned_inputs = clone_inputs(inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 352, in <graph break in forward_and_backward_pass>
    self.optimizer_zero_grad(mod)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 354, in <graph break in forward_and_backward_pass>
    pred = mod(*cloned_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/fx/graph_module.py", line 660, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/fx/graph_module.py", line 279, in __call__
    raise e
  File "/scratch/ezyang/work/a/pytorch/torch/fx/graph_module.py", line 269, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
  File "/scratch/ezyang/work/a/pytorch/torch/nn/modules/module.py", line 1482, in _call_impl
    return forward_call(*args, **kwargs)
  File "<eval_with_key>.8", line 4, in forward
    def forward(self, x : torch.Tensor) -> torch.Tensor:
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 212, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_functorch/aot_autograd.py", line 2476, in forward
    return compiled_fn(full_args)
  File "/scratch/ezyang/work/a/pytorch/torch/_functorch/aot_autograd.py", line 1024, in g
    return f(*args)
  File "/scratch/ezyang/work/a/pytorch/torch/_functorch/aot_autograd.py", line 2045, in debug_compiled_function
    return compiled_function(*args)
  File "/scratch/ezyang/work/a/pytorch/torch/_functorch/aot_autograd.py", line 1962, in compiled_function
    original_inpt.copy_(updated_inpt)
RuntimeError: output with shape [1] doesn't match the broadcast shape [64]
TorchDynamo optimized model failed to run because of following error
FAIL
Dynamo produced 1 graph(s) covering 163 ops
cuda train resnext50_32x4d                    PASS
Dynamo produced 1 graph(s) covering 175 ops
cuda train shufflenet_v2_x1_0                 PASS
Dynamo produced 1 graph(s) covering 367 ops
/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/gym/core.py:317: DeprecationWarning: [33mWARN: Initializing wrapper in old step API which returns one bool instead of two. It is recommended to set `new_step_api=True` to use new step API. This will be the default behaviour in future.[0m
  deprecation(
/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/gym/wrappers/step_api_compatibility.py:39: DeprecationWarning: [33mWARN: Initializing environment in old step API which returns one bool instead of two. It is recommended to set `new_step_api=True` to use new step API. This will be the default behaviour in future.[0m
  deprecation(
cuda train soft_actor_critic                  PASS
Dynamo produced 3 graph(s) covering 20 ops
cuda train speech_transformer                 [2022-12-17 16:31:25,292] torch._dynamo.variables.builtin: [WARNING] incorrect arg count <bound method BuiltinVariable._call_min_max of BuiltinVariable(max)> missing a required argument: 'b' and no constant handler
PASS
Dynamo produced 15 graph(s) covering 849 ops
cuda train squeezenet1_1                      PASS
Dynamo produced 1 graph(s) covering 66 ops
cuda train tacotron2                          ERROR:common:one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [4, 80, 724]], which is output 0 of AsStridedBackward0, is at version 2; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1184, in check_accuracy
    new_result = optimized_model_iter_fn(model_copy, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 212, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1061, in run_n_iterations
    self.model_iter_fn(mod, inputs, collect_outputs=False)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 351, in forward_and_backward_pass
    cloned_inputs = clone_inputs(inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 352, in <graph break in forward_and_backward_pass>
    self.optimizer_zero_grad(mod)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 356, in <graph break in forward_and_backward_pass>
    self.grad_scaler.scale(loss).backward()
  File "/scratch/ezyang/work/a/pytorch/torch/_tensor.py", line 484, in backward
    torch.autograd.backward(
  File "/scratch/ezyang/work/a/pytorch/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/scratch/ezyang/work/a/pytorch/torch/autograd/function.py", line 273, in apply
    return user_fn(self, *args)
  File "/scratch/ezyang/work/a/pytorch/torch/_functorch/aot_autograd.py", line 1871, in backward
    list(ctx.symints) + list(ctx.saved_tensors) + list(contiguous_args)
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [4, 80, 724]], which is output 0 of AsStridedBackward0, is at version 2; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
TorchDynamo optimized model failed to run because of following error
FAIL
Dynamo produced 11 graph(s) covering 28361 ops
cuda train timm_efficientdet                  ERROR:common:

from user code:
   File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/effdet/efficientdet.py", line 211, in forward
    input_node = resample(input_node)
  File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/effdet/efficientdet.py", line 134, in forward
    return F.interpolate(

Set torch._dynamo.config.verbose=True for more information


You can suppress this exception and fall back to eager by setting:
    torch._dynamo.config.suppress_errors = True
Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1055, in run_node
    return node.target(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/nn/functional.py", line 3924, in interpolate
    return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
RuntimeError: Cannot call sizes() on tensor with symbolic sizes/strides

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1014, in get_fake_value
    return wrap_fake_exception(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 704, in wrap_fake_exception
    return fn()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1015, in <lambda>
    lambda: run_node(tx.output, node, args, kwargs, nnmodule)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1064, in run_node
    raise RuntimeError(
RuntimeError: Failed running call_function <function interpolate at 0x7f9351c53160>(*(FakeTensor(FakeTensor(..., device='meta',
           size=(s0, 88, ceiling(ceiling(ceiling(ceiling(ceiling(ceiling(ceiling(s2/2)/2)/2)/2)/2)/2)/2), ceiling(ceiling(ceiling(ceiling(ceiling(ceiling(ceiling(s2/2)/2)/2)/2)/2)/2)/2)),
           grad_fn=<MaxPool2DWithIndicesBackward0>), cuda:0), (10, 10), None, 'nearest', None), **{'recompute_scale_factor': False}):
Cannot call sizes() on tensor with symbolic sizes/strides
(scroll up for backtrace)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1184, in check_accuracy
    new_result = optimized_model_iter_fn(model_copy, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 212, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1061, in run_n_iterations
    self.model_iter_fn(mod, inputs, collect_outputs=False)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 351, in forward_and_backward_pass
    cloned_inputs = clone_inputs(inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 352, in <graph break in forward_and_backward_pass>
    self.optimizer_zero_grad(mod)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 354, in <graph break in forward_and_backward_pass>
    pred = mod(*cloned_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/nn/modules/module.py", line 1482, in _call_impl
    return forward_call(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 333, in catch_errors
    return callback(frame, cache_size, hooks)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 480, in _convert_frame
    result = inner_convert(frame, cache_size, hooks)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 103, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 90, in time_wrapper
    r = func(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 339, in _convert_frame_assert
    return _compile(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 400, in _compile
    out_code = transform_code_object(code, transform)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/bytecode_transformation.py", line 341, in transform_code_object
    transformations(instructions, code_options)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 387, in transform
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1684, in run
    super().run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 220, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 220, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 220, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 220, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 220, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 220, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 182, in call_function
    tx.call_function(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 220, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1015, in CALL_FUNCTION_KW
    self.call_function(fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/torch.py", line 468, in call_function
    tensor_variable = wrap_fx_proxy(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/builder.py", line 733, in wrap_fx_proxy
    return wrap_fx_proxy_cls(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/builder.py", line 773, in wrap_fx_proxy_cls
    example_value = get_fake_value(proxy.node, tx)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1034, in get_fake_value
    raise TorchRuntimeError() from e
torch._dynamo.exc.TorchRuntimeError:

from user code:
   File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/effdet/efficientdet.py", line 211, in forward
    input_node = resample(input_node)
  File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/effdet/efficientdet.py", line 134, in forward
    return F.interpolate(

Set torch._dynamo.config.verbose=True for more information


You can suppress this exception and fall back to eager by setting:
    torch._dynamo.config.suppress_errors = True

TorchDynamo optimized model failed to run because of following error
FAIL
Dynamo produced 0 graph(s) covering 0 ops
cuda train timm_efficientnet                  PASS
Dynamo produced 1 graph(s) covering 313 ops
cuda train timm_regnet                        PASS
Dynamo produced 1 graph(s) covering 458 ops
cuda train timm_resnest                       PASS
Dynamo produced 1 graph(s) covering 180 ops
cuda train timm_vision_transformer            PASS
Dynamo produced 1 graph(s) covering 441 ops
cuda train timm_vision_transformer_large      PASS
Dynamo produced 0 graph(s) covering 0 ops
cuda train timm_vovnet                        PASS
Dynamo produced 1 graph(s) covering 169 ops
cuda train tts_angular                        [2022-12-17 16:53:45,348] torch._dynamo.optimizations.training: [WARNING] Unable to use Aot Autograd because of presence of LSTM
[2022-12-17 16:53:45,430] torch._dynamo.optimizations.training: [WARNING] Unable to use Aot Autograd because of presence of LSTM
[2022-12-17 16:53:45,506] torch._dynamo.optimizations.training: [WARNING] Unable to use Aot Autograd because of presence of LSTM
PASS
Dynamo produced 4 graph(s) covering 11 ops
cuda train vgg16                              PASS
Dynamo produced 1 graph(s) covering 40 ops
cuda train vision_maskrcnn                    Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/torchbench.py", line 368, in <module>
    main(TorchBenchmarkRunner(), original_dir)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1697, in main
    return maybe_fresh_cache(run, args.cold_start_latency and args.only)(
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 863, in inner
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 2076, in run
    runner.run_one_model(
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1360, in run_one_model
    status = self.check_accuracy(
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1166, in check_accuracy
    if not same(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 739, in same
    return len(ref) == len(res) and all(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 740, in <genexpr>
    same(ai, bi, fp64_refi, cos_similarity, tol, equal_nan, exact_dtype)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 739, in same
    return len(ref) == len(res) and all(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 740, in <genexpr>
    same(ai, bi, fp64_refi, cos_similarity, tol, equal_nan, exact_dtype)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 750, in same
    same(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 807, in same
    ref_error = rmse(fp64_ref, ref).item()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 722, in rmse
    return torch.sqrt(torch.mean(torch.square(ref - res)))
RuntimeError: The size of tensor a (38) must match the size of tensor b (39) at non-singleton dimension 0
ERROR
cuda train yolov3                             [2022-12-17 16:55:28,573] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (64)
   function: 'forward' (/scratch/ezyang/work/a/pytorch/torch/nn/modules/container.py:202)
   reasons:  ___check_obj_id(self, 140578271510384)
to diagnose recompilation issues, see https://github.com/pytorch/torchdynamo/blob/main/TROUBLESHOOTING.md.
PASS
Dynamo produced 93 graph(s) covering 349 ops
cuda train AlbertForMaskedLM                  PASS
Dynamo produced 4 graph(s) covering 574 ops
cuda train AlbertForQuestionAnswering         PASS
Dynamo produced 4 graph(s) covering 577 ops
cuda train AllenaiLongformerBase              [2022-12-17 16:57:48,426] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (64)
   function: '_chunk' (/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/transformers/models/longformer/modeling_longformer.py:770)
   reasons:  hidden_states.stride()[0] == hidden_states.size()[2] and hidden_states.stride()[1] == hidden_states.size()[0]*hidden_states.size()[2] and hidden_states.stride()[2] == 1 and hidden_states.storage_offset() == 0 and Eq(hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2], 512*hidden_states.size()[0]*hidden_states.size()[2]*hidden_states.size()[1]//512) and Eq(Mod(hidden_states.size()[1], hidden_states.size()[1]//512), 0) and Ne(hidden_states.size()[1]/hidden_states.size()[1]//512, 1) and Ne(hidden_states.size()[1]//512, 1) and Ne(hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512, 1) and hidden_states.size()[1]//512 >= 2 and hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512 >= hidden_states.size()[2] and hidden_states.size()[1]/hidden_states.size()[1]//512 >= 2 and hidden_states.size()[0]*hidden_states.size()[2] < hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512 and Ne(hidden_states.size()[1]//512, 0) and hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512 >= 0 and hidden_states.size()[1]//512 > 1 and Eq(hidden_states.size()[1]/hidden_states.size()[1]//512, 512) and Ne(2*hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2] - hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512, 0) and Ne(2*hidden_states.size()[1]//512 - 1, 1) and Ne((hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2, 1) and 2*hidden_states.size()[1]//512 - 1 >= 2 and (hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2 >= hidden_states.size()[2] and hidden_states.size()[0]*hidden_states.size()[2] < (hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2 and Ne((hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2, hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512) and Ne((hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2, 0) and (hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2 >= 0 and 2*hidden_states.size()[1]//512 - 1 > 1 and 1 < 2*hidden_states.size()[1]//512*(hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2 - (hidden_states.size()[1]*hidden_states.size()[0]*hidden_states.size()[2]/hidden_states.size()[1]//512)//2 and hidden_states.size()[1]/hidden_states.size()[1]//512 >= 0 and Ne(2*hidden_states.size()[1]//512 - 1, -1) and 2*hidden_states.size()[1]//512 - 1 >= 0 and hidden_states.size()[0] != 0 and hidden_states.size()[0] != 1 and hidden_states.size()[1] != 0 and hidden_states.size()[1] != 1 and hidden_states.size()[2] != 0 and hidden_states.size()[2] != 1
to diagnose recompilation issues, see https://github.com/pytorch/torchdynamo/blob/main/TROUBLESHOOTING.md.
PASS
Dynamo produced 124 graph(s) covering 1685 ops
cuda train BartForCausalLM                    PASS
Dynamo produced 26 graph(s) covering 412 ops
cuda train BartForConditionalGeneration       PASS
Dynamo produced 76 graph(s) covering 1131 ops
cuda train BertForMaskedLM                    PASS
Dynamo produced 5 graph(s) covering 559 ops
cuda train BertForQuestionAnswering           PASS
Dynamo produced 5 graph(s) covering 569 ops
cuda train BlenderbotForCausalLM              PASS
Dynamo produced 0 graph(s) covering 0 ops
cuda train BlenderbotSmallForCausalLM         PASS
Dynamo produced 18 graph(s) covering 279 ops
cuda train BlenderbotSmallForConditionalGeneration PASS
Dynamo produced 51 graph(s) covering 753 ops
cuda train CamemBert                          PASS
Dynamo produced 5 graph(s) covering 572 ops
cuda train DebertaForMaskedLM                 PASS
Dynamo produced 77 graph(s) covering 1027 ops
cuda train DebertaForQuestionAnswering        PASS
Dynamo produced 77 graph(s) covering 1037 ops
cuda train DebertaV2ForMaskedLM               PASS
Dynamo produced 0 graph(s) covering 0 ops
cuda train DebertaV2ForQuestionAnswering      PASS
Dynamo produced 172 graph(s) covering 2189 ops
WARNING:__main__:Sequence Length not defined for DistilBertForMaskedLM. Choosing 128 arbitrarily
cuda train DistilBertForMaskedLM              PASS
Dynamo produced 4 graph(s) covering 221 ops
WARNING:__main__:Sequence Length not defined for DistilBertForQuestionAnswering. Choosing 128 arbitrarily
cuda train DistilBertForQuestionAnswering     PASS
Dynamo produced 4 graph(s) covering 231 ops
cuda train DistillGPT2                        PASS
Dynamo produced 30 graph(s) covering 462 ops
If you want to use `ElectraForCausalLM` as a standalone, add `is_decoder=True.`
cuda train ElectraForCausalLM                 PASS
Dynamo produced 5 graph(s) covering 562 ops
cuda train ElectraForQuestionAnswering        PASS
Dynamo produced 5 graph(s) covering 568 ops
cuda train GPT2ForSequenceClassification      PASS
Dynamo produced 60 graph(s) covering 924 ops
cuda train GoogleFnet                         PASS
Dynamo produced 28 graph(s) covering 203 ops
cuda train LayoutLMForMaskedLM                PASS
Dynamo produced 4 graph(s) covering 564 ops
cuda train LayoutLMForSequenceClassification  PASS
Dynamo produced 5 graph(s) covering 562 ops
WARNING:__main__:Sequence Length not defined for M2M100ForConditionalGeneration. Choosing 128 arbitrarily
cuda train M2M100ForConditionalGeneration     PASS
Dynamo produced 101 graph(s) covering 1174 ops
cuda train MBartForCausalLM                   PASS
Dynamo produced 38 graph(s) covering 412 ops
cuda train MBartForConditionalGeneration      PASS
Dynamo produced 100 graph(s) covering 1136 ops
WARNING:__main__:Sequence Length not defined for MT5ForConditionalGeneration. Choosing 128 arbitrarily
cuda train MT5ForConditionalGeneration        WARNING:common:fp64 golden ref were not generated for MT5ForConditionalGeneration. Setting accuracy check to cosine
PASS
Dynamo produced 1 graph(s) covering 1282 ops
If you want to use `MegatronBertForCausalLM` as a standalone, add `is_decoder=True.`
cuda train MegatronBertForCausalLM            PASS
Dynamo produced 4 graph(s) covering 1105 ops
cuda train MegatronBertForQuestionAnswering   PASS
Dynamo produced 4 graph(s) covering 1111 ops
cuda train MobileBertForMaskedLM              PASS
Dynamo produced 5 graph(s) covering 1822 ops
cuda train MobileBertForQuestionAnswering     PASS
Dynamo produced 5 graph(s) covering 1829 ops
cuda train OPTForCausalLM                     PASS
Dynamo produced 38 graph(s) covering 464 ops
cuda train PLBartForCausalLM                  PASS
Dynamo produced 14 graph(s) covering 214 ops
cuda train PLBartForConditionalGeneration     PASS
Dynamo produced 40 graph(s) covering 584 ops
WARNING:__main__:Sequence Length not defined for PegasusForCausalLM. Choosing 128 arbitrarily
cuda train PegasusForCausalLM                 PASS
Dynamo produced 39 graph(s) covering 423 ops
WARNING:__main__:Sequence Length not defined for PegasusForConditionalGeneration. Choosing 128 arbitrarily
cuda train PegasusForConditionalGeneration    PASS
Dynamo produced 101 graph(s) covering 1140 ops
If you want to use `RobertaLMHeadModel` as a standalone, add `is_decoder=True.`
cuda train RobertaForCausalLM                 PASS
Dynamo produced 5 graph(s) covering 576 ops
cuda train RobertaForQuestionAnswering        PASS
Dynamo produced 5 graph(s) covering 582 ops
WARNING:__main__:Sequence Length not defined for Speech2Text2ForCausalLM. Choosing 128 arbitrarily
cuda train Speech2Text2ForCausalLM            PASS
Dynamo produced 15 graph(s) covering 242 ops
cuda train T5ForConditionalGeneration         WARNING:common:fp64 golden ref were not generated for T5ForConditionalGeneration. Setting accuracy check to cosine
PASS
Dynamo produced 1 graph(s) covering 885 ops
cuda train T5Small                            WARNING:common:fp64 golden ref were not generated for T5Small. Setting accuracy check to cosine
PASS
Dynamo produced 1 graph(s) covering 885 ops
cuda train TrOCRForCausalLM                   PASS
Dynamo produced 26 graph(s) covering 424 ops
WARNING:__main__:Sequence Length not defined for XGLMForCausalLM. Choosing 128 arbitrarily
cuda train XGLMForCausalLM                    PASS
Dynamo produced 75 graph(s) covering 843 ops
cuda train XLNetLMHeadModel                   PASS
Dynamo produced 4 graph(s) covering 1014 ops
cuda train YituTechConvBert                   PASS
Dynamo produced 5 graph(s) covering 834 ops
cuda train adv_inception_v3                   PASS
Dynamo produced 3 graph(s) covering 628 ops
cuda train beit_base_patch16_224              WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(s4**2, 197) - 0, s4)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s4**2)//197 - 197, s4)
PASS
Dynamo produced 2 graph(s) covering 515 ops
cuda train botnet26t_256                      PASS
Dynamo produced 35 graph(s) covering 414 ops
cuda train cait_m36_384                       PASS
Dynamo produced 2 graph(s) covering 1546 ops
cuda train coat_lite_mini                     PASS
Dynamo produced 2 graph(s) covering 651 ops
cuda train convit_base                        WARNING:common:fp64 golden ref were not generated for convit_base. Setting accuracy check to cosine
PASS
Dynamo produced 58 graph(s) covering 1132 ops
cuda train convmixer_768_32                   PASS
Dynamo produced 2 graph(s) covering 232 ops
cuda train convnext_base                      WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 4)//4*(Mod((s2 - 4)//4**2/((s2 - 4)//4 + 1) + 2*(s2 - 4)//4/((s2 - 4)//4 + 1) + 1/((s2 - 4)//4 + 1), 1)) + Mod((s2 - 4)//4**2/((s2 - 4)//4 + 1) + 2*(s2 - 4)//4/((s2 - 4)//4 + 1) + 1/((s2 - 4)//4 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(((s2 - 4)//4 - 1)//2**2 + 2*((s2 - 4)//4 - 1)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(((s2 - 4)//4 - 1)//2*(Mod(((s2 - 4)//4 - 1)//2**2/(((s2 - 4)//4 - 1)//2 + 1) + 2*((s2 - 4)//4 - 1)//2/(((s2 - 4)//4 - 1)//2 + 1) + 1/(((s2 - 4)//4 - 1)//2 + 1), 1)) + Mod(((s2 - 4)//4 - 1)//2**2/(((s2 - 4)//4 - 1)//2 + 1) + 2*((s2 - 4)//4 - 1)//2/(((s2 - 4)//4 - 1)//2 + 1) + 1/(((s2 - 4)//4 - 1)//2 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod((((s2 - 4)//4 - 1)//2 - 1)//2**2 + 2*(((s2 - 4)//4 - 1)//2 - 1)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((((s2 - 4)//4 - 1)//2 - 1)//2*(Mod((((s2 - 4)//4 - 1)//2 - 1)//2**2/((((s2 - 4)//4 - 1)//2 - 1)//2 + 1) + 2*(((s2 - 4)//4 - 1)//2 - 1)//2/((((s2 - 4)//4 - 1)//2 - 1)//2 + 1) + 1/((((s2 - 4)//4 - 1)//2 - 1)//2 + 1), 1)) + Mod((((s2 - 4)//4 - 1)//2 - 1)//2**2/((((s2 - 4)//4 - 1)//2 - 1)//2 + 1) + 2*(((s2 - 4)//4 - 1)//2 - 1)//2/((((s2 - 4)//4 - 1)//2 - 1)//2 + 1) + 1/((((s2 - 4)//4 - 1)//2 - 1)//2 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2**2 + 2*((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2*(Mod(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2**2/(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2 + 1) + 2*((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2/(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2 + 1) + 1/(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2 + 1), 1)) + Mod(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2**2/(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2 + 1) + 2*((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2/(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2 + 1) + 1/(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 4)//4*(Mod((s2 - 4)//4**2/((s2 - 4)//4 + 1) + 2*(s2 - 4)//4/((s2 - 4)//4 + 1) + 1/((s2 - 4)//4 + 1), 1)) + Mod((s2 - 4)//4**2/((s2 - 4)//4 + 1) + 2*(s2 - 4)//4/((s2 - 4)//4 + 1) + 1/((s2 - 4)//4 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(((s2 - 4)//4 - 1)//2**2 + 2*((s2 - 4)//4 - 1)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(((s2 - 4)//4 - 1)//2*(Mod(((s2 - 4)//4 - 1)//2**2/(((s2 - 4)//4 - 1)//2 + 1) + 2*((s2 - 4)//4 - 1)//2/(((s2 - 4)//4 - 1)//2 + 1) + 1/(((s2 - 4)//4 - 1)//2 + 1), 1)) + Mod(((s2 - 4)//4 - 1)//2**2/(((s2 - 4)//4 - 1)//2 + 1) + 2*((s2 - 4)//4 - 1)//2/(((s2 - 4)//4 - 1)//2 + 1) + 1/(((s2 - 4)//4 - 1)//2 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod((((s2 - 4)//4 - 1)//2 - 1)//2**2 + 2*(((s2 - 4)//4 - 1)//2 - 1)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((((s2 - 4)//4 - 1)//2 - 1)//2*(Mod((((s2 - 4)//4 - 1)//2 - 1)//2**2/((((s2 - 4)//4 - 1)//2 - 1)//2 + 1) + 2*(((s2 - 4)//4 - 1)//2 - 1)//2/((((s2 - 4)//4 - 1)//2 - 1)//2 + 1) + 1/((((s2 - 4)//4 - 1)//2 - 1)//2 + 1), 1)) + Mod((((s2 - 4)//4 - 1)//2 - 1)//2**2/((((s2 - 4)//4 - 1)//2 - 1)//2 + 1) + 2*(((s2 - 4)//4 - 1)//2 - 1)//2/((((s2 - 4)//4 - 1)//2 - 1)//2 + 1) + 1/((((s2 - 4)//4 - 1)//2 - 1)//2 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2**2 + 2*((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2*(Mod(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2**2/(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2 + 1) + 2*((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2/(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2 + 1) + 1/(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2 + 1), 1)) + Mod(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2**2/(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2 + 1) + 2*((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2/(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2 + 1) + 1/(((((s2 - 4)//4 - 1)//2 - 1)//2 - 1)//2 + 1), 1) - 0, s2)
PASS
Dynamo produced 3 graph(s) covering 1048 ops
cuda train crossvit_9_240                     ERROR:common:

from user code:
   File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/timm/models/crossvit.py", line 394, in forward_features
    x_ = scale_image(x_, ss, self.crop_scale)
  File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/timm/models/crossvit.py", line 281, in scale_image
    x = torch.nn.functional.interpolate(x, size=ss, mode='bicubic', align_corners=False)

Set torch._dynamo.config.verbose=True for more information


You can suppress this exception and fall back to eager by setting:
    torch._dynamo.config.suppress_errors = True
Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1055, in run_node
    return node.target(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/nn/functional.py", line 3960, in interpolate
    return torch._C._nn.upsample_bicubic2d(input, output_size, align_corners, scale_factors)
RuntimeError: Cannot call sizes() on tensor with symbolic sizes/strides

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1014, in get_fake_value
    return wrap_fake_exception(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 704, in wrap_fake_exception
    return fn()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1015, in <lambda>
    lambda: run_node(tx.output, node, args, kwargs, nnmodule)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1064, in run_node
    raise RuntimeError(
RuntimeError: Failed running call_function <function interpolate at 0x7f8e72ec8160>(*(FakeTensor(FakeTensor(..., device='meta', size=(s0, 3, 240, 240)), cuda:0),), **{'size': (224, 224), 'mode': 'bicubic', 'align_corners': False}):
Cannot call sizes() on tensor with symbolic sizes/strides
(scroll up for backtrace)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1184, in check_accuracy
    new_result = optimized_model_iter_fn(model_copy, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 212, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1061, in run_n_iterations
    self.model_iter_fn(mod, inputs, collect_outputs=False)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/timm_models.py", line 315, in forward_and_backward_pass
    cloned_inputs = clone_inputs(inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/timm_models.py", line 316, in <graph break in forward_and_backward_pass>
    self.optimizer_zero_grad(mod)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/timm_models.py", line 318, in <graph break in forward_and_backward_pass>
    pred = mod(*cloned_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/nn/modules/module.py", line 1482, in _call_impl
    return forward_call(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 333, in catch_errors
    return callback(frame, cache_size, hooks)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 480, in _convert_frame
    result = inner_convert(frame, cache_size, hooks)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 103, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 90, in time_wrapper
    r = func(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 339, in _convert_frame_assert
    return _compile(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 400, in _compile
    out_code = transform_code_object(code, transform)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/bytecode_transformation.py", line 341, in transform_code_object
    transformations(instructions, code_options)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 387, in transform
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1684, in run
    super().run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/functions.py", line 244, in call_function
    return super().call_function(tx, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/functions.py", line 214, in call_function
    return super(UserFunctionVariable, self).call_function(tx, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/functions.py", line 67, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/functions.py", line 214, in call_function
    return super(UserFunctionVariable, self).call_function(tx, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/functions.py", line 67, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1015, in CALL_FUNCTION_KW
    self.call_function(fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/torch.py", line 468, in call_function
    tensor_variable = wrap_fx_proxy(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/builder.py", line 733, in wrap_fx_proxy
    return wrap_fx_proxy_cls(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/builder.py", line 773, in wrap_fx_proxy_cls
    example_value = get_fake_value(proxy.node, tx)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1034, in get_fake_value
    raise TorchRuntimeError() from e
torch._dynamo.exc.TorchRuntimeError:

from user code:
   File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/timm/models/crossvit.py", line 394, in forward_features
    x_ = scale_image(x_, ss, self.crop_scale)
  File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/timm/models/crossvit.py", line 281, in scale_image
    x = torch.nn.functional.interpolate(x, size=ss, mode='bicubic', align_corners=False)

Set torch._dynamo.config.verbose=True for more information


You can suppress this exception and fall back to eager by setting:
    torch._dynamo.config.suppress_errors = True

TorchDynamo optimized model failed to run because of following error
FAIL
Dynamo produced 0 graph(s) covering 0 ops
cuda train cspdarknet53                       PASS
Dynamo produced 41 graph(s) covering 407 ops
cuda train deit_base_distilled_patch16_224    PASS
Dynamo produced 2 graph(s) covering 449 ops
cuda train dla102                             PASS
Dynamo produced 3 graph(s) covering 832 ops
cuda train dm_nfnet_f0                        PASS
Dynamo produced 3 graph(s) covering 1502 ops
cuda train dpn107                             PASS
Dynamo produced 2 graph(s) covering 744 ops
cuda train eca_botnext26ts_256                PASS
Dynamo produced 33 graph(s) covering 430 ops
cuda train eca_halonext26ts                   WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(s3**2, 64) - 0, s3)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(((s2 - 1)//2 + 1)//(s2/8)**2, ((128*s0)//((s0*(s2 - 1)//2**2 + 2*s0*(s2 - 1)//2 + s0)//(s2**2/32))*((s2 - 1)//2 + 1)//(s2/8)**2*(s0*(s2 - 1)//2**2 + 2*s0*(s2 - 1)//2 + s0)//(s2**2/32))//(128*s0)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(((s2 - 1)//2 + 1)//(s2/8)**2, 4) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((((s2 - 1)//2 + 1)//(s2/8)**2)//4 - 4, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(4*(Mod((((s2 - 1)//2 + 1)//(s2/8)**2)//4, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(((s2 - 1)//2 + 1)//(s2/8)**2, ((128*s0)//((s0*(s2 - 1)//2**2 + 2*s0*(s2 - 1)//2 + s0)//(s2**2/32))*((s2 - 1)//2 + 1)//(s2/8)**2*(s0*(s2 - 1)//2**2 + 2*s0*(s2 - 1)//2 + s0)//(s2**2/32))//(128*s0)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(((s2 - 1)//2 + 1)//(s2/8)**2, 4) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((((s2 - 1)//2 + 1)//(s2/8)**2)//4 - 4, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(4*(Mod((((s2 - 1)//2 + 1)//(s2/8)**2)//4, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod((128*s0)//((s0*(s2 - 1)//2**2 + 2*s0*(s2 - 1)//2 + s0)//(s2**2/32))*((s2 - 1)//2**2 + 2*(s2 - 1)//2 + 1)//(s2**2/32), 128) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(s3**2, 16) - 0, s3)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(s3**2, 64) - 0, s3)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(s0**3, 64) - 0, s0)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(((s2 - 1)//2 + 1)//(s2/8)**2, ((128*s0)//((s0*(s2 - 1)//2**2 + 2*s0*(s2 - 1)//2 + s0)//(s2**2/32))*((s2 - 1)//2 + 1)//(s2/8)**2*(s0*(s2 - 1)//2**2 + 2*s0*(s2 - 1)//2 + s0)//(s2**2/32))//(128*s0)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(((s2 - 1)//2 + 1)//(s2/8)**2, 4) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((((s2 - 1)//2 + 1)//(s2/8)**2)//4 - 4, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(4*(Mod((((s2 - 1)//2 + 1)//(s2/8)**2)//4, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(((s2 - 1)//2 + 1)//(s2/8)**2, ((128*s0)//((s0*(s2 - 1)//2**2 + 2*s0*(s2 - 1)//2 + s0)//(s2**2/32))*((s2 - 1)//2 + 1)//(s2/8)**2*(s0*(s2 - 1)//2**2 + 2*s0*(s2 - 1)//2 + s0)//(s2**2/32))//(128*s0)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(((s2 - 1)//2 + 1)//(s2/8)**2, 4) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((((s2 - 1)//2 + 1)//(s2/8)**2)//4 - 4, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(4*(Mod((((s2 - 1)//2 + 1)//(s2/8)**2)//4, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod((128*s0)//((s0*(s2 - 1)//2**2 + 2*s0*(s2 - 1)//2 + s0)//(s2**2/32))*((s2 - 1)//2**2 + 2*(s2 - 1)//2 + 1)//(s2**2/32), 128) - 0, s2)
PASS
Dynamo produced 36 graph(s) covering 493 ops
cuda train ese_vovnet19b_dw                   PASS
Dynamo produced 2 graph(s) covering 158 ops
cuda train fbnetc_100                         PASS
Dynamo produced 2 graph(s) covering 381 ops
cuda train fbnetv3_b                          PASS
Dynamo produced 2 graph(s) covering 608 ops
cuda train gernet_l                           PASS
Dynamo produced 2 graph(s) covering 407 ops
cuda train ghostnet_100                       PASS
Dynamo produced 2 graph(s) covering 326 ops
cuda train gluon_inception_v3                 PASS
Dynamo produced 3 graph(s) covering 628 ops
cuda train gluon_xception65                   [2022-12-17 17:58:35,313] torch._dynamo.utils: [ERROR] RMSE (res-fp64): 0.01022, (ref-fp64): 0.00320 and shape=torch.Size([728])
[2022-12-17 17:58:35,313] torch._dynamo.utils: [ERROR] Accuracy failed for key name mid.block19.rep.bn1.weight.grad
FAIL
Dynamo produced 2 graph(s) covering 354 ops
cuda train gmixer_24_224                      PASS
Dynamo produced 2 graph(s) covering 640 ops
cuda train gmlp_s16_224                       PASS
Dynamo produced 2 graph(s) covering 496 ops
cuda train hrnet_w18                          ERROR:common:

from user code:
   File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/timm/models/hrnet.py", line 713, in stages
    yl = self.stage2(xl)
  File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/timm/models/hrnet.py", line 495, in forward
    y = y + fuse_outer[j](x[j])

Set torch._dynamo.config.verbose=True for more information


You can suppress this exception and fall back to eager by setting:
    torch._dynamo.config.suppress_errors = True
Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1060, in run_node
    return nnmodule(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/nn/modules/module.py", line 1482, in _call_impl
    return forward_call(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/nn/modules/upsampling.py", line 156, in forward
    return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners,
  File "/scratch/ezyang/work/a/pytorch/torch/nn/functional.py", line 3924, in interpolate
    return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
RuntimeError: Cannot call sizes() on tensor with symbolic sizes/strides

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1014, in get_fake_value
    return wrap_fake_exception(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 704, in wrap_fake_exception
    return fn()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1015, in <lambda>
    lambda: run_node(tx.output, node, args, kwargs, nnmodule)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1064, in run_node
    raise RuntimeError(
RuntimeError: Failed running call_module sub1_1_2(*(FakeTensor(FakeTensor(..., device='meta', size=(s0, 18, (s2 - 1)//8 + 1, (s2 - 1)//8 + 1),
           grad_fn=<NativeBatchNormLegitBackward0>), cuda:0),), **{}):
Cannot call sizes() on tensor with symbolic sizes/strides
(scroll up for backtrace)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1184, in check_accuracy
    new_result = optimized_model_iter_fn(model_copy, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 212, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1061, in run_n_iterations
    self.model_iter_fn(mod, inputs, collect_outputs=False)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/timm_models.py", line 315, in forward_and_backward_pass
    cloned_inputs = clone_inputs(inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/timm_models.py", line 316, in <graph break in forward_and_backward_pass>
    self.optimizer_zero_grad(mod)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/timm_models.py", line 318, in <graph break in forward_and_backward_pass>
    pred = mod(*cloned_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/nn/modules/module.py", line 1482, in _call_impl
    return forward_call(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 333, in catch_errors
    return callback(frame, cache_size, hooks)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 480, in _convert_frame
    result = inner_convert(frame, cache_size, hooks)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 103, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 90, in time_wrapper
    r = func(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 339, in _convert_frame_assert
    return _compile(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 400, in _compile
    out_code = transform_code_object(code, transform)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/bytecode_transformation.py", line 341, in transform_code_object
    transformations(instructions, code_options)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 387, in transform
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1684, in run
    super().run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/functions.py", line 244, in call_function
    return super().call_function(tx, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/functions.py", line 214, in call_function
    return super(UserFunctionVariable, self).call_function(tx, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/functions.py", line 67, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/functions.py", line 244, in call_function
    return super().call_function(tx, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/functions.py", line 214, in call_function
    return super(UserFunctionVariable, self).call_function(tx, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/functions.py", line 67, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 182, in call_function
    tx.call_function(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 220, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 182, in call_function
    tx.call_function(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 201, in call_function
    return wrap_fx_proxy(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/builder.py", line 733, in wrap_fx_proxy
    return wrap_fx_proxy_cls(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/builder.py", line 773, in wrap_fx_proxy_cls
    example_value = get_fake_value(proxy.node, tx)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1034, in get_fake_value
    raise TorchRuntimeError() from e
torch._dynamo.exc.TorchRuntimeError:

from user code:
   File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/timm/models/hrnet.py", line 713, in stages
    yl = self.stage2(xl)
  File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/timm/models/hrnet.py", line 495, in forward
    y = y + fuse_outer[j](x[j])

Set torch._dynamo.config.verbose=True for more information


You can suppress this exception and fall back to eager by setting:
    torch._dynamo.config.suppress_errors = True

TorchDynamo optimized model failed to run because of following error
FAIL
Dynamo produced 0 graph(s) covering 0 ops
cuda train inception_v3                       PASS
Dynamo produced 3 graph(s) covering 628 ops
cuda train jx_nest_base                       PASS
Dynamo produced 2 graph(s) covering 1240 ops
cuda train lcnet_050                          PASS
Dynamo produced 2 graph(s) covering 166 ops
cuda train levit_128                          ERROR:common:

from user code:
   File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/timm/models/levit.py", line 293, in forward
    q, k, v = self.qkv(x).view(
  File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/timm/models/levit.py", line 172, in forward
    return self.bn(x.flatten(0, 1)).reshape_as(x)

Set torch._dynamo.config.verbose=True for more information


You can suppress this exception and fall back to eager by setting:
    torch._dynamo.config.suppress_errors = True
Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1057, in run_node
    return getattr(args[0], node.target)(*args[1:], **kwargs)
RuntimeError: Cannot call sizes() on tensor with symbolic sizes/strides

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1014, in get_fake_value
    return wrap_fake_exception(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 704, in wrap_fake_exception
    return fn()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1015, in <lambda>
    lambda: run_node(tx.output, node, args, kwargs, nnmodule)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1064, in run_node
    raise RuntimeError(
RuntimeError: Failed running call_method reshape_as(*(FakeTensor(FakeTensor(..., device='meta',
           size=(s0*(s2 - 1)//16**2 + 2*s0*(s2 - 1)//16 + s0, 256),
           grad_fn=<NativeBatchNormLegitBackward0>), cuda:0), FakeTensor(FakeTensor(..., device='meta',
           size=(s0, (s2 - 1)//16**2 + 2*(s2 - 1)//16 + 1, 256),
           grad_fn=<ViewBackward0>), cuda:0)), **{}):
Cannot call sizes() on tensor with symbolic sizes/strides
(scroll up for backtrace)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1184, in check_accuracy
    new_result = optimized_model_iter_fn(model_copy, example_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 212, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/common.py", line 1061, in run_n_iterations
    self.model_iter_fn(mod, inputs, collect_outputs=False)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/timm_models.py", line 315, in forward_and_backward_pass
    cloned_inputs = clone_inputs(inputs)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/timm_models.py", line 316, in <graph break in forward_and_backward_pass>
    self.optimizer_zero_grad(mod)
  File "/scratch/ezyang/work/a/pytorch/benchmarks/dynamo/timm_models.py", line 318, in <graph break in forward_and_backward_pass>
    pred = mod(*cloned_inputs)
  File "/scratch/ezyang/work/a/pytorch/torch/nn/modules/module.py", line 1482, in _call_impl
    return forward_call(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/eval_frame.py", line 333, in catch_errors
    return callback(frame, cache_size, hooks)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 480, in _convert_frame
    result = inner_convert(frame, cache_size, hooks)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 103, in _fn
    return fn(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 90, in time_wrapper
    r = func(*args, **kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 339, in _convert_frame_assert
    return _compile(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 400, in _compile
    out_code = transform_code_object(code, transform)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/bytecode_transformation.py", line 341, in transform_code_object
    transformations(instructions, code_options)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/convert_frame.py", line 387, in transform
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1684, in run
    super().run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/functions.py", line 244, in call_function
    return super().call_function(tx, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/functions.py", line 214, in call_function
    return super(UserFunctionVariable, self).call_function(tx, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/functions.py", line 67, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 182, in call_function
    tx.call_function(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 220, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 220, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/nn_module.py", line 220, in call_function
    return tx.inline_user_function_return(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 471, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1762, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1817, in inline_call_
    tracer.run()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 538, in run
    and self.step()
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 501, in step
    getattr(self, inst.opname)(inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 307, in wrapper
    return inner_fn(self, inst)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 966, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/symbolic_convert.py", line 435, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/misc.py", line 598, in call_function
    return self.obj.call_method(tx, self.name, args, kwargs).add_options(self)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/tensor.py", line 368, in call_method
    return wrap_fx_proxy(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/builder.py", line 733, in wrap_fx_proxy
    return wrap_fx_proxy_cls(
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/variables/builder.py", line 773, in wrap_fx_proxy_cls
    example_value = get_fake_value(proxy.node, tx)
  File "/scratch/ezyang/work/a/pytorch/torch/_dynamo/utils.py", line 1034, in get_fake_value
    raise TorchRuntimeError() from e
torch._dynamo.exc.TorchRuntimeError:

from user code:
   File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/timm/models/levit.py", line 293, in forward
    q, k, v = self.qkv(x).view(
  File "/data/home/ezyang/local/a/pytorch-env/lib/python3.9/site-packages/timm/models/levit.py", line 172, in forward
    return self.bn(x.flatten(0, 1)).reshape_as(x)

Set torch._dynamo.config.verbose=True for more information


You can suppress this exception and fall back to eager by setting:
    torch._dynamo.config.suppress_errors = True

TorchDynamo optimized model failed to run because of following error
FAIL
Dynamo produced 0 graph(s) covering 0 ops
cuda train mixer_b16_224                      PASS
Dynamo produced 2 graph(s) covering 232 ops
cuda train mixnet_l                           PASS
Dynamo produced 2 graph(s) covering 675 ops
cuda train mnasnet_100                        PASS
Dynamo produced 2 graph(s) covering 302 ops
cuda train mobilenetv2_100                    PASS
Dynamo produced 2 graph(s) covering 302 ops
cuda train mobilenetv3_large_100              PASS
Dynamo produced 2 graph(s) covering 313 ops
cuda train mobilevit_s                        PASS
Dynamo produced 2 graph(s) covering 631 ops
cuda train nfnet_l0                           PASS
Dynamo produced 2 graph(s) covering 548 ops
cuda train pit_b_224                          PASS
Dynamo produced 8 graph(s) covering 494 ops
cuda train pnasnet5large                      PASS
Dynamo produced 3 graph(s) covering 4198 ops
cuda train poolformer_m36                     WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 3)//4*(Mod((s2 - 3)//4**2/((s2 - 3)//4 + 1) + 2*(s2 - 3)//4/((s2 - 3)//4 + 1) + 1/((s2 - 3)//4 + 1), 1)) + Mod((s2 - 3)//4**2/((s2 - 3)//4 + 1) + 2*(s2 - 3)//4/((s2 - 3)//4 + 1) + 1/((s2 - 3)//4 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 3)//8*(Mod((s2 - 3)//8**2/((s2 - 3)//8 + 1) + 2*(s2 - 3)//8/((s2 - 3)//8 + 1) + 1/((s2 - 3)//8 + 1), 1)) + Mod((s2 - 3)//8**2/((s2 - 3)//8 + 1) + 2*(s2 - 3)//8/((s2 - 3)//8 + 1) + 1/((s2 - 3)//8 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 3)//16*(Mod((s2 - 3)//16**2/((s2 - 3)//16 + 1) + 2*(s2 - 3)//16/((s2 - 3)//16 + 1) + 1/((s2 - 3)//16 + 1), 1)) + Mod((s2 - 3)//16**2/((s2 - 3)//16 + 1) + 2*(s2 - 3)//16/((s2 - 3)//16 + 1) + 1/((s2 - 3)//16 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 3)//32*(Mod((s2 - 3)//32**2/((s2 - 3)//32 + 1) + 2*(s2 - 3)//32/((s2 - 3)//32 + 1) + 1/((s2 - 3)//32 + 1), 1)) + Mod((s2 - 3)//32**2/((s2 - 3)//32 + 1) + 2*(s2 - 3)//32/((s2 - 3)//32 + 1) + 1/((s2 - 3)//32 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 3)//4*(Mod((s2 - 3)//4**2/((s2 - 3)//4 + 1) + 2*(s2 - 3)//4/((s2 - 3)//4 + 1) + 1/((s2 - 3)//4 + 1), 1)) + Mod((s2 - 3)//4**2/((s2 - 3)//4 + 1) + 2*(s2 - 3)//4/((s2 - 3)//4 + 1) + 1/((s2 - 3)//4 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 3)//8*(Mod((s2 - 3)//8**2/((s2 - 3)//8 + 1) + 2*(s2 - 3)//8/((s2 - 3)//8 + 1) + 1/((s2 - 3)//8 + 1), 1)) + Mod((s2 - 3)//8**2/((s2 - 3)//8 + 1) + 2*(s2 - 3)//8/((s2 - 3)//8 + 1) + 1/((s2 - 3)//8 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 3)//16*(Mod((s2 - 3)//16**2/((s2 - 3)//16 + 1) + 2*(s2 - 3)//16/((s2 - 3)//16 + 1) + 1/((s2 - 3)//16 + 1), 1)) + Mod((s2 - 3)//16**2/((s2 - 3)//16 + 1) + 2*(s2 - 3)//16/((s2 - 3)//16 + 1) + 1/((s2 - 3)//16 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s2 - 3)//32*(Mod((s2 - 3)//32**2/((s2 - 3)//32 + 1) + 2*(s2 - 3)//32/((s2 - 3)//32 + 1) + 1/((s2 - 3)//32 + 1), 1)) + Mod((s2 - 3)//32**2/((s2 - 3)//32 + 1) + 2*(s2 - 3)//32/((s2 - 3)//32 + 1) + 1/((s2 - 3)//32 + 1), 1) - 0, s2)
PASS
Dynamo produced 3 graph(s) covering 1392 ops
cuda train regnety_002                        PASS
Dynamo produced 2 graph(s) covering 366 ops
cuda train repvgg_a2                          PASS
Dynamo produced 2 graph(s) covering 395 ops
cuda train res2net101_26w_4s                  PASS
Dynamo produced 2 graph(s) covering 805 ops
cuda train res2net50_14w_8s                   PASS
Dynamo produced 2 graph(s) covering 701 ops
cuda train res2next50                         PASS
Dynamo produced 2 graph(s) covering 397 ops
cuda train resmlp_12_224                      PASS
Dynamo produced 2 graph(s) covering 208 ops
cuda train resnest101e                        PASS
Dynamo produced 2 graph(s) covering 1284 ops
cuda train rexnet_100                         PASS
Dynamo produced 2 graph(s) covering 402 ops
cuda train sebotnet33ts_256                   PASS
Dynamo produced 45 graph(s) covering 564 ops
cuda train selecsls42b                        PASS
Dynamo produced 2 graph(s) covering 134 ops
cuda train spnasnet_100                       PASS
Dynamo produced 2 graph(s) covering 374 ops
cuda train swin_base_patch4_window7_224       WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(s3**2, 49) - 0, s3)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s3**2)//49 - 49, s3)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(s3**2, 49) - 0, s3)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((s3**2)//49 - 49, s3)
PASS
Dynamo produced 3 graph(s) covering 4024 ops
cuda train swsl_resnext101_32x16d             PASS
Dynamo produced 2 graph(s) covering 413 ops
cuda train tf_efficientnet_b0                 PASS
Dynamo produced 3 graph(s) covering 1086 ops
cuda train tf_mixnet_l                        PASS
Dynamo produced 3 graph(s) covering 2428 ops
cuda train tinynet_a                          PASS
Dynamo produced 2 graph(s) covering 433 ops
cuda train tnt_s_patch16_224                  PASS
Dynamo produced 2 graph(s) covering 956 ops
cuda train twins_pcpvt_base                   WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1, s2//4) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(4*(Mod(64*s0*(s2//4 - 8)//8*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8 + 64*s0*(s2//4 - 8)//8 + 64*s0*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(64*s0*(Mod((s2//4 - 8)//8*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8 + (s2//4 - 8)//8 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod((s2//4 - 8)//8*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8 + (s2//4 - 8)//8 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(128*s0*(Mod((s2//4 - 8)//8*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8 + (s2//4 - 8)//8 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(4*(Mod(64*s0*s2//4*((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4), 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(s2//4*((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4), 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(s2//4*((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1, s2//8) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(4*(Mod(128*s0*(s2//8 - 4)//4*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 4)//4 + 128*s0*(s2//8 - 4)//4 + 128*s0*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 4)//4, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(128*s0*(Mod((s2//8 - 4)//4*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 4)//4 + (s2//8 - 4)//4 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 4)//4, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod((s2//8 - 4)//4*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 4)//4 + (s2//8 - 4)//4 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 4)//4, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(256*s0*(Mod((s2//8 - 4)//4*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 4)//4 + (s2//8 - 4)//4 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 4)//4, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(4*(Mod(128*s0*s2//8*((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8), 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(s2//8*((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8), 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1, s2//16) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(4*(Mod(320*s0*(s2//16 - 2)//2*(((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2 + 320*s0*(s2//16 - 2)//2 + 320*s0*(((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(320*s0*(Mod((s2//16 - 2)//2*(((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2 + (s2//16 - 2)//2 + (((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod((s2//16 - 2)//2*(((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2 + (s2//16 - 2)//2 + (((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(640*s0*(Mod((s2//16 - 2)//2*(((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2 + (s2//16 - 2)//2 + (((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(4*(Mod(320*s0*s2//16*((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16), 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(s2//16*((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16), 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(1024*s0*(Mod((s2//16 - 2)//2*(((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2 + (s2//16 - 2)//2 + (((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod((s2//16 - 2)//2*(((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2 + (s2//16 - 2)//2 + (((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2 + 1, s2//32) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(4*(Mod(512*s0*s2//32*((s2//16 - 2)//2*(((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2 + (s2//16 - 2)//2 + (((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2 + 1)//(s2//32), 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(s2//32*((s2//16 - 2)//2*(((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2 + (s2//16 - 2)//2 + (((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2 + 1)//(s2//32), 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(1024*s0*(Mod(s2//32*((s2//16 - 2)//2*(((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2 + (s2//16 - 2)//2 + (((s2//8 - 2)//2*(((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + (s2//8 - 2)//2 + (((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1)//(s2//8) - 2)//2 + 1)//(s2//16) - 2)//2 + 1)//(s2//32), 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1, s2//4) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(4*(Mod(64*s0*(s2//4 - 8)//8*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8 + 64*s0*(s2//4 - 8)//8 + 64*s0*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(64*s0*(Mod((s2//4 - 8)//8*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8 + (s2//4 - 8)//8 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod((s2//4 - 8)//8*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8 + (s2//4 - 8)//8 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(128*s0*(Mod((s2//4 - 8)//8*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8 + (s2//4 - 8)//8 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 8)//8, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(4*(Mod(64*s0*s2//4*((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4), 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(s2//4*((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4), 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(s2//4*((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod((s2//4 - 2)//2*(((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + (s2//4 - 2)//2 + (((s2 - 4)//4**2 + 2*(s2 - 4)//4 + 1)//(s2//4) - 2)//2 + 1, s2//8) - 0, s2)
TIMEOUT
cuda train visformer_small                    WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(1152*s0*(Mod((((s2 - 1)//2 - 3)//4 - 1)//2**2 + 2*(((s2 - 1)//2 - 3)//4 - 1)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((((s2 - 1)//2 - 3)//4 - 1)//2*(Mod((((s2 - 1)//2 - 3)//4 - 1)//2**2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 2*(((s2 - 1)//2 - 3)//4 - 1)//2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 1/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1), 1)) + Mod((((s2 - 1)//2 - 3)//4 - 1)//2**2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 2*(((s2 - 1)//2 - 3)//4 - 1)//2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 1/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(2304*s0*(Mod(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2**2 + 2*((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2*(Mod(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2**2/(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2 + 1) + 2*((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2/(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2 + 1) + 1/(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2 + 1), 1)) + Mod(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2**2/(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2 + 1) + 2*((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2/(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2 + 1) + 1/(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(1152*s0*(Mod((((s2 - 1)//2 - 3)//4 - 1)//2**2 + 2*(((s2 - 1)//2 - 3)//4 - 1)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((((s2 - 1)//2 - 3)//4 - 1)//2*(Mod((((s2 - 1)//2 - 3)//4 - 1)//2**2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 2*(((s2 - 1)//2 - 3)//4 - 1)//2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 1/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1), 1)) + Mod((((s2 - 1)//2 - 3)//4 - 1)//2**2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 2*(((s2 - 1)//2 - 3)//4 - 1)//2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 1/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(2304*s0*(Mod(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2**2 + 2*((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2*(Mod(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2**2/(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2 + 1) + 2*((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2/(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2 + 1) + 1/(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2 + 1), 1)) + Mod(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2**2/(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2 + 1) + 2*((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2/(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2 + 1) + 1/(((((s2 - 1)//2 - 3)//4 - 1)//2 - 1)//2 + 1), 1) - 0, s2)
PASS
Dynamo produced 3 graph(s) covering 734 ops
cuda train vit_base_patch16_224               PASS
Dynamo produced 2 graph(s) covering 443 ops
cuda train volo_d1_224                        WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(((s2 - 1)//2 - 3)//4**2 + 2*((s2 - 1)//2 - 3)//4, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(((s2 - 1)//2 - 3)//4*(Mod(((s2 - 1)//2 - 3)//4**2/(((s2 - 1)//2 - 3)//4 + 1) + 2*((s2 - 1)//2 - 3)//4/(((s2 - 1)//2 - 3)//4 + 1) + 1/(((s2 - 1)//2 - 3)//4 + 1), 1)) + Mod(((s2 - 1)//2 - 3)//4**2/(((s2 - 1)//2 - 3)//4 + 1) + 2*((s2 - 1)//2 - 3)//4/(((s2 - 1)//2 - 3)//4 + 1) + 1/(((s2 - 1)//2 - 3)//4 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(((s2 - 1)//2 - 3)//8**2 + 2*((s2 - 1)//2 - 3)//8, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(((s2 - 1)//2 - 3)//8*(Mod(((s2 - 1)//2 - 3)//8**2/(((s2 - 1)//2 - 3)//8 + 1) + 2*((s2 - 1)//2 - 3)//8/(((s2 - 1)//2 - 3)//8 + 1) + 1/(((s2 - 1)//2 - 3)//8 + 1), 1)) + Mod(((s2 - 1)//2 - 3)//8**2/(((s2 - 1)//2 - 3)//8 + 1) + 2*((s2 - 1)//2 - 3)//8/(((s2 - 1)//2 - 3)//8 + 1) + 1/(((s2 - 1)//2 - 3)//8 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(((s2 - 1)//2 - 3)//8**2 + 2*((s2 - 1)//2 - 3)//8 + 1, ceiling(((s2 - 1)//2 - 3)//4/2 + 1/2)**2) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(6*(Mod(((s2 - 1)//2 - 3)//8**2 + 2*((s2 - 1)//2 - 3)//8, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod((((s2 - 1)//2 - 3)//4 - 1)//2**2 + 2*(((s2 - 1)//2 - 3)//4 - 1)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((((s2 - 1)//2 - 3)//4 - 1)//2*(Mod((((s2 - 1)//2 - 3)//4 - 1)//2**2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 2*(((s2 - 1)//2 - 3)//4 - 1)//2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 1/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1), 1)) + Mod((((s2 - 1)//2 - 3)//4 - 1)//2**2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 2*(((s2 - 1)//2 - 3)//4 - 1)//2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 1/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(384*s0*(Mod((((s2 - 1)//2 - 3)//4 - 1)//2**2 + 2*(((s2 - 1)//2 - 3)//4 - 1)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(6*s0*(Mod(ceiling(((s2 - 1)//2 - 3)//4/2 + 1/2)**2, ((s2 - 1)//2 - 3)//8**2 + 2*((s2 - 1)//2 - 3)//8 + 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(6*(Mod(ceiling(((s2 - 1)//2 - 3)//4/2 + 1/2)**2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(ceiling(((s2 - 1)//2 - 3)//4/2 + 1/2)**2, ((s2 - 1)//2 - 3)//8 + 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(((s2 - 1)//2 - 3)//8*(ceiling(((s2 - 1)//2 - 3)//4/2 + 1/2)**2)//(((s2 - 1)//2 - 3)//8 + 1) + (ceiling(((s2 - 1)//2 - 3)//4/2 + 1/2)**2)//(((s2 - 1)//2 - 3)//8 + 1), 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(((s2 - 1)//2 - 3)//4**2 + 2*((s2 - 1)//2 - 3)//4, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(((s2 - 1)//2 - 3)//4*(Mod(((s2 - 1)//2 - 3)//4**2/(((s2 - 1)//2 - 3)//4 + 1) + 2*((s2 - 1)//2 - 3)//4/(((s2 - 1)//2 - 3)//4 + 1) + 1/(((s2 - 1)//2 - 3)//4 + 1), 1)) + Mod(((s2 - 1)//2 - 3)//4**2/(((s2 - 1)//2 - 3)//4 + 1) + 2*((s2 - 1)//2 - 3)//4/(((s2 - 1)//2 - 3)//4 + 1) + 1/(((s2 - 1)//2 - 3)//4 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(((s2 - 1)//2 - 3)//8**2 + 2*((s2 - 1)//2 - 3)//8, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(((s2 - 1)//2 - 3)//8*(Mod(((s2 - 1)//2 - 3)//8**2/(((s2 - 1)//2 - 3)//8 + 1) + 2*((s2 - 1)//2 - 3)//8/(((s2 - 1)//2 - 3)//8 + 1) + 1/(((s2 - 1)//2 - 3)//8 + 1), 1)) + Mod(((s2 - 1)//2 - 3)//8**2/(((s2 - 1)//2 - 3)//8 + 1) + 2*((s2 - 1)//2 - 3)//8/(((s2 - 1)//2 - 3)//8 + 1) + 1/(((s2 - 1)//2 - 3)//8 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(((s2 - 1)//2 - 3)//8**2 + 2*((s2 - 1)//2 - 3)//8 + 1, ceiling(((s2 - 1)//2 - 3)//4/2 + 1/2)**2) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(6*(Mod(((s2 - 1)//2 - 3)//8**2 + 2*((s2 - 1)//2 - 3)//8, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod((((s2 - 1)//2 - 3)//4 - 1)//2**2 + 2*(((s2 - 1)//2 - 3)//4 - 1)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve((((s2 - 1)//2 - 3)//4 - 1)//2*(Mod((((s2 - 1)//2 - 3)//4 - 1)//2**2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 2*(((s2 - 1)//2 - 3)//4 - 1)//2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 1/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1), 1)) + Mod((((s2 - 1)//2 - 3)//4 - 1)//2**2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 2*(((s2 - 1)//2 - 3)//4 - 1)//2/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1) + 1/((((s2 - 1)//2 - 3)//4 - 1)//2 + 1), 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(384*s0*(Mod((((s2 - 1)//2 - 3)//4 - 1)//2**2 + 2*(((s2 - 1)//2 - 3)//4 - 1)//2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(6*s0*(Mod(ceiling(((s2 - 1)//2 - 3)//4/2 + 1/2)**2, ((s2 - 1)//2 - 3)//8**2 + 2*((s2 - 1)//2 - 3)//8 + 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(6*(Mod(ceiling(((s2 - 1)//2 - 3)//4/2 + 1/2)**2, 1)) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(Mod(ceiling(((s2 - 1)//2 - 3)//4/2 + 1/2)**2, ((s2 - 1)//2 - 3)//8 + 1) - 0, s2)
WARNING:torch.fx.experimental.symbolic_shapes:RecursionError in sympy.solve(s0*(Mod(((s2 - 1)//2 - 3)//8*(ceiling(((s2 - 1)//2 - 3)//4/2 + 1/2)**2)//(((s2 - 1)//2 - 3)//8 + 1) + (ceiling(((s2 - 1)//2 - 3)//4/2 + 1/2)**2)//(((s2 - 1)//2 - 3)//8 + 1), 1)) - 0, s2)
[2022-12-17 19:09:03,297] torch._dynamo.utils: [ERROR] RMSE (res-fp64): 0.00361, (ref-fp64): 0.00018 and shape=torch.Size([64, 3, 7, 7])
[2022-12-17 19:09:03,297] torch._dynamo.utils: [ERROR] Accuracy failed for key name patch_embed.conv.0.weight.grad
FAIL
Dynamo produced 3 graph(s) covering 1568 ops
cuda train xcit_large_24_p8_224               WARNING:common:fp64 golden ref were not generated for xcit_large_24_p8_224. Setting accuracy check to cosine
FAIL
Dynamo produced 0 graph(s) covering 0 ops