Test failures with new configs in test_grad_scaling_autocast in test_torch.py #126638
Labels
module: optimizer
Related to torch.optim
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
This issue tracks my observations when updating
test_grad_scaling_autocast
in test_torch.py with the new OptimizerInfo infrastructure (#123451).While I was able to combine tests that call
_grad_scaling_autocast_test
into one test (#125538), I observe test failures when I try to use_get_optim_inputs_including_global_cliquey_kwargs
to avoid hardcoded configs.The following is the test case:
The following observations I made about the failing configs generated from
_get_optim_inputs_including_global_cliquey_kwargs
:optimizer_ctor
is SGD, the test fails for the config{'weight_decay': 0.1, 'maximize': True, 'fused': True}
.context
ispartial(self.assertRaises, AssertionError)
for Adam and AdamW, the tests fail for configs{'lr': 0.01, 'fused': False}
,{'lr': 0.01, 'fused': True}
with the errorAssertionError: AssertionError not raised
.context
tocontextlib.nullcontext
for Adam and AdamW (since I notice the AssertionError is not raised in observation 2), the tests fail for all the configs with the errorAssertionError: Tensor-likes are not close!
. In this case, I am confused as to why is the error being thrown even for the configs that failed in observation 2, the mismatch elements percentage is around 3.1% for{'lr': 0.01, 'fused': False}
,{'lr': 0.01, 'fused': True}
but either 39.1% or 100% for other configs.Please let me know if I can provide any additional information or perform any other tests. I would be happy to work on this.
cc @vincentqb @jbschlosser @albanD @janeyx99 @crcrpar
The text was updated successfully, but these errors were encountered: