Set flag FLAGS_enable_cublas_tensor_op_math True in default. #896

GhostScreaming · 2022-11-08T08:26:06Z

When FP32 and FP16 model runs on A100 machine, it can be accelerated using TensorCore. Although NVIDIA declares that fp32 computation will be transferred to TensorCore automatically on A100, we detect that it's not the case. As a result, we set flag FLAGS_enable_cublas_tensor_op_math True in default manually.

sneaxiy

LGTM.

Set flag FLAGS_enable_cublas_tensor_op_math on in default.

0ccab14

sneaxiy approved these changes Nov 8, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set flag FLAGS_enable_cublas_tensor_op_math True in default. #896

Set flag FLAGS_enable_cublas_tensor_op_math True in default. #896

GhostScreaming commented Nov 8, 2022

sneaxiy left a comment

Set flag FLAGS_enable_cublas_tensor_op_math True in default. #896

Are you sure you want to change the base?

Set flag FLAGS_enable_cublas_tensor_op_math True in default. #896

Conversation

GhostScreaming commented Nov 8, 2022

sneaxiy left a comment

Choose a reason for hiding this comment