Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA version 12.4 not supported for this cm command #1243

Open
EtienneMassart opened this issue May 3, 2024 · 1 comment
Open

CUDA version 12.4 not supported for this cm command #1243

EtienneMassart opened this issue May 3, 2024 · 1 comment

Comments

@EtienneMassart
Copy link

Running this command from the cm playground gives an error message:
cm run script --tags=run-mlperf,inference,_performance-only,_short
--division=open
--category=edge
--device=cuda
--model=gptj-99
--precision=float32
--implementation=nvidia
--backend=tensorrt
--scenario=Offline
--execution_mode=test
--power=no
--adr.python.version_min=3.8
--clean
--compliance=no
--quiet
--time

It requires to install libnccl2=2.18.3 which only supports CUDA 11.0 and 12.0-2. I tried changing the version installed by ~/CM/repos/mlcommons@ck/cm-mlops/script/install-nccl-libs/ but ran into an other error later:
Building CXX object caffe2/CMakeFiles/op_registration_test.dir/__/aten/src/ATen/core/op_registration/op_registration_test.cpp.o
ninja: build stopped: subcommand failed.

I don't know if it is related to the change I made but haven't found a fix for this.

@arjunsuresh
Copy link
Contributor

arjunsuresh commented May 24, 2024

Hi @EtienneMassart have you managed to solve this? cuda 12.4 is supported in CM now. Please follow this documentation for MLPerf inference which supports Nvidia and Reference implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants