Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running Gemma inference on GPU #47

Open
LarryHawkingYoung opened this issue Mar 18, 2024 · 1 comment
Open

Error when running Gemma inference on GPU #47

LarryHawkingYoung opened this issue Mar 18, 2024 · 1 comment
Labels
stat:awaiting response Status - Awaiting response from author type:support Support issues

Comments

@LarryHawkingYoung
Copy link

When I run

docker run -t --rm \
    --gpus all \
    -v ${CKPT_PATH}:/tmp/ckpt \
    ${DOCKER_URI} \
    python scripts/run.py \
    --device=cuda \
    --ckpt=/tmp/ckpt \
    --variant="${VARIANT}" \
    --prompt="${PROMPT}"

It returns the error:
docker: Error response from daemon: could not select device drit device driver "" with capabilities: [[gpu]].

while if I run on CPU with command:

docker run -t --rm \
    -v ${CKPT_PATH}:/tmp/ckpt \
    ${DOCKER_URI} \
    python scripts/run.py \
    --ckpt=/tmp/ckpt \
    --variant="${VARIANT}" \
    --prompt="${PROMPT}"

It works out OK.

@pengchongjin
Copy link
Collaborator

What model variant did you use and what GPU did you use?

One guess is that you may run out of GPU memory if you try to run the 7B un-quantized model on a 16GB GPU. You can either try the 7B quantized model or a 2B model and it should work.

@tilakrayal tilakrayal added type:support Support issues question Further information is requested stat:awaiting response Status - Awaiting response from author and removed question Further information is requested labels Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting response Status - Awaiting response from author type:support Support issues
Projects
None yet
Development

No branches or pull requests

3 participants