You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When deploy the docker container with the source from "s3" and model_id "mistralai/Mistral-7B-Instruct-v0.2" (lorax-launcher --port 8080 --source "s3"), it failed with the following error message:
2024-05-14T20:25:15.424-07:00
huggingface_hub.utils._errors.EntryNotFoundError: No .safetensors weights found for model mistralai/Mistral-7B-Instruct-v0.2
2024-05-14T20:25:15.424-07:00
Error: DownloadError
File "/opt/conda/lib/python3.10/site-packages/lorax_server/cli.py", line 123, in download_weights
_download_weights(model_id, revision, extension, auto_convert, source, api_token)
File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/weights.py", line 447, in download_weights
model_source.weight_files()
File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/sources/s3.py", line 222, in weight_files
return weight_files_s3(self.bucket, self.model_id, self.revision, extension)
File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/sources/s3.py", line 156, in weight_files_s3
pt_filenames = weight_s3_files(bucket, model_id, extension=".bin")
File "/opt/conda/lib/python3.10/site-packages/lorax_server/utils/sources/s3.py", line 86, in weight_s3_files
raise EntryNotFoundError(
Based on error message above, the model_id (mistralai/Mistral-7B-Instruct-v0.2) has been passed in correctly. However, the top level folder name of the Mistral model is models--mistralai--Mistral-7B-Instruct-v0.2. Thus, the root cause of this bug is at the weight_s3_files function (link) below:
def weight_s3_files(bucket: Any, model_id: str, extension: str = ".safetensors") -> List[str]:
"""Get the weights filenames from s3"""
model_files = bucket.objects.filter(Prefix=model_id)
filenames = [f.key.removeprefix(model_id).lstrip("/") for f in model_files if f.key.endswith(extension)]
if not filenames:
raise EntryNotFoundError(
f"No {extension} weights found for model {model_id}",
None,
)
return filenames
In this line: model_files = bucket.objects.filter(Prefix=model_id), model_files returns empty because model_id (mistralai/Mistral-7B-Instruct-v0.2) doesn't match models--mistralai--Mistral-7B-Instruct-v0.2.
The fix of this bug can be converting model_id to folder name, like the get_s3_model_local_dir function (link) does, before filtering the s3 bucket.
Information
Docker
The CLI directly
Tasks
An officially supported command
My own modifications
Reproduction
Dockerfile:
ARG VERSION
FROM ghcr.io/predibase/lorax:$VERSION
COPY sagemaker_entrypoint.sh entrypoint.sh
RUN chmod +x entrypoint.sh
ENTRYPOINT ["./entrypoint.sh"]
sagemaker_entrypoint.sh:
#!/bin/bash
if [[ -z "${HF_MODEL_ID}" ]]; then
echo "HF_MODEL_ID must be set"
exit 1
fi
export MODEL_ID="${HF_MODEL_ID}"
if [[ -n "${HF_MODEL_REVISION}" ]]; then
export REVISION="${HF_MODEL_REVISION}"
fi
if [[ -n "${SM_NUM_GPUS}" ]]; then
export NUM_SHARD="${SM_NUM_GPUS}"
fi
if [[ -n "${HF_MODEL_QUANTIZE}" ]]; then
export QUANTIZE="${HF_MODEL_QUANTIZE}"
fi
if [[ -n "${HF_MODEL_TRUST_REMOTE_CODE}" ]]; then
export TRUST_REMOTE_CODE="${HF_MODEL_TRUST_REMOTE_CODE}"
fi
if [[ -z "${ADAPTER_BUCKET}" ]]; then
echo "Warning: ADAPTER_BUCKET not set. Only able to load local or HuggingFace Hub models."
else
export PREDIBASE_MODEL_BUCKET="${ADAPTER_BUCKET}"
fi
lorax-launcher --port 8080 --source "s3"
Expected behavior
lorax_launcher should be able to filter the Mistral base model saved in S3 to find the .safetensors files
The text was updated successfully, but these errors were encountered:
System Info
lorax_version: "a7e8175"
Python 3.10.8
Platform: ml.g5.16xlarge (AWS)
When deploy the docker container with the source from "s3" and model_id "mistralai/Mistral-7B-Instruct-v0.2" (
lorax-launcher --port 8080 --source "s3"
), it failed with the following error message:Based on error message above, the model_id (
mistralai/Mistral-7B-Instruct-v0.2
) has been passed in correctly. However, the top level folder name of the Mistral model ismodels--mistralai--Mistral-7B-Instruct-v0.2
. Thus, the root cause of this bug is at theweight_s3_files
function (link) below:In this line:
model_files = bucket.objects.filter(Prefix=model_id)
,model_files
returns empty becausemodel_id
(mistralai/Mistral-7B-Instruct-v0.2
) doesn't matchmodels--mistralai--Mistral-7B-Instruct-v0.2
.The fix of this bug can be converting
model_id
to folder name, like theget_s3_model_local_dir
function (link) does, before filtering the s3 bucket.Information
Tasks
Reproduction
Dockerfile:
sagemaker_entrypoint.sh:
Expected behavior
lorax_launcher should be able to filter the Mistral base model saved in S3 to find the
.safetensors
filesThe text was updated successfully, but these errors were encountered: