Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModuleNotFoundError: No module named 'deepspeed' #177

Open
qxpBlog opened this issue Mar 14, 2024 · 1 comment
Open

ModuleNotFoundError: No module named 'deepspeed' #177

qxpBlog opened this issue Mar 14, 2024 · 1 comment

Comments

@qxpBlog
Copy link

qxpBlog commented Mar 14, 2024

When i run the following command, the problem with the title arises

bash /home/iotsc01/xinpengq/LMOps-main/minillm/scripts/llama2/sft/sft_7B.sh /home/iotsc01/LMOps-main/minillm

the scripts file is following:

#! /bin/bash
BASE_MODEL=/home/iotsc01/.cache/huggingface/hub/models--meta-llama--Llama-2-7b-hf/snapshots/8cca527612d856d7d32bd94f8103728d614eb852
PYTHON_ENV_PATH="/home/iotsc01/anaconda3/envs/distil/bin/python"
MASTER_ADDR=localhost
MASTER_PORT=${2-2012}
NNODES=1
NODE_RANK=0
GPUS_PER_NODE=${3-16}

DISTRIBUTED_ARGS="--nproc_per_node $GPUS_PER_NODE \
                  --nnodes $NNODES \
                  --node_rank $NODE_RANK \
                  --master_addr $MASTER_ADDR \
                  --master_port $MASTER_PORT"

# model
BASE_PATH=${1-"/home/MiniLLM"}
CKPT="${BASE_MODEL}"
CKPT_NAME="llama2-7B"
# data
DATA_DIR="${BASE_PATH}/processed_data/dolly/full/llama2/"
# hp
BATCH_SIZE=1
LR=0.00001
GRAD_ACC=2
EVAL_BATCH_SIZE=2
# length
MAX_LENGTH=512
# runtime
SAVE_PATH="${BASE_PATH}/results/llama2/train/sft"
# seed
SEED=10
SEED_ORDER=10


OPTS=""
# model
OPTS+=" --base-path ${BASE_PATH}"
OPTS+=" --model-path ${CKPT}"
OPTS+=" --ckpt-name ${CKPT_NAME}"
OPTS+=" --n-gpu ${GPUS_PER_NODE}"
OPTS+=" --model-type llama2"
OPTS+=" --gradient-checkpointing"
# data
OPTS+=" --data-dir ${DATA_DIR}"
OPTS+=" --num-workers 0"
OPTS+=" --dev-num 500"
# hp
OPTS+=" --lr ${LR}"
OPTS+=" --batch-size ${BATCH_SIZE}"
OPTS+=" --eval-batch-size ${EVAL_BATCH_SIZE}"
OPTS+=" --gradient-accumulation-steps ${GRAD_ACC}"
OPTS+=" --warmup-iters 0"
OPTS+=" --lr-decay-style cosine"
OPTS+=" --weight-decay 1e-2"
OPTS+=" --clip-grad 1.0"
OPTS+=" --epochs 10"
# length
OPTS+=" --max-length ${MAX_LENGTH}"
OPTS+=" --max-prompt-length 256"
# runtime
OPTS+=" --do-train"
OPTS+=" --do-valid"
OPTS+=" --eval-gen"
OPTS+=" --save-interval -1"
OPTS+=" --eval-interval -1"
OPTS+=" --log-interval 4"
OPTS+=" --mid-log-num 1"
OPTS+=" --save ${SAVE_PATH}"
# seed
OPTS+=" --seed ${SEED}"
OPTS+=" --seed-order ${SEED_ORDER}"
# deepspeed
OPTS+=" --deepspeed"
OPTS+=" --deepspeed_config ${BASE_PATH}/configs/deepspeed/ds_config.json"
# type
OPTS+=" --type lm"
# gen
OPTS+=" --do-sample"
OPTS+=" --top-k 0"
OPTS+=" --top-p 1.0"
OPTS+=" --temperature 1.0"


export NCCL_DEBUG=""
export WANDB_DISABLED=True
export TF_CPP_MIN_LOG_LEVEL=3
export PYTHONPATH=${BASE_PATH}
CMD="torchrun ${DISTRIBUTED_ARGS} ${BASE_PATH}/finetune.py ${OPTS} $@"

echo ${CMD}
echo "PYTHONPATH=${PYTHONPATH}"
mkdir -p ${SAVE_PATH}
${CMD}

my anaconda3 environment distil is following:

# packages in environment at /home/iotsc01/anaconda3/envs/distil:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
_openmp_mutex             5.1                       1_gnu    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
absl-py                   2.1.0                    pypi_0    pypi
accelerate                0.28.0                   pypi_0    pypi
aiohttp                   3.9.3                    pypi_0    pypi
aiosignal                 1.3.1                    pypi_0    pypi
annotated-types           0.6.0                    pypi_0    pypi
async-timeout             4.0.3                    pypi_0    pypi
attrs                     23.2.0                   pypi_0    pypi
blas                      1.0                         mkl    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
bzip2                     1.0.8                h5eee18b_5    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
c-ares                    1.19.1               h5eee18b_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
ca-certificates           2023.12.12           h06a4308_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
certifi                   2024.2.2                 pypi_0    pypi
cffi                      1.16.0           py38h5eee18b_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
charset-normalizer        3.3.2                    pypi_0    pypi
click                     8.1.7                    pypi_0    pypi
cmake                     3.28.3                   pypi_0    pypi
datasets                  2.18.0                   pypi_0    pypi
deepspeed                 0.10.0                   pypi_0    pypi
dill                      0.3.8                    pypi_0    pypi
expat                     2.5.0                h6a678d5_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
filelock                  3.13.1                   pypi_0    pypi
frozenlist                1.4.1                    pypi_0    pypi
fsspec                    2024.2.0                 pypi_0    pypi
hjson                     3.1.0                    pypi_0    pypi
huggingface-hub           0.21.4                   pypi_0    pypi
idna                      3.6                      pypi_0    pypi
importlib-metadata        7.0.2                    pypi_0    pypi
intel-openmp              2023.1.0         hdb19cb5_46306    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
jinja2                    3.1.3                    pypi_0    pypi
joblib                    1.3.2                    pypi_0    pypi
krb5                      1.20.1               h143b758_1    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
ld_impl_linux-64          2.38                 h1181459_1    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
libcurl                   8.5.0                h251f7ec_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
libedit                   3.1.20230828         h5eee18b_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
libev                     4.33                 h7f8727e_1    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
libffi                    3.4.4                h6a678d5_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
libgcc-ng                 11.2.0               h1234567_1    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
libgomp                   11.2.0               h1234567_1    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
libnghttp2                1.57.0               h2d74bed_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
libssh2                   1.10.0               hdbd6064_2    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
libstdcxx-ng              11.2.0               h1234567_1    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
libuuid                   1.41.5               h5eee18b_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
libuv                     1.44.2               h5eee18b_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
lit                       18.1.1                   pypi_0    pypi
lz4-c                     1.9.4                h6a678d5_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
markdown-it-py            3.0.0                    pypi_0    pypi
markupsafe                2.1.5                    pypi_0    pypi
mdurl                     0.1.2                    pypi_0    pypi
mkl                       2023.1.0         h213fc3f_46344    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
mkl-service               2.4.0            py38h5eee18b_1    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
mkl_fft                   1.3.8            py38h5eee18b_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
mkl_random                1.2.4            py38hdb19cb5_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
mpmath                    1.3.0                    pypi_0    pypi
multidict                 6.0.5                    pypi_0    pypi
multiprocess              0.70.16                  pypi_0    pypi
ncurses                   6.4                  h6a678d5_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
networkx                  3.1                      pypi_0    pypi
ninja                     1.11.1.1                 pypi_0    pypi
nltk                      3.8.1                    pypi_0    pypi
numerize                  0.12                     pypi_0    pypi
numpy                     1.24.3           py38hf6e8229_1    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
numpy-base                1.24.3           py38h060ed82_1    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
nvidia-cublas-cu11        11.10.3.66               pypi_0    pypi
nvidia-cuda-cupti-cu11    11.7.101                 pypi_0    pypi
nvidia-cuda-nvrtc-cu11    11.7.99                  pypi_0    pypi
nvidia-cuda-runtime-cu11  11.7.99                  pypi_0    pypi
nvidia-cudnn-cu11         8.5.0.96                 pypi_0    pypi
nvidia-cufft-cu11         10.9.0.58                pypi_0    pypi
nvidia-curand-cu11        10.2.10.91               pypi_0    pypi
nvidia-cusolver-cu11      11.4.0.1                 pypi_0    pypi
nvidia-cusparse-cu11      11.7.4.91                pypi_0    pypi
nvidia-nccl-cu11          2.14.3                   pypi_0    pypi
nvidia-nvtx-cu11          11.7.91                  pypi_0    pypi
openssl                   3.0.13               h7f8727e_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
packaging                 24.0                     pypi_0    pypi
pandas                    2.0.3                    pypi_0    pypi
peft                      0.9.0                    pypi_0    pypi
pillow                    10.2.0                   pypi_0    pypi
pip                       23.3.1           py38h06a4308_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
protobuf                  3.20.3                   pypi_0    pypi
psutil                    5.9.8                    pypi_0    pypi
py-cpuinfo                9.0.0                    pypi_0    pypi
pyarrow                   15.0.1                   pypi_0    pypi
pyarrow-hotfix            0.6                      pypi_0    pypi
pycparser                 2.21               pyhd3eb1b0_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
pydantic                  1.9.0                    pypi_0    pypi
pydantic-core             2.16.3                   pypi_0    pypi
pygments                  2.17.2                   pypi_0    pypi
pynvml                    11.5.0                   pypi_0    pypi
python                    3.8.18               h955ad1f_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
python-dateutil           2.9.0.post0              pypi_0    pypi
pytz                      2024.1                   pypi_0    pypi
pyyaml                    6.0.1            py38h5eee18b_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
readline                  8.2                  h5eee18b_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
regex                     2023.12.25               pypi_0    pypi
requests                  2.31.0                   pypi_0    pypi
rhash                     1.4.3                hdbd6064_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
rich                      13.7.1                   pypi_0    pypi
rouge-score               0.1.2                    pypi_0    pypi
safetensors               0.4.2                    pypi_0    pypi
sentencepiece             0.2.0                    pypi_0    pypi
setuptools                68.2.2           py38h06a4308_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
six                       1.16.0                   pypi_0    pypi
sqlite                    3.41.2               h5eee18b_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
sympy                     1.12                     pypi_0    pypi
tbb                       2021.8.0             hdb19cb5_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
tk                        8.6.12               h1ccaba5_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
tokenizers                0.15.2                   pypi_0    pypi
torch                     2.0.1                    pypi_0    pypi
torchtyping               0.1.4                    pypi_0    pypi
torchvision               0.15.2                   pypi_0    pypi
tqdm                      4.66.2                   pypi_0    pypi
transformers              4.36.0.dev0              pypi_0    pypi
triton                    2.0.0                    pypi_0    pypi
typeguard                 4.1.5                    pypi_0    pypi
typing                    3.10.0.0         py38h06a4308_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
typing-extensions         4.10.0                   pypi_0    pypi
tzdata                    2024.1                   pypi_0    pypi
urllib3                   2.2.1                    pypi_0    pypi
wheel                     0.41.2           py38h06a4308_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
xxhash                    3.4.1                    pypi_0    pypi
xz                        5.4.6                h5eee18b_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
yaml                      0.2.5                h7b6447c_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
yarl                      1.9.4                    pypi_0    pypi
zipp                      3.18.0                   pypi_0    pypi
zlib                      1.2.13               h5eee18b_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
zstd                      1.5.5                hc292b87_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main

why i have downloaded the package deepspeed, but it does not work

@t1101675
Copy link
Contributor

Is the environment distil activated with conda activate distil? Can you import deepspeed in the interactive environment after simply running python3?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants