UNSLOTH NOT DETECTING CUDA AND "str2optimizer32bit" #412

glitch047 · 2024-05-02T16:25:15Z

I'm facing issues with UNSLOTH not detecting CUDA and encountering a "str2optimizer32bit" error. My setup includes an HP Z4 workstation with an Intel Core i7 processor and an NVIDIA 1080Ti GPU, running Ubuntu 22.04. CUDA version is 12.1, and PyTorch version is 2.3.0. Libnccl version is 2.18.3.

I've compiled the bits and bytes library from source using the following steps:

git clone https://github.com/TimDettmers/bitsandbytes.git && cd bitsandbytes/
pip install -r requirements-dev.txt
cmake -DCOMPUTE_BACKEND=cuda -S .
make
pip install .

I installed UNSLOTH with:

conda create --name unsloth_env python=3.10
conda activate unsloth_env
conda install pytorch-cuda=12.1 pytorch cudatoolkit xformers -c pytorch -c nvidia -c xformers
pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
pip install --no-deps trl peft accelerate bitsandbytes

Additionally, I updated the Bash rc file:

export BNB_CUDA_VERSION=121
Add CUDA to PATH and LD_LIBRARY_PATH
export PATH=/usr/local/cuda-12.1/bin${PATH:+:$PATH}
export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}

The Python site packages, all contain libbitsandbytes_cuda121.so and not the cpu one.
The problem arises when running bits and bytes python -m bitsandbytes, leading to an error. The same issue persists when running UNSLOTH.

    WARNING: BNB_CUDA_VERSION=121 environment variable detected; loading libbitsandbytes_cuda121_nocublaslt121.so.
This can be used to load a bitsandbytes version that is different from the PyTorch CUDA version.
If this was unintended set the BNB_CUDA_VERSION variable to an empty string: export BNB_CUDA_VERSION=
If you use the manual override make sure the right libcudart.so is in your LD_LIBRARY_PATH
For example by adding the following to your .bashrc: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<path_to_cuda_dir/lib64

Could not find the bitsandbytes CUDA binary at PosixPath('/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda121_nocublaslt121.so')
Could not load bitsandbytes native library: /home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: cannot open shared object file: No such file or directory
Traceback (most recent call last):
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 109, in <module>
    lib = get_native_library()
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 96, in get_native_library
    dll = ct.cdll.LoadLibrary(str(binary_path))
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary
    return self._dlltype(name)
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: cannot open shared object file: No such file or directory

CUDA Setup failed despite CUDA being available. Please run the following command to get more information:

python -m bitsandbytes

Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++ BUG REPORT INFORMATION ++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
CUDA specs: CUDASpecs(highest_compute_capability=(6, 1), cuda_version_string='121', cuda_version_tuple=(12, 1))
PyTorch settings found: CUDA_VERSION=121, Highest Compute Capability: (6, 1).
WARNING: BNB_CUDA_VERSION=121 environment variable detected; loading libbitsandbytes_cuda121_nocublaslt121.so.
This can be used to load a bitsandbytes version that is different from the PyTorch CUDA version.
If this was unintended set the BNB_CUDA_VERSION variable to an empty string: export BNB_CUDA_VERSION=
If you use the manual override make sure the right libcudart.so is in your LD_LIBRARY_PATH
For example by adding the following to your .bashrc: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<path_to_cuda_dir/lib64

Library not found: /home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda121_nocublaslt121.so. Maybe you need to compile it from source?
If you compiled from source, try again with `make CUDA_VERSION=DETECTED_CUDA_VERSION`,
for example, `make CUDA_VERSION=113`.

The CUDA version for the compile might depend on your conda install, if using conda.
Inspect CUDA version via `conda list | grep cuda`.
To manually override the PyTorch CUDA version please see: https://github.com/TimDettmers/bitsandbytes/blob/main/docs/source/nonpytorchcuda.mdx
WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU!
If you run into issues with 8-bit matmul, you can try 4-bit quantization:
https://huggingface.co/blog/4bit-transformers-bitsandbytes
Found duplicate CUDA runtime files (see below).

We select the PyTorch default CUDA runtime, which is 12.1,
but this might mismatch with the CUDA version that is needed for bitsandbytes.
To override this behavior set the `BNB_CUDA_VERSION=<version string, e.g. 122>` environmental variable.

For example, if you want to use the CUDA version 122,
    BNB_CUDA_VERSION=122 python ...

OR set the environmental variable in your .bashrc:
    export BNB_CUDA_VERSION=122

In the case of a manual override, make sure you set LD_LIBRARY_PATH, e.g.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.2,
* Found CUDA runtime at: /usr/local/cuda-12.1/lib64/libcudart.so
* Found CUDA runtime at: /usr/local/cuda-12.1/lib64/libcudart.so.12
* Found CUDA runtime at: /usr/local/cuda-12.1/lib64/libcudart.so.12.1.105
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Checking that the library is importable and CUDA is callable...
Couldn't load the bitsandbytes library, likely due to missing binaries.
Please ensure bitsandbytes is properly installed.

For source installations, compile the binaries with `cmake -DCOMPUTE_BACKEND=cuda -S .`.
See the documentation for more details if needed.

Trying a simple check anyway, but this will likely fail...
Traceback (most recent call last):
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/diagnostics/main.py", line 66, in main
    sanity_check()
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/diagnostics/main.py", line 40, in sanity_check
    adam.step()
  File "/home/llm/.local/lib/python3.10/site-packages/torch/optim/optimizer.py", line 391, in wrapper
    out = func(*args, **kwargs)
  File "/home/llm/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/optim/optimizer.py", line 287, in step
    self.update_step(group, p, gindex, pindex)
  File "/home/llm/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/optim/optimizer.py", line 496, in update_step
    F.optimizer_update_32bit(
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/functional.py", line 1584, in optimizer_update_32bit
    optim_func = str2optimizer32bit[optimizer_name][0]
NameError: name 'str2optimizer32bit' is not defined
Above we output some debug information.
Please provide this info when creating an issue via https://github.com/TimDettmers/bitsandbytes/issues/new/choose
WARNING: Please be sure to sanitize sensitive info from the output before posting it.

The code i ran wad from unsloths github repo, a free colab notebook of llama3 and it gave the following error

WARNING: BNB_CUDA_VERSION=121 environment variable detected; loading libbitsandbytes_cuda121_nocublaslt121.so.
This can be used to load a bitsandbytes version that is different from the PyTorch CUDA version.
If this was unintended set the BNB_CUDA_VERSION variable to an empty string: export BNB_CUDA_VERSION=
If you use the manual override make sure the right libcudart.so is in your LD_LIBRARY_PATH
For example by adding the following to your .bashrc: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<path_to_cuda_dir/lib64

Could not find the bitsandbytes CUDA binary at PosixPath('/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda121_nocublaslt121.so')
Could not load bitsandbytes native library: /home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: cannot open shared object file: No such file or directory
Traceback (most recent call last):
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 109, in <module>
    lib = get_native_library()
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 96, in get_native_library
    dll = ct.cdll.LoadLibrary(str(binary_path))
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary
    return self._dlltype(name)
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: cannot open shared object file: No such file or directory

CUDA Setup failed despite CUDA being available. Please run the following command to get more information:

python -m bitsandbytes

Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py:72: UserWarning: Unsloth: Running `ldconfig /usr/lib64-nvidia` to link CUDA.
  warnings.warn(
/sbin/ldconfig.real: Can't create temporary cache file /etc/ld.so.cache~: Permission denied
/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py:103: UserWarning: Unsloth: CUDA is not linked properly.
Try running `python -m bitsandbytes` then `python -m xformers.info`
We tried running `ldconfig /usr/lib64-nvidia` ourselves, but it didn't work.
You need to run in your terminal `sudo ldconfig /usr/lib64-nvidia` yourself, then import Unsloth.
Also try `sudo ldconfig /usr/local/cuda-xx.x` - find the latest cuda version.
Unsloth will still run for now, but maybe it might crash - let's hope it works!
  warnings.warn(
Traceback (most recent call last):
  File "/mnt/ssd/unsloth/unsloth2.py", line 1, in <module>
    from unsloth import FastLanguageModel
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/__init__.py", line 113, in <module>
    from .models import *
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/__init__.py", line 15, in <module>
    from .loader import FastLanguageModel
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/loader.py", line 15, in <module>
    from .llama import FastLlamaModel, logger
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py", line 26, in <module>
    from ..kernels import *
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/__init__.py", line 15, in <module>
    from .cross_entropy_loss import fast_cross_entropy_loss
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/cross_entropy_loss.py", line 18, in <module>
    from .utils import calculate_settings, MAX_FUSED_SIZE
  File "/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/utils.py", line 36, in <module>
    cdequantize_blockwise_fp32      = bnb.functional.lib.cdequantize_blockwise_fp32
AttributeError: 'NoneType' object has no attribute 'cdequantize_blockwise_fp32'

How to address and fix this issue?

Training of llama3 should have been started. The Bitsandbytes should match with torch and cuda.

The text was updated successfully, but these errors were encountered:

sage-khan · 2024-05-03T19:27:43Z

Pay heed to the error which states:

Could not find the bitsandbytes CUDA binary at PosixPath('/home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda121_nocublaslt121.so')
Could not load bitsandbytes native library: /home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: cannot open shared object file: No such file or directory

Will need to find the files libbitsandbytes_cuda121_nocublaslt121.so and libbitsandbytes_cpu.so and save them in /home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/ .

You may find libbitsandbytes_cuda121.so in /home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/. You can rename it to libbitsandbytes_cuda121_nocublaslt121.so.

mv /home/llm/miniconda3/envs/unsloth_env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda121.so libbitsandbytes_cuda121_nocublaslt121.so

After that it may raise error regarding a missing module cudart. For that, simply do

pip3 install cudart cuda-python=12.1.0

danielhanchen · 2024-05-04T10:11:02Z

Wait maybe GTX 1080s might not work anymore, although I cannot confirm.

You first need to successfully install bitsandbytes in a standalone way first ie:

pip install bitsandbytes

then run python -m bitsandbytes

Until bitsandbytes is installed, then Unsloth can work. I'm suspecting GTX 1080s might not work anymore :(

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UNSLOTH NOT DETECTING CUDA AND "str2optimizer32bit" #412

UNSLOTH NOT DETECTING CUDA AND "str2optimizer32bit" #412

glitch047 commented May 2, 2024 •

edited

sage-khan commented May 3, 2024

danielhanchen commented May 4, 2024

UNSLOTH NOT DETECTING CUDA AND "str2optimizer32bit" #412

UNSLOTH NOT DETECTING CUDA AND "str2optimizer32bit" #412

Comments

glitch047 commented May 2, 2024 • edited

sage-khan commented May 3, 2024

danielhanchen commented May 4, 2024

glitch047 commented May 2, 2024 •

edited