Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: RuntimeError: "slow_conv2d_cpu" not implemented for 'Half' #15779

Open
3 of 6 tasks
631068264 opened this issue May 14, 2024 · 1 comment
Open
3 of 6 tasks

[Bug]: RuntimeError: "slow_conv2d_cpu" not implemented for 'Half' #15779

631068264 opened this issue May 14, 2024 · 1 comment
Labels
bug-report Report of a bug, yet to be confirmed

Comments

@631068264
Copy link

Checklist

  • The issue exists after disabling all extensions
  • The issue exists on a clean installation of webui
  • The issue is caused by an extension, but I believe it is caused by a bug in the webui
  • The issue exists in the current version of the webui
  • The issue has not been reported before recently
  • The issue has been reported before but has not been fixed yet

What happened?

Similar error on v1.9.3 on M3 Mac when use clip

pip list|grep torch
open-clip-torch           2.20.0
pytorch-lightning         1.9.4
torch                     2.1.0
torchdiffeq               0.2.3
torchmetrics              1.4.0
torchsde                  0.2.6
torchvision               0.16.0

error

Applying attention optimization: sub-quadratic... done.
Model loaded in 6.7s (load weights from disk: 0.5s, create model: 0.7s, apply weights to model: 4.1s, apply half(): 0.7s, calculate empty prompt: 0.6s).
load checkpoint from /Users/xxxx/project/stable-diffusion-webui/models/BLIP/model_base_caption_capfilt_large.pth
[W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
*** Error interrogating
    Traceback (most recent call last):
      File "/Users/xxxx/project/stable-diffusion-webui/modules/interrogate.py", line 203, in interrogate
        image_features = self.clip_model.encode_image(clip_image).type(self.dtype)
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/clip/model.py", line 341, in encode_image
        return self.visual(image.type(self.dtype))
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/clip/model.py", line 224, in forward
        x = self.conv1(x)  # shape = [*, width, grid, grid]
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "/Users/xxxx/project/stable-diffusion-webui/modules/devices.py", line 164, in forward_wrapper
        result = self.org_forward(*args, **kwargs)
      File "/Users/xxxx/project/stable-diffusion-webui/extensions-builtin/Lora/networks.py", line 518, in network_Conv2d_forward
        return originals.Conv2d_forward(self, input)
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 460, in forward
        return self._conv_forward(input, self.weight, self.bias)
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 456, in _conv_forward
        return F.conv2d(input, weight, bias, self.stride,
    RuntimeError: "slow_conv2d_cpu" not implemented for 'Half'

Steps to reproduce the problem

Use image to image , and the use clip model for an image, get error

What should have happened?

get the prompt for image

What browsers do you use to access the UI ?

Google Chrome

Sysinfo

sysinfo-2024-05-14-03-07.json

Console logs

################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye), Fedora 34+ and openSUSE Leap 15.4 or newer.
################################################################

################################################################
Running on xxxx user
################################################################

################################################################
Repo already cloned, using it as install directory
################################################################

################################################################
python venv already activate or run without venv: /Users/xxxx/project/stable-diffusion-webui/venv
################################################################

################################################################
Launching launch.py...
################################################################
Python 3.10.11 (main, Apr  7 2023, 07:33:46) [Clang 14.0.0 (clang-1400.0.29.202)]
Version: v1.9.3
Commit hash: 1c0a0c4c26f78c32095ebc7f8af82f5c04fca8c0
Launching Web UI with arguments: --enable-insecure-extension-access --skip-install --skip-version-check --lowvram --skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
    PyTorch 2.0.1 with CUDA None (you have 2.1.0)
    Python  3.10.11 (you have 3.10.11)
  Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
  Memory-efficient attention, SwiGLU, sparse and more won't be available.
  Set XFORMERS_MORE_DETAILS=1 for more details
No module 'xformers'. Proceeding without it.
Warning: caught exception 'Torch not compiled with CUDA enabled', memory monitor disabled
ControlNet preprocessor location: /Users/xxxx/project/stable-diffusion-webui/extensions/sd-webui-controlnet/annotator/downloads
2024-05-14 10:44:10,187 - ControlNet - INFO - ControlNet v1.1.448
Loading weights [6ce0161689] from /Users/xxxx/project/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors
2024-05-14 10:44:10,402 - ControlNet - INFO - ControlNet UI callback registered.
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Creating model from config: /Users/xxxx/project/stable-diffusion-webui/configs/v1-inference.yaml
/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Startup time: 8.4s (prepare environment: 0.2s, import torch: 2.9s, import gradio: 1.2s, setup paths: 1.3s, initialize shared: 0.3s, other imports: 1.3s, load scripts: 0.5s, create ui: 0.4s, gradio launch: 0.3s).
OMP: Warning #191: Forking a process while a parallel region is active is potentially unsafe.
OMP: Warning #191: Forking a process while a parallel region is active is potentially unsafe.
OMP: Warning #191: Forking a process while a parallel region is active is potentially unsafe.
OMP: Warning #191: Forking a process while a parallel region is active is potentially unsafe.
OMP: Warning #191: Forking a process while a parallel region is active is potentially unsafe.
OMP: Warning #191: Forking a process while a parallel region is active is potentially unsafe.
Applying attention optimization: sub-quadratic... done.
Model loaded in 9.3s (load weights from disk: 0.5s, create model: 0.7s, apply weights to model: 6.0s, apply half(): 1.1s, calculate empty prompt: 0.8s).
load checkpoint from /Users/xxxx/project/stable-diffusion-webui/models/BLIP/model_base_caption_capfilt_large.pth
[W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
*** Error interrogating
    Traceback (most recent call last):
      File "/Users/xxxx/project/stable-diffusion-webui/modules/interrogate.py", line 203, in interrogate
        image_features = self.clip_model.encode_image(clip_image).type(self.dtype)
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/clip/model.py", line 341, in encode_image
        return self.visual(image.type(self.dtype))
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/clip/model.py", line 224, in forward
        x = self.conv1(x)  # shape = [*, width, grid, grid]
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
        return forward_call(*args, **kwargs)
      File "/Users/xxxx/project/stable-diffusion-webui/modules/devices.py", line 164, in forward_wrapper
        result = self.org_forward(*args, **kwargs)
      File "/Users/xxxx/project/stable-diffusion-webui/extensions-builtin/Lora/networks.py", line 518, in network_Conv2d_forward
        return originals.Conv2d_forward(self, input)
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 460, in forward
        return self._conv_forward(input, self.weight, self.bias)
      File "/Users/xxxx/project/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 456, in _conv_forward
        return F.conv2d(input, weight, bias, self.stride,
    RuntimeError: "slow_conv2d_cpu" not implemented for 'Half'

---

Additional information

No response

@631068264 631068264 added the bug-report Report of a bug, yet to be confirmed label May 14, 2024
@zero41120
Copy link

zero41120 commented May 27, 2024

Here is what I did to make CLIP interrogation work on my M3 Mac:

  1. First, I encountered the same error as you, seeing "Half" in the error message. My assumption is that something in the Mac configuration causes the system to use fp16 instead of fp32. Although I don't fully understand what fp16 and fp32 mean, here is the Mac config that gets loaded by default in A1111:
    export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate"
  2. I learned that using the --no-half flag prevents the system from using fp16. So, following the instruction in the Mac file, go to this file:
    #export COMMANDLINE_ARGS=""
  3. Uncomment this line and use these flags:
export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate --no-half"

This makes the interrogation work, and SD1.5 models generate images fine

However, on my 18GB M3 Mac, loading my favorite SDXL model causes the Python memory usage to increase from the usual 10GB to over 20GB, halting image generation. To use the SDXL model again, I have to remove the --no-half flag.


From my understanding, the M3 Mac uses a shared memory strategy, allowing both regular RAM and VRAM to use the same 18GB of RAM. However, when total usage exceeds 18GB, the system starts writing data to disk and swapping data between disk and RAM. This process is expensive and, since the SDXL Python process exceeds the 18GB supported by my Mac, even if the request is received, the swapping makes image generation impossible to complete in a reasonable time.

It's likely that the A1111 developers can fix this issue easily by making the interrogation use --no-half by default in the Mac environment, regardless of the flag. This change can be updated in this file:

if not shared.cmd_opts.no_half and not self.running_on_cpu:
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-report Report of a bug, yet to be confirmed
Projects
None yet
Development

No branches or pull requests

2 participants