Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Much higher RAM usage (2-3 times) compared to FastSDCPU when using the exact same models/settings #261

Open
JohnAlcatraz opened this issue May 12, 2024 · 2 comments

Comments

@JohnAlcatraz
Copy link

JohnAlcatraz commented May 12, 2024

Currently stable-diffusion.cpp seems to have a too high RAM usage compared to https://github.com/rupeshs/fastsdcpu (written in Python) for the same result.

I compared the Dreamshaper LCM model + TAESD at 5 steps and a resolution of 512x512 on stable-diffusion.cpp vs FastSDCPU, running on the CPU.

The speed is fully identical between both projects, I get ~4.4 s/it with both projects.

But stable-diffusion.cpp uses a peak of 2 GB RAM, or 1.6 GB with flash attention enabled, while FastSDCPU only uses a peak of 700 MB RAM. So stable-diffusion.cpp needs between 2-3x more RAM for the same result.

It looks like some significant optimizations would be possible in stable-diffusion.cpp that make it much more memory efficient.

@FSSRepo
Copy link
Contributor

FSSRepo commented May 13, 2024

Currently, im2col is being used for convolutions, which consumes a very high amount of RAM during the VAE phase.

I have been working on a kernel that merges im2col and matrix multiplications to avoid materializing a lot of data in memory, although that entails a 40% performance reduction. So far, I am only doing this for CUDA; for CPU it will be more difficult and will likely have a negative impact on performance.

@JohnAlcatraz
Copy link
Author

JohnAlcatraz commented May 13, 2024

Currently, im2col is being used for convolutions, which consumes a very high amount of RAM during the VAE phase.

But I did my comparison with TAESD instead of the VAE, so I think that means the VAE isn't used at all? TAESD is super lightweight already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants