Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for Nvidia installed deps detection algorithm in gpu.go #4106

Closed
wants to merge 2 commits into from

Conversation

alecvern
Copy link

@alecvern alecvern commented May 2, 2024

Related to #3593, #4008

In brief: I faced a similar issue and after a long analysis I found the problem in the current version of gpu.go code.

Context:

  • Windows 11
  • Nvidia GPU
  • usage of portable ollama version "ollama-windows-amd64.exe"

Currently, the code is looking for the cudart64_*.dll library in the following folders:

  1. the folder where ollama is installed through the standard installer (C:\Users\%USERNAME%\AppData\Local\Programs\Ollama).
  2. the folder where Cuda Toolkit is installed, being set manually in code via
    var CudartWindowsGlobs = []string{ "c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll" }
  3. All folders in the Path array from Windows Environment Variables.

The problem is that among the paths in (3) there is a path to the PhysX installation folder (installed by default with Nvidia drivers).
Screenshot 2024-05-02 212136
The PhysX installation contains cudart64_*.dll, but there are two related (and necessary for ollama) libraries missing: cublas and cublasLt.
This causes the scanner to constantly detect this folder when starting chat:

(when there are no any other cudart at all in the system)

image

(and when everything is in place, among legitimate paths for cudart)

Screenshot 2024-05-02 233040

The problem doesn't manifest itself so obviously on most devices now because:

  1. most install ollama through the installer, and the cublas path from the installed ollama folder is prioritized over the one detected in the PhysX folder
  2. many install the Cuda Toolkit and this path is also interpreted by ollama as higher priority at startup
  3. some people (like me) use portable ollama and run it from the same folder where they store individual instances of the nvidia libraries (when running portable ollama from a folder, that folder is assigned to cmd.exe as a working directory, which is automatically added to the environment variables during startup). This way of starting is also a higher priority for ollama, I guess because the cudart library in the PhysX folder is usually more recent than the ones used next to the portable ollama (and ollama always prefers to start trying to run with the oldest version found -- and that will fire later).

Everything would be fine for now if there was no such thing as PhysX Legacy, which is installed together with some old games and applications.
physx legacy
In this case, it turns out that the portable ollama on run prioritizes the cudart from the PhysX Legacy folder even over the cudart in the folder where the portable ollama .exe running from. This leads to an unhandled error and a crash of the ollama server (exact case of #3593) because once it finds the cudart in the PhysX folder it cannot then discover expected cublas and cublasLt libs needed for the LLM to work.

To summarize: the current implementation of gpu.go may shoot up in the future. Even now, it creates malfunctions, provided 3 conditions are met:

  1. using portable ollama (ollama-windows-amd64.exe) with nvidia libs alongside, in the same folder
  2. you have not installed cuda toolkit
  3. by pure coincidence you have PhysX Legacy installed (or) the current PhysX is installed right now along with the driver and after some future update ollama suddenly prioritizes an attempt to run through cudart from the PhysX folder.

In any case, trying to index the PhysX folder as legitimately containing the necessary Nvidia libraries is an undesigned behavior leading to crashes on startup of chatting session. There are 2 required libraries missing in any version of PhysX.
Also, the proposed code checks for the presence of these very libraries in the scanned folders to avoid similar cases, even not related to PhysX.

@alecvern
Copy link
Author

alecvern commented May 7, 2024

Obsolete after merging #4135.

@alecvern alecvern closed this May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant