Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

本地部署找不到模型 #107

Open
Roy202307 opened this issue Jul 27, 2023 · 3 comments
Open

本地部署找不到模型 #107

Roy202307 opened this issue Jul 27, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@Roy202307
Copy link

Roy202307 commented Jul 27, 2023

(venv) PS D:\python\LangChain-ChatGLM-Webui-master> python app.py
No sentence-transformers model found with name C:\Users\Administrator/.cache\torch\sentence_transformers\GanymedeNil_text2vec-base-chinese. Creating a new one with MEAN poolin
g.
No sentence-transformers model found with name D:\python\LangChain-ChatGLM-Webui-master\model_cache\GanymedeNil/text2vec-base-chinese\GanymedeNil_text2vec-base-chinese. Creati
ng a new one with MEAN pooling.
Symbol nvrtcGetCUBIN not found in D:\NAVDIA_GPU\Toolkit\CUDA\v11.0\bin\nvrtc64_110_0.dll
Symbol nvrtcGetCUBINSize not found in D:\NAVDIA_GPU\Toolkit\CUDA\v11.0\bin\nvrtc64_110_0.dll
Symbol cudaLaunchKernel not found in C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common\cudart64_65.dll
No compiled kernel found.
Compiling kernels : C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantization_kernels_
parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e53
7cae0911f\quantization_kernels_parallel.c -shared -o C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e
537cae0911f\quantization_kernels_parallel.so
'gcc' 不是内部或外部命令,也不是可运行的程序
或批处理文件。
Compile default cpu kernel failed, using default cpu kernel code.
Compiling gcc -O3 -fPIC -std=c99 C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantiza
tion_kernels.c -shared -o C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantization_ke
rnels.so
'gcc' 不是内部或外部命令,也不是可运行的程序
或批处理文件。
Compile default cpu kernel failed.
Failed to load kernel.
Cannot load cpu kernel, don't use quantized model on cpu.
Using quantization cache
Applying quantization to glm layers
The dtype of attention mask (torch.int64) is not bool
Running on local URL: http://0.0.0.0:7860
Running on public URL: https://6d7cff42c16ffdbf72.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces

====================================================================

上面是运行python app.py之后,显示的信息
打开 public URL,显示:模型未成功加载,请重新选择模型后点击"重新加载模型"按钮
这时,不作任何修改,直接点击按钮“重新加载模型”,
控制台又是一大堆提示:
No sentence-transformers model found with name C:\Users\Administrator/.cache\torch\sentence_transformers\GanymedeNil_text2vec-base-chinese. Creating a new one with MEAN poolin
g.
No sentence-transformers model found with name D:\python\LangChain-ChatGLM-Webui-master\model_cache\GanymedeNil/text2vec-base-chinese\GanymedeNil_text2vec-base-chinese. Creati
ng a new one with MEAN pooling.
No compiled kernel found.
Compiling kernels : C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantization_kernels_
parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e53
7cae0911f\quantization_kernels_parallel.c -shared -o C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e
537cae0911f\quantization_kernels_parallel.so
'gcc' 不是内部或外部命令,也不是可运行的程序
或批处理文件。
Compile default cpu kernel failed, using default cpu kernel code.
Compiling gcc -O3 -fPIC -std=c99 C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantiza
tion_kernels.c -shared -o C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantization_ke
rnels.so
'gcc' 不是内部或外部命令,也不是可运行的程序
或批处理文件。
Compile default cpu kernel failed.
Failed to load kernel.
Cannot load cpu kernel, don't use quantized model on cpu.
Using quantization cache
Applying quantization to glm layers

但是,页面会提示“模型已成功重新加载,可以开始对话”
只要输入信息开始对话,页面就提示ERROR
控制台显示:
RuntimeError: Error in __cdecl faiss::FileIOReader::FileIOReader(const char *) at D:\a\faiss-wheels\faiss-wheels\faiss\faiss\impl\io.cpp:68: Error: 'f' failed: could not open
faiss_index\index.faiss for reading: No such file or directory

请问这是什么问题呢?

@snuffcn
Copy link

snuffcn commented Sep 21, 2023

和你情况一样

@thomas-yanxin thomas-yanxin added the bug Something isn't working label Jan 8, 2024
@Yanllan
Copy link

Yanllan commented Jan 15, 2024

由你的报错可见 模型运行在了CPU上,若想让ChatGLM-6B运行在CPU上,请按照下列步骤进行:
1.安装gcc编译器

安装时需要用cpu运行,必须安装gcc与openmp
2.修改配置内容
ChatGLM-6B/quantization.py文件中注释掉from cpm_kernels.kernels.base import LazyKernelCModule, KernelFunction, round_up
kernels = Kernel(…)注释掉,替换为kernels =CPUKernel()
把已缓存的.cache目录下文件删掉 例如你的文件地址C:\Users\Administrator.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b-int4\6c5205c47d0d2f7ea2e44715d279e537cae0911f\quantization_kernels_
最后,修改cli_demo.py中的内容model = AutoModel.from_pretrained("THUDM\ChatGLM-6B", trust_remote_code=True).half().cuda()为:model = AutoModel.from_pretrained("THUDM\ChatGLM-6B", trust_remote_code=True).float()

@Roy202307
Copy link
Author

Roy202307 commented Jan 24, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants