Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

用qwen-7b-int4和int8进行lora微调后,微调和推理没问题,但部署后,请求报错 #935

Open
nauyiahc opened this issue May 15, 2024 · 1 comment

Comments

@nauyiahc
Copy link

openai方式请求报错
image

Describe the bug
INFO: 127.0.0.1:35572 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 411, in run_asgi
result = await app( # type: ignore[func-returns-value]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in call
return await self.app(scope, receive, send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in call
await super().call(scope, receive, send)
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/starlette/applications.py", line 123, in call
await self.middleware_stack(scope, receive, send)
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in call
raise exc
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in call
await self.app(scope, receive, _send)
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 756, in call
await self.middleware_stack(scope, receive, send)
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 776, in app
await route.handle(scope, receive, send)
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 297, in handle
await self.app(scope, receive, send)
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 77, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/starlette/routing.py", line 72, in app
response = await func(request)
^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app
raw_response = await run_endpoint_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
return await dependant.call(**values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/model/SFT/swift/swift/llm/deploy.py", line 415, in create_chat_completion
return await inference_pt_async(request, raw_request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/model/SFT/swift/swift/llm/deploy.py", line 405, in inference_pt_async
return await _generate_full()
^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/model/SFT/swift/swift/llm/deploy.py", line 329, in _generate_full
response, _ = inference(
^^^^^^^^^^
File "/home/chaiy/model/SFT/swift/swift/llm/utils/utils.py", line 743, in inference
generate_ids = model.generate(
^^^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/peft/peft_model.py", line 1190, in generate
outputs = self.base_model.generate(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/.cache/huggingface/modules/transformers_modules/Qwen-7B-Chat-Int8/modeling_qwen.py", line 1259, in generate
return super().generate(
^^^^^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/transformers/generation/utils.py", line 1622, in generate
result = self._sample(
^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/transformers/generation/utils.py", line 2791, in _sample
outputs = self(
^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/.cache/huggingface/modules/transformers_modules/Qwen-7B-Chat-Int8/modeling_qwen.py", line 1043, in forward
transformer_outputs = self.transformer(
^^^^^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/.cache/huggingface/modules/transformers_modules/Qwen-7B-Chat-Int8/modeling_qwen.py", line 891, in forward
outputs = block(
^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/.cache/huggingface/modules/transformers_modules/Qwen-7B-Chat-Int8/modeling_qwen.py", line 610, in forward
attn_outputs = self.attn(
^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/.cache/huggingface/modules/transformers_modules/Qwen-7B-Chat-Int8/modeling_qwen.py", line 416, in forward
mixed_x_layer = self.c_attn(hidden_states)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/chaiy/software/miniconda3/envs/swift/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1582, in _call_impl
result = forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: QuantLinear.forward() got an unexpected keyword argument 'adapter_names'

Your hardware and system info
torch==2.1.3
peft==0.10.0

Additional context
用sft.sh 和infer.sh都没问题

@nauyiahc
Copy link
Author

部署脚本如下:

CUDA_VISIBLE_DEVICES=0
swift deploy
--model_type qwen-7b-chat-int4
--ckpt_dir "/home/model/swift/work/qwen/output/qwen-7b-chat-int8/v0-20240515-125128/checkpoint-93"
--infer_backend 'pt'
--host "0.0.0.0"
--port 8000

因为是量化模型,所以我指定了--infer_backend 'pt',但是还是要让我安装vllm,我安装了vllm0.3.1的版本

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant