-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple issues with setting up the text chatbot service on SPR #1300
Comments
@tbykowsk, |
Thank you @lvliang-intel, I can confirm that issues 1-7 for instruction notebooks/setup_text_chatbot_service_on_spr.ipynb are resolved. However, the instruction lacks information where to execute the step "Startup the backend server". A line could be added that the server has to be run from the main directory of repository ( ======================================================================== I also followed the quick start example which you referred, focusing on steps for CPU. Sections "1. Setup Environment" and "2. Run the chatbot in command mode" executed without any issues.
Output from the server:
However, the next step "3.1.2 Test request command at client side" executed correctly. The main issue which I noticed with this quick start example, is that NeuralChat replies over 10 times slower than NeuralChat from the previous instruction notebooks/setup_text_chatbot_service_on_spr.ipynb. Both times I used the same question:
and received very similar answers. NeuralChat from the quick start example was generating a response for around 13 minutes, whereas NeuralChat from the previous instruction took only around 50 seconds. I ran this experiment on the same platform. Log from the quick start example:
Log from the previous instruction:
Please notice the time difference between the question and reply. Are there any optimizations missing in the quick start example? ======================================================================== I have also read the main NeuralChat Reame for completeness. In the step Launch OpenAI-compatible Service I was not able to run
I received the same error while running the example Python Code from
The Python code only worked when executed from the root directory of the repository ( Is there a configuration step missing in this Readme (analogically like in the first instruction) regarding from where the |
Quick update, I have taken the yesterday's release of v1.4.1 and can confirm that both issues with paths are resolved (for instruction notebooks/setup_text_chatbot_service_on_spr.ipynb and for the main NeuralChat Reame). The performance issue with quick start example is also fixed. The only problem remains in step "3.1.1 Verify the client connection to server is OK", unless receiving |
@lvliang-intel, |
Hi @tbykowsk, Further explanation
|
Hi,
I have followed this instruction intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/docs/notebooks/setup_text_chatbot_service_on_spr.ipynb at main · intel/intel-extension-for-transformers (github.com) and written down a couple of issues with potential solutions, which you may want to consider implementing.
I am using Ubuntu 22.04 LTS and Python 3.10.12.
!git clone https://github.com/intel/intel-extension-for-transformers.git
The instruction says to used HEAD on the master branch, even though the repository is being actively developed. This causes problems like this 422 Unprocessable Entity using Neural Chat via OpenAI interface with meta--lama/llama-2-7b-chat-hf · Issue #1288 · intel/intel-extension-for-transformers (github.com)
I have also encountered the aforementioned issue, and then decided to use the latest release which is v1.3.1.
It would be useful to add information to the instruction about a commit/release it was validated with.
I was continuing with v1.3.1 from now on.
The requirements.txt installs
torch==2.1.0
, but once the backend server is started, it complains about torch compatility:The work-around is to reinstall torch and its dependencies manually to get compatible versions:
!pip install nest_asyncio
It is a bit confusing that
nest_syncio
has to be installed manually and is not added to requirements.txt for the backend.!pip install -r ./examples/deployment/textbot/frontend/requirements.txt
The requirements.txt again installs
torch==2.1.0
, which makes the backend unusable. Please consider using compatible packages for both components, or maybe suggest in the instruction to create a separate Python virtual environment for each component if the same host is used.!nohup python app.py &
There is an issue with
fastchat.utils
when starting app.py:I have worked around this by removing the reference to
violates_moderation
in app.py, but you may want to investigate the problem further.!nohup python app.py &
There is an issue with
gradio
package which occurs when a NeuralChat URL is loaded in a browser:The solution to this issue is updating
gradio
to at least version 3.50.2.The incompatible version is installed in the source code of app.py, so this line has to be changed:
!nohup python app.py &
After the NeuralChat URL successfully loads in the browser, the chat replies only with:
It is cause by the error in the backend:
To fix this, one may add
neural_speed
tointel-extension-for-transformers/intel_extension_for_transformers/neural_chat/requirements.txt
, or install the package manually.Intel/neural-chat-7b-v3-1
andmeta-llama/Llama-2-7b-chat-hf
, but for example fails withmeta-llama/Llama-2-13b-chat-hf
:The backend seems to load
meta-llama/Llama-2-13b-chat-hf
correctly. Maybe enabling other models from the same family in the frontend would not require many changes.Thank you for taking the time to read trough all this text :)
The text was updated successfully, but these errors were encountered: