how to show the outputs result on a web-service? or how can i get the result of inferrence for other application? #126

xujiangyu · 2024-01-22T03:01:38Z

Prerequisites

Before submitting your question, please ensure the following:

I am running the latest version of PowerInfer. Development is rapid, and as of now, there are no tagged versions.
I have carefully read and followed the instructions in the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).

Question Details

Please provide a clear and concise description of your question. If applicable, include steps to reproduce the issue or behaviors you've observed.

Additional Context

Please provide any additional information that may be relevant to your question, such as specific system configurations, environment details, or any other context that could be helpful in addressing your inquiry.

hodlen · 2024-01-22T18:06:33Z

Hi @xujiangyu ! If you are referring to the examples/server application, you can access it by entering the server address (e.g., 127.0.0.1:8080) in your browser. This allows you to interact with the model via a simple UI and see the outputs. For more details, please refer to the server documentation. Additionally, all inference outputs from the server are also printed to stdout.

For other applications, most of them print the inference results in the command line. You can find usage instructions in the examples/[application] directory, where each application's README and source code are available.

xujiangyu · 2024-01-23T02:56:52Z

Hi @xujiangyu ! If you are referring to the examples/server application, you can access it by entering the server address (e.g., 127.0.0.1:8080) in your browser. This allows you to interact with the model via a simple UI and see the outputs. For more details, please refer to the server documentation. Additionally, all inference outputs from the server are also printed to stdout.

For other applications, most of them print the inference results in the command line. You can find usage instructions in the examples/[application] directory, where each application's README and source code are available.

Thank you for your reply. I wonder how to add background knowledge in the parameters ,such as for RAG flow. I check the parameters of the main func and didn't recognise such a specific parameter.

hodlen · 2024-01-24T15:47:34Z

Adding background knowledge is quite an application layer concept and is no more than injecting information in prompts. This project focuses on the LLM inference and doesn't provide convenient support for that.

I suggest using some wrappers like the llama-cpp-python library (you can use our forked version here), or the server endpoint. And then you can use any mainstream orchestration frameworks like LangChain to easily achieve the RAG workflow.

xujiangyu added the question Further information is requested label Jan 22, 2024

hodlen mentioned this issue Jan 29, 2024

No CUDA toolset found #119

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to show the outputs result on a web-service? or how can i get the result of inferrence for other application? #126

how to show the outputs result on a web-service? or how can i get the result of inferrence for other application? #126

xujiangyu commented Jan 22, 2024 •

edited

hodlen commented Jan 22, 2024

xujiangyu commented Jan 23, 2024

hodlen commented Jan 24, 2024

how to show the outputs result on a web-service? or how can i get the result of inferrence for other application? #126

how to show the outputs result on a web-service? or how can i get the result of inferrence for other application? #126

Comments

xujiangyu commented Jan 22, 2024 • edited

Prerequisites

Question Details

Additional Context

hodlen commented Jan 22, 2024

xujiangyu commented Jan 23, 2024

hodlen commented Jan 24, 2024

xujiangyu commented Jan 22, 2024 •

edited