chat2db-sqlcoder-deploy

Languages： English | 中文

📖 Introduction

This project introduces how to deploy the 8-bit quantized sqlcoder model on Alibaba Cloud for free, and apply the large model to the Chat2DB client.

!!! Please note that the sqlcoder project is mainly for SQL generation, so it performs better in natural language to SQL, but slightly worse in SQL interpretation, optimization and transformation. Use it for reference only, do not blame the model or product.

📦 Hardware Requirements

Model	Minimum GPU Memory (Inference)	Minimum GPU Memory (Efficient Tuning)
sqlcoder-int8	20GB	20GB

📦 Deployment

📦 Deploy 8-bit model on Alibaba Cloud DSW

Apply for free trial of Alibaba Cloud DSW.
Create a DSW instance, select the resource group that can deduct resource package, and select the instance image pytorch:1.12-gpu-py39-cu113-ubuntu20.04
Install the dependencies in requirements.txt
```
pip install -r requirements.txt
```
Download the latest bitsandbytes package to support 8-bit models:
```
pip install -i https://test.pypi.org/simple/ bitsandbytes
```
Create folders named sqlcoder-model and sqlcoder in DSW instance under the path "/mnt/workspace".
Download sqlcoder model under sqlcoder-model folder:
```
git clone https://huggingface.co/defog/sqlcoder 
```
Copy api.py and prompt.md to sqlcoder folder.

Install FastAPI related packages:

pip install fastapi nest-asyncio pyngrok uvicorn

Start the API service under sqlcoder folder:
```
python api.py
```
You will get an API url like https://dfb1-34-87-2-137.ngrok.io.
Configure the API url in Chat2DB client to use the model for SQL generation.

📦 Deploy fp16 model on Alibaba Cloud DSW

If resources permit, you can try deploying the non-quantized sqlcoder model, which will have slightly higher accuracy in SQL generation than the 8-bit model, but requires more GPU memory and longer inference time.

Just modify the model loading in api.py to fp16 model:

model = AutoModelForCausalLM.from_pretrained("/mnt/workspace/sqlcoder-model/sqlcoder", 
                                      trust_remote_code=True,
                                      torch_dtype=torch.float16,
                                      device_map="auto",
                                      use_cache=True)

📦 Deploy on other cloud platforms

Although this tutorial uses Alibaba Cloud DSW as example, the scripts and commands have no customization. In theory, sqlcoder can be deployed on any cloud by following the steps above.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
README_CN.md		README_CN.md
api.py		api.py
prompt.md		prompt.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

README_CN.md

README_CN.md

api.py

api.py

prompt.md

prompt.md

requirements.txt

requirements.txt

Repository files navigation

chat2db-sqlcoder-deploy

📖 Introduction

📦 Hardware Requirements

📦 Deployment

📦 Deploy 8-bit model on Alibaba Cloud DSW

📦 Deploy fp16 model on Alibaba Cloud DSW

📦 Deploy on other cloud platforms

About

Releases

Packages

Languages

chat2db/chat2db-sqlcoder-deploy

Folders and files

Latest commit

History

Repository files navigation

chat2db-sqlcoder-deploy

📖 Introduction

📦 Hardware Requirements

📦 Deployment

📦 Deploy 8-bit model on Alibaba Cloud DSW

📦 Deploy fp16 model on Alibaba Cloud DSW

📦 Deploy on other cloud platforms

About

Resources

Stars

Watchers

Forks

Languages