[Bug] 部署多模态大模型时，本地图片的输入无法被正确读取 #1613

zhuyr97 · 2024-05-18T14:17:48Z

Checklist

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.

Describe the bug

在提供的demo样例中，图片的路径是通过url给出的；使用网络图片url一切正常；但是我想读取本地图片作为输入时，模型输出是相当于无没有读取图片，图片并未送给模型；

demo 样例：
from lmdeploy.serve.openai.api_client import APIClient

api_client = APIClient(f'http://0.0.0.0:23333')
model_name = api_client.available_models[0]
messages = [{
'role':
'user',
'content': [{
'type': 'text',
'text': 'Describe the image please',
}, {
'type': 'image_url',
'image_url': {
'url':
'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg',
},
}]
}]
for item in api_client.chat_completions_v1(model=model_name,
messages=messages):
print(item)

Reproduction

lmdeploy serve api_server /root/autodl-tmp/Qwen-VL-Chat/ --server-port 23333 --tp 1 --cache-max-entry-count 0.5 --max-batch-size 32 --session-len 8192 --model-name qwen

Environment

lmdeploy 0.4.1
torchvision 0.17.2
timm 0.9.16

我希望修改代码可以支持本地图片 或者 网络图片url进行输入；支持本地图片时，我将图片base64编码后的图片数据添加到消息内容中;

import base64
import os
from lmdeploy.serve.openai.api_client import APIClient

# 实例化APIClient对象，指定API服务的URL。这里假定服务运行在本地，并监听所有网络接口的23333端口。
# 实例化APIClient对象
api_client = APIClient(f'http://0.0.0.0:23333')
model_name = api_client.available_models[0]

# 初始化消息列表，记录所有对话内容
messages = []

# 进入一个循环，用于多轮对话
while True:
    # 获取用户输入
    user_input = input("User: ")
    
    # 检查是否输入 'exit' 以结束对话
    if user_input.lower() == 'exit':
        print("Exiting chat.")
        break

    # 询问是否要添加本地图片路径
    image_path = input("Enter local image path or press Enter to use default image: ")
    
    # 使用用户提供的本地图片路径或默认图片链接
    if image_path and os.path.exists(image_path):
        # 读取本地图片并进行base64编码
        try:
            with open(image_path, "rb") as image_file:
                encoded_image = base64.b64encode(image_file.read()).decode('utf-8')
            image_content = {
                'type': 'image',
                'image': {
                    'base64': encoded_image,
                },
            }
        except Exception as e:
            print(f"Error reading image file: {e}")
            continue
    else:
        # 使用默认图片URL
        image_content = {
            'type': 'image_url',
            'image_url': {
                'url': 'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg',
            },
        }

    # 构建用户消息内容
    messages.append({
        'role': 'user',
        'content': [{
            'type': 'text',
            'text': user_input,
        }, image_content]
    })
    
    # 清空累积的内容
    accumulated_content = ''

    # 对于API客户端提供的chat_completions_v1方法进行迭代调用。
    # 这个方法接收模型名称和消息列表作为参数，返回模型针对每条消息的回复。
    for item in api_client.chat_completions_v1(model=model_name, messages=messages, stream=True):
        # 检查当前chunk是否包含新的内容
        if 'delta' in item['choices'][0] and 'content' in item['choices'][0]['delta']:
            new_content = item['choices'][0]['delta']['content']
            # 累积新内容到已有的字符串
            accumulated_content += new_content
            # 实时打印累加的内容
            print(accumulated_content, end='', flush=True)

        # 检查是否到达回复的结束
        if item['choices'][0].get('finish_reason') == 'stop':
            # 在这里可以处理完成后的逻辑，比如打印结束标记或执行其他操作
            print("\nResponse completed.")
            # 将助手的回复添加到消息列表
            messages.append({
                'role': 'assistant',
                'content': [{
                    'type': 'text',
                    'text': accumulated_content,
                }]
            })
            break

Error traceback

模型的response:抱歉，由于我是一个文本生成模型，我无法查看图片或其他形式的媒体。不过，如果你有任何问题或需要关于某个主题的更多信息，我会尽力提供帮助。

zhuyr97 · 2024-05-19T04:44:28Z

已解决；还是用这个key值，使用编码后的base64_image_url 即可
image_content = {
'type': 'image_url',
'image_url': {
'url': base64_image_url ,
},
}

zhuyr97 closed this as completed May 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] 部署多模态大模型时，本地图片的输入无法被正确读取 #1613

[Bug] 部署多模态大模型时，本地图片的输入无法被正确读取 #1613

zhuyr97 commented May 18, 2024

zhuyr97 commented May 19, 2024

[Bug] 部署多模态大模型时，本地图片的输入无法被正确读取 #1613

[Bug] 部署多模态大模型时，本地图片的输入无法被正确读取 #1613

Comments

zhuyr97 commented May 18, 2024

Checklist

Describe the bug

Reproduction

Environment

Error traceback

zhuyr97 commented May 19, 2024