Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] 部署多模态大模型时,本地图片的输入无法被正确读取 #1613

Closed
2 tasks done
zhuyr97 opened this issue May 18, 2024 · 1 comment
Closed
2 tasks done

Comments

@zhuyr97
Copy link

zhuyr97 commented May 18, 2024

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.

Describe the bug

在提供的demo样例中,图片的路径是通过url给出的;使用网络图片url一切正常;但是我想读取本地图片作为输入时,模型输出是相当于无没有读取图片,图片并未送给模型;

demo 样例:
from lmdeploy.serve.openai.api_client import APIClient

api_client = APIClient(f'http://0.0.0.0:23333')
model_name = api_client.available_models[0]
messages = [{
'role':
'user',
'content': [{
'type': 'text',
'text': 'Describe the image please',
}, {
'type': 'image_url',
'image_url': {
'url':
'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg',
},
}]
}]
for item in api_client.chat_completions_v1(model=model_name,
messages=messages):
print(item)

Reproduction

lmdeploy serve api_server /root/autodl-tmp/Qwen-VL-Chat/ --server-port 23333 --tp 1 --cache-max-entry-count 0.5 --max-batch-size 32 --session-len 8192 --model-name qwen

Environment

lmdeploy 0.4.1
torchvision 0.17.2
timm 0.9.16

我希望修改代码可以支持本地图片 或者 网络图片url进行输入;支持本地图片时,我将图片base64编码后的图片数据添加到消息内容中;

import base64
import os
from lmdeploy.serve.openai.api_client import APIClient

# 实例化APIClient对象,指定API服务的URL。这里假定服务运行在本地,并监听所有网络接口的23333端口。
# 实例化APIClient对象
api_client = APIClient(f'http://0.0.0.0:23333')
model_name = api_client.available_models[0]

# 初始化消息列表,记录所有对话内容
messages = []

# 进入一个循环,用于多轮对话
while True:
    # 获取用户输入
    user_input = input("User: ")
    
    # 检查是否输入 'exit' 以结束对话
    if user_input.lower() == 'exit':
        print("Exiting chat.")
        break

    # 询问是否要添加本地图片路径
    image_path = input("Enter local image path or press Enter to use default image: ")
    
    # 使用用户提供的本地图片路径或默认图片链接
    if image_path and os.path.exists(image_path):
        # 读取本地图片并进行base64编码
        try:
            with open(image_path, "rb") as image_file:
                encoded_image = base64.b64encode(image_file.read()).decode('utf-8')
            image_content = {
                'type': 'image',
                'image': {
                    'base64': encoded_image,
                },
            }
        except Exception as e:
            print(f"Error reading image file: {e}")
            continue
    else:
        # 使用默认图片URL
        image_content = {
            'type': 'image_url',
            'image_url': {
                'url': 'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg',
            },
        }

    # 构建用户消息内容
    messages.append({
        'role': 'user',
        'content': [{
            'type': 'text',
            'text': user_input,
        }, image_content]
    })
    
    # 清空累积的内容
    accumulated_content = ''

    # 对于API客户端提供的chat_completions_v1方法进行迭代调用。
    # 这个方法接收模型名称和消息列表作为参数,返回模型针对每条消息的回复。
    for item in api_client.chat_completions_v1(model=model_name, messages=messages, stream=True):
        # 检查当前chunk是否包含新的内容
        if 'delta' in item['choices'][0] and 'content' in item['choices'][0]['delta']:
            new_content = item['choices'][0]['delta']['content']
            # 累积新内容到已有的字符串
            accumulated_content += new_content
            # 实时打印累加的内容
            print(accumulated_content, end='', flush=True)

        # 检查是否到达回复的结束
        if item['choices'][0].get('finish_reason') == 'stop':
            # 在这里可以处理完成后的逻辑,比如打印结束标记或执行其他操作
            print("\nResponse completed.")
            # 将助手的回复添加到消息列表
            messages.append({
                'role': 'assistant',
                'content': [{
                    'type': 'text',
                    'text': accumulated_content,
                }]
            })
            break

Error traceback

模型的response:抱歉,由于我是一个文本生成模型,我无法查看图片或其他形式的媒体。不过,如果你有任何问题或需要关于某个主题的更多信息,我会尽力提供帮助。
@zhuyr97
Copy link
Author

zhuyr97 commented May 19, 2024

已解决;还是用这个key值,使用编码后的base64_image_url 即可
image_content = {
'type': 'image_url',
'image_url': {
'url': base64_image_url ,
},
}

@zhuyr97 zhuyr97 closed this as completed May 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant