Add docs of support new vl model #1332

irexyc · 2024-03-22T08:56:07Z

Motivation

add docs of support-new-vl-model

lvhan028 · 2024-03-25T03:10:36Z

docs/zh_cn/advance/vl_new_model.md

@@ -0,0 +1,131 @@
+# lmdeploy.vl 新模型支持


如何添加多模态视觉模型（VLM）

lvhan028 · 2024-03-25T03:11:54Z

docs/zh_cn/advance/vl_new_model.md

@@ -0,0 +1,131 @@
+# lmdeploy.vl 新模型支持
+
+目前，有一批 VLM 模型采用如下图所示的架构。图片经过 Vison Encoder 得到图片特征，之后经过 Projection 映射到文本的特征空间。最后，将图片特征与文本特征拼接后送入 LLM 进行推理。这类 VLM 模型有一个特点，即拼接后的特征送入 LLM 推理时并不区分特征的类型，两种特征之间没有交互计算。


目前，LMDeploy 支持类似 LLaVA 架构的多模态视觉模型。如下图所示，在这种架构中，图片经过 ...

lvhan028 · 2024-03-25T03:15:05Z

docs/zh_cn/advance/vl_new_model.md

+
+对于此类架构的模型，使用 LMDeploy 可以很方便的添加新模型的支持。
+
+## 模型支持


我感觉紧跟着 "模型支持" 的文字部分，可以融入到前文。这样，VisonModel, VLChatTemplateWrapper 可以作为 H2 标题。

lvhan028 · 2024-03-25T03:16:32Z

docs/zh_cn/advance/vl_new_model.md

+
+> \[!NOTE\]
+>
+> 一般 VLM 模型有一个对应的不带图片输入的LLM 模型，如 Qwen-VL-Chat 和 Qwen-7B-Chat，请先确保这个 LLM 模型可以被 TurboMind 引擎推理，或者他的模型结构和 TurboMind 已支持的模型结构相同。


删除“或者他的模型结构...结构相同。”

lvhan028 · 2024-03-25T03:19:11Z

docs/zh_cn/advance/vl_new_model.md

+
+    def build_model(self):
+        # init an empty model
+        with init_empty_weights():


init_empty_weights() 源自哪里？

lvhan028 · 2024-03-25T03:19:56Z

docs/zh_cn/advance/vl_new_model.md

+
+        # move model to cpu and load weight
+        model.to_empty(device='cpu')
+        load_model_from_weight_files(model, self.model_path)


load_model_from_weight_files 是可以通用的吗

lvhan028 · 2024-03-25T03:20:57Z

docs/zh_cn/advance/vl_new_model.md

+
+下面以 Qwen/Qwen-VL-Chat 模型为例，展示如何使用 LMDeploy 添加这类模型的支持。
+
+### VisonModel


实现 VisonModel

lvhan028 · 2024-03-25T03:22:21Z

docs/zh_cn/advance/vl_new_model.md

+添加新的视觉模型，主要需要修改两个地方:
+
+1. 抽取 VLM 模型对应的 vision 模型，并实现 `forward` 特征抽取函数
+2. 修改 `load_vl_model` 函数, 使 VLM 模型在加载时可以找到对应的 VisionModel 模型


下面的例子并没有 load_vl_model 函数

docs of adding new vl model

8b67f3c

irexyc mentioned this pull request Mar 22, 2024

Adding support for DeepSeek-VL 7B #1321

Closed

lvhan028 self-requested a review March 22, 2024 09:50

lvhan028 reviewed Mar 25, 2024

View reviewed changes

docs/zh_cn/advance/vl_new_model.md

@@ -0,0 +1,131 @@

# lmdeploy.vl 新模型支持

Copy link

Collaborator

lvhan028 Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如何添加多模态视觉模型（VLM）

lvhan028 reviewed Mar 25, 2024

View reviewed changes

lvhan028 added the documentation Improvements or additions to documentation label Mar 25, 2024

irexyc mentioned this pull request Apr 11, 2024

[Feature] support InternVL-Chat-Chinese-V1-2-Plus #1424

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add docs of support new vl model #1332

Add docs of support new vl model #1332

irexyc commented Mar 22, 2024

lvhan028 Mar 25, 2024

lvhan028 Mar 25, 2024 •

edited

lvhan028 Mar 25, 2024 •

edited

lvhan028 Mar 25, 2024

lvhan028 Mar 25, 2024

lvhan028 Mar 25, 2024

lvhan028 Mar 25, 2024

lvhan028 Mar 25, 2024

		@@ -0,0 +1,131 @@
		# lmdeploy.vl 新模型支持

		目前，有一批 VLM 模型采用如下图所示的架构。图片经过 Vison Encoder 得到图片特征，之后经过 Projection 映射到文本的特征空间。最后，将图片特征与文本特征拼接后送入 LLM 进行推理。这类 VLM 模型有一个特点，即拼接后的特征送入 LLM 推理时并不区分特征的类型，两种特征之间没有交互计算。


		对于此类架构的模型，使用 LMDeploy 可以很方便的添加新模型的支持。

		## 模型支持


		下面以 Qwen/Qwen-VL-Chat 模型为例，展示如何使用 LMDeploy 添加这类模型的支持。

		### VisonModel

Add docs of support new vl model #1332

Are you sure you want to change the base?

Add docs of support new vl model #1332

Conversation

irexyc commented Mar 22, 2024

Motivation

lvhan028 Mar 25, 2024

Choose a reason for hiding this comment

lvhan028 Mar 25, 2024 • edited

Choose a reason for hiding this comment

lvhan028 Mar 25, 2024 • edited

Choose a reason for hiding this comment

lvhan028 Mar 25, 2024

Choose a reason for hiding this comment

lvhan028 Mar 25, 2024

Choose a reason for hiding this comment

lvhan028 Mar 25, 2024

Choose a reason for hiding this comment

lvhan028 Mar 25, 2024

Choose a reason for hiding this comment

lvhan028 Mar 25, 2024

Choose a reason for hiding this comment

lvhan028 Mar 25, 2024 •

edited

lvhan028 Mar 25, 2024 •

edited