Skip to content

Latest commit

 

History

History
25 lines (17 loc) · 1007 Bytes

index.md

File metadata and controls

25 lines (17 loc) · 1007 Bytes

Multi-Modal Documentation

📚 Tutorial

  1. MLLM Deployment Documentation

Multi-Modal Best Practice

A single round of dialogue can contain multiple images (or no images):

  1. Qwen-VL Best Practice
  2. Qwen-Audio Best Practice
  3. Deepseek-VL Best Practice
  4. Internlm2-Xcomposers Best Practice
  5. Phi3-Vision Best Practice

A single round of dialogue can only contain one image:

  1. Llava Best Practice
  2. Yi-VL Best Practice.md

The entire conversation revolves around one image.

  1. CogVLM Best Practice, CogVLM2 Best Practice, GLM4V Best Practice
  2. MiniCPM-V Best Practice
  3. InternVL-Chat-V1.5 Best Practice