Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualize layer activations and weights to simplify the quantization process. #607

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

HIT-cwh
Copy link
Collaborator

@HIT-cwh HIT-cwh commented Oct 24, 2023

Usage

  1. Draw the first type of plot which shows the absmax/absmean/max/mean/min value of a linear layer at different layers.

For instance, the code below presents the visualized results of computing the absolute maximum value across multiple tokens specifically for the input of all 'q_proj' linear layers within DecoderLayers 0 and 1

lmdeploy quant_visualization draw \
    --modes 1  \
    --pretrained_model_name_or_path internlm/internlm-chat-7b \
    --work_dir work_dir \
    --use_input \
    --key absmax \
    --linear_name q_proj \
    --layers 0,1

Output:
image

  1. Draw the second type of plot which shows the relationship between activations and weights.

For instance, the code below presents the visualized results of the relationship between the input of all 'q_proj' linear layers within DecoderLayers 0 and 1, and their corresponding weights.

lmdeploy quant_visualization draw \
    --modes 2  \
    --pretrained_model_name_or_path internlm/internlm-chat-7b \
    --work_dir work_dir \
    --key absmax \
    --linear_name q_proj \
    --layers 0,1

image

  1. Draw the third type of plot which is a boxplot showing the absmax/absmean/max/mean/min value of the input or output of a linear layer at different layers.

For instance, the code below displays a boxplot showcasing the absolute maximum values computed across multiple tokens, specifically for the input of all 'q_proj' linear layers.

lmdeploy quant_visualization draw \
    --modes 3  \
    --pretrained_model_name_or_path internlm/internlm-chat-7b \
    --work_dir work_dir \
    --key absmax \
    --linear_name q_proj \
    --use_input

image

  1. Draw the fourth type of plot which shows the relationship between maximum and minimum values of activations.

For instance, the code below illustrates the computed maximum and minimum values across multiple tokens, specifically for the input of all 'q_proj' linear layers within DecoderLayers 0 and 1.

lmdeploy quant_visualization draw \
    --modes 4  \
    --pretrained_model_name_or_path internlm/internlm-chat-7b \
    --work_dir work_dir \
    --linear_name q_proj \
    --use_input
    --layers 0,1

image

@pppppM pppppM self-requested a review October 24, 2023 14:03
@LZHgrla
Copy link
Collaborator

LZHgrla commented Oct 25, 2023

Hi, @HIT-cwh
Do we support the visualization of the weight values?

@HIT-cwh
Copy link
Collaborator Author

HIT-cwh commented Oct 27, 2023

Hi, @HIT-cwh Do we support the visualization of the weight values?

Support for this feature is currently in development and will be progressively enhanced in the forthcoming iterations.

@lvhan028
Copy link
Collaborator

Can we use lmdeploy lite view to visualize the activation and weights? It's simpler than lmdeploy quant_visualization draw

@lvhan028 lvhan028 self-requested a review November 15, 2023 06:44
@lvhan028
Copy link
Collaborator

May add user guide about the usage of this great tool.

@HIT-cwh
Copy link
Collaborator Author

HIT-cwh commented Nov 15, 2023

May add user guide about the usage of this great tool.

The commit that fixes the load ckpt bug has been split out. Please refer to pr690

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants