hf权重转换代码小bug #47

yuanzhoulvpi2017 · 2023-11-03T01:28:09Z

在代码的这个地方，if config.num_hidden_layers % args.target_tensor_model_parallel_size != 0:写的不对,不应该是args.target_tensor_model_parallel_size , 应该是args.target_pipeline_model_parallel_size

if config.num_hidden_layers % args.target_tensor_model_parallel_size != 0:
        raise ValueError(
            f"Number of layers ({config.num_hidden_layers}) must be divisible by number of tensor parallelism"
            f" ({args.target_tensor_model_parallel_size})"
        )
    num_layers = config.num_hidden_layers // args.target_pipeline_model_parallel_size

    layer_re = re.compile(r"model.layers\.(\d+)\.([a-z0-9_.]+)\.([a-z]+)")

https://github.com/alibaba/Megatron-LLaMA/blob/main/tools/checkpoint_conversion/llama_checkpoint_conversion.py#L675C47-L675C47

The text was updated successfully, but these errors were encountered:

yuanzhoulvpi2017 changed the title hf hf权重转换代码小bug Nov 3, 2023

li-yi-dong assigned thuhujin Nov 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hf权重转换代码小bug #47

hf权重转换代码小bug #47

yuanzhoulvpi2017 commented Nov 3, 2023 •

edited

hf权重转换代码小bug #47

hf权重转换代码小bug #47

Comments

yuanzhoulvpi2017 commented Nov 3, 2023 • edited

yuanzhoulvpi2017 commented Nov 3, 2023 •

edited