Configs in inference.py necessary for context length expansion in model serving? #157

spring1915 · 2023-12-13T17:27:10Z

In inference.py, there're two settings

    orig_ctx_len = getattr(config, "max_position_embeddings", None)
    if orig_ctx_len and args.context_size > orig_ctx_len:
        scaling_factor = float(math.ceil(args.context_size / orig_ctx_len))
        config.rope_scaling = {"type": "linear", "factor": scaling_factor}

and
model.resize_token_embeddings(32001)
Are they needed for the fine-tuned model with extended context length to work properly? For example, I finetuned the orignal Llama2 model to get a new context length of 16k, do I still need the settings for the model during inference? This is important since it will save us the hassle of writing custom inference code when using certain model-serving frameworks. We just tell the framework the model's save location.

The text was updated successfully, but these errors were encountered:

spring1915 changed the title ~~Configs in inference.py necessary for context length expansion?~~ Configs in inference.py necessary for context length expansion in model serving? Dec 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configs in inference.py necessary for context length expansion in model serving? #157

Configs in inference.py necessary for context length expansion in model serving? #157

spring1915 commented Dec 13, 2023

Configs in inference.py necessary for context length expansion in model serving? #157

Configs in inference.py necessary for context length expansion in model serving? #157

Comments

spring1915 commented Dec 13, 2023