Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement dynamic LoRA swapping #262

Merged
merged 30 commits into from May 12, 2024
Merged

Implement dynamic LoRA swapping #262

merged 30 commits into from May 12, 2024

Conversation

EricLBuehler
Copy link
Owner

@EricLBuehler EricLBuehler commented May 3, 2024

Dynamic LoRA swapping, first raised in #259, enables the user to dynamically set active LoRA adapters. This can be configured per-request to enable users to add their own routing functionality.

Usage

Pre-loading

Adapters may be pre-loaded (but not activated) to remove runtime cost for loading adapters. The adapters to be pre-loaded must all share the same ordering. Therefore, a logical place to specify them is the LoRA ordering file, and should be done as such in the preload_adapters field:

{
    "order": ["..."],
    "layers": {"...": "123"},
    "base_model_id": "...",
    "preload_adapters": [{"name": "...", "adapter_model_id": "..."}] # New field here
}

Runtime APIs

APIs to dynamically activate LoRA adapters by name are exposed in the HTTP server, Rust, and Python APIs.

@EricLBuehler EricLBuehler added new feature New feature or request backend Backend work models Additions to model or architectures labels May 3, 2024
Copy link

github-actions bot commented May 3, 2024

Code Metrics Report
  ───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Rust                        72     23863     1572       530    21761       1325
───────────────────────────────────────────────────────────────────────────────
Total                       72     23863     1572       530    21761       1325
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop 85,737
Estimated Schedule Effort 11.916649 months
Estimated People Required 5.112342
───────────────────────────────────────────────────────────────────────────────
Processed 793364 bytes, 0.793 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────
  

@EricLBuehler EricLBuehler merged commit 96f25d5 into master May 12, 2024
10 checks passed
@EricLBuehler EricLBuehler deleted the lora_swapping branch May 12, 2024 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Backend work models Additions to model or architectures new feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant