Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Staging PR for implimenting Phi-2 support. #97

Open
wants to merge 54 commits into
base: main
Choose a base branch
from

Commits on Jan 18, 2024

  1. Configuration menu
    Copy the full SHA
    1a21471 View commit details
    Browse the repository at this point in the history

Commits on Jan 22, 2024

  1. formatting and typo fix

    cm2435 committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    b60c138 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    598dda3 View commit details
    Browse the repository at this point in the history
  3. Contributed relu triton kernl, removed a small amount of boilerplate …

    …from the layernorm, added test suite for kernls.
    cm2435 committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    3954f16 View commit details
    Browse the repository at this point in the history

Commits on Jan 24, 2024

  1. Configuration menu
    Copy the full SHA
    48eb887 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    4a7c20c View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b6d224b View commit details
    Browse the repository at this point in the history

Commits on Feb 4, 2024

  1. Configuration menu
    Copy the full SHA
    e520e9a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    cbd31c3 View commit details
    Browse the repository at this point in the history
  3. updated tests

    cm2435 committed Feb 4, 2024
    Configuration menu
    Copy the full SHA
    b62c886 View commit details
    Browse the repository at this point in the history
  4. formatting

    cm2435 committed Feb 4, 2024
    Configuration menu
    Copy the full SHA
    e3e41a7 View commit details
    Browse the repository at this point in the history

Commits on Feb 6, 2024

  1. uncommented out init.py

    cm2435 committed Feb 6, 2024
    Configuration menu
    Copy the full SHA
    440ef5d View commit details
    Browse the repository at this point in the history
  2. doing work to impliment model

    cm2435 committed Feb 6, 2024
    Configuration menu
    Copy the full SHA
    a1e2b0d View commit details
    Browse the repository at this point in the history
  3. updated pre_patch for phi2

    cm2435 committed Feb 6, 2024
    Configuration menu
    Copy the full SHA
    f2112b1 View commit details
    Browse the repository at this point in the history

Commits on Feb 12, 2024

  1. Configuration menu
    Copy the full SHA
    0386c96 View commit details
    Browse the repository at this point in the history

Commits on Feb 19, 2024

  1. Configuration menu
    Copy the full SHA
    6ec3c4f View commit details
    Browse the repository at this point in the history

Commits on Feb 26, 2024

  1. Quick fixes (unslothai#101)

    * Fix tokenizer, dropout, bias for LoRA
    
    * Update loader.py
    
    * Fix LoRA downcasting
    
    * Update _utils.py
    
    * Saving to GGUF
    
    * fix
    
    * colab_quantize_to_gguf
    
    * move save modules
    
    * save module
    
    * Update __init__.py
    
    * Update save.py
    
    * Temp downgrade due to TRL issue
    
    * Fix up bugs
    
    * Faster saving + other changes
    
    * Update llama.py
    
    * Saving modules
    
    * spelling
    
    * Update llama.py
    
    * Update save.py
    
    * Update save.py
    
    * Update loader.py
    
    * Update llama.py
    
    * patch saving
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * patch saving
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * original_model
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * saving to RAM leakage?
    
    * Update save.py
    
    * new_save_directory
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update pyproject.toml
    
    * Update pyproject.toml
    
    * Update pyproject.toml
    
    * Quick fixes
    
    * Update llama.py
    
    * Update llama.py
    
    * Update dpo.py
    
    * Update dpo.py
    
    * Update llama.py
    
    * Update save.py
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    f070fad View commit details
    Browse the repository at this point in the history
  2. Revert quantization methods

    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    2904ad9 View commit details
    Browse the repository at this point in the history
  3. getattr issues (unslothai#103)

    * Fix tokenizer, dropout, bias for LoRA
    
    * Update loader.py
    
    * Fix LoRA downcasting
    
    * Update _utils.py
    
    * Saving to GGUF
    
    * fix
    
    * colab_quantize_to_gguf
    
    * move save modules
    
    * save module
    
    * Update __init__.py
    
    * Update save.py
    
    * Temp downgrade due to TRL issue
    
    * Fix up bugs
    
    * Faster saving + other changes
    
    * Update llama.py
    
    * Saving modules
    
    * spelling
    
    * Update llama.py
    
    * Update save.py
    
    * Update save.py
    
    * Update loader.py
    
    * Update llama.py
    
    * patch saving
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * patch saving
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * original_model
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * saving to RAM leakage?
    
    * Update save.py
    
    * new_save_directory
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update pyproject.toml
    
    * Update pyproject.toml
    
    * Update pyproject.toml
    
    * Quick fixes
    
    * Update llama.py
    
    * Update llama.py
    
    * Update dpo.py
    
    * Update dpo.py
    
    * Update llama.py
    
    * Update save.py
    
    * getattr
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    2da8a7d View commit details
    Browse the repository at this point in the history
  4. Update _utils.py

    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    24f943f View commit details
    Browse the repository at this point in the history
  5. Quick fixes (unslothai#106)

    * Fix tokenizer, dropout, bias for LoRA
    
    * Update loader.py
    
    * Fix LoRA downcasting
    
    * Update _utils.py
    
    * Saving to GGUF
    
    * fix
    
    * colab_quantize_to_gguf
    
    * move save modules
    
    * save module
    
    * Update __init__.py
    
    * Update save.py
    
    * Temp downgrade due to TRL issue
    
    * Fix up bugs
    
    * Faster saving + other changes
    
    * Update llama.py
    
    * Saving modules
    
    * spelling
    
    * Update llama.py
    
    * Update save.py
    
    * Update save.py
    
    * Update loader.py
    
    * Update llama.py
    
    * patch saving
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * patch saving
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * original_model
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * saving to RAM leakage?
    
    * Update save.py
    
    * new_save_directory
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update pyproject.toml
    
    * Update pyproject.toml
    
    * Update pyproject.toml
    
    * Quick fixes
    
    * Update llama.py
    
    * Update llama.py
    
    * Update dpo.py
    
    * Update dpo.py
    
    * Update llama.py
    
    * Update save.py
    
    * getattr
    
    * RSLoRA and LoftQ direct support
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Fix DPO + GGUF
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    ddb7bee View commit details
    Browse the repository at this point in the history
  6. Hotfix for Jan 2024 Release (unslothai#110)

    * Fix tokenizer, dropout, bias for LoRA
    
    * Update loader.py
    
    * Fix LoRA downcasting
    
    * Update _utils.py
    
    * Saving to GGUF
    
    * fix
    
    * colab_quantize_to_gguf
    
    * move save modules
    
    * save module
    
    * Update __init__.py
    
    * Update save.py
    
    * Temp downgrade due to TRL issue
    
    * Fix up bugs
    
    * Faster saving + other changes
    
    * Update llama.py
    
    * Saving modules
    
    * spelling
    
    * Update llama.py
    
    * Update save.py
    
    * Update save.py
    
    * Update loader.py
    
    * Update llama.py
    
    * patch saving
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * patch saving
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * original_model
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * saving to RAM leakage?
    
    * Update save.py
    
    * new_save_directory
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update pyproject.toml
    
    * Update pyproject.toml
    
    * Update pyproject.toml
    
    * Quick fixes
    
    * Update llama.py
    
    * Update llama.py
    
    * Update dpo.py
    
    * Update dpo.py
    
    * Update llama.py
    
    * Update save.py
    
    * getattr
    
    * RSLoRA and LoftQ direct support
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Fix DPO + GGUF
    
    * Fix quantization_method
    
    * Fix quantization_config
    
    * patch model
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update save.py
    
    * Update save.py
    
    * tokenizer_save_settings
    
    * Update save.py
    
    * quantization and loftq
    
    * Update save.py
    
    * Update llama.py
    
    * Update save.py
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    9a9e6d4 View commit details
    Browse the repository at this point in the history
  7. Fixed saving! (unslothai#113)

    * Fix tokenizer, dropout, bias for LoRA
    
    * Update loader.py
    
    * Fix LoRA downcasting
    
    * Update _utils.py
    
    * Saving to GGUF
    
    * fix
    
    * colab_quantize_to_gguf
    
    * move save modules
    
    * save module
    
    * Update __init__.py
    
    * Update save.py
    
    * Temp downgrade due to TRL issue
    
    * Fix up bugs
    
    * Faster saving + other changes
    
    * Update llama.py
    
    * Saving modules
    
    * spelling
    
    * Update llama.py
    
    * Update save.py
    
    * Update save.py
    
    * Update loader.py
    
    * Update llama.py
    
    * patch saving
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * patch saving
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * original_model
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * saving to RAM leakage?
    
    * Update save.py
    
    * new_save_directory
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update pyproject.toml
    
    * Update pyproject.toml
    
    * Update pyproject.toml
    
    * Quick fixes
    
    * Update llama.py
    
    * Update llama.py
    
    * Update dpo.py
    
    * Update dpo.py
    
    * Update llama.py
    
    * Update save.py
    
    * getattr
    
    * RSLoRA and LoftQ direct support
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Fix DPO + GGUF
    
    * Fix quantization_method
    
    * Fix quantization_config
    
    * patch model
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update save.py
    
    * Update save.py
    
    * tokenizer_save_settings
    
    * Update save.py
    
    * quantization and loftq
    
    * Update save.py
    
    * Update llama.py
    
    * Update save.py
    
    * upload_to_huggingface
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    b392c28 View commit details
    Browse the repository at this point in the history
  8. Update save.py

    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    3c880df View commit details
    Browse the repository at this point in the history
  9. Update save.py

    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    770b5ac View commit details
    Browse the repository at this point in the history
  10. Update save.py

    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    164319a View commit details
    Browse the repository at this point in the history
  11. Hotfix (unslothai#118)

    * faster saving & inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    8f996e2 View commit details
    Browse the repository at this point in the history
  12. 2-4x faster native HF inference (unslothai#119)

    * faster saving & inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * fast inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Mistral correct RoPE scaling
    
    * Max sequence lengths
    
    * Apache 2
    
    * fast_linear_forward
    
    * Update utils.py
    
    * Update utils.py
    
    * No print
    
    * Update utils.py
    
    * Update utils.py
    
    * inference
    
    * Update llama.py
    
    * Fast inference RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * LoRA
    
    * Fast LoRA saving
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    7e6f313 View commit details
    Browse the repository at this point in the history
  13. Fix bugs (unslothai#129)

    * faster saving & inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * fast inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Mistral correct RoPE scaling
    
    * Max sequence lengths
    
    * Apache 2
    
    * fast_linear_forward
    
    * Update utils.py
    
    * Update utils.py
    
    * No print
    
    * Update utils.py
    
    * Update utils.py
    
    * inference
    
    * Update llama.py
    
    * Fast inference RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * LoRA
    
    * Fast LoRA saving
    
    * Update llama.py
    
    * hidden_states
    
    * q_len == 1
    
    * q_len issue
    
    * Update mistral.py
    
    * Update mistral.py
    
    * incorrect inference
    
    * Update to transformers 4.37
    
    * Graceful FA2 error + torch 2.1.1
    
    * Update mapper.py
    
    * Update pyproject.toml
    
    * Fix saving and bnb-4bit
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * remove patching
    
    * Update llama.py
    
    * Update llama.py
    
    * Update swiglu.py
    
    * Repatch
    
    * Update fast_lora.py
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    f61ed0e View commit details
    Browse the repository at this point in the history
  14. More bug fixes (unslothai#133)

    * faster saving & inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * fast inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Mistral correct RoPE scaling
    
    * Max sequence lengths
    
    * Apache 2
    
    * fast_linear_forward
    
    * Update utils.py
    
    * Update utils.py
    
    * No print
    
    * Update utils.py
    
    * Update utils.py
    
    * inference
    
    * Update llama.py
    
    * Fast inference RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * LoRA
    
    * Fast LoRA saving
    
    * Update llama.py
    
    * hidden_states
    
    * q_len == 1
    
    * q_len issue
    
    * Update mistral.py
    
    * Update mistral.py
    
    * incorrect inference
    
    * Update to transformers 4.37
    
    * Graceful FA2 error + torch 2.1.1
    
    * Update mapper.py
    
    * Update pyproject.toml
    
    * Fix saving and bnb-4bit
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * remove patching
    
    * Update llama.py
    
    * Update llama.py
    
    * Update swiglu.py
    
    * Repatch
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update save.py
    
    * Update fast_lora.py
    
    * Update utils.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update save.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    43c146d View commit details
    Browse the repository at this point in the history
  15. Inference bug fix (unslothai#134)

    * faster saving & inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * fast inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Mistral correct RoPE scaling
    
    * Max sequence lengths
    
    * Apache 2
    
    * fast_linear_forward
    
    * Update utils.py
    
    * Update utils.py
    
    * No print
    
    * Update utils.py
    
    * Update utils.py
    
    * inference
    
    * Update llama.py
    
    * Fast inference RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * LoRA
    
    * Fast LoRA saving
    
    * Update llama.py
    
    * hidden_states
    
    * q_len == 1
    
    * q_len issue
    
    * Update mistral.py
    
    * Update mistral.py
    
    * incorrect inference
    
    * Update to transformers 4.37
    
    * Graceful FA2 error + torch 2.1.1
    
    * Update mapper.py
    
    * Update pyproject.toml
    
    * Fix saving and bnb-4bit
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * remove patching
    
    * Update llama.py
    
    * Update llama.py
    
    * Update swiglu.py
    
    * Repatch
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update save.py
    
    * Update fast_lora.py
    
    * Update utils.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update save.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Revert "Update llama.py"
    
    This reverts commit a208ec4.
    
    * Update llama.py
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    cb4c49c View commit details
    Browse the repository at this point in the history
  16. Fix bugs + more accurate Swiglu (unslothai#137)

    * faster saving & inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * fast inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Mistral correct RoPE scaling
    
    * Max sequence lengths
    
    * Apache 2
    
    * fast_linear_forward
    
    * Update utils.py
    
    * Update utils.py
    
    * No print
    
    * Update utils.py
    
    * Update utils.py
    
    * inference
    
    * Update llama.py
    
    * Fast inference RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * LoRA
    
    * Fast LoRA saving
    
    * Update llama.py
    
    * hidden_states
    
    * q_len == 1
    
    * q_len issue
    
    * Update mistral.py
    
    * Update mistral.py
    
    * incorrect inference
    
    * Update to transformers 4.37
    
    * Graceful FA2 error + torch 2.1.1
    
    * Update mapper.py
    
    * Update pyproject.toml
    
    * Fix saving and bnb-4bit
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * remove patching
    
    * Update llama.py
    
    * Update llama.py
    
    * Update swiglu.py
    
    * Repatch
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update save.py
    
    * Update fast_lora.py
    
    * Update utils.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update save.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Revert "Update llama.py"
    
    This reverts commit a208ec4.
    
    * Update llama.py
    
    * Works?
    
    * Update pyproject.toml
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Swiglu
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * attention_mask
    
    * Update llama.py
    
    * Update llama.py
    
    * labels
    
    * Update mistral.py
    
    * Update llama.py
    
    * attention mask
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    d08a042 View commit details
    Browse the repository at this point in the history
  17. 1 more bug (unslothai#138)

    * faster saving & inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * fast inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Mistral correct RoPE scaling
    
    * Max sequence lengths
    
    * Apache 2
    
    * fast_linear_forward
    
    * Update utils.py
    
    * Update utils.py
    
    * No print
    
    * Update utils.py
    
    * Update utils.py
    
    * inference
    
    * Update llama.py
    
    * Fast inference RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * LoRA
    
    * Fast LoRA saving
    
    * Update llama.py
    
    * hidden_states
    
    * q_len == 1
    
    * q_len issue
    
    * Update mistral.py
    
    * Update mistral.py
    
    * incorrect inference
    
    * Update to transformers 4.37
    
    * Graceful FA2 error + torch 2.1.1
    
    * Update mapper.py
    
    * Update pyproject.toml
    
    * Fix saving and bnb-4bit
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * remove patching
    
    * Update llama.py
    
    * Update llama.py
    
    * Update swiglu.py
    
    * Repatch
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update save.py
    
    * Update fast_lora.py
    
    * Update utils.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update save.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Revert "Update llama.py"
    
    This reverts commit a208ec4.
    
    * Update llama.py
    
    * Works?
    
    * Update pyproject.toml
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Swiglu
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * attention_mask
    
    * Update llama.py
    
    * Update llama.py
    
    * labels
    
    * Update mistral.py
    
    * Update llama.py
    
    * attention mask
    
    * Update save.py
    
    * Update save.py
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    c1d6501 View commit details
    Browse the repository at this point in the history
  18. Fix saving issues (unslothai#139)

    * faster saving & inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * fast inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Mistral correct RoPE scaling
    
    * Max sequence lengths
    
    * Apache 2
    
    * fast_linear_forward
    
    * Update utils.py
    
    * Update utils.py
    
    * No print
    
    * Update utils.py
    
    * Update utils.py
    
    * inference
    
    * Update llama.py
    
    * Fast inference RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * LoRA
    
    * Fast LoRA saving
    
    * Update llama.py
    
    * hidden_states
    
    * q_len == 1
    
    * q_len issue
    
    * Update mistral.py
    
    * Update mistral.py
    
    * incorrect inference
    
    * Update to transformers 4.37
    
    * Graceful FA2 error + torch 2.1.1
    
    * Update mapper.py
    
    * Update pyproject.toml
    
    * Fix saving and bnb-4bit
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * remove patching
    
    * Update llama.py
    
    * Update llama.py
    
    * Update swiglu.py
    
    * Repatch
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update save.py
    
    * Update fast_lora.py
    
    * Update utils.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update save.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Revert "Update llama.py"
    
    This reverts commit a208ec4.
    
    * Update llama.py
    
    * Works?
    
    * Update pyproject.toml
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Swiglu
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * attention_mask
    
    * Update llama.py
    
    * Update llama.py
    
    * labels
    
    * Update mistral.py
    
    * Update llama.py
    
    * attention mask
    
    * Update save.py
    
    * Update save.py
    
    * Update mistral.py
    
    * attention mask
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update dpo.py
    
    * Patch saving
    
    * Update save.py
    
    * Update save.py
    
    * patch_saving_functions
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * print
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    7a2f5d2 View commit details
    Browse the repository at this point in the history
  19. Nightly (unslothai#140)

    * faster saving & inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * fast inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Mistral correct RoPE scaling
    
    * Max sequence lengths
    
    * Apache 2
    
    * fast_linear_forward
    
    * Update utils.py
    
    * Update utils.py
    
    * No print
    
    * Update utils.py
    
    * Update utils.py
    
    * inference
    
    * Update llama.py
    
    * Fast inference RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * LoRA
    
    * Fast LoRA saving
    
    * Update llama.py
    
    * hidden_states
    
    * q_len == 1
    
    * q_len issue
    
    * Update mistral.py
    
    * Update mistral.py
    
    * incorrect inference
    
    * Update to transformers 4.37
    
    * Graceful FA2 error + torch 2.1.1
    
    * Update mapper.py
    
    * Update pyproject.toml
    
    * Fix saving and bnb-4bit
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * remove patching
    
    * Update llama.py
    
    * Update llama.py
    
    * Update swiglu.py
    
    * Repatch
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update save.py
    
    * Update fast_lora.py
    
    * Update utils.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update save.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Revert "Update llama.py"
    
    This reverts commit a208ec4.
    
    * Update llama.py
    
    * Works?
    
    * Update pyproject.toml
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Swiglu
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * attention_mask
    
    * Update llama.py
    
    * Update llama.py
    
    * labels
    
    * Update mistral.py
    
    * Update llama.py
    
    * attention mask
    
    * Update save.py
    
    * Update save.py
    
    * Update mistral.py
    
    * attention mask
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update dpo.py
    
    * Patch saving
    
    * Update save.py
    
    * Update save.py
    
    * patch_saving_functions
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * print
    
    * Mistral patch
    
    * Update mistral.py
    
    * Update save.py
    
    * saving
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    77866a2 View commit details
    Browse the repository at this point in the history
  20. Fix inference attention mask (unslothai#142)

    * faster saving & inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * fast inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Mistral correct RoPE scaling
    
    * Max sequence lengths
    
    * Apache 2
    
    * fast_linear_forward
    
    * Update utils.py
    
    * Update utils.py
    
    * No print
    
    * Update utils.py
    
    * Update utils.py
    
    * inference
    
    * Update llama.py
    
    * Fast inference RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * LoRA
    
    * Fast LoRA saving
    
    * Update llama.py
    
    * hidden_states
    
    * q_len == 1
    
    * q_len issue
    
    * Update mistral.py
    
    * Update mistral.py
    
    * incorrect inference
    
    * Update to transformers 4.37
    
    * Graceful FA2 error + torch 2.1.1
    
    * Update mapper.py
    
    * Update pyproject.toml
    
    * Fix saving and bnb-4bit
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * remove patching
    
    * Update llama.py
    
    * Update llama.py
    
    * Update swiglu.py
    
    * Repatch
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update save.py
    
    * Update fast_lora.py
    
    * Update utils.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update save.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Revert "Update llama.py"
    
    This reverts commit a208ec4.
    
    * Update llama.py
    
    * Works?
    
    * Update pyproject.toml
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Swiglu
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * attention_mask
    
    * Update llama.py
    
    * Update llama.py
    
    * labels
    
    * Update mistral.py
    
    * Update llama.py
    
    * attention mask
    
    * Update save.py
    
    * Update save.py
    
    * Update mistral.py
    
    * attention mask
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update dpo.py
    
    * Patch saving
    
    * Update save.py
    
    * Update save.py
    
    * patch_saving_functions
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * print
    
    * Mistral patch
    
    * Update mistral.py
    
    * Update save.py
    
    * saving
    
    * Update llama.py
    
    * Update llama.py
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    7b667a4 View commit details
    Browse the repository at this point in the history
  21. Hotfix - fix inference (unslothai#146)

    * faster saving & inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * fast inference
    
    * Update llama.py
    
    * Update save.py
    
    * Update llama.py
    
    * Mistral correct RoPE scaling
    
    * Max sequence lengths
    
    * Apache 2
    
    * fast_linear_forward
    
    * Update utils.py
    
    * Update utils.py
    
    * No print
    
    * Update utils.py
    
    * Update utils.py
    
    * inference
    
    * Update llama.py
    
    * Fast inference RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * RoPE
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * LoRA
    
    * Fast LoRA saving
    
    * Update llama.py
    
    * hidden_states
    
    * q_len == 1
    
    * q_len issue
    
    * Update mistral.py
    
    * Update mistral.py
    
    * incorrect inference
    
    * Update to transformers 4.37
    
    * Graceful FA2 error + torch 2.1.1
    
    * Update mapper.py
    
    * Update pyproject.toml
    
    * Fix saving and bnb-4bit
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * remove patching
    
    * Update llama.py
    
    * Update llama.py
    
    * Update swiglu.py
    
    * Repatch
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update save.py
    
    * Update fast_lora.py
    
    * Update utils.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update save.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Revert "Update llama.py"
    
    This reverts commit a208ec4.
    
    * Update llama.py
    
    * Works?
    
    * Update pyproject.toml
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Swiglu
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * attention_mask
    
    * Update llama.py
    
    * Update llama.py
    
    * labels
    
    * Update mistral.py
    
    * Update llama.py
    
    * attention mask
    
    * Update save.py
    
    * Update save.py
    
    * Update mistral.py
    
    * attention mask
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update dpo.py
    
    * Patch saving
    
    * Update save.py
    
    * Update save.py
    
    * patch_saving_functions
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * print
    
    * Mistral patch
    
    * Update mistral.py
    
    * Update save.py
    
    * saving
    
    * Update llama.py
    
    * Update llama.py
    
    * Fast inference repatch
    
    * Update llama.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update mistral.py
    
    * Update __init__.py
    
    * Fix inference
    
    * Update mistral.py
    
    * fast lm_head
    
    * Remove fast path
    
    * Update rope_embedding.py
    
    * Update loader.py
    
    * LlamaAttention_fast_forward_inference
    
    * if past_key_value is not None and q_len == 1:
    
    * revert inference
    
    * Update loader.py
    
    * past_key_value
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    d129628 View commit details
    Browse the repository at this point in the history
  22. 2x faster inference (unslothai#151)

    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update save.py
    
    * Update fast_lora.py
    
    * Update utils.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update save.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Revert "Update llama.py"
    
    This reverts commit a208ec4.
    
    * Update llama.py
    
    * Works?
    
    * Update pyproject.toml
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Swiglu
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * attention_mask
    
    * Update llama.py
    
    * Update llama.py
    
    * labels
    
    * Update mistral.py
    
    * Update llama.py
    
    * attention mask
    
    * Update save.py
    
    * Update save.py
    
    * Update mistral.py
    
    * attention mask
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update dpo.py
    
    * Patch saving
    
    * Update save.py
    
    * Update save.py
    
    * patch_saving_functions
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * print
    
    * Mistral patch
    
    * Update mistral.py
    
    * Update save.py
    
    * saving
    
    * Update llama.py
    
    * Update llama.py
    
    * Fast inference repatch
    
    * Update llama.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update mistral.py
    
    * Update __init__.py
    
    * Fix inference
    
    * Update mistral.py
    
    * fast lm_head
    
    * Remove fast path
    
    * Update rope_embedding.py
    
    * Update loader.py
    
    * LlamaAttention_fast_forward_inference
    
    * if past_key_value is not None and q_len == 1:
    
    * revert inference
    
    * Update loader.py
    
    * past_key_value
    
    * Update llama.py
    
    * Update llama.py
    
    * Fix SDPA
    
    * Update llama.py
    
    * padding
    
    * Inference
    
    * Update llama.py
    
    * Revert
    
    * Update mistral.py
    
    * faster inference
    
    * inference
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * inference
    
    * Update llama.py
    
    * Update utils.py
    
    * faster inference
    
    * Update llama.py
    
    * revert
    
    * lm_head
    
    * Update llama.py
    
    * inference
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * faster inference
    
    * Update llama.py
    
    * fast inference
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * torch compile
    
    * past_key_values
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update llama.py
    
    * fast inference + saving config.json
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * fast inference again
    
    * more temp matrices
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * fast inference
    
    * Update mistral.py
    
    * Update llama.py
    
    * SDPA
    
    * attention_mask
    
    * New version
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update utils.py
    
    * Update utils.py
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    60acab2 View commit details
    Browse the repository at this point in the history
  23. ReadMe Revamp (unslothai#156)

    * HF Perf Button
    
    * Update README.md
    
    Adding new buttons cleanup
    
    * Update README.md
    
    * Delete images/Discord.png
    
    * Delete images/try live demo green.png
    
    * new transparent logos
    
    * Revamping page
    
    * Revamp mainpage
    
    * Update README.md
    
    * Update README.md
    
    * Update README.md
    
    * Update README.md
    
    * Update README.md
    
    * Update README.md
    
    * Update README.md
    
    * finetune button
    
    * Delete start free finetune button.png
    
    * free finetune button
    
    * Add files via upload
    
    * Update README.md
    
    * Update README.md
    
    * Add files via upload
    
    * Add files via upload
    
    * Update README.md
    
    * Add files via upload
    
    * Update README.md
    
    * Update README.md
    
    * Update README.md
    
    * Update README.md
    
    * Update README.md
    
    * Update README.md
    
    * Update README.md
    
    * Update README.md
    
    * Update README.md
    
    * Update README.md
    
    * Squashed commit of the following:
    
    commit 35f2ab4a8b4deecbbbe9fbd95f4efde8694233db
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Sun Feb 4 17:35:56 2024 +1100
    
        2x faster inference (#151)
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update save.py
    
        * Update fast_lora.py
    
        * Update utils.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update save.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Revert "Update llama.py"
    
        This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
    
        * Update llama.py
    
        * Works?
    
        * Update pyproject.toml
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Swiglu
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * attention_mask
    
        * Update llama.py
    
        * Update llama.py
    
        * labels
    
        * Update mistral.py
    
        * Update llama.py
    
        * attention mask
    
        * Update save.py
    
        * Update save.py
    
        * Update mistral.py
    
        * attention mask
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update dpo.py
    
        * Patch saving
    
        * Update save.py
    
        * Update save.py
    
        * patch_saving_functions
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * print
    
        * Mistral patch
    
        * Update mistral.py
    
        * Update save.py
    
        * saving
    
        * Update llama.py
    
        * Update llama.py
    
        * Fast inference repatch
    
        * Update llama.py
    
        * Update utils.py
    
        * Update utils.py
    
        * Update utils.py
    
        * Update mistral.py
    
        * Update __init__.py
    
        * Fix inference
    
        * Update mistral.py
    
        * fast lm_head
    
        * Remove fast path
    
        * Update rope_embedding.py
    
        * Update loader.py
    
        * LlamaAttention_fast_forward_inference
    
        * if past_key_value is not None and q_len == 1:
    
        * revert inference
    
        * Update loader.py
    
        * past_key_value
    
        * Update llama.py
    
        * Update llama.py
    
        * Fix SDPA
    
        * Update llama.py
    
        * padding
    
        * Inference
    
        * Update llama.py
    
        * Revert
    
        * Update mistral.py
    
        * faster inference
    
        * inference
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * inference
    
        * Update llama.py
    
        * Update utils.py
    
        * faster inference
    
        * Update llama.py
    
        * revert
    
        * lm_head
    
        * Update llama.py
    
        * inference
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * faster inference
    
        * Update llama.py
    
        * fast inference
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * torch compile
    
        * past_key_values
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update utils.py
    
        * Update utils.py
    
        * Update utils.py
    
        * Update utils.py
    
        * Update llama.py
    
        * fast inference + saving config.json
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * fast inference again
    
        * more temp matrices
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * fast inference
    
        * Update mistral.py
    
        * Update llama.py
    
        * SDPA
    
        * attention_mask
    
        * New version
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update utils.py
    
        * Update utils.py
    
    commit 051a73b0e63d3ae3acd7c4d962349280f69bbdb0
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Wed Jan 31 04:03:37 2024 +1100
    
        Hotfix - fix inference (#146)
    
        * faster saving & inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * fast inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Mistral correct RoPE scaling
    
        * Max sequence lengths
    
        * Apache 2
    
        * fast_linear_forward
    
        * Update utils.py
    
        * Update utils.py
    
        * No print
    
        * Update utils.py
    
        * Update utils.py
    
        * inference
    
        * Update llama.py
    
        * Fast inference RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * LoRA
    
        * Fast LoRA saving
    
        * Update llama.py
    
        * hidden_states
    
        * q_len == 1
    
        * q_len issue
    
        * Update mistral.py
    
        * Update mistral.py
    
        * incorrect inference
    
        * Update to transformers 4.37
    
        * Graceful FA2 error + torch 2.1.1
    
        * Update mapper.py
    
        * Update pyproject.toml
    
        * Fix saving and bnb-4bit
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * remove patching
    
        * Update llama.py
    
        * Update llama.py
    
        * Update swiglu.py
    
        * Repatch
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update save.py
    
        * Update fast_lora.py
    
        * Update utils.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update save.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Revert "Update llama.py"
    
        This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
    
        * Update llama.py
    
        * Works?
    
        * Update pyproject.toml
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Swiglu
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * attention_mask
    
        * Update llama.py
    
        * Update llama.py
    
        * labels
    
        * Update mistral.py
    
        * Update llama.py
    
        * attention mask
    
        * Update save.py
    
        * Update save.py
    
        * Update mistral.py
    
        * attention mask
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update dpo.py
    
        * Patch saving
    
        * Update save.py
    
        * Update save.py
    
        * patch_saving_functions
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * print
    
        * Mistral patch
    
        * Update mistral.py
    
        * Update save.py
    
        * saving
    
        * Update llama.py
    
        * Update llama.py
    
        * Fast inference repatch
    
        * Update llama.py
    
        * Update utils.py
    
        * Update utils.py
    
        * Update utils.py
    
        * Update mistral.py
    
        * Update __init__.py
    
        * Fix inference
    
        * Update mistral.py
    
        * fast lm_head
    
        * Remove fast path
    
        * Update rope_embedding.py
    
        * Update loader.py
    
        * LlamaAttention_fast_forward_inference
    
        * if past_key_value is not None and q_len == 1:
    
        * revert inference
    
        * Update loader.py
    
        * past_key_value
    
    commit 05624642802c7f90dcc7aeea0e1c8d447cde006e
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Mon Jan 29 17:49:54 2024 +1100
    
        Fix inference attention mask (#142)
    
        * faster saving & inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * fast inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Mistral correct RoPE scaling
    
        * Max sequence lengths
    
        * Apache 2
    
        * fast_linear_forward
    
        * Update utils.py
    
        * Update utils.py
    
        * No print
    
        * Update utils.py
    
        * Update utils.py
    
        * inference
    
        * Update llama.py
    
        * Fast inference RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * LoRA
    
        * Fast LoRA saving
    
        * Update llama.py
    
        * hidden_states
    
        * q_len == 1
    
        * q_len issue
    
        * Update mistral.py
    
        * Update mistral.py
    
        * incorrect inference
    
        * Update to transformers 4.37
    
        * Graceful FA2 error + torch 2.1.1
    
        * Update mapper.py
    
        * Update pyproject.toml
    
        * Fix saving and bnb-4bit
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * remove patching
    
        * Update llama.py
    
        * Update llama.py
    
        * Update swiglu.py
    
        * Repatch
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update save.py
    
        * Update fast_lora.py
    
        * Update utils.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update save.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Revert "Update llama.py"
    
        This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
    
        * Update llama.py
    
        * Works?
    
        * Update pyproject.toml
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Swiglu
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * attention_mask
    
        * Update llama.py
    
        * Update llama.py
    
        * labels
    
        * Update mistral.py
    
        * Update llama.py
    
        * attention mask
    
        * Update save.py
    
        * Update save.py
    
        * Update mistral.py
    
        * attention mask
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update dpo.py
    
        * Patch saving
    
        * Update save.py
    
        * Update save.py
    
        * patch_saving_functions
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * print
    
        * Mistral patch
    
        * Update mistral.py
    
        * Update save.py
    
        * saving
    
        * Update llama.py
    
        * Update llama.py
    
    commit 206a9b65f090bd71ccaad7dd88b67ba2bfde0b58
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Mon Jan 29 03:45:07 2024 +1100
    
        Nightly (#140)
    
        * faster saving & inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * fast inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Mistral correct RoPE scaling
    
        * Max sequence lengths
    
        * Apache 2
    
        * fast_linear_forward
    
        * Update utils.py
    
        * Update utils.py
    
        * No print
    
        * Update utils.py
    
        * Update utils.py
    
        * inference
    
        * Update llama.py
    
        * Fast inference RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * LoRA
    
        * Fast LoRA saving
    
        * Update llama.py
    
        * hidden_states
    
        * q_len == 1
    
        * q_len issue
    
        * Update mistral.py
    
        * Update mistral.py
    
        * incorrect inference
    
        * Update to transformers 4.37
    
        * Graceful FA2 error + torch 2.1.1
    
        * Update mapper.py
    
        * Update pyproject.toml
    
        * Fix saving and bnb-4bit
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * remove patching
    
        * Update llama.py
    
        * Update llama.py
    
        * Update swiglu.py
    
        * Repatch
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update save.py
    
        * Update fast_lora.py
    
        * Update utils.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update save.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Revert "Update llama.py"
    
        This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
    
        * Update llama.py
    
        * Works?
    
        * Update pyproject.toml
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Swiglu
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * attention_mask
    
        * Update llama.py
    
        * Update llama.py
    
        * labels
    
        * Update mistral.py
    
        * Update llama.py
    
        * attention mask
    
        * Update save.py
    
        * Update save.py
    
        * Update mistral.py
    
        * attention mask
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update dpo.py
    
        * Patch saving
    
        * Update save.py
    
        * Update save.py
    
        * patch_saving_functions
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * print
    
        * Mistral patch
    
        * Update mistral.py
    
        * Update save.py
    
        * saving
    
    commit 8faf469f028a05852b2dc29ec8df1f36998fab33
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Mon Jan 29 02:52:39 2024 +1100
    
        Fix saving issues (#139)
    
        * faster saving & inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * fast inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Mistral correct RoPE scaling
    
        * Max sequence lengths
    
        * Apache 2
    
        * fast_linear_forward
    
        * Update utils.py
    
        * Update utils.py
    
        * No print
    
        * Update utils.py
    
        * Update utils.py
    
        * inference
    
        * Update llama.py
    
        * Fast inference RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * LoRA
    
        * Fast LoRA saving
    
        * Update llama.py
    
        * hidden_states
    
        * q_len == 1
    
        * q_len issue
    
        * Update mistral.py
    
        * Update mistral.py
    
        * incorrect inference
    
        * Update to transformers 4.37
    
        * Graceful FA2 error + torch 2.1.1
    
        * Update mapper.py
    
        * Update pyproject.toml
    
        * Fix saving and bnb-4bit
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * remove patching
    
        * Update llama.py
    
        * Update llama.py
    
        * Update swiglu.py
    
        * Repatch
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update save.py
    
        * Update fast_lora.py
    
        * Update utils.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update save.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Revert "Update llama.py"
    
        This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
    
        * Update llama.py
    
        * Works?
    
        * Update pyproject.toml
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Swiglu
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * attention_mask
    
        * Update llama.py
    
        * Update llama.py
    
        * labels
    
        * Update mistral.py
    
        * Update llama.py
    
        * attention mask
    
        * Update save.py
    
        * Update save.py
    
        * Update mistral.py
    
        * attention mask
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update dpo.py
    
        * Patch saving
    
        * Update save.py
    
        * Update save.py
    
        * patch_saving_functions
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * print
    
    commit 1ecc0185a5759c7a0c95dfc96aceea5023cebdfc
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Sun Jan 28 04:30:29 2024 +1100
    
        1 more bug (#138)
    
        * faster saving & inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * fast inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Mistral correct RoPE scaling
    
        * Max sequence lengths
    
        * Apache 2
    
        * fast_linear_forward
    
        * Update utils.py
    
        * Update utils.py
    
        * No print
    
        * Update utils.py
    
        * Update utils.py
    
        * inference
    
        * Update llama.py
    
        * Fast inference RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * LoRA
    
        * Fast LoRA saving
    
        * Update llama.py
    
        * hidden_states
    
        * q_len == 1
    
        * q_len issue
    
        * Update mistral.py
    
        * Update mistral.py
    
        * incorrect inference
    
        * Update to transformers 4.37
    
        * Graceful FA2 error + torch 2.1.1
    
        * Update mapper.py
    
        * Update pyproject.toml
    
        * Fix saving and bnb-4bit
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * remove patching
    
        * Update llama.py
    
        * Update llama.py
    
        * Update swiglu.py
    
        * Repatch
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update save.py
    
        * Update fast_lora.py
    
        * Update utils.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update save.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Revert "Update llama.py"
    
        This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
    
        * Update llama.py
    
        * Works?
    
        * Update pyproject.toml
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Swiglu
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * attention_mask
    
        * Update llama.py
    
        * Update llama.py
    
        * labels
    
        * Update mistral.py
    
        * Update llama.py
    
        * attention mask
    
        * Update save.py
    
        * Update save.py
    
    commit cd32ba76b71adf3317ede9de7d1cf6f30ad3bf0d
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Sun Jan 28 04:20:06 2024 +1100
    
        Fix bugs + more accurate Swiglu (#137)
    
        * faster saving & inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * fast inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Mistral correct RoPE scaling
    
        * Max sequence lengths
    
        * Apache 2
    
        * fast_linear_forward
    
        * Update utils.py
    
        * Update utils.py
    
        * No print
    
        * Update utils.py
    
        * Update utils.py
    
        * inference
    
        * Update llama.py
    
        * Fast inference RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * LoRA
    
        * Fast LoRA saving
    
        * Update llama.py
    
        * hidden_states
    
        * q_len == 1
    
        * q_len issue
    
        * Update mistral.py
    
        * Update mistral.py
    
        * incorrect inference
    
        * Update to transformers 4.37
    
        * Graceful FA2 error + torch 2.1.1
    
        * Update mapper.py
    
        * Update pyproject.toml
    
        * Fix saving and bnb-4bit
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * remove patching
    
        * Update llama.py
    
        * Update llama.py
    
        * Update swiglu.py
    
        * Repatch
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update save.py
    
        * Update fast_lora.py
    
        * Update utils.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update save.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Revert "Update llama.py"
    
        This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
    
        * Update llama.py
    
        * Works?
    
        * Update pyproject.toml
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Swiglu
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * attention_mask
    
        * Update llama.py
    
        * Update llama.py
    
        * labels
    
        * Update mistral.py
    
        * Update llama.py
    
        * attention mask
    
    commit 89daa0efcc38c7690abbb8170b5d9f3d364796ce
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Sat Jan 27 04:50:22 2024 +1100
    
        Inference bug fix (#134)
    
        * faster saving & inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * fast inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Mistral correct RoPE scaling
    
        * Max sequence lengths
    
        * Apache 2
    
        * fast_linear_forward
    
        * Update utils.py
    
        * Update utils.py
    
        * No print
    
        * Update utils.py
    
        * Update utils.py
    
        * inference
    
        * Update llama.py
    
        * Fast inference RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * LoRA
    
        * Fast LoRA saving
    
        * Update llama.py
    
        * hidden_states
    
        * q_len == 1
    
        * q_len issue
    
        * Update mistral.py
    
        * Update mistral.py
    
        * incorrect inference
    
        * Update to transformers 4.37
    
        * Graceful FA2 error + torch 2.1.1
    
        * Update mapper.py
    
        * Update pyproject.toml
    
        * Fix saving and bnb-4bit
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * remove patching
    
        * Update llama.py
    
        * Update llama.py
    
        * Update swiglu.py
    
        * Repatch
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update save.py
    
        * Update fast_lora.py
    
        * Update utils.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update save.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Revert "Update llama.py"
    
        This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
    
        * Update llama.py
    
    commit 87a7ef1049f6fca409a0673f51f4758e0aff248d
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Sat Jan 27 04:47:54 2024 +1100
    
        More bug fixes (#133)
    
        * faster saving & inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * fast inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Mistral correct RoPE scaling
    
        * Max sequence lengths
    
        * Apache 2
    
        * fast_linear_forward
    
        * Update utils.py
    
        * Update utils.py
    
        * No print
    
        * Update utils.py
    
        * Update utils.py
    
        * inference
    
        * Update llama.py
    
        * Fast inference RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * LoRA
    
        * Fast LoRA saving
    
        * Update llama.py
    
        * hidden_states
    
        * q_len == 1
    
        * q_len issue
    
        * Update mistral.py
    
        * Update mistral.py
    
        * incorrect inference
    
        * Update to transformers 4.37
    
        * Graceful FA2 error + torch 2.1.1
    
        * Update mapper.py
    
        * Update pyproject.toml
    
        * Fix saving and bnb-4bit
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * remove patching
    
        * Update llama.py
    
        * Update llama.py
    
        * Update swiglu.py
    
        * Repatch
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * Update save.py
    
        * Update fast_lora.py
    
        * Update utils.py
    
        * Update llama.py
    
        * Update fast_lora.py
    
        * Update swiglu.py
    
        * Update save.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
    commit 3d67790901696e953171f64b4bf9d980780051a0
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Fri Jan 26 04:19:17 2024 +1100
    
        Fix bugs (#129)
    
        * faster saving & inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * fast inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Mistral correct RoPE scaling
    
        * Max sequence lengths
    
        * Apache 2
    
        * fast_linear_forward
    
        * Update utils.py
    
        * Update utils.py
    
        * No print
    
        * Update utils.py
    
        * Update utils.py
    
        * inference
    
        * Update llama.py
    
        * Fast inference RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * LoRA
    
        * Fast LoRA saving
    
        * Update llama.py
    
        * hidden_states
    
        * q_len == 1
    
        * q_len issue
    
        * Update mistral.py
    
        * Update mistral.py
    
        * incorrect inference
    
        * Update to transformers 4.37
    
        * Graceful FA2 error + torch 2.1.1
    
        * Update mapper.py
    
        * Update pyproject.toml
    
        * Fix saving and bnb-4bit
    
        * Update fast_lora.py
    
        * Update fast_lora.py
    
        * remove patching
    
        * Update llama.py
    
        * Update llama.py
    
        * Update swiglu.py
    
        * Repatch
    
        * Update fast_lora.py
    
    commit a833f403462e9cfc1f96b3b84d9da15d7d8db5ee
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Tue Jan 23 03:55:24 2024 +1100
    
        2-4x faster native HF inference (#119)
    
        * faster saving & inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * fast inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Mistral correct RoPE scaling
    
        * Max sequence lengths
    
        * Apache 2
    
        * fast_linear_forward
    
        * Update utils.py
    
        * Update utils.py
    
        * No print
    
        * Update utils.py
    
        * Update utils.py
    
        * inference
    
        * Update llama.py
    
        * Fast inference RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * RoPE
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * LoRA
    
        * Fast LoRA saving
    
    commit b370c9c8aacc31a7845404566dd95dfa8c0e3bac
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Sun Jan 21 22:20:22 2024 +1100
    
        Hotfix (#118)
    
        * faster saving & inference
    
        * Update llama.py
    
        * Update save.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update mistral.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
    commit 57a5b5a49da588b1db8e9a988cc985dc20393d34
    Author: Daniel Han-Chen <danielhanchen@gmail.com>
    Date:   Sun Jan 21 05:00:37 2024 +1100
    
        Update save.py
    
    commit 5145a61e69ab9b3035465f649e1c1e5aae749f8f
    Author: Daniel Han-Chen <danielhanchen@gmail.com>
    Date:   Sun Jan 21 04:21:54 2024 +1100
    
        Update save.py
    
    commit a7bd8d119c16433de4f8b6a36903ef7131f225e5
    Author: Daniel Han-Chen <danielhanchen@gmail.com>
    Date:   Sun Jan 21 04:13:03 2024 +1100
    
        Update save.py
    
    commit be4b97e7d89074b6dd1d2e984fa429051d328192
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Sun Jan 21 03:43:49 2024 +1100
    
        Fixed saving! (#113)
    
        * Fix tokenizer, dropout, bias for LoRA
    
        * Update loader.py
    
        * Fix LoRA downcasting
    
        * Update _utils.py
    
        * Saving to GGUF
    
        * fix
    
        * colab_quantize_to_gguf
    
        * move save modules
    
        * save module
    
        * Update __init__.py
    
        * Update save.py
    
        * Temp downgrade due to TRL issue
    
        * Fix up bugs
    
        * Faster saving + other changes
    
        * Update llama.py
    
        * Saving modules
    
        * spelling
    
        * Update llama.py
    
        * Update save.py
    
        * Update save.py
    
        * Update loader.py
    
        * Update llama.py
    
        * patch saving
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * patch saving
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * original_model
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * saving to RAM leakage?
    
        * Update save.py
    
        * new_save_directory
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update pyproject.toml
    
        * Update pyproject.toml
    
        * Update pyproject.toml
    
        * Quick fixes
    
        * Update llama.py
    
        * Update llama.py
    
        * Update dpo.py
    
        * Update dpo.py
    
        * Update llama.py
    
        * Update save.py
    
        * getattr
    
        * RSLoRA and LoftQ direct support
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Fix DPO + GGUF
    
        * Fix quantization_method
    
        * Fix quantization_config
    
        * patch model
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update save.py
    
        * Update save.py
    
        * tokenizer_save_settings
    
        * Update save.py
    
        * quantization and loftq
    
        * Update save.py
    
        * Update llama.py
    
        * Update save.py
    
        * upload_to_huggingface
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
    commit abb462be71e8cf01ad989dca0efaa17441113651
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Sat Jan 20 23:23:00 2024 +1100
    
        Hotfix for Jan 2024 Release (#110)
    
        * Fix tokenizer, dropout, bias for LoRA
    
        * Update loader.py
    
        * Fix LoRA downcasting
    
        * Update _utils.py
    
        * Saving to GGUF
    
        * fix
    
        * colab_quantize_to_gguf
    
        * move save modules
    
        * save module
    
        * Update __init__.py
    
        * Update save.py
    
        * Temp downgrade due to TRL issue
    
        * Fix up bugs
    
        * Faster saving + other changes
    
        * Update llama.py
    
        * Saving modules
    
        * spelling
    
        * Update llama.py
    
        * Update save.py
    
        * Update save.py
    
        * Update loader.py
    
        * Update llama.py
    
        * patch saving
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * patch saving
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * original_model
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * saving to RAM leakage?
    
        * Update save.py
    
        * new_save_directory
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update pyproject.toml
    
        * Update pyproject.toml
    
        * Update pyproject.toml
    
        * Quick fixes
    
        * Update llama.py
    
        * Update llama.py
    
        * Update dpo.py
    
        * Update dpo.py
    
        * Update llama.py
    
        * Update save.py
    
        * getattr
    
        * RSLoRA and LoftQ direct support
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Fix DPO + GGUF
    
        * Fix quantization_method
    
        * Fix quantization_config
    
        * patch model
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Update save.py
    
        * Update save.py
    
        * tokenizer_save_settings
    
        * Update save.py
    
        * quantization and loftq
    
        * Update save.py
    
        * Update llama.py
    
        * Update save.py
    
    commit 31e2d71720e64b854145d7779833b7d2d3d4177e
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Sat Jan 20 04:25:06 2024 +1100
    
        Quick fixes (#106)
    
        * Fix tokenizer, dropout, bias for LoRA
    
        * Update loader.py
    
        * Fix LoRA downcasting
    
        * Update _utils.py
    
        * Saving to GGUF
    
        * fix
    
        * colab_quantize_to_gguf
    
        * move save modules
    
        * save module
    
        * Update __init__.py
    
        * Update save.py
    
        * Temp downgrade due to TRL issue
    
        * Fix up bugs
    
        * Faster saving + other changes
    
        * Update llama.py
    
        * Saving modules
    
        * spelling
    
        * Update llama.py
    
        * Update save.py
    
        * Update save.py
    
        * Update loader.py
    
        * Update llama.py
    
        * patch saving
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * patch saving
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * original_model
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * saving to RAM leakage?
    
        * Update save.py
    
        * new_save_directory
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update pyproject.toml
    
        * Update pyproject.toml
    
        * Update pyproject.toml
    
        * Quick fixes
    
        * Update llama.py
    
        * Update llama.py
    
        * Update dpo.py
    
        * Update dpo.py
    
        * Update llama.py
    
        * Update save.py
    
        * getattr
    
        * RSLoRA and LoftQ direct support
    
        * Update llama.py
    
        * Update llama.py
    
        * Update llama.py
    
        * Fix DPO + GGUF
    
    commit 8846337e5c8c2f206a4ac8fe6d239f3d1221f7ac
    Author: Daniel Han-Chen <danielhanchen@gmail.com>
    Date:   Sat Jan 20 02:30:31 2024 +1100
    
        Update _utils.py
    
    commit d378df87e5f3945474915a098c9aa58313465064
    Merge: c1e7480 920e3c2
    Author: Daniel Han-Chen <danielhanchen@gmail.com>
    Date:   Fri Jan 19 23:15:38 2024 +1100
    
        Merge branch 'main' of https://github.com/unslothai/unsloth
    
    commit c1e7480ac2ad0e5efa05e84fe0997619ccdd86a4
    Author: Daniel Han-Chen <danielhanchen@gmail.com>
    Date:   Fri Jan 19 23:15:20 2024 +1100
    
        Revert quantization methods
    
    commit 920e3c2ea07a044addeb7c3fa8be6f0189cb7f84
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Fri Jan 19 22:57:22 2024 +1100
    
        getattr issues (#103)
    
        * Fix tokenizer, dropout, bias for LoRA
    
        * Update loader.py
    
        * Fix LoRA downcasting
    
        * Update _utils.py
    
        * Saving to GGUF
    
        * fix
    
        * colab_quantize_to_gguf
    
        * move save modules
    
        * save module
    
        * Update __init__.py
    
        * Update save.py
    
        * Temp downgrade due to TRL issue
    
        * Fix up bugs
    
        * Faster saving + other changes
    
        * Update llama.py
    
        * Saving modules
    
        * spelling
    
        * Update llama.py
    
        * Update save.py
    
        * Update save.py
    
        * Update loader.py
    
        * Update llama.py
    
        * patch saving
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * patch saving
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * original_model
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * saving to RAM leakage?
    
        * Update save.py
    
        * new_save_directory
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update pyproject.toml
    
        * Update pyproject.toml
    
        * Update pyproject.toml
    
        * Quick fixes
    
        * Update llama.py
    
        * Update llama.py
    
        * Update dpo.py
    
        * Update dpo.py
    
        * Update llama.py
    
        * Update save.py
    
        * getattr
    
    commit fc25ab0df032f8ee5ea750f27c68d63f49d2d9a9
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Fri Jan 19 22:52:30 2024 +1100
    
        Quick fixes (#101)
    
        * Fix tokenizer, dropout, bias for LoRA
    
        * Update loader.py
    
        * Fix LoRA downcasting
    
        * Update _utils.py
    
        * Saving to GGUF
    
        * fix
    
        * colab_quantize_to_gguf
    
        * move save modules
    
        * save module
    
        * Update __init__.py
    
        * Update save.py
    
        * Temp downgrade due to TRL issue
    
        * Fix up bugs
    
        * Faster saving + other changes
    
        * Update llama.py
    
        * Saving modules
    
        * spelling
    
        * Update llama.py
    
        * Update save.py
    
        * Update save.py
    
        * Update loader.py
    
        * Update llama.py
    
        * patch saving
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * patch saving
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * original_model
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * saving to RAM leakage?
    
        * Update save.py
    
        * new_save_directory
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update pyproject.toml
    
        * Update pyproject.toml
    
        * Update pyproject.toml
    
        * Quick fixes
    
        * Update llama.py
    
        * Update llama.py
    
        * Update dpo.py
    
        * Update dpo.py
    
        * Update llama.py
    
        * Update save.py
    
    commit b8b1eafda35d124046e11766aeeb6343957e0daf
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Fri Jan 19 04:51:19 2024 +1100
    
        2024 Release (#96)
    
        * Fix tokenizer, dropout, bias for LoRA
    
        * Update loader.py
    
        * Fix LoRA downcasting
    
        * Update _utils.py
    
        * Saving to GGUF
    
        * fix
    
        * colab_quantize_to_gguf
    
        * move save modules
    
        * save module
    
        * Update __init__.py
    
        * Update save.py
    
        * Temp downgrade due to TRL issue
    
        * Fix up bugs
    
        * Faster saving + other changes
    
        * Update llama.py
    
        * Saving modules
    
        * spelling
    
        * Update llama.py
    
        * Update save.py
    
        * Update save.py
    
        * Update loader.py
    
        * Update llama.py
    
        * patch saving
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * patch saving
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * original_model
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * saving to RAM leakage?
    
        * Update save.py
    
        * new_save_directory
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update save.py
    
        * Update pyproject.toml
    
        * Update pyproject.toml
    
        * Update pyproject.toml
    
    commit 4112eb4a3df4c0911e36211b47381086c963b4e0
    Author: Daniel Han-Chen <danielhanchen@gmail.com>
    Date:   Fri Jan 19 03:41:00 2024 +1100
    
        Update pyproject.toml
    
    commit 59d74753362ff59e664cb6d650b564511e6e20f3
    Author: Daniel Han-Chen <danielhanchen@gmail.com>
    Date:   Fri Jan 19 03:35:17 2024 +1100
    
        Update pyproject.toml
    
    commit c1ac4d2707574868767345e76ebe49c8353f9057
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Thu Jan 11 04:08:03 2024 +1100
    
        Fix some bugs (#83)
    
        * Fix tokenizer, dropout, bias for LoRA
    
        * Update loader.py
    
        * Fix LoRA downcasting
    
        * Update _utils.py
    
        * Saving to GGUF
    
        * fix
    
        * colab_quantize_to_gguf
    
        * move save modules
    
        * save module
    
        * Update __init__.py
    
        * Update save.py
    
        * Temp downgrade due to TRL issue
    
        * Fix up bugs
    
    commit d3887c7fd93d9b910bf6ee3ab3c7fd485fc55e46
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Wed Jan 10 23:10:48 2024 +1100
    
        Update README.md (#81)
    
    commit b5d94d9a0ad9532494e1b3c7badbb94fa92c50eb
    Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
    Date:   Wed Jan 10 23:10:23 2024 +1100
    
        Discord button redo (#80)
    
    commit 01d7f58e11373ab07b9282a42bc14f542dbdabf0
    Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
    Date:   Wed Jan 10 23:02:20 2024 +1100
    
        Update logos (#79)
    
        * HF Perf Button
    
        * Update README.md
    
        Adding new buttons cleanup
    
        * Update README.md
    
        * Delete images/Discord.png
    
        * Delete images/try live demo green.png
    
        * new transparent logos
    
        * Revamping page
    
        * Revamp mainpage
    
        * Update README.md
    
        * Update README.md
    
    commit 9faaf5b388e025f8ffc302450a12ffb84e7e1750
    Author: Daniel Han <danielhanchen@gmail.com>
    Date:   Wed Jan 10 20:03:01 2024 +1100
    
        Create FUNDING.yml (#78)
    
    commit 82e6fece0b78011707090639823d2d7acf5a3864
    Author: Daniel Han-Chen <danielhanchen@gmail.com>
    Date:   Wed Jan 10 01:02:44 2024 +1100
    
        fix_tokenizer
    
    commit b52278199b7ae2764f242622275bb8a85ba7b721
    Author: Daniel Han-Chen <danielhanchen@gmail.com>
    Date:   Tue Jan 9 23:40:43 2024 +1100
    
        check_tokenizer
    
    ---------
    
    Co-authored-by: Daniel Han <danielhanchen@gmail.com>
    2 people authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    3618e5b View commit details
    Browse the repository at this point in the history
  24. Torch 2.2 (unslothai#157)

    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update save.py
    
    * Update fast_lora.py
    
    * Update utils.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update save.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Revert "Update llama.py"
    
    This reverts commit a208ec4.
    
    * Update llama.py
    
    * Works?
    
    * Update pyproject.toml
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Swiglu
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * attention_mask
    
    * Update llama.py
    
    * Update llama.py
    
    * labels
    
    * Update mistral.py
    
    * Update llama.py
    
    * attention mask
    
    * Update save.py
    
    * Update save.py
    
    * Update mistral.py
    
    * attention mask
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update dpo.py
    
    * Patch saving
    
    * Update save.py
    
    * Update save.py
    
    * patch_saving_functions
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * print
    
    * Mistral patch
    
    * Update mistral.py
    
    * Update save.py
    
    * saving
    
    * Update llama.py
    
    * Update llama.py
    
    * Fast inference repatch
    
    * Update llama.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update mistral.py
    
    * Update __init__.py
    
    * Fix inference
    
    * Update mistral.py
    
    * fast lm_head
    
    * Remove fast path
    
    * Update rope_embedding.py
    
    * Update loader.py
    
    * LlamaAttention_fast_forward_inference
    
    * if past_key_value is not None and q_len == 1:
    
    * revert inference
    
    * Update loader.py
    
    * past_key_value
    
    * Update llama.py
    
    * Update llama.py
    
    * Fix SDPA
    
    * Update llama.py
    
    * padding
    
    * Inference
    
    * Update llama.py
    
    * Revert
    
    * Update mistral.py
    
    * faster inference
    
    * inference
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * inference
    
    * Update llama.py
    
    * Update utils.py
    
    * faster inference
    
    * Update llama.py
    
    * revert
    
    * lm_head
    
    * Update llama.py
    
    * inference
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * faster inference
    
    * Update llama.py
    
    * fast inference
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * torch compile
    
    * past_key_values
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update llama.py
    
    * fast inference + saving config.json
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * fast inference again
    
    * more temp matrices
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * fast inference
    
    * Update mistral.py
    
    * Update llama.py
    
    * SDPA
    
    * attention_mask
    
    * New version
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update save.py
    
    * Update save.py
    
    * Torch 2.2.0
    
    * Update save.py
    
    * mistral swa
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Fix SWA inference
    
    * Fix llm_int8_skip_modules
    
    * SWA inference
    
    * Update save.py
    
    * Update save.py
    
    * Update pyproject.toml
    
    * __version__
    
    * __version__
    
    * Update save.py
    
    * Update save.py
    
    * Update mistral.py
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    2d5f7ed View commit details
    Browse the repository at this point in the history
  25. Nightly (unslothai#161)

    * Update fast_lora.py
    
    * Update utils.py
    
    * Update llama.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update save.py
    
    * Update save.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Revert "Update llama.py"
    
    This reverts commit a208ec4.
    
    * Update llama.py
    
    * Works?
    
    * Update pyproject.toml
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Swiglu
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * attention_mask
    
    * Update llama.py
    
    * Update llama.py
    
    * labels
    
    * Update mistral.py
    
    * Update llama.py
    
    * attention mask
    
    * Update save.py
    
    * Update save.py
    
    * Update mistral.py
    
    * attention mask
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update dpo.py
    
    * Patch saving
    
    * Update save.py
    
    * Update save.py
    
    * patch_saving_functions
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * print
    
    * Mistral patch
    
    * Update mistral.py
    
    * Update save.py
    
    * saving
    
    * Update llama.py
    
    * Update llama.py
    
    * Fast inference repatch
    
    * Update llama.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update mistral.py
    
    * Update __init__.py
    
    * Fix inference
    
    * Update mistral.py
    
    * fast lm_head
    
    * Remove fast path
    
    * Update rope_embedding.py
    
    * Update loader.py
    
    * LlamaAttention_fast_forward_inference
    
    * if past_key_value is not None and q_len == 1:
    
    * revert inference
    
    * Update loader.py
    
    * past_key_value
    
    * Update llama.py
    
    * Update llama.py
    
    * Fix SDPA
    
    * Update llama.py
    
    * padding
    
    * Inference
    
    * Update llama.py
    
    * Revert
    
    * Update mistral.py
    
    * faster inference
    
    * inference
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * inference
    
    * Update llama.py
    
    * Update utils.py
    
    * faster inference
    
    * Update llama.py
    
    * revert
    
    * lm_head
    
    * Update llama.py
    
    * inference
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * faster inference
    
    * Update llama.py
    
    * fast inference
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * torch compile
    
    * past_key_values
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update llama.py
    
    * fast inference + saving config.json
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * fast inference again
    
    * more temp matrices
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * fast inference
    
    * Update mistral.py
    
    * Update llama.py
    
    * SDPA
    
    * attention_mask
    
    * New version
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update save.py
    
    * Update save.py
    
    * Torch 2.2.0
    
    * Update save.py
    
    * mistral swa
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Fix SWA inference
    
    * Fix llm_int8_skip_modules
    
    * SWA inference
    
    * Update save.py
    
    * Update save.py
    
    * Update pyproject.toml
    
    * __version__
    
    * __version__
    
    * Update save.py
    
    * Update save.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    e62c037 View commit details
    Browse the repository at this point in the history
  26. Update README.md (unslothai#162)

    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    e81b78d View commit details
    Browse the repository at this point in the history
  27. Update mapper.py

    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    c3ea900 View commit details
    Browse the repository at this point in the history
  28. Update README.md (unslothai#164)

    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    1f5f2e3 View commit details
    Browse the repository at this point in the history
  29. Update README.md (unslothai#165)

    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    53b7af5 View commit details
    Browse the repository at this point in the history
  30. Configuration menu
    Copy the full SHA
    38c3f43 View commit details
    Browse the repository at this point in the history
  31. Prelim Feb release (unslothai#173)

    * Works?
    
    * Update pyproject.toml
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Swiglu
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update swiglu.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * Update fast_lora.py
    
    * attention_mask
    
    * Update llama.py
    
    * Update llama.py
    
    * labels
    
    * Update mistral.py
    
    * Update llama.py
    
    * attention mask
    
    * Update save.py
    
    * Update save.py
    
    * Update mistral.py
    
    * attention mask
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update dpo.py
    
    * Patch saving
    
    * Update save.py
    
    * Update save.py
    
    * patch_saving_functions
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * print
    
    * Mistral patch
    
    * Update mistral.py
    
    * Update save.py
    
    * saving
    
    * Update llama.py
    
    * Update llama.py
    
    * Fast inference repatch
    
    * Update llama.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update mistral.py
    
    * Update __init__.py
    
    * Fix inference
    
    * Update mistral.py
    
    * fast lm_head
    
    * Remove fast path
    
    * Update rope_embedding.py
    
    * Update loader.py
    
    * LlamaAttention_fast_forward_inference
    
    * if past_key_value is not None and q_len == 1:
    
    * revert inference
    
    * Update loader.py
    
    * past_key_value
    
    * Update llama.py
    
    * Update llama.py
    
    * Fix SDPA
    
    * Update llama.py
    
    * padding
    
    * Inference
    
    * Update llama.py
    
    * Revert
    
    * Update mistral.py
    
    * faster inference
    
    * inference
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * inference
    
    * Update llama.py
    
    * Update utils.py
    
    * faster inference
    
    * Update llama.py
    
    * revert
    
    * lm_head
    
    * Update llama.py
    
    * inference
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * faster inference
    
    * Update llama.py
    
    * fast inference
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * torch compile
    
    * past_key_values
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update llama.py
    
    * fast inference + saving config.json
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update mistral.py
    
    * fast inference again
    
    * more temp matrices
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * fast inference
    
    * Update mistral.py
    
    * Update llama.py
    
    * SDPA
    
    * attention_mask
    
    * New version
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update utils.py
    
    * Update utils.py
    
    * Update save.py
    
    * Update save.py
    
    * Torch 2.2.0
    
    * Update save.py
    
    * mistral swa
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Update save.py
    
    * Fix SWA inference
    
    * Fix llm_int8_skip_modules
    
    * SWA inference
    
    * Update save.py
    
    * Update save.py
    
    * Update pyproject.toml
    
    * __version__
    
    * __version__
    
    * Update save.py
    
    * Update save.py
    
    * Update mistral.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Update llama.py
    
    * Chat Templates
    
    * Update chat_templates.py
    
    * Update chat_templates.py
    
    * Update chat_templates.py
    
    * Update chat_templates.py
    
    * patch tokenizer
    
    * Update chat_templates.py
    
    * Saving, LlamaRotaryEmbedding issues
    
    * Update llama.py
    
    * Update mistral.py
    danielhanchen authored and cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    c5fd5cb View commit details
    Browse the repository at this point in the history
  32. Configuration menu
    Copy the full SHA
    0bb66a9 View commit details
    Browse the repository at this point in the history
  33. Configuration menu
    Copy the full SHA
    e031ed8 View commit details
    Browse the repository at this point in the history
  34. fix partial rope embedding

    cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    c8198a0 View commit details
    Browse the repository at this point in the history
  35. added new kernel test

    cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    69005f4 View commit details
    Browse the repository at this point in the history
  36. resolve merge conflicts

    cm2435 committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    63328d1 View commit details
    Browse the repository at this point in the history

Commits on Mar 4, 2024

  1. Configuration menu
    Copy the full SHA
    2d19215 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    1142330 View commit details
    Browse the repository at this point in the history