2026-03-17 14:53:50 +00:00
< h1 align = "center" style = "margin:0;" >
2026-03-14 05:30:21 +00:00
< a href = "https://unsloth.ai/docs" > < picture >
2026-03-17 14:53:50 +00:00
< source media = "(prefers-color-scheme: dark)" srcset = "https://raw.githubusercontent.com/unslothai/unsloth/main/images/STUDIO%20WHITE%20LOGO.png" >
< source media = "(prefers-color-scheme: light)" srcset = "https://raw.githubusercontent.com/unslothai/unsloth/main/images/STUDIO%20BLACK%20LOGO.png" >
< img alt = "Unsloth logo" src = "https://raw.githubusercontent.com/unslothai/unsloth/main/images/STUDIO%20BLACK%20LOGO.png" height = "60" style = "max-width:100%;" >
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
< / picture > < / a >
2026-03-17 14:53:50 +00:00
< / h1 >
< h3 align = "center" style = "margin: 0; margin-top: 0;" >
Run and train AI models with a unified local interface.
< / h3 >
< p align = "center" >
< a href = "#-features" > Features< / a > •
< a href = "#-quickstart" > Quickstart< / a > •
< a href = "#-free-notebooks" > Notebooks< / a > •
< a href = "https://unsloth.ai/docs" > Documentation< / a > •
< a href = "https://discord.com/invite/unsloth" > Discord< / a >
< / p >
2026-03-17 15:42:03 +00:00
< a href = "https://unsloth.ai/docs/new/studio" >
< img alt = "unsloth studio ui homepage" src = "https://raw.githubusercontent.com/unslothai/unsloth/main/studio/frontend/public/studio%20github%20landscape%20colab%20display.png" style = "max-width: 100%; margin-bottom: 0;" > < / a >
2026-03-17 14:53:50 +00:00
2026-03-19 03:13:48 +00:00
Unsloth Studio (Beta) lets you run and train text, [audio ](https://unsloth.ai/docs/basics/text-to-speech-tts-fine-tuning ), [embedding ](https://unsloth.ai/docs/new/embedding-finetuning ), [vision ](https://unsloth.ai/docs/basics/vision-fine-tuning ) models on Windows, Linux and macOS.
2026-03-18 15:30:53 +00:00
2026-03-17 14:53:50 +00:00
## ⭐ Features
Unsloth provides several key features for both inference and training:
### Inference
* **Search + download + run models** including GGUF, LoRA adapters, safetensors
* **Export models**: [Save or export ](https://unsloth.ai/docs/new/studio/export ) models to GGUF, 16-bit safetensors and other formats.
* **Tool calling**: Support for [self-healing tool calling ](https://unsloth.ai/docs/new/studio/chat#auto-healing-tool-calling ) and web search
2026-03-19 14:26:28 +00:00
* **[Code execution](https://unsloth.ai/docs/new/studio/chat#code-execution)**: lets LLMs test code in Claude artifacts and sandbox environments
2026-03-17 14:53:50 +00:00
* [Auto-tune inference parameters ](https://unsloth.ai/docs/new/studio/chat#auto-parameter-tuning ) and customize chat templates.
* Upload images, audio, PDFs, code, DOCX and more file types to chat with.
### Training
* Train **500+ models** up to **2x faster** with up to **70% less VRAM** , with no accuracy loss.
* Supports full fine-tuning, pretraining, 4-bit, 16-bit and, FP8 training.
* **Observability**: Monitor training live, track loss and GPU usage and customize graphs.
* **Data Recipes**: [Auto-create datasets ](https://unsloth.ai/docs/new/studio/data-recipe ) from **PDF, CSV, DOCX** etc. Edit data in a visual-node workflow.
* **Reinforcement Learning**: The most efficient [RL ](https://unsloth.ai/docs/get-started/reinforcement-learning-rl-guide ) library, using **80% less VRAM** for GRPO, [FP8 ](https://unsloth.ai/docs/get-started/reinforcement-learning-rl-guide/fp8-reinforcement-learning ) etc.
* [Multi-GPU ](https://unsloth.ai/docs/basics/multi-gpu-training-with-unsloth ) training is supported, with major improvements coming soon.
## ⚡ Quickstart
Unsloth can be used in two ways: through ** [Unsloth Studio ](https://unsloth.ai/docs/new/studio/ )**, the web UI, or through **Unsloth Core** , the code-based version. Each has different requirements.
### Unsloth Studio (web UI)
2026-03-19 03:13:48 +00:00
Unsloth Studio (Beta) works on **Windows, Linux, WSL** and **macOS** .
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
2026-03-19 14:26:28 +00:00
* **CPU:** Supported for Chat and Data Recipes currently
2026-03-18 06:04:54 +00:00
* **NVIDIA:** Training works on RTX 30/40/50, Blackwell, DGX Spark, Station and more
2026-03-19 14:26:28 +00:00
* **macOS:** Currently supports chat and Data Recipes. **MLX training** is coming very soon
2026-03-18 06:04:54 +00:00
* **AMD:** Chat works. Train with [Unsloth Core ](#unsloth-core-code-based ). Studio support is coming soon.
* **Coming soon:** Training support for Apple MLX, AMD, and Intel.
2026-03-17 14:53:50 +00:00
* **Multi-GPU:** Available now, with a major upgrade on the way
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
2026-03-19 09:28:53 +00:00
#### MacOS, Linux, WSL Setup:
2026-03-18 15:12:20 +00:00
```bash
Fix Install commands for Windows + 1 line installs (#4447)
* One liner setup for unsloth studio
* Fix install scripts: system deps, activation bugs, curl/wget support
- install.sh: detect platform (macOS/Linux/WSL) and check for missing
system dependencies (cmake, git, build-essential, libcurl4-openssl-dev).
Prompt user once for permission to install all missing packages via
brew (macOS) or sudo apt-get (Linux/WSL). Add wget fallback via
download() helper since curl is not always present on minimal Linux
installs. Fix nested curl|sh stdin stealing by downloading uv installer
to a tempfile first. Replace venv activation (no-op in a pipe subshell)
with explicit --python flag for uv pip install and direct venv binary
invocation. Add idempotency guard for venv creation. Redirect stdin
on unsloth studio setup to prevent pipe consumption. On macOS, check
for Xcode Command Line Tools and trigger install if missing.
- install.ps1: wrap script body in Install-UnslothStudio function so
that errors use return instead of exit (exit kills the terminal when
run via irm|iex). Remove activate.ps1 invocation entirely -- use
explicit --python path for uv pip install and & $UnslothExe for
studio setup. This avoids both the child-scope activation bug (& vs
dot-source) and the execution policy error on default Windows systems.
Add winget availability check with clear error message. Fix PATH
refresh to append registry paths instead of replacing the session PATH.
Add uv installer fallback via astral.sh PowerShell script if winget
install does not put uv on PATH. Broaden Python version check to
accept 3.11-3.13. Add idempotency guard for venv creation.
- README.md: add wget one-liner alternative for systems without curl.
* Fix Tailwind CSS v4 .gitignore bug on Windows (#4444)
- Add .gitignore hiding workaround to setup.ps1 (matching existing
setup.sh logic) so venv .gitignore files containing "*" don't prevent
Tailwind's oxide scanner from finding .tsx source files
- Add CSS size validation to setup.sh, setup.ps1, and build.sh to catch
truncated Tailwind builds early
- Remove stray force-rebuild overrides that made the "skip build if
current" cache check dead code in both setup scripts
- Add rm -rf dist to build.sh to force clean rebuilds for wheel packaging
* Change default port 8000 to 8888, fix installer bugs, improve UX
- Change default Studio port from 8000 to 8888 across all entry points
(run.py, studio.py, ui.py, colab.py, vite.config.ts, setup scripts)
- Update launch banner: "Launching with studio venv..." to
"Launching Unsloth Studio... Please wait..."
- Add "Open your web browser" banner and rename labels
(Local -> Local Access, External -> Worldwide Web Address)
- Fix venv idempotency: check for bin/python instead of just directory
existence, clean up partial venvs on retry
- Fix build.sh CSS validation: handle empty CSS case that silently
bypassed the check with "integer expression expected"
- Fix install.sh sudo handling: try apt-get without sudo first (works
when root), then escalate with per-package tracking and user prompt
- Fix install.ps1: check exit code from studio setup, fail on error
- Add pciutils to WSL GGUF build dependencies
- Apply same smart apt-get escalation pattern to studio/setup.sh
* Use detected Python version for venv, abort on non-apt Linux
- install.ps1: detect existing Python 3.11/3.12/3.13 and use that
version for venv creation instead of always forcing 3.13
- install.sh: exit with error on non-apt Linux distros when required
packages cannot be auto-installed, instead of silently continuing
* Make sudo permission prompt more prominent with warning banner
* Add Accept [Y/n] sudo prompt to studio/setup.sh for consistency
* Fix native command exit code handling and sudo decline flow
install.ps1: Add $LASTEXITCODE checks after winget (Python), uv venv,
and uv pip install calls. $ErrorActionPreference only catches PowerShell
cmdlet errors, not native executable failures. The Python check also
handles winget returning non-zero for "already installed".
setup.sh: Skip llama-server build when user declines sudo or sudo is
unavailable. Previously the script continued to section 8 which would
fail with confusing errors (e.g. "gcc: command not found") since
build-essential was never installed.
* Move rm -rf llama.cpp inside build branch to preserve existing install
When _SKIP_GGUF_BUILD is set (user declined sudo or sudo unavailable),
the previous rm -rf would destroy an already-working llama-server before
the skip check ran. Move it inside the else branch so existing builds
are preserved when the rebuild is skipped.
---------
Co-authored-by: Daniel Han <danielhanchen@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2026-03-19 09:09:09 +00:00
curl -fsSL https://raw.githubusercontent.com/unslothai/unsloth/main/install.sh | sh
```
2026-03-19 09:28:53 +00:00
If you don't have `curl` , use `wget` . Then to launch after setup:
Fix Install commands for Windows + 1 line installs (#4447)
* One liner setup for unsloth studio
* Fix install scripts: system deps, activation bugs, curl/wget support
- install.sh: detect platform (macOS/Linux/WSL) and check for missing
system dependencies (cmake, git, build-essential, libcurl4-openssl-dev).
Prompt user once for permission to install all missing packages via
brew (macOS) or sudo apt-get (Linux/WSL). Add wget fallback via
download() helper since curl is not always present on minimal Linux
installs. Fix nested curl|sh stdin stealing by downloading uv installer
to a tempfile first. Replace venv activation (no-op in a pipe subshell)
with explicit --python flag for uv pip install and direct venv binary
invocation. Add idempotency guard for venv creation. Redirect stdin
on unsloth studio setup to prevent pipe consumption. On macOS, check
for Xcode Command Line Tools and trigger install if missing.
- install.ps1: wrap script body in Install-UnslothStudio function so
that errors use return instead of exit (exit kills the terminal when
run via irm|iex). Remove activate.ps1 invocation entirely -- use
explicit --python path for uv pip install and & $UnslothExe for
studio setup. This avoids both the child-scope activation bug (& vs
dot-source) and the execution policy error on default Windows systems.
Add winget availability check with clear error message. Fix PATH
refresh to append registry paths instead of replacing the session PATH.
Add uv installer fallback via astral.sh PowerShell script if winget
install does not put uv on PATH. Broaden Python version check to
accept 3.11-3.13. Add idempotency guard for venv creation.
- README.md: add wget one-liner alternative for systems without curl.
* Fix Tailwind CSS v4 .gitignore bug on Windows (#4444)
- Add .gitignore hiding workaround to setup.ps1 (matching existing
setup.sh logic) so venv .gitignore files containing "*" don't prevent
Tailwind's oxide scanner from finding .tsx source files
- Add CSS size validation to setup.sh, setup.ps1, and build.sh to catch
truncated Tailwind builds early
- Remove stray force-rebuild overrides that made the "skip build if
current" cache check dead code in both setup scripts
- Add rm -rf dist to build.sh to force clean rebuilds for wheel packaging
* Change default port 8000 to 8888, fix installer bugs, improve UX
- Change default Studio port from 8000 to 8888 across all entry points
(run.py, studio.py, ui.py, colab.py, vite.config.ts, setup scripts)
- Update launch banner: "Launching with studio venv..." to
"Launching Unsloth Studio... Please wait..."
- Add "Open your web browser" banner and rename labels
(Local -> Local Access, External -> Worldwide Web Address)
- Fix venv idempotency: check for bin/python instead of just directory
existence, clean up partial venvs on retry
- Fix build.sh CSS validation: handle empty CSS case that silently
bypassed the check with "integer expression expected"
- Fix install.sh sudo handling: try apt-get without sudo first (works
when root), then escalate with per-package tracking and user prompt
- Fix install.ps1: check exit code from studio setup, fail on error
- Add pciutils to WSL GGUF build dependencies
- Apply same smart apt-get escalation pattern to studio/setup.sh
* Use detected Python version for venv, abort on non-apt Linux
- install.ps1: detect existing Python 3.11/3.12/3.13 and use that
version for venv creation instead of always forcing 3.13
- install.sh: exit with error on non-apt Linux distros when required
packages cannot be auto-installed, instead of silently continuing
* Make sudo permission prompt more prominent with warning banner
* Add Accept [Y/n] sudo prompt to studio/setup.sh for consistency
* Fix native command exit code handling and sudo decline flow
install.ps1: Add $LASTEXITCODE checks after winget (Python), uv venv,
and uv pip install calls. $ErrorActionPreference only catches PowerShell
cmdlet errors, not native executable failures. The Python check also
handles winget returning non-zero for "already installed".
setup.sh: Skip llama-server build when user declines sudo or sudo is
unavailable. Previously the script continued to section 8 which would
fail with confusing errors (e.g. "gcc: command not found") since
build-essential was never installed.
* Move rm -rf llama.cpp inside build branch to preserve existing install
When _SKIP_GGUF_BUILD is set (user declined sudo or sudo unavailable),
the previous rm -rf would destroy an already-working llama-server before
the skip check ran. Move it inside the else branch so existing builds
are preserved when the rebuild is skipped.
---------
Co-authored-by: Daniel Han <danielhanchen@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2026-03-19 09:09:09 +00:00
```bash
2026-03-19 09:26:18 +00:00
source unsloth_studio/bin/activate
unsloth studio -H 0.0.0.0 -p 8888
```
2026-03-19 09:28:53 +00:00
#### Windows PowerShell Setup:
2026-03-19 09:26:18 +00:00
```powershell
irm https://raw.githubusercontent.com/unslothai/unsloth/main/install.ps1 | iex
Fix Install commands for Windows + 1 line installs (#4447)
* One liner setup for unsloth studio
* Fix install scripts: system deps, activation bugs, curl/wget support
- install.sh: detect platform (macOS/Linux/WSL) and check for missing
system dependencies (cmake, git, build-essential, libcurl4-openssl-dev).
Prompt user once for permission to install all missing packages via
brew (macOS) or sudo apt-get (Linux/WSL). Add wget fallback via
download() helper since curl is not always present on minimal Linux
installs. Fix nested curl|sh stdin stealing by downloading uv installer
to a tempfile first. Replace venv activation (no-op in a pipe subshell)
with explicit --python flag for uv pip install and direct venv binary
invocation. Add idempotency guard for venv creation. Redirect stdin
on unsloth studio setup to prevent pipe consumption. On macOS, check
for Xcode Command Line Tools and trigger install if missing.
- install.ps1: wrap script body in Install-UnslothStudio function so
that errors use return instead of exit (exit kills the terminal when
run via irm|iex). Remove activate.ps1 invocation entirely -- use
explicit --python path for uv pip install and & $UnslothExe for
studio setup. This avoids both the child-scope activation bug (& vs
dot-source) and the execution policy error on default Windows systems.
Add winget availability check with clear error message. Fix PATH
refresh to append registry paths instead of replacing the session PATH.
Add uv installer fallback via astral.sh PowerShell script if winget
install does not put uv on PATH. Broaden Python version check to
accept 3.11-3.13. Add idempotency guard for venv creation.
- README.md: add wget one-liner alternative for systems without curl.
* Fix Tailwind CSS v4 .gitignore bug on Windows (#4444)
- Add .gitignore hiding workaround to setup.ps1 (matching existing
setup.sh logic) so venv .gitignore files containing "*" don't prevent
Tailwind's oxide scanner from finding .tsx source files
- Add CSS size validation to setup.sh, setup.ps1, and build.sh to catch
truncated Tailwind builds early
- Remove stray force-rebuild overrides that made the "skip build if
current" cache check dead code in both setup scripts
- Add rm -rf dist to build.sh to force clean rebuilds for wheel packaging
* Change default port 8000 to 8888, fix installer bugs, improve UX
- Change default Studio port from 8000 to 8888 across all entry points
(run.py, studio.py, ui.py, colab.py, vite.config.ts, setup scripts)
- Update launch banner: "Launching with studio venv..." to
"Launching Unsloth Studio... Please wait..."
- Add "Open your web browser" banner and rename labels
(Local -> Local Access, External -> Worldwide Web Address)
- Fix venv idempotency: check for bin/python instead of just directory
existence, clean up partial venvs on retry
- Fix build.sh CSS validation: handle empty CSS case that silently
bypassed the check with "integer expression expected"
- Fix install.sh sudo handling: try apt-get without sudo first (works
when root), then escalate with per-package tracking and user prompt
- Fix install.ps1: check exit code from studio setup, fail on error
- Add pciutils to WSL GGUF build dependencies
- Apply same smart apt-get escalation pattern to studio/setup.sh
* Use detected Python version for venv, abort on non-apt Linux
- install.ps1: detect existing Python 3.11/3.12/3.13 and use that
version for venv creation instead of always forcing 3.13
- install.sh: exit with error on non-apt Linux distros when required
packages cannot be auto-installed, instead of silently continuing
* Make sudo permission prompt more prominent with warning banner
* Add Accept [Y/n] sudo prompt to studio/setup.sh for consistency
* Fix native command exit code handling and sudo decline flow
install.ps1: Add $LASTEXITCODE checks after winget (Python), uv venv,
and uv pip install calls. $ErrorActionPreference only catches PowerShell
cmdlet errors, not native executable failures. The Python check also
handles winget returning non-zero for "already installed".
setup.sh: Skip llama-server build when user declines sudo or sudo is
unavailable. Previously the script continued to section 8 which would
fail with confusing errors (e.g. "gcc: command not found") since
build-essential was never installed.
* Move rm -rf llama.cpp inside build branch to preserve existing install
When _SKIP_GGUF_BUILD is set (user declined sudo or sudo unavailable),
the previous rm -rf would destroy an already-working llama-server before
the skip check ran. Move it inside the else branch so existing builds
are preserved when the rebuild is skipped.
---------
Co-authored-by: Daniel Han <danielhanchen@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2026-03-19 09:09:09 +00:00
```
2026-03-19 09:28:53 +00:00
Then to launch after setup:
2026-03-19 09:26:18 +00:00
```powershell
& .\unsloth_studio\Scripts\unsloth.exe studio -H 0.0.0.0 -p 8888
```
#### MacOS, Linux, WSL developer installs:
Fix Install commands for Windows + 1 line installs (#4447)
* One liner setup for unsloth studio
* Fix install scripts: system deps, activation bugs, curl/wget support
- install.sh: detect platform (macOS/Linux/WSL) and check for missing
system dependencies (cmake, git, build-essential, libcurl4-openssl-dev).
Prompt user once for permission to install all missing packages via
brew (macOS) or sudo apt-get (Linux/WSL). Add wget fallback via
download() helper since curl is not always present on minimal Linux
installs. Fix nested curl|sh stdin stealing by downloading uv installer
to a tempfile first. Replace venv activation (no-op in a pipe subshell)
with explicit --python flag for uv pip install and direct venv binary
invocation. Add idempotency guard for venv creation. Redirect stdin
on unsloth studio setup to prevent pipe consumption. On macOS, check
for Xcode Command Line Tools and trigger install if missing.
- install.ps1: wrap script body in Install-UnslothStudio function so
that errors use return instead of exit (exit kills the terminal when
run via irm|iex). Remove activate.ps1 invocation entirely -- use
explicit --python path for uv pip install and & $UnslothExe for
studio setup. This avoids both the child-scope activation bug (& vs
dot-source) and the execution policy error on default Windows systems.
Add winget availability check with clear error message. Fix PATH
refresh to append registry paths instead of replacing the session PATH.
Add uv installer fallback via astral.sh PowerShell script if winget
install does not put uv on PATH. Broaden Python version check to
accept 3.11-3.13. Add idempotency guard for venv creation.
- README.md: add wget one-liner alternative for systems without curl.
* Fix Tailwind CSS v4 .gitignore bug on Windows (#4444)
- Add .gitignore hiding workaround to setup.ps1 (matching existing
setup.sh logic) so venv .gitignore files containing "*" don't prevent
Tailwind's oxide scanner from finding .tsx source files
- Add CSS size validation to setup.sh, setup.ps1, and build.sh to catch
truncated Tailwind builds early
- Remove stray force-rebuild overrides that made the "skip build if
current" cache check dead code in both setup scripts
- Add rm -rf dist to build.sh to force clean rebuilds for wheel packaging
* Change default port 8000 to 8888, fix installer bugs, improve UX
- Change default Studio port from 8000 to 8888 across all entry points
(run.py, studio.py, ui.py, colab.py, vite.config.ts, setup scripts)
- Update launch banner: "Launching with studio venv..." to
"Launching Unsloth Studio... Please wait..."
- Add "Open your web browser" banner and rename labels
(Local -> Local Access, External -> Worldwide Web Address)
- Fix venv idempotency: check for bin/python instead of just directory
existence, clean up partial venvs on retry
- Fix build.sh CSS validation: handle empty CSS case that silently
bypassed the check with "integer expression expected"
- Fix install.sh sudo handling: try apt-get without sudo first (works
when root), then escalate with per-package tracking and user prompt
- Fix install.ps1: check exit code from studio setup, fail on error
- Add pciutils to WSL GGUF build dependencies
- Apply same smart apt-get escalation pattern to studio/setup.sh
* Use detected Python version for venv, abort on non-apt Linux
- install.ps1: detect existing Python 3.11/3.12/3.13 and use that
version for venv creation instead of always forcing 3.13
- install.sh: exit with error on non-apt Linux distros when required
packages cannot be auto-installed, instead of silently continuing
* Make sudo permission prompt more prominent with warning banner
* Add Accept [Y/n] sudo prompt to studio/setup.sh for consistency
* Fix native command exit code handling and sudo decline flow
install.ps1: Add $LASTEXITCODE checks after winget (Python), uv venv,
and uv pip install calls. $ErrorActionPreference only catches PowerShell
cmdlet errors, not native executable failures. The Python check also
handles winget returning non-zero for "already installed".
setup.sh: Skip llama-server build when user declines sudo or sudo is
unavailable. Previously the script continued to section 8 which would
fail with confusing errors (e.g. "gcc: command not found") since
build-essential was never installed.
* Move rm -rf llama.cpp inside build branch to preserve existing install
When _SKIP_GGUF_BUILD is set (user declined sudo or sudo unavailable),
the previous rm -rf would destroy an already-working llama-server before
the skip check ran. Move it inside the else branch so existing builds
are preserved when the rebuild is skipped.
---------
Co-authored-by: Daniel Han <danielhanchen@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2026-03-19 09:09:09 +00:00
```bash
2026-03-18 16:48:02 +00:00
curl -LsSf https://astral.sh/uv/install.sh | sh
2026-03-18 18:13:12 +00:00
uv venv unsloth_studio --python 3.13
2026-03-18 15:12:20 +00:00
source unsloth_studio/bin/activate
2026-03-18 05:50:55 +00:00
uv pip install unsloth --torch-backend=auto
2026-03-18 15:12:20 +00:00
unsloth studio setup
unsloth studio -H 0.0.0.0 -p 8888
2026-03-18 13:52:27 +00:00
```
2026-03-18 15:12:20 +00:00
2026-03-19 09:26:18 +00:00
#### Windows PowerShell developer installs:
Fix Install commands for Windows + 1 line installs (#4447)
* One liner setup for unsloth studio
* Fix install scripts: system deps, activation bugs, curl/wget support
- install.sh: detect platform (macOS/Linux/WSL) and check for missing
system dependencies (cmake, git, build-essential, libcurl4-openssl-dev).
Prompt user once for permission to install all missing packages via
brew (macOS) or sudo apt-get (Linux/WSL). Add wget fallback via
download() helper since curl is not always present on minimal Linux
installs. Fix nested curl|sh stdin stealing by downloading uv installer
to a tempfile first. Replace venv activation (no-op in a pipe subshell)
with explicit --python flag for uv pip install and direct venv binary
invocation. Add idempotency guard for venv creation. Redirect stdin
on unsloth studio setup to prevent pipe consumption. On macOS, check
for Xcode Command Line Tools and trigger install if missing.
- install.ps1: wrap script body in Install-UnslothStudio function so
that errors use return instead of exit (exit kills the terminal when
run via irm|iex). Remove activate.ps1 invocation entirely -- use
explicit --python path for uv pip install and & $UnslothExe for
studio setup. This avoids both the child-scope activation bug (& vs
dot-source) and the execution policy error on default Windows systems.
Add winget availability check with clear error message. Fix PATH
refresh to append registry paths instead of replacing the session PATH.
Add uv installer fallback via astral.sh PowerShell script if winget
install does not put uv on PATH. Broaden Python version check to
accept 3.11-3.13. Add idempotency guard for venv creation.
- README.md: add wget one-liner alternative for systems without curl.
* Fix Tailwind CSS v4 .gitignore bug on Windows (#4444)
- Add .gitignore hiding workaround to setup.ps1 (matching existing
setup.sh logic) so venv .gitignore files containing "*" don't prevent
Tailwind's oxide scanner from finding .tsx source files
- Add CSS size validation to setup.sh, setup.ps1, and build.sh to catch
truncated Tailwind builds early
- Remove stray force-rebuild overrides that made the "skip build if
current" cache check dead code in both setup scripts
- Add rm -rf dist to build.sh to force clean rebuilds for wheel packaging
* Change default port 8000 to 8888, fix installer bugs, improve UX
- Change default Studio port from 8000 to 8888 across all entry points
(run.py, studio.py, ui.py, colab.py, vite.config.ts, setup scripts)
- Update launch banner: "Launching with studio venv..." to
"Launching Unsloth Studio... Please wait..."
- Add "Open your web browser" banner and rename labels
(Local -> Local Access, External -> Worldwide Web Address)
- Fix venv idempotency: check for bin/python instead of just directory
existence, clean up partial venvs on retry
- Fix build.sh CSS validation: handle empty CSS case that silently
bypassed the check with "integer expression expected"
- Fix install.sh sudo handling: try apt-get without sudo first (works
when root), then escalate with per-package tracking and user prompt
- Fix install.ps1: check exit code from studio setup, fail on error
- Add pciutils to WSL GGUF build dependencies
- Apply same smart apt-get escalation pattern to studio/setup.sh
* Use detected Python version for venv, abort on non-apt Linux
- install.ps1: detect existing Python 3.11/3.12/3.13 and use that
version for venv creation instead of always forcing 3.13
- install.sh: exit with error on non-apt Linux distros when required
packages cannot be auto-installed, instead of silently continuing
* Make sudo permission prompt more prominent with warning banner
* Add Accept [Y/n] sudo prompt to studio/setup.sh for consistency
* Fix native command exit code handling and sudo decline flow
install.ps1: Add $LASTEXITCODE checks after winget (Python), uv venv,
and uv pip install calls. $ErrorActionPreference only catches PowerShell
cmdlet errors, not native executable failures. The Python check also
handles winget returning non-zero for "already installed".
setup.sh: Skip llama-server build when user declines sudo or sudo is
unavailable. Previously the script continued to section 8 which would
fail with confusing errors (e.g. "gcc: command not found") since
build-essential was never installed.
* Move rm -rf llama.cpp inside build branch to preserve existing install
When _SKIP_GGUF_BUILD is set (user declined sudo or sudo unavailable),
the previous rm -rf would destroy an already-working llama-server before
the skip check ran. Move it inside the else branch so existing builds
are preserved when the rebuild is skipped.
---------
Co-authored-by: Daniel Han <danielhanchen@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2026-03-19 09:09:09 +00:00
```powershell
2026-03-18 16:57:44 +00:00
winget install -e --id Python.Python.3.13
winget install --id=astral-sh.uv -e
2026-03-18 18:13:12 +00:00
uv venv unsloth_studio --python 3.13
2026-03-18 15:12:20 +00:00
.\unsloth_studio\Scripts\activate
uv pip install unsloth --torch-backend=auto
unsloth studio setup
unsloth studio -H 0.0.0.0 -p 8888
2026-03-18 13:52:27 +00:00
```
2026-03-19 03:13:48 +00:00
#### Docker
Use our [Docker image ](https://hub.docker.com/r/unsloth/unsloth ) ```unsloth/unsloth``` container. Run:
```bash
docker run -d -e JUPYTER_PASSWORD="mypassword" \
-p 8888:8888 -p 8000:8000 -p 2222:22 \
-v $(pwd)/work:/workspace/work \
--gpus all \
unsloth/unsloth
```
2026-03-18 03:04:04 +00:00
2026-03-19 03:13:48 +00:00
#### Nightly Install - MacOS, Linux, WSL:
2026-03-18 15:12:20 +00:00
```bash
2026-03-18 16:57:44 +00:00
curl -LsSf https://astral.sh/uv/install.sh | sh
2026-03-18 15:12:20 +00:00
git clone --filter=blob:none https://github.com/unslothai/unsloth.git unsloth_studio
cd unsloth_studio
2026-03-18 18:13:12 +00:00
uv venv --python 3.13
2026-03-18 15:12:20 +00:00
source .venv/bin/activate
uv pip install -e . --torch-backend=auto
unsloth studio setup
unsloth studio -H 0.0.0.0 -p 8888
2026-03-18 13:58:42 +00:00
```
2026-03-18 15:15:07 +00:00
Then to launch every time:
2026-03-18 15:12:20 +00:00
```bash
cd unsloth_studio
source .venv/bin/activate
unsloth studio -H 0.0.0.0 -p 8888
2026-03-18 13:58:42 +00:00
```
2026-03-18 15:12:20 +00:00
2026-03-19 03:13:48 +00:00
#### Nightly Install - Windows:
Run in Windows Powershell:
2026-03-18 15:12:20 +00:00
```bash
2026-03-18 16:57:44 +00:00
winget install -e --id Python.Python.3.13
winget install --id=astral-sh.uv -e
2026-03-18 15:12:20 +00:00
git clone --filter=blob:none https://github.com/unslothai/unsloth.git unsloth_studio
cd unsloth_studio
2026-03-18 18:13:12 +00:00
uv venv --python 3.13
2026-03-18 15:12:20 +00:00
.\.venv\Scripts\activate
uv pip install -e . --torch-backend=auto
unsloth studio setup
unsloth studio -H 0.0.0.0 -p 8888
```
2026-03-18 15:15:07 +00:00
Then to launch every time:
2026-03-18 15:12:20 +00:00
```bash
cd unsloth_studio
.\.venv\Scripts\activate
unsloth studio -H 0.0.0.0 -p 8888
2026-03-17 14:53:50 +00:00
```
### Unsloth Core (code-based)
2026-03-18 15:12:20 +00:00
#### Linux, WSL
2026-03-17 14:53:50 +00:00
```bash
2026-03-18 18:13:12 +00:00
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv unsloth_env --python 3.13
2026-03-18 15:12:20 +00:00
source unsloth_env/bin/activate
uv pip install unsloth --torch-backend=auto
```
#### Windows Powershell
```bash
2026-03-18 18:13:12 +00:00
winget install -e --id Python.Python.3.13
winget install --id=astral-sh.uv -e
uv venv unsloth_env --python 3.13
2026-03-18 15:12:20 +00:00
.\unsloth_env\Scripts\activate
2026-03-18 05:53:11 +00:00
uv pip install unsloth --torch-backend=auto
2026-03-17 14:53:50 +00:00
```
For Windows, `pip install unsloth` works only if you have Pytorch installed. Read our [Windows Guide ](https://unsloth.ai/docs/get-started/install/windows-installation ).
You can use the same Docker image as Unsloth Studio.
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
2026-03-17 14:53:50 +00:00
#### AMD, Intel
2026-03-18 05:53:11 +00:00
For RTX 50x, B200, 6000 GPUs: `uv pip install unsloth --torch-backend=auto` . Read our guides for: [Blackwell ](https://unsloth.ai/docs/blog/fine-tuning-llms-with-blackwell-rtx-50-series-and-unsloth ) and [DGX Spark ](https://unsloth.ai/docs/blog/fine-tuning-llms-with-nvidia-dgx-spark-and-unsloth ). < br >
2026-03-17 14:53:50 +00:00
To install Unsloth on **AMD** and **Intel** GPUs, follow our [AMD Guide ](https://unsloth.ai/docs/get-started/install/amd ) and [Intel Guide ](https://unsloth.ai/docs/get-started/install/intel ).
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
2026-03-17 14:53:50 +00:00
## ✨ Free Notebooks
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
2026-03-17 14:53:50 +00:00
Train for free with our notebooks. Read our [guide ](https://unsloth.ai/docs/get-started/fine-tuning-llms-guide ). Add dataset, run, then deploy your trained model.
2024-02-09 04:49:09 +00:00
2026-03-14 05:30:21 +00:00
| Model | Free Notebooks | Performance | Memory use |
2024-04-29 19:59:02 +00:00
|-----------|---------|--------|----------|
2026-03-14 05:30:21 +00:00
| **Qwen3.5 (4B)** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen3_5_(4B )_Vision.ipynb) | 1.5x faster | 60% less |
| **gpt-oss (20B)** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B )-Fine-tuning.ipynb) | 2x faster | 70% less |
| **gpt-oss (20B): GRPO** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B )-GRPO.ipynb) | 2x faster | 80% less |
| **Qwen3: Advanced GRPO** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen3_(4B )-GRPO.ipynb) | 2x faster | 50% less |
| **Gemma 3 (4B) Vision** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3_(4B )-Vision.ipynb) | 1.7x faster | 60% less |
| **embeddinggemma (300M)** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/EmbeddingGemma_(300M ).ipynb) | 2x faster | 20% less |
| **Mistral Ministral 3 (3B)** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Ministral_3_VL_(3B )_Vision.ipynb) | 1.5x faster | 60% less |
| **Llama 3.1 (8B) Alpaca** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B )-Alpaca.ipynb) | 2x faster | 70% less |
| **Llama 3.2 Conversational** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(1B_and_3B )-Conversational.ipynb) | 2x faster | 70% less |
| **Orpheus-TTS (3B)** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Orpheus_(3B )-TTS.ipynb) | 1.5x faster | 50% less |
- See all our notebooks for: [Kaggle ](https://github.com/unslothai/notebooks?tab=readme-ov-file#-kaggle-notebooks ), [GRPO ](https://unsloth.ai/docs/get-started/unsloth-notebooks#grpo-reasoning-rl-notebooks ), [TTS ](https://unsloth.ai/docs/get-started/unsloth-notebooks#text-to-speech-tts-notebooks ), [embedding ](https://unsloth.ai/docs/new/embedding-finetuning ) & [Vision ](https://unsloth.ai/docs/get-started/unsloth-notebooks#vision-multimodal-notebooks )
- See [all our models ](https://unsloth.ai/docs/get-started/unsloth-model-catalog ) and [all our notebooks ](https://unsloth.ai/docs/get-started/unsloth-notebooks )
- See detailed documentation for Unsloth [here ](https://unsloth.ai/docs )
## 🦥 Unsloth News
2026-03-17 14:53:50 +00:00
- **Introducing Unsloth Studio**: our new web UI for running and training LLMs. [Blog ](https://unsloth.ai/docs/new/studio )
2026-03-14 05:30:21 +00:00
- **Qwen3.5** - 0.8B, 2B, 4B, 9B, 27B, 35-A3B, 112B-A10B are now supported. [Guide + notebooks ](https://unsloth.ai/docs/models/qwen3.5/fine-tune )
- Train **MoE LLMs 12x faster** with 35% less VRAM - DeepSeek, GLM, Qwen and gpt-oss. [Blog ](https://unsloth.ai/docs/new/faster-moe )
- **Embedding models**: Unsloth now supports ~1.8-3.3x faster embedding fine-tuning. [Blog ](https://unsloth.ai/docs/new/embedding-finetuning ) • [Notebooks ](https://unsloth.ai/docs/get-started/unsloth-notebooks#embedding-models )
- New **7x longer context RL** vs. all other setups, via our new batching algorithms. [Blog ](https://unsloth.ai/docs/new/grpo-long-context )
- New RoPE & MLP **Triton Kernels** & **Padding Free + Packing** : 3x faster training & 30% less VRAM. [Blog ](https://unsloth.ai/docs/new/3x-faster-training-packing )
- **500K Context**: Training a 20B model with >500K context is now possible on an 80GB GPU. [Blog ](https://unsloth.ai/docs/blog/500k-context-length-fine-tuning )
- **FP8 & Vision RL**: You can now do FP8 & VLM GRPO on consumer GPUs. [FP8 Blog ](https://unsloth.ai/docs/get-started/reinforcement-learning-rl-guide/fp8-reinforcement-learning ) • [Vision RL ](https://unsloth.ai/docs/get-started/reinforcement-learning-rl-guide/vision-reinforcement-learning-vlm-rl )
- **gpt-oss** by OpenAI: Read our [RL blog ](https://unsloth.ai/docs/models/gpt-oss-how-to-run-and-fine-tune/gpt-oss-reinforcement-learning ), [Flex Attention ](https://unsloth.ai/docs/models/gpt-oss-how-to-run-and-fine-tune/long-context-gpt-oss-training ) blog and [Guide ](https://unsloth.ai/docs/models/gpt-oss-how-to-run-and-fine-tune ).
2025-11-28 04:49:47 +00:00
2026-03-14 05:30:21 +00:00
## 🔗 Links and Resources
| Type | Links |
| ----------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------ |
| < img width = "15" src = "https://redditinc.com/hs-fs/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png" /> **r/unsloth Reddit** | [Join Reddit community ](https://reddit.com/r/unsloth ) |
| 📚 **Documentation & Wiki** | [Read Our Docs ](https://unsloth.ai/docs ) |
| < img width = "13" src = "https://upload.wikimedia.org/wikipedia/commons/0/09/X_(formerly_Twitter)_logo_late_2025.svg" /> **Twitter (aka X)** | [Follow us on X ](https://twitter.com/unslothai ) |
| 💾 **Installation** | [Pip & Docker Install ](https://unsloth.ai/docs/get-started/install ) |
| 🔮 **Our Models** | [Unsloth Catalog ](https://unsloth.ai/docs/get-started/unsloth-model-catalog ) |
| ✍️ **Blog** | [Read our Blogs ](https://unsloth.ai/blog ) |
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
2024-12-20 10:20:15 +00:00
### Citation
2024-12-05 07:59:13 +00:00
You can cite the Unsloth repo as follows:
```bibtex
@software {unsloth,
author = {Daniel Han, Michael Han and Unsloth team},
title = {Unsloth},
2026-03-14 05:30:21 +00:00
url = {https://github.com/unslothai/unsloth},
2024-12-05 07:59:13 +00:00
year = {2023}
}
```
2026-03-17 14:53:50 +00:00
If you trained a model with 🦥Unsloth, you can use this cool sticker! < img src = "https://raw.githubusercontent.com/unslothai/unsloth/main/images/made with unsloth.png" width = "200" align = "center" / >
2024-12-05 07:59:13 +00:00
2026-03-18 01:48:00 +00:00
### License
2026-03-18 11:21:18 +00:00
Unsloth uses a dual-licensing model of Apache 2.0 and AGPL-3.0. The core Unsloth package remains licensed under ** [Apache 2.0 ](https://github.com/unslothai/unsloth?tab=Apache-2.0-1-ov-file )**, while certain optional components, such as the Unsloth Studio UI are licensed under the open-source license ** [AGPL-3.0 ](https://github.com/unslothai/unsloth?tab=AGPL-3.0-2-ov-file )**.
2026-03-18 01:48:00 +00:00
This structure helps support ongoing Unsloth development while keeping the project open source and enabling the broader ecosystem to continue growing.
2024-04-10 15:43:34 +00:00
### Thank You to
2026-03-17 14:53:50 +00:00
- The [llama.cpp library ](https://github.com/ggml-org/llama.cpp ) that lets users run and save models with Unsloth
2026-03-14 05:30:21 +00:00
- The Hugging Face team and their libraries: [transformers ](https://github.com/huggingface/transformers ) and [TRL ](https://github.com/huggingface/trl )
- The Pytorch and [Torch AO ](https://github.com/unslothai/unsloth/pull/3391 ) team for their contributions
- And of course for every single person who has contributed or has used Unsloth!