ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
< div align = "center" >
< a href = "https://unsloth.ai" > < picture >
< source media = "(prefers-color-scheme: dark)" srcset = "https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20logo%20white%20text.png" >
< source media = "(prefers-color-scheme: light)" srcset = "https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20logo%20black%20text.png" >
< img alt = "unsloth logo" src = "https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20logo%20black%20text.png" height = "110" style = "max-width: 100%;" >
< / picture > < / a >
2025-01-09 07:02:27 +00:00
< a href = "https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-Alpaca.ipynb" > < img src = "https://raw.githubusercontent.com/unslothai/unsloth/main/images/start free finetune button.png" height = "48" > < / a >
2025-02-10 03:57:15 +00:00
< a href = "https://discord.com/invite/unsloth" > < img src = "https://raw.githubusercontent.com/unslothai/unsloth/main/images/Discord button.png" height = "48" > < / a >
2024-11-21 19:24:12 +00:00
< a href = "https://docs.unsloth.ai" > < img src = "https://raw.githubusercontent.com/unslothai/unsloth/refs/heads/main/images/Documentation%20Button.png" height = "48" > < / a >
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
2025-04-29 02:08:12 +00:00
### Finetune Qwen3, Llama 4, Gemma 3, Phi-4 & Mistral 2x faster with 80% less VRAM!
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00

< / div >
## ✨ Finetune for Free
2025-03-14 18:12:02 +00:00
Notebooks are beginner friendly. Read our [guide ](https://docs.unsloth.ai/get-started/fine-tuning-guide ). Add your dataset, click "Run All", and export your finetuned model to GGUF, Ollama, vLLM or Hugging Face.
2024-02-09 04:49:09 +00:00
2024-04-29 19:59:02 +00:00
| Unsloth supports | Free Notebooks | Performance | Memory use |
|-----------|---------|--------|----------|
2025-02-13 09:14:06 +00:00
| **GRPO (R1 reasoning)** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B )-GRPO.ipynb) | 2x faster | 80% less |
2025-03-15 05:06:53 +00:00
| **Gemma 3 (4B)** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3_(4B ).ipynb) | 1.6x faster | 60% less |
2025-03-14 18:12:02 +00:00
| **Llama 3.2 (3B)** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(1B_and_3B )-Conversational.ipynb) | 2x faster | 70% less |
2025-01-26 22:11:58 +00:00
| **Phi-4 (14B)** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Phi_4-Conversational.ipynb ) | 2x faster | 70% less |
| **Llama 3.2 Vision (11B)** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(11B )-Vision.ipynb) | 2x faster | 50% less |
| **Llama 3.1 (8B)** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B )-Alpaca.ipynb) | 2x faster | 70% less |
| **Qwen 2.5 (7B)** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen2.5_(7B )-Alpaca.ipynb) | 2x faster | 70% less |
| **Mistral v0.3 (7B)** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Mistral_v0.3_(7B )-Conversational.ipynb) | 2.2x faster | 75% less |
| **Ollama** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3_(8B )-Ollama.ipynb) | 1.9x faster | 60% less |
| **DPO Zephyr** | [▶️ Start for free ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Zephyr_(7B )-DPO.ipynb) | 1.9x faster | 50% less |
2024-04-10 15:43:34 +00:00
2024-11-21 19:24:12 +00:00
- See [all our notebooks ](https://docs.unsloth.ai/get-started/unsloth-notebooks ) and [all our models ](https://docs.unsloth.ai/get-started/all-our-models )
2025-03-14 18:12:02 +00:00
- **Kaggle Notebooks** for [Llama 3.2 Kaggle notebook ](https://www.kaggle.com/danielhanchen/kaggle-llama-3-2-1b-3b-unsloth-notebook ), [Llama 3.1 (8B) ](https://www.kaggle.com/danielhanchen/kaggle-llama-3-1-8b-unsloth-notebook ), [Phi-4 (14B) ](https://www.kaggle.com/code/danielhanchen/phi-4-finetuning-unsloth-notebook ), [Mistral (7B) ](https://www.kaggle.com/code/danielhanchen/kaggle-mistral-7b-unsloth-notebook )
- See detailed documentation for Unsloth [here ](https://docs.unsloth.ai/ ).
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
2025-03-03 04:34:36 +00:00
## ⚡ Quickstart
- **Install with pip (recommended)** for Linux devices:
```
pip install unsloth
```
2025-03-14 18:12:02 +00:00
For Windows install instructions, see [here ](https://docs.unsloth.ai/get-started/installing-+-updating/windows-installation ).
2025-03-03 04:34:36 +00:00
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
## 🦥 Unsloth.ai News
2025-04-05 21:56:01 +00:00
- 📣 NEW! ** [Llama 4 ](https://unsloth.ai/blog/llama4 )**, Meta's latest models including Scout & Maverick are now supported.
2025-03-19 11:21:39 +00:00
- 📣 NEW! [**EVERYTHING** is now supported ](https://unsloth.ai/blog/gemma3#everything ) incuding: FFT, ALL models (Mixtral, MOE, Cohere, Mamba) and all training algorithms (KTO, DoRA) etc. MultiGPU support coming very soon.
To enable full-finetuning, set ```full_finetuning = True``` and for 8-bit finetuning, set ```load_in_8bit = True```
2025-04-02 08:10:21 +00:00
- 📣 NEW! **Gemma 3** by Google: [Read Blog ](https://unsloth.ai/blog/gemma3 ). We [uploaded GGUFs, 4-bit models ](https://huggingface.co/collections/unsloth/gemma-3-67d12b7e8816ec6efa7e4e5b ).
2025-03-14 18:12:02 +00:00
- 📣 NEW! Introducing Long-context [Reasoning (GRPO) ](https://unsloth.ai/blog/grpo ) in Unsloth. Train your own reasoning model with just 5GB VRAM. Transform Llama, Phi, Mistral etc. into reasoning LLMs!
- 📣 NEW! [DeepSeek-R1 ](https://unsloth.ai/blog/deepseek-r1 ) - the most powerful open reasoning models with Llama & Qwen distillations. Run or fine-tune them now [with our guide ](https://unsloth.ai/blog/deepseek-r1 ). All model uploads: [here ](https://huggingface.co/collections/unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5 ).
2025-03-16 00:47:25 +00:00
- 📣 NEW! [Phi-4 ](https://unsloth.ai/blog/phi4 ) by Microsoft: We also [fixed bugs ](https://unsloth.ai/blog/phi4 ) in Phi-4 and [uploaded GGUFs, 4-bit ](https://huggingface.co/collections/unsloth/phi-4-all-versions-677eecf93784e61afe762afa ).
2025-01-21 06:13:07 +00:00
- 📣 Introducing Unsloth [Dynamic 4-bit Quantization ](https://unsloth.ai/blog/dynamic-4bit )! We dynamically opt not to quantize certain parameters and this greatly increases accuracy while only using < 10 % more VRAM than BnB 4-bit . See our collection on [Hugging Face here. ](https://huggingface.co/collections/unsloth/unsloth-4-bit-dynamic-quants-67503bb873f89e15276c44e7 )
2025-01-31 05:05:45 +00:00
- 📣 [Vision models ](https://unsloth.ai/blog/vision ) now supported! [Llama 3.2 Vision (11B) ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(11B )-Vision.ipynb), [Qwen 2.5 VL (7B) ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen2_VL_(7B )-Vision.ipynb) and [Pixtral (12B) 2409 ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Pixtral_(12B )-Vision.ipynb)
2024-08-21 00:59:50 +00:00
< details >
< summary > Click for more news< / summary >
2025-03-14 18:12:02 +00:00
2025-04-05 21:56:01 +00:00
- 📣 [Llama 3.3 (70B) ](https://huggingface.co/collections/unsloth/llama-33-all-versions-67535d7d994794b9d7cf5e9f ), Meta's latest model is supported.
- 📣 We worked with Apple to add [Cut Cross Entropy ](https://arxiv.org/abs/2411.09009 ). Unsloth now supports 89K context for Meta's Llama 3.3 (70B) on a 80GB GPU - 13x longer than HF+FA2. For Llama 3.1 (8B), Unsloth enables 342K context, surpassing its native 128K support.
2025-01-21 06:13:07 +00:00
- 📣 We found and helped fix a [gradient accumulation bug ](https://unsloth.ai/blog/gradient )! Please update Unsloth and transformers.
- 📣 Try out [Chat interface ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Unsloth_Studio.ipynb )!
- 📣 NEW! Qwen-2.5 including [Coder ](https://unsloth.ai/blog/qwen-coder ) models are now supported with bugfixes. 14b fits in a Colab GPU! [Qwen 2.5 conversational notebook ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen2.5_Coder_(14B )-Conversational.ipynb)
- 📣 NEW! [Mistral Small 22b notebook ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Mistral_Small_(22B )-Alpaca.ipynb) finetuning fits in under 16GB of VRAM!
2024-09-23 08:27:12 +00:00
- 📣 NEW! `pip install unsloth` now works! Head over to [pypi ](https://pypi.org/project/unsloth/ ) to check it out! This allows non git pull installs. Use `pip install unsloth[colab-new]` for non dependency installs.
2025-01-21 06:13:07 +00:00
- 📣 NEW! Continued Pretraining [notebook ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Mistral_v0.3_(7B )-CPT.ipynb) for other languages like Korean!
- 📣 [2x faster inference ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B )-Inference.ipynb) added for all our models
2024-07-03 19:12:21 +00:00
- 📣 We cut memory usage by a [further 30% ](https://unsloth.ai/blog/long-context ) and now support [4x longer context windows ](https://unsloth.ai/blog/long-context )!
2024-08-21 00:59:50 +00:00
< / details >
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
## 🔗 Links and Resources
| Type | Links |
| ------------------------------- | --------------------------------------- |
2024-08-21 00:59:50 +00:00
| 📚 **Documentation & Wiki** | [Read Our Docs ](https://docs.unsloth.ai ) |
2024-04-10 15:43:34 +00:00
| < img height = "14" src = "https://upload.wikimedia.org/wikipedia/commons/6/6f/Logo_of_Twitter.svg" /> **Twitter (aka X)** | [Follow us on X ](https://twitter.com/unslothai )|
2025-03-03 04:35:27 +00:00
| 💾 **Installation** | [Pip install ](https://docs.unsloth.ai/get-started/installing-+-updating )|
2025-02-27 00:58:32 +00:00
| 🔮 **Our Models** | [Unsloth Releases ](https://docs.unsloth.ai/get-started/all-our-models )|
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
| ✍️ **Blog** | [Read our Blogs ](https://unsloth.ai/blog )|
2024-12-20 10:20:15 +00:00
| < img height = "14" src = "https://redditinc.com/hs-fs/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png" /> **Reddit** | [Join our Reddit page ](https://reddit.com/r/unsloth )|
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
## ⭐ Key Features
2025-03-19 11:21:39 +00:00
- Supports **full-finetuning** , pretraining, 4b-bit, 16-bit and **8-bit** training
2025-02-09 03:41:39 +00:00
- All kernels written in [OpenAI's Triton ](https://openai.com/index/triton/ ) language. **Manual backprop engine** .
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
- **0% loss in accuracy** - no approximation methods - all exact.
- No change of hardware. Supports NVIDIA GPUs since 2018+. Minimum CUDA Capability 7.0 (V100, T4, Titan V, RTX 20, 30, 40x, A100, H100, L40 etc) [Check your GPU! ](https://developer.nvidia.com/cuda-gpus ) GTX 1070, 1080 works, but is slow.
2025-03-03 04:34:36 +00:00
- Works on **Linux** and **Windows**
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
- Supports 4bit and 16bit QLoRA / LoRA finetuning via [bitsandbytes ](https://github.com/TimDettmers/bitsandbytes ).
- If you trained a model with 🦥Unsloth, you can use this cool sticker! < img src = "https://raw.githubusercontent.com/unslothai/unsloth/main/images/made with unsloth.png" height = "50" align = "center" />
2025-02-27 00:58:32 +00:00
## 💾 Install Unsloth
2025-03-03 04:34:36 +00:00
You can also see our documentation for more detailed installation and updating instructions [here ](https://docs.unsloth.ai/get-started/installing-+-updating ).
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
2025-03-03 04:34:36 +00:00
### Pip Installation
**Install with pip (recommended) for Linux devices:**
2025-02-27 00:58:32 +00:00
```
pip install unsloth
```
2025-03-03 04:34:36 +00:00
See [here ](https://github.com/unslothai/unsloth/edit/main/README.md#advanced-pip-installation ) for advanced pip install instructions.
### Windows Installation
> [!warning]
> Python 3.13 does not support Unsloth. Use 3.12, 3.11 or 3.10
2025-03-03 04:44:26 +00:00
1. **Install NVIDIA Video Driver:**
2025-03-04 05:27:20 +00:00
You should install the latest version of your GPUs driver. Download drivers here: [NVIDIA GPU Drive ](https://www.nvidia.com/Download/index.aspx ).
2025-03-03 04:34:36 +00:00
3. **Install Visual Studio C++:**
2025-03-14 18:12:02 +00:00
You will need Visual Studio, with C++ installed. By default, C++ is not installed with [Visual Studio ](https://visualstudio.microsoft.com/vs/community/ ), so make sure you select all of the C++ options. Also select options for Windows 10/11 SDK. For detailed instructions with options, see [here ](https://docs.unsloth.ai/get-started/installing-+-updating ).
2025-03-03 04:34:36 +00:00
5. **Install CUDA Toolkit:**
2025-03-04 05:27:20 +00:00
Follow the instructions to install [CUDA Toolkit ](https://developer.nvidia.com/cuda-toolkit-archive ).
2025-03-03 04:34:36 +00:00
6. **Install PyTorch:**
You will need the correct version of PyTorch that is compatibile with your CUDA drivers, so make sure to select them carefully.
2025-03-04 05:27:20 +00:00
[Install PyTorch ](https://pytorch.org/get-started/locally/ ).
2025-03-03 04:34:36 +00:00
7. **Install Unsloth:**
```python
2025-03-14 13:42:44 +00:00
pip install unsloth
2025-03-03 04:34:36 +00:00
```
#### Notes
To run Unsloth directly on Windows:
- Install Triton from this Windows fork and follow the instructions [here ](https://github.com/woct0rdho/triton-windows ) (be aware that the Windows fork requires PyTorch >= 2.4 and CUDA 12)
- In the SFTTrainer, set `dataset_num_proc=1` to avoid a crashing issue:
```python
trainer = SFTTrainer(
dataset_num_proc=1,
...
)
```
#### Advanced/Troubleshooting
For **advanced installation instructions** or if you see weird errors during installations:
1. Install `torch` and `triton` . Go to https://pytorch.org to install it. For example `pip install torch torchvision torchaudio triton`
2. Confirm if CUDA is installated correctly. Try `nvcc` . If that fails, you need to install `cudatoolkit` or CUDA drivers.
3. Install `xformers` manually. You can try installing `vllm` and seeing if `vllm` succeeds. Check if `xformers` succeeded with `python -m xformers.info` Go to https://github.com/facebookresearch/xformers. Another option is to install `flash-attn` for Ampere GPUs.
4. Double check that your versions of Python, CUDA, CUDNN, `torch` , `triton` , and `xformers` are compatible with one another. The [PyTorch Compatibility Matrix ](https://github.com/pytorch/pytorch/blob/main/RELEASE.md#release-compatibility-matrix ) may be useful.
5. Finally, install `bitsandbytes` and check it with `python -m bitsandbytes`
2025-02-20 07:24:05 +00:00
### Conda Installation (Optional)
2024-09-08 19:29:31 +00:00
`⚠️ Only use Conda if you have it. If not, use Pip` . Select either `pytorch-cuda=11.8,12.1` for CUDA 11.8 or CUDA 12.1. We support `python=3.10,3.11,3.12` .
2023-12-22 17:22:48 +00:00
```bash
2024-06-14 10:59:45 +00:00
conda create --name unsloth_env \
2024-08-20 00:18:30 +00:00
python=3.11 \
pytorch-cuda=12.1 \
2024-06-14 10:59:45 +00:00
pytorch cudatoolkit xformers -c pytorch -c nvidia -c xformers \
-y
2024-02-20 16:58:59 +00:00
conda activate unsloth_env
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
2025-02-20 07:24:05 +00:00
pip install unsloth
2023-12-01 02:37:27 +00:00
```
2023-11-30 14:32:45 +00:00
2024-08-20 00:18:30 +00:00
< details >
< summary > If you're looking to install Conda in a Linux environment, < a href = "https://docs.anaconda.com/miniconda/" > read here< / a > , or run the below 🔽< / summary >
```bash
mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm -rf ~/miniconda3/miniconda.sh
~/miniconda3/bin/conda init bash
~/miniconda3/bin/conda init zsh
```
< / details >
2025-03-03 04:34:36 +00:00
### Advanced Pip Installation
2024-09-08 19:29:31 +00:00
`⚠️ Do **NOT** use this if you have Conda.` Pip is a bit more complex since there are dependency issues. The pip command is different for `torch 2.2,2.3,2.4,2.5` and CUDA versions.
2023-12-27 17:19:12 +00:00
2024-10-27 01:03:15 +00:00
For other torch versions, we support `torch211` , `torch212` , `torch220` , `torch230` , `torch240` and for CUDA versions, we support `cu118` and `cu121` and `cu124` . For Ampere devices (A100, H100, RTX3090) and above, use `cu118-ampere` or `cu121-ampere` or `cu124-ampere` .
2024-09-08 19:29:31 +00:00
For example, if you have `torch 2.4` and `CUDA 12.1` , use:
2023-12-22 17:22:48 +00:00
```bash
2023-12-05 15:59:22 +00:00
pip install --upgrade pip
2024-08-20 00:18:30 +00:00
pip install "unsloth[cu121-torch240] @ git+https://github.com/unslothai/unsloth.git"
2023-12-05 15:59:22 +00:00
```
2024-03-16 15:47:05 +00:00
2024-10-27 01:03:15 +00:00
Another example, if you have `torch 2.5` and `CUDA 12.4` , use:
```bash
pip install --upgrade pip
pip install "unsloth[cu124-torch250] @ git+https://github.com/unslothai/unsloth.git"
```
2024-09-08 19:29:31 +00:00
And other examples:
```bash
pip install "unsloth[cu121-ampere-torch240] @ git+https://github.com/unslothai/unsloth.git"
pip install "unsloth[cu118-ampere-torch240] @ git+https://github.com/unslothai/unsloth.git"
pip install "unsloth[cu121-torch240] @ git+https://github.com/unslothai/unsloth.git"
pip install "unsloth[cu118-torch240] @ git+https://github.com/unslothai/unsloth.git"
pip install "unsloth[cu121-torch230] @ git+https://github.com/unslothai/unsloth.git"
pip install "unsloth[cu121-ampere-torch230] @ git+https://github.com/unslothai/unsloth.git"
2024-10-27 01:03:15 +00:00
pip install "unsloth[cu121-torch250] @ git+https://github.com/unslothai/unsloth.git"
pip install "unsloth[cu124-ampere-torch250] @ git+https://github.com/unslothai/unsloth.git"
2024-09-08 19:29:31 +00:00
```
Or, run the below in a terminal to get the **optimal** pip installation command:
2024-05-12 19:22:03 +00:00
```bash
2024-08-20 00:18:30 +00:00
wget -qO- https://raw.githubusercontent.com/unslothai/unsloth/main/unsloth/_auto_install.py | python -
```
Or, run the below manually in a Python REPL:
```python
try: import torch
2024-10-27 01:03:15 +00:00
except: raise ImportError('Install torch via `pip install torch` ')
2024-08-20 00:18:30 +00:00
from packaging.version import Version as V
v = V(torch.__version__)
cuda = str(torch.version.cuda)
is_ampere = torch.cuda.get_device_capability()[0] >= 8
2024-10-27 01:03:15 +00:00
if cuda != "12.1" and cuda != "11.8" and cuda != "12.4": raise RuntimeError(f"CUDA = {cuda} not supported!")
2024-08-20 00:18:30 +00:00
if v < = V('2.1.0'): raise RuntimeError(f"Torch = {v} too old!")
elif v < = V('2.1.1'): x = 'cu{}{}-torch211'
elif v < = V('2.1.2'): x = 'cu{}{}-torch212'
elif v < V ( ' 2 . 3 . 0 ' ) : x = 'cu{}{}-torch220'
elif v < V ( ' 2 . 4 . 0 ' ) : x = 'cu{}{}-torch230'
elif v < V ( ' 2 . 5 . 0 ' ) : x = 'cu{}{}-torch240'
2024-10-27 01:03:15 +00:00
elif v < V ( ' 2 . 6 . 0 ' ) : x = 'cu{}{}-torch250'
2024-08-20 00:18:30 +00:00
else: raise RuntimeError(f"Torch = {v} too new!")
x = x.format(cuda.replace(".", ""), "-ampere" if is_ampere else "")
print(f'pip install --upgrade pip & & pip install "unsloth[{x}] @ git+https://github.com/unslothai/unsloth.git"')
2024-05-12 19:22:03 +00:00
```
2024-08-20 00:18:30 +00:00
2025-03-03 04:34:36 +00:00
## 📜 Documentation
2024-08-21 00:59:50 +00:00
- Go to our official [Documentation ](https://docs.unsloth.ai ) for saving to GGUF, checkpointing, evaluation and more!
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
- We support Huggingface's TRL, Trainer, Seq2SeqTrainer or even Pytorch code!
- We're in 🤗Hugging Face's official docs! Check out the [SFT docs ](https://huggingface.co/docs/trl/main/en/sft_trainer#accelerate-fine-tuning-2x-using-unsloth ) and [DPO docs ](https://huggingface.co/docs/trl/main/en/dpo_trainer#accelerate-dpo-fine-tuning-using-unsloth )!
2025-01-07 12:23:14 +00:00
- If you want to download models from the ModelScope community, please use an environment variable: `UNSLOTH_USE_MODELSCOPE=1` , and install the modelscope library by: `pip install modelscope -U` .
> unsloth_cli.py also supports `UNSLOTH_USE_MODELSCOPE=1` to download models and datasets. please remember to use the model and dataset id in the ModelScope community.
2024-01-04 17:08:53 +00:00
2023-12-22 17:22:48 +00:00
```python
2024-05-21 18:45:57 +00:00
from unsloth import FastLanguageModel
2023-12-01 02:37:27 +00:00
import torch
2025-03-04 11:55:49 +00:00
from trl import SFTTrainer, SFTConfig
2023-12-25 12:28:05 +00:00
from datasets import load_dataset
max_seq_length = 2048 # Supports RoPE Scaling interally, so choose any!
# Get LAION dataset
url = "https://huggingface.co/datasets/laion/OIG/resolve/main/unified_chip2.jsonl"
dataset = load_dataset("json", data_files = {"train" : url}, split = "train")
2023-12-01 02:37:27 +00:00
2024-04-28 18:47:03 +00:00
# 4bit pre quantized models we support for 4x faster downloading + no OOMs.
2023-12-30 14:20:04 +00:00
fourbit_models = [
2025-03-04 11:55:49 +00:00
"unsloth/Meta-Llama-3.1-8B-bnb-4bit", # Llama-3.1 2x faster
"unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit",
"unsloth/Meta-Llama-3.1-70B-bnb-4bit",
"unsloth/Meta-Llama-3.1-405B-bnb-4bit", # 4bit for 405b!
"unsloth/Mistral-Small-Instruct-2409", # Mistral 22b 2x faster!
2024-05-23 20:36:10 +00:00
"unsloth/mistral-7b-instruct-v0.3-bnb-4bit",
2025-03-04 11:55:49 +00:00
"unsloth/Phi-3.5-mini-instruct", # Phi-3.5 2x faster!
2024-05-23 20:36:10 +00:00
"unsloth/Phi-3-medium-4k-instruct",
2025-03-04 11:55:49 +00:00
"unsloth/gemma-2-9b-bnb-4bit",
"unsloth/gemma-2-27b-bnb-4bit", # Gemma 2x faster!
"unsloth/Llama-3.2-1B-bnb-4bit", # NEW! Llama 3.2 models
"unsloth/Llama-3.2-1B-Instruct-bnb-4bit",
"unsloth/Llama-3.2-3B-bnb-4bit",
"unsloth/Llama-3.2-3B-Instruct-bnb-4bit",
"unsloth/Llama-3.3-70B-Instruct-bnb-4bit" # NEW! Llama 3.3 70B!
2024-04-28 18:47:03 +00:00
] # More models at https://huggingface.co/unsloth
2024-02-20 16:58:59 +00:00
2025-03-19 11:23:52 +00:00
model, tokenizer = FastModel.from_pretrained(
model_name = "unsloth/gemma-3-4B-it",
max_seq_length = 2048, # Choose any for long context!
load_in_4bit = True, # 4 bit quantization to reduce memory
load_in_8bit = False, # [NEW!] A bit more accurate, uses 2x memory
full_finetuning = False, # [NEW!] We have full finetuning now!
# token = "hf_...", # use one if using gated models
2023-12-01 02:37:27 +00:00
)
# Do model patching and add fast LoRA weights
2023-12-25 12:28:05 +00:00
model = FastLanguageModel.get_peft_model(
2023-12-01 02:37:27 +00:00
model,
r = 16,
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",],
lora_alpha = 16,
2024-01-06 08:13:39 +00:00
lora_dropout = 0, # Supports any, but = 0 is optimized
bias = "none", # Supports any, but = "none" is optimized
2024-04-28 18:47:03 +00:00
# [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
2023-12-01 02:37:27 +00:00
random_state = 3407,
max_seq_length = max_seq_length,
2024-02-20 16:58:59 +00:00
use_rslora = False, # We support rank stabilized LoRA
loftq_config = None, # And LoftQ
2023-12-01 02:37:27 +00:00
)
2023-12-25 12:28:05 +00:00
trainer = SFTTrainer(
model = model,
train_dataset = dataset,
tokenizer = tokenizer,
2025-03-04 11:55:49 +00:00
args = SFTConfig(
dataset_text_field = "text",
max_seq_length = max_seq_length,
2023-12-25 12:28:05 +00:00
per_device_train_batch_size = 2,
gradient_accumulation_steps = 4,
warmup_steps = 10,
max_steps = 60,
logging_steps = 1,
output_dir = "outputs",
optim = "adamw_8bit",
seed = 3407,
),
)
trainer.train()
2024-02-20 16:58:59 +00:00
# Go to https://github.com/unslothai/unsloth/wiki for advanced tips like
# (1) Saving to GGUF / merging to 16bit for vLLM
# (2) Continued training from a saved LoRA adapter
# (3) Adding an evaluation loop / OOMs
2024-06-14 10:59:45 +00:00
# (4) Customized chat templates
2023-12-01 02:37:27 +00:00
```
2025-02-27 01:03:47 +00:00
< a name = "RL" > < / a >
## 💡 Reinforcement Learning
2025-03-27 07:26:18 +00:00
RL including DPO, GRPO, PPO, Reward Modelling, Online DPO all work with Unsloth. We're in 🤗Hugging Face's official docs! We're on the [GRPO docs ](https://huggingface.co/learn/nlp-course/en/chapter12/6 ) and the [DPO docs ](https://huggingface.co/docs/trl/main/en/dpo_trainer#accelerate-dpo-fine-tuning-using-unsloth )! List of RL notebooks:
2025-02-27 01:03:47 +00:00
- ORPO notebook: [Link ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3_(8B )-ORPO.ipynb)
- DPO Zephyr notebook: [Link ](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Zephyr_(7B )-DPO.ipynb)
2025-03-27 07:26:18 +00:00
- KTO notebook: [Link ](https://colab.research.google.com/drive/1MRgGtLWuZX4ypSfGguFgC-IblTvO2ivM?usp=sharing )
- SimPO notebook: [Link ](https://colab.research.google.com/drive/1Hs5oQDovOay4mFA6Y9lQhVJ8TnbFLFh2?usp=sharing )
2024-01-04 17:08:53 +00:00
2025-02-27 00:58:32 +00:00
< details >
< summary > Click for DPO code< / summary >
2023-12-31 08:24:30 +00:00
```python
2024-11-14 03:05:40 +00:00
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0" # Optional set GPU device ID
2025-03-04 11:55:49 +00:00
from unsloth import FastLanguageModel
2023-12-31 08:24:30 +00:00
import torch
2025-03-04 11:55:49 +00:00
from trl import DPOTrainer, DPOConfig
max_seq_length = 2048
2023-12-31 08:24:30 +00:00
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/zephyr-sft-bnb-4bit",
max_seq_length = max_seq_length,
load_in_4bit = True,
)
# Do model patching and add fast LoRA weights
model = FastLanguageModel.get_peft_model(
model,
r = 64,
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",],
lora_alpha = 64,
2024-01-06 08:13:39 +00:00
lora_dropout = 0, # Supports any, but = 0 is optimized
bias = "none", # Supports any, but = "none" is optimized
2024-04-28 18:47:03 +00:00
# [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
2023-12-31 08:24:30 +00:00
random_state = 3407,
max_seq_length = max_seq_length,
)
dpo_trainer = DPOTrainer(
model = model,
ref_model = None,
2025-03-04 11:55:49 +00:00
train_dataset = YOUR_DATASET_HERE,
# eval_dataset = YOUR_DATASET_HERE,
tokenizer = tokenizer,
args = DPOConfig(
2023-12-31 08:24:30 +00:00
per_device_train_batch_size = 4,
gradient_accumulation_steps = 8,
warmup_ratio = 0.1,
num_train_epochs = 3,
logging_steps = 1,
optim = "adamw_8bit",
seed = 42,
output_dir = "outputs",
2025-03-04 11:55:49 +00:00
max_length = 1024,
max_prompt_length = 512,
beta = 0.1,
2023-12-31 08:24:30 +00:00
),
)
dpo_trainer.train()
```
2025-02-27 00:58:32 +00:00
< / details >
## 🥇 Performance Benchmarking
- For our most detailed benchmarks, read our [Llama 3.3 Blog ](https://unsloth.ai/blog/llama3-3 ).
- Benchmarking of Unsloth was also conducted by [🤗Hugging Face ](https://huggingface.co/blog/unsloth-trl ).
We tested using the Alpaca Dataset, a batch size of 2, gradient accumulation steps of 4, rank = 32, and applied QLoRA on all linear layers (q, k, v, o, gate, up, down):
| Model | VRAM | 🦥 Unsloth speed | 🦥 VRAM reduction | 🦥 Longer context | 😊 Hugging Face + FA2 |
|----------------|-------|-----------------|----------------|----------------|--------------------|
| Llama 3.3 (70B)| 80GB | 2x | >75% | 13x longer | 1x |
| Llama 3.1 (8B) | 80GB | 2x | >70% | 12x longer | 1x |
2023-12-31 08:24:30 +00:00
2025-01-15 07:20:07 +00:00
### Context length benchmarks
2025-02-27 00:58:32 +00:00
2025-01-15 07:20:07 +00:00
#### Llama 3.1 (8B) max. context length
We tested Llama 3.1 (8B) Instruct and did 4bit QLoRA on all linear layers (Q, K, V, O, gate, up and down) with rank = 32 with a batch size of 1. We padded all sequences to a certain maximum sequence length to mimic long context finetuning workloads.
| GPU VRAM | 🦥Unsloth context length | Hugging Face + FA2 |
|----------|-----------------------|-----------------|
| 8 GB | 2,972 | OOM |
| 12 GB | 21,848 | 932 |
| 16 GB | 40,724 | 2,551 |
| 24 GB | 78,475 | 5,789 |
| 40 GB | 153,977 | 12,264 |
| 48 GB | 191,728 | 15,502 |
| 80 GB | 342,733 | 28,454 |
#### Llama 3.3 (70B) max. context length
We tested Llama 3.3 (70B) Instruct on a 80GB A100 and did 4bit QLoRA on all linear layers (Q, K, V, O, gate, up and down) with rank = 32 with a batch size of 1. We padded all sequences to a certain maximum sequence length to mimic long context finetuning workloads.
| GPU VRAM | 🦥Unsloth context length | Hugging Face + FA2 |
|----------|------------------------|------------------|
| 48 GB | 12,106 | OOM |
| 80 GB | 89,389 | 6,916 |
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00
2025-01-15 07:20:07 +00:00
< br >
ReadMe Revamp (#156)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
2024-02-06 15:00:12 +00:00

< br >
2023-12-14 01:33:25 +00:00
2024-12-20 10:20:15 +00:00
### Citation
2024-12-05 07:59:13 +00:00
You can cite the Unsloth repo as follows:
```bibtex
@software {unsloth,
author = {Daniel Han, Michael Han and Unsloth team},
title = {Unsloth},
url = {http://github.com/unslothai/unsloth},
year = {2023}
}
```
2024-04-10 15:43:34 +00:00
### Thank You to
2025-03-03 04:34:36 +00:00
- Hugging Face's [TRL library ](https://github.com/huggingface/trl ) which serves as the basis foundation for Unsloth
2024-12-20 10:20:15 +00:00
- [Erik ](https://github.com/erikwijmans ) for his help adding [Apple's ML Cross Entropy ](https://github.com/apple/ml-cross-entropy ) in Unsloth
2024-04-10 15:43:34 +00:00
- [HuyNguyen-hust ](https://github.com/HuyNguyen-hust ) for making [RoPE Embeddings 28% faster ](https://github.com/unslothai/unsloth/pull/238 )
- [RandomInternetPreson ](https://github.com/RandomInternetPreson ) for confirming WSL support
- [152334H ](https://github.com/152334H ) for experimental DPO support