2023-08-09 06:38:51 +00:00
#!/usr/bin/env python3
2025-12-04 18:02:06 +00:00
"""
LocalAI Diffusers Backend
This backend provides gRPC access to diffusers pipelines with dynamic pipeline loading .
New pipelines added to diffusers become available automatically without code changes .
"""
2023-08-09 06:38:51 +00:00
from concurrent import futures
2024-07-16 16:58:45 +00:00
import traceback
2023-08-09 06:38:51 +00:00
import argparse
feat(conda): conda environments (#1144)
* feat(autogptq): add a separate conda environment for autogptq (#1137)
**Description**
This PR related to #1117
**Notes for Reviewers**
Here we lock down the version of the dependencies. Make sure it can be
used all the time without failed if the version of dependencies were
upgraded.
I change the order of importing packages according to the pylint, and no
change the logic of code. It should be ok.
I will do more investigate on writing some test cases for every backend.
I can run the service in my environment, but there is not exist a way to
test it. So, I am not confident on it.
Add a README.md in the `grpc` root. This is the common commands for
creating `conda` environment. And it can be used to the reference file
for creating extral gRPC backend document.
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* [Extra backend] Add seperate environment for ttsbark (#1141)
**Description**
This PR relates to #1117
**Notes for Reviewers**
Same to the latest PR:
* The code is also changed, but only the order of the import package
parts. And some code comments are also added.
* Add a configuration of the `conda` environment
* Add a simple test case for testing if the service can be startup in
current `conda` environment. It is succeed in VSCode, but the it is not
out of box on terminal. So, it is hard to say the test case really
useful.
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): add make target and entrypoints for the dockerfile
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): Add seperate conda env for diffusers (#1145)
**Description**
This PR relates to #1117
**Notes for Reviewers**
* Add `conda` env `diffusers.yml`
* Add Makefile to create it automatically
* Add `run.sh` to support running as a extra backend
* Also adding it to the main Dockerfile
* Add make command in the root Makefile
* Testing the server, it can start up under the env
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate env for vllm (#1148)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* The gRPC server can be started as normal
* The test case can be triggered in VSCode
* Same to other this kind of PRs, add `vllm.yml` Makefile and add
`run.sh` to the main Dockerfile, and command to the main Makefile
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate env for huggingface (#1146)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* Add conda env `huggingface.yml`
* Change the import order, and also remove the no-used packages
* Add `run.sh` and `make command` to the main Dockerfile and Makefile
* Add test cases for it. It can be triggered and succeed under VSCode
Python extension but it is hang by using `python -m unites
test_huggingface.py` in the terminal
```
Running tests (unittest): /workspaces/LocalAI/extra/grpc/huggingface
Running tests: /workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_embedding
/workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_load_model
/workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_server_startup
./test_huggingface.py::TestBackendServicer::test_embedding Passed
./test_huggingface.py::TestBackendServicer::test_load_model Passed
./test_huggingface.py::TestBackendServicer::test_server_startup Passed
Total number of tests expected to run: 3
Total number of tests run: 3
Total number of tests passed: 3
Total number of tests failed: 0
Total number of tests failed with errors: 0
Total number of tests skipped: 0
Finished running tests!
```
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): Add the seperate conda env for VALL-E X (#1147)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* The gRPC server cannot start up
```
(ttsvalle) @Aisuko ➜ /workspaces/LocalAI (feat/vall-e-x) $ /opt/conda/envs/ttsvalle/bin/python /workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py
Traceback (most recent call last):
File "/workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py", line 14, in <module>
from utils.generation import SAMPLE_RATE, generate_audio, preload_models
ModuleNotFoundError: No module named 'utils'
```
The installation steps follow
https://github.com/Plachtaa/VALL-E-X#-installation below:
* Under the `ttsvalle` conda env
```
git clone https://github.com/Plachtaa/VALL-E-X.git
cd VALL-E-X
pip install -r requirements.txt
```
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: set image type
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate conda env for exllama (#1149)
Add seperate env for exllama
Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Setup conda
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Set image_type arg
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* ci: prepare only conda env in tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Dockerfile: comment manual pip calls
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* conda: add conda to PATH
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixes
* add shebang
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* file perms
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* debug
* Install new conda in the worker
* Disable GPU tests for now until the worker is back
* Rename workflows
* debug
* Fixup conda install
* fixup(wrapper): pass args
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Aisuko <urakiny@gmail.com>
2023-11-04 14:30:32 +00:00
from collections import defaultdict
from enum import Enum
2023-08-09 06:38:51 +00:00
import signal
import sys
feat(conda): conda environments (#1144)
* feat(autogptq): add a separate conda environment for autogptq (#1137)
**Description**
This PR related to #1117
**Notes for Reviewers**
Here we lock down the version of the dependencies. Make sure it can be
used all the time without failed if the version of dependencies were
upgraded.
I change the order of importing packages according to the pylint, and no
change the logic of code. It should be ok.
I will do more investigate on writing some test cases for every backend.
I can run the service in my environment, but there is not exist a way to
test it. So, I am not confident on it.
Add a README.md in the `grpc` root. This is the common commands for
creating `conda` environment. And it can be used to the reference file
for creating extral gRPC backend document.
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* [Extra backend] Add seperate environment for ttsbark (#1141)
**Description**
This PR relates to #1117
**Notes for Reviewers**
Same to the latest PR:
* The code is also changed, but only the order of the import package
parts. And some code comments are also added.
* Add a configuration of the `conda` environment
* Add a simple test case for testing if the service can be startup in
current `conda` environment. It is succeed in VSCode, but the it is not
out of box on terminal. So, it is hard to say the test case really
useful.
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): add make target and entrypoints for the dockerfile
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): Add seperate conda env for diffusers (#1145)
**Description**
This PR relates to #1117
**Notes for Reviewers**
* Add `conda` env `diffusers.yml`
* Add Makefile to create it automatically
* Add `run.sh` to support running as a extra backend
* Also adding it to the main Dockerfile
* Add make command in the root Makefile
* Testing the server, it can start up under the env
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate env for vllm (#1148)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* The gRPC server can be started as normal
* The test case can be triggered in VSCode
* Same to other this kind of PRs, add `vllm.yml` Makefile and add
`run.sh` to the main Dockerfile, and command to the main Makefile
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate env for huggingface (#1146)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* Add conda env `huggingface.yml`
* Change the import order, and also remove the no-used packages
* Add `run.sh` and `make command` to the main Dockerfile and Makefile
* Add test cases for it. It can be triggered and succeed under VSCode
Python extension but it is hang by using `python -m unites
test_huggingface.py` in the terminal
```
Running tests (unittest): /workspaces/LocalAI/extra/grpc/huggingface
Running tests: /workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_embedding
/workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_load_model
/workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_server_startup
./test_huggingface.py::TestBackendServicer::test_embedding Passed
./test_huggingface.py::TestBackendServicer::test_load_model Passed
./test_huggingface.py::TestBackendServicer::test_server_startup Passed
Total number of tests expected to run: 3
Total number of tests run: 3
Total number of tests passed: 3
Total number of tests failed: 0
Total number of tests failed with errors: 0
Total number of tests skipped: 0
Finished running tests!
```
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): Add the seperate conda env for VALL-E X (#1147)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* The gRPC server cannot start up
```
(ttsvalle) @Aisuko ➜ /workspaces/LocalAI (feat/vall-e-x) $ /opt/conda/envs/ttsvalle/bin/python /workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py
Traceback (most recent call last):
File "/workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py", line 14, in <module>
from utils.generation import SAMPLE_RATE, generate_audio, preload_models
ModuleNotFoundError: No module named 'utils'
```
The installation steps follow
https://github.com/Plachtaa/VALL-E-X#-installation below:
* Under the `ttsvalle` conda env
```
git clone https://github.com/Plachtaa/VALL-E-X.git
cd VALL-E-X
pip install -r requirements.txt
```
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: set image type
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate conda env for exllama (#1149)
Add seperate env for exllama
Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Setup conda
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Set image_type arg
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* ci: prepare only conda env in tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Dockerfile: comment manual pip calls
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* conda: add conda to PATH
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixes
* add shebang
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* file perms
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* debug
* Install new conda in the worker
* Disable GPU tests for now until the worker is back
* Rename workflows
* debug
* Fixup conda install
* fixup(wrapper): pass args
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Aisuko <urakiny@gmail.com>
2023-11-04 14:30:32 +00:00
import time
2023-08-09 06:38:51 +00:00
import os
feat(conda): conda environments (#1144)
* feat(autogptq): add a separate conda environment for autogptq (#1137)
**Description**
This PR related to #1117
**Notes for Reviewers**
Here we lock down the version of the dependencies. Make sure it can be
used all the time without failed if the version of dependencies were
upgraded.
I change the order of importing packages according to the pylint, and no
change the logic of code. It should be ok.
I will do more investigate on writing some test cases for every backend.
I can run the service in my environment, but there is not exist a way to
test it. So, I am not confident on it.
Add a README.md in the `grpc` root. This is the common commands for
creating `conda` environment. And it can be used to the reference file
for creating extral gRPC backend document.
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* [Extra backend] Add seperate environment for ttsbark (#1141)
**Description**
This PR relates to #1117
**Notes for Reviewers**
Same to the latest PR:
* The code is also changed, but only the order of the import package
parts. And some code comments are also added.
* Add a configuration of the `conda` environment
* Add a simple test case for testing if the service can be startup in
current `conda` environment. It is succeed in VSCode, but the it is not
out of box on terminal. So, it is hard to say the test case really
useful.
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): add make target and entrypoints for the dockerfile
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): Add seperate conda env for diffusers (#1145)
**Description**
This PR relates to #1117
**Notes for Reviewers**
* Add `conda` env `diffusers.yml`
* Add Makefile to create it automatically
* Add `run.sh` to support running as a extra backend
* Also adding it to the main Dockerfile
* Add make command in the root Makefile
* Testing the server, it can start up under the env
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate env for vllm (#1148)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* The gRPC server can be started as normal
* The test case can be triggered in VSCode
* Same to other this kind of PRs, add `vllm.yml` Makefile and add
`run.sh` to the main Dockerfile, and command to the main Makefile
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate env for huggingface (#1146)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* Add conda env `huggingface.yml`
* Change the import order, and also remove the no-used packages
* Add `run.sh` and `make command` to the main Dockerfile and Makefile
* Add test cases for it. It can be triggered and succeed under VSCode
Python extension but it is hang by using `python -m unites
test_huggingface.py` in the terminal
```
Running tests (unittest): /workspaces/LocalAI/extra/grpc/huggingface
Running tests: /workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_embedding
/workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_load_model
/workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_server_startup
./test_huggingface.py::TestBackendServicer::test_embedding Passed
./test_huggingface.py::TestBackendServicer::test_load_model Passed
./test_huggingface.py::TestBackendServicer::test_server_startup Passed
Total number of tests expected to run: 3
Total number of tests run: 3
Total number of tests passed: 3
Total number of tests failed: 0
Total number of tests failed with errors: 0
Total number of tests skipped: 0
Finished running tests!
```
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): Add the seperate conda env for VALL-E X (#1147)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* The gRPC server cannot start up
```
(ttsvalle) @Aisuko ➜ /workspaces/LocalAI (feat/vall-e-x) $ /opt/conda/envs/ttsvalle/bin/python /workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py
Traceback (most recent call last):
File "/workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py", line 14, in <module>
from utils.generation import SAMPLE_RATE, generate_audio, preload_models
ModuleNotFoundError: No module named 'utils'
```
The installation steps follow
https://github.com/Plachtaa/VALL-E-X#-installation below:
* Under the `ttsvalle` conda env
```
git clone https://github.com/Plachtaa/VALL-E-X.git
cd VALL-E-X
pip install -r requirements.txt
```
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: set image type
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate conda env for exllama (#1149)
Add seperate env for exllama
Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Setup conda
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Set image_type arg
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* ci: prepare only conda env in tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Dockerfile: comment manual pip calls
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* conda: add conda to PATH
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixes
* add shebang
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* file perms
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* debug
* Install new conda in the worker
* Disable GPU tests for now until the worker is back
* Rename workflows
* debug
* Fixup conda install
* fixup(wrapper): pass args
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Aisuko <urakiny@gmail.com>
2023-11-04 14:30:32 +00:00
from PIL import Image
2023-08-09 06:38:51 +00:00
import torch
feat(conda): conda environments (#1144)
* feat(autogptq): add a separate conda environment for autogptq (#1137)
**Description**
This PR related to #1117
**Notes for Reviewers**
Here we lock down the version of the dependencies. Make sure it can be
used all the time without failed if the version of dependencies were
upgraded.
I change the order of importing packages according to the pylint, and no
change the logic of code. It should be ok.
I will do more investigate on writing some test cases for every backend.
I can run the service in my environment, but there is not exist a way to
test it. So, I am not confident on it.
Add a README.md in the `grpc` root. This is the common commands for
creating `conda` environment. And it can be used to the reference file
for creating extral gRPC backend document.
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* [Extra backend] Add seperate environment for ttsbark (#1141)
**Description**
This PR relates to #1117
**Notes for Reviewers**
Same to the latest PR:
* The code is also changed, but only the order of the import package
parts. And some code comments are also added.
* Add a configuration of the `conda` environment
* Add a simple test case for testing if the service can be startup in
current `conda` environment. It is succeed in VSCode, but the it is not
out of box on terminal. So, it is hard to say the test case really
useful.
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): add make target and entrypoints for the dockerfile
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): Add seperate conda env for diffusers (#1145)
**Description**
This PR relates to #1117
**Notes for Reviewers**
* Add `conda` env `diffusers.yml`
* Add Makefile to create it automatically
* Add `run.sh` to support running as a extra backend
* Also adding it to the main Dockerfile
* Add make command in the root Makefile
* Testing the server, it can start up under the env
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate env for vllm (#1148)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* The gRPC server can be started as normal
* The test case can be triggered in VSCode
* Same to other this kind of PRs, add `vllm.yml` Makefile and add
`run.sh` to the main Dockerfile, and command to the main Makefile
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate env for huggingface (#1146)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* Add conda env `huggingface.yml`
* Change the import order, and also remove the no-used packages
* Add `run.sh` and `make command` to the main Dockerfile and Makefile
* Add test cases for it. It can be triggered and succeed under VSCode
Python extension but it is hang by using `python -m unites
test_huggingface.py` in the terminal
```
Running tests (unittest): /workspaces/LocalAI/extra/grpc/huggingface
Running tests: /workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_embedding
/workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_load_model
/workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_server_startup
./test_huggingface.py::TestBackendServicer::test_embedding Passed
./test_huggingface.py::TestBackendServicer::test_load_model Passed
./test_huggingface.py::TestBackendServicer::test_server_startup Passed
Total number of tests expected to run: 3
Total number of tests run: 3
Total number of tests passed: 3
Total number of tests failed: 0
Total number of tests failed with errors: 0
Total number of tests skipped: 0
Finished running tests!
```
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): Add the seperate conda env for VALL-E X (#1147)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* The gRPC server cannot start up
```
(ttsvalle) @Aisuko ➜ /workspaces/LocalAI (feat/vall-e-x) $ /opt/conda/envs/ttsvalle/bin/python /workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py
Traceback (most recent call last):
File "/workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py", line 14, in <module>
from utils.generation import SAMPLE_RATE, generate_audio, preload_models
ModuleNotFoundError: No module named 'utils'
```
The installation steps follow
https://github.com/Plachtaa/VALL-E-X#-installation below:
* Under the `ttsvalle` conda env
```
git clone https://github.com/Plachtaa/VALL-E-X.git
cd VALL-E-X
pip install -r requirements.txt
```
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: set image type
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate conda env for exllama (#1149)
Add seperate env for exllama
Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Setup conda
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Set image_type arg
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* ci: prepare only conda env in tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Dockerfile: comment manual pip calls
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* conda: add conda to PATH
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixes
* add shebang
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* file perms
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* debug
* Install new conda in the worker
* Disable GPU tests for now until the worker is back
* Rename workflows
* debug
* Fixup conda install
* fixup(wrapper): pass args
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Aisuko <urakiny@gmail.com>
2023-11-04 14:30:32 +00:00
import backend_pb2
import backend_pb2_grpc
import grpc
2026-03-29 22:47:27 +00:00
sys . path . insert ( 0 , os . path . join ( os . path . dirname ( __file__ ) , ' .. ' , ' common ' ) )
sys . path . insert ( 0 , os . path . join ( os . path . dirname ( __file__ ) , ' common ' ) )
from grpc_auth import get_auth_interceptors
feat(conda): conda environments (#1144)
* feat(autogptq): add a separate conda environment for autogptq (#1137)
**Description**
This PR related to #1117
**Notes for Reviewers**
Here we lock down the version of the dependencies. Make sure it can be
used all the time without failed if the version of dependencies were
upgraded.
I change the order of importing packages according to the pylint, and no
change the logic of code. It should be ok.
I will do more investigate on writing some test cases for every backend.
I can run the service in my environment, but there is not exist a way to
test it. So, I am not confident on it.
Add a README.md in the `grpc` root. This is the common commands for
creating `conda` environment. And it can be used to the reference file
for creating extral gRPC backend document.
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* [Extra backend] Add seperate environment for ttsbark (#1141)
**Description**
This PR relates to #1117
**Notes for Reviewers**
Same to the latest PR:
* The code is also changed, but only the order of the import package
parts. And some code comments are also added.
* Add a configuration of the `conda` environment
* Add a simple test case for testing if the service can be startup in
current `conda` environment. It is succeed in VSCode, but the it is not
out of box on terminal. So, it is hard to say the test case really
useful.
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): add make target and entrypoints for the dockerfile
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): Add seperate conda env for diffusers (#1145)
**Description**
This PR relates to #1117
**Notes for Reviewers**
* Add `conda` env `diffusers.yml`
* Add Makefile to create it automatically
* Add `run.sh` to support running as a extra backend
* Also adding it to the main Dockerfile
* Add make command in the root Makefile
* Testing the server, it can start up under the env
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate env for vllm (#1148)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* The gRPC server can be started as normal
* The test case can be triggered in VSCode
* Same to other this kind of PRs, add `vllm.yml` Makefile and add
`run.sh` to the main Dockerfile, and command to the main Makefile
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate env for huggingface (#1146)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* Add conda env `huggingface.yml`
* Change the import order, and also remove the no-used packages
* Add `run.sh` and `make command` to the main Dockerfile and Makefile
* Add test cases for it. It can be triggered and succeed under VSCode
Python extension but it is hang by using `python -m unites
test_huggingface.py` in the terminal
```
Running tests (unittest): /workspaces/LocalAI/extra/grpc/huggingface
Running tests: /workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_embedding
/workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_load_model
/workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_server_startup
./test_huggingface.py::TestBackendServicer::test_embedding Passed
./test_huggingface.py::TestBackendServicer::test_load_model Passed
./test_huggingface.py::TestBackendServicer::test_server_startup Passed
Total number of tests expected to run: 3
Total number of tests run: 3
Total number of tests passed: 3
Total number of tests failed: 0
Total number of tests failed with errors: 0
Total number of tests skipped: 0
Finished running tests!
```
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): Add the seperate conda env for VALL-E X (#1147)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* The gRPC server cannot start up
```
(ttsvalle) @Aisuko ➜ /workspaces/LocalAI (feat/vall-e-x) $ /opt/conda/envs/ttsvalle/bin/python /workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py
Traceback (most recent call last):
File "/workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py", line 14, in <module>
from utils.generation import SAMPLE_RATE, generate_audio, preload_models
ModuleNotFoundError: No module named 'utils'
```
The installation steps follow
https://github.com/Plachtaa/VALL-E-X#-installation below:
* Under the `ttsvalle` conda env
```
git clone https://github.com/Plachtaa/VALL-E-X.git
cd VALL-E-X
pip install -r requirements.txt
```
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: set image type
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate conda env for exllama (#1149)
Add seperate env for exllama
Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Setup conda
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Set image_type arg
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* ci: prepare only conda env in tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Dockerfile: comment manual pip calls
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* conda: add conda to PATH
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixes
* add shebang
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* file perms
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* debug
* Install new conda in the worker
* Disable GPU tests for now until the worker is back
* Rename workflows
* debug
* Fixup conda install
* fixup(wrapper): pass args
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Aisuko <urakiny@gmail.com>
2023-11-04 14:30:32 +00:00
2025-12-04 18:02:06 +00:00
# Import dynamic loader for pipeline discovery
from diffusers_dynamic_loader import (
get_pipeline_registry ,
resolve_pipeline_class ,
get_available_pipelines ,
load_diffusers_pipeline ,
)
# Import specific items still needed for special cases and safety checker
from diffusers import DiffusionPipeline , ControlNetModel
from diffusers import FluxPipeline , FluxTransformer2DModel , AutoencoderKLWan
2023-08-14 21:12:00 +00:00
from diffusers . pipelines . stable_diffusion import safety_checker
2024-07-16 16:58:45 +00:00
from diffusers . utils import load_image , export_to_video
2024-03-07 13:37:45 +00:00
from compel import Compel , ReturnedEmbeddingsType
2024-08-10 23:31:53 +00:00
from optimum . quanto import freeze , qfloat8 , quantize
2025-12-04 18:02:06 +00:00
from transformers import T5EncoderModel
2023-08-27 08:11:16 +00:00
from safetensors . torch import load_file
2026-02-19 09:45:17 +00:00
# Try to import sd_embed - it might not always be available
try :
from sd_embed . embedding_funcs import (
get_weighted_text_embeddings_sd15 ,
get_weighted_text_embeddings_sdxl ,
get_weighted_text_embeddings_sd3 ,
get_weighted_text_embeddings_flux1 ,
)
SD_EMBED_AVAILABLE = True
except ImportError :
get_weighted_text_embeddings_sd15 = None
get_weighted_text_embeddings_sdxl = None
get_weighted_text_embeddings_sd3 = None
get_weighted_text_embeddings_flux1 = None
SD_EMBED_AVAILABLE = False
feat(conda): conda environments (#1144)
* feat(autogptq): add a separate conda environment for autogptq (#1137)
**Description**
This PR related to #1117
**Notes for Reviewers**
Here we lock down the version of the dependencies. Make sure it can be
used all the time without failed if the version of dependencies were
upgraded.
I change the order of importing packages according to the pylint, and no
change the logic of code. It should be ok.
I will do more investigate on writing some test cases for every backend.
I can run the service in my environment, but there is not exist a way to
test it. So, I am not confident on it.
Add a README.md in the `grpc` root. This is the common commands for
creating `conda` environment. And it can be used to the reference file
for creating extral gRPC backend document.
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* [Extra backend] Add seperate environment for ttsbark (#1141)
**Description**
This PR relates to #1117
**Notes for Reviewers**
Same to the latest PR:
* The code is also changed, but only the order of the import package
parts. And some code comments are also added.
* Add a configuration of the `conda` environment
* Add a simple test case for testing if the service can be startup in
current `conda` environment. It is succeed in VSCode, but the it is not
out of box on terminal. So, it is hard to say the test case really
useful.
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): add make target and entrypoints for the dockerfile
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): Add seperate conda env for diffusers (#1145)
**Description**
This PR relates to #1117
**Notes for Reviewers**
* Add `conda` env `diffusers.yml`
* Add Makefile to create it automatically
* Add `run.sh` to support running as a extra backend
* Also adding it to the main Dockerfile
* Add make command in the root Makefile
* Testing the server, it can start up under the env
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate env for vllm (#1148)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* The gRPC server can be started as normal
* The test case can be triggered in VSCode
* Same to other this kind of PRs, add `vllm.yml` Makefile and add
`run.sh` to the main Dockerfile, and command to the main Makefile
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate env for huggingface (#1146)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* Add conda env `huggingface.yml`
* Change the import order, and also remove the no-used packages
* Add `run.sh` and `make command` to the main Dockerfile and Makefile
* Add test cases for it. It can be triggered and succeed under VSCode
Python extension but it is hang by using `python -m unites
test_huggingface.py` in the terminal
```
Running tests (unittest): /workspaces/LocalAI/extra/grpc/huggingface
Running tests: /workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_embedding
/workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_load_model
/workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_server_startup
./test_huggingface.py::TestBackendServicer::test_embedding Passed
./test_huggingface.py::TestBackendServicer::test_load_model Passed
./test_huggingface.py::TestBackendServicer::test_server_startup Passed
Total number of tests expected to run: 3
Total number of tests run: 3
Total number of tests passed: 3
Total number of tests failed: 0
Total number of tests failed with errors: 0
Total number of tests skipped: 0
Finished running tests!
```
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda): Add the seperate conda env for VALL-E X (#1147)
**Description**
This PR is related to #1117
**Notes for Reviewers**
* The gRPC server cannot start up
```
(ttsvalle) @Aisuko ➜ /workspaces/LocalAI (feat/vall-e-x) $ /opt/conda/envs/ttsvalle/bin/python /workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py
Traceback (most recent call last):
File "/workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py", line 14, in <module>
from utils.generation import SAMPLE_RATE, generate_audio, preload_models
ModuleNotFoundError: No module named 'utils'
```
The installation steps follow
https://github.com/Plachtaa/VALL-E-X#-installation below:
* Under the `ttsvalle` conda env
```
git clone https://github.com/Plachtaa/VALL-E-X.git
cd VALL-E-X
pip install -r requirements.txt
```
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions
-------------------------
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: set image type
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(conda):Add seperate conda env for exllama (#1149)
Add seperate env for exllama
Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Setup conda
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Set image_type arg
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* ci: prepare only conda env in tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Dockerfile: comment manual pip calls
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* conda: add conda to PATH
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixes
* add shebang
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* file perms
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* debug
* Install new conda in the worker
* Disable GPU tests for now until the worker is back
* Rename workflows
* debug
* Fixup conda install
* fixup(wrapper): pass args
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Aisuko <urakiny@gmail.com>
2023-11-04 14:30:32 +00:00
2026-01-14 08:07:30 +00:00
# Import LTX-2 specific utilities
2026-01-22 13:09:20 +00:00
from diffusers . pipelines . ltx2 . export_utils import encode_video as ltx2_encode_video
from diffusers import LTX2VideoTransformer3DModel , GGUFQuantizationConfig
2026-01-14 08:07:30 +00:00
2023-08-09 06:38:51 +00:00
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
2024-07-16 16:58:45 +00:00
COMPEL = os . environ . get ( " COMPEL " , " 0 " ) == " 1 "
2026-02-11 21:58:19 +00:00
SD_EMBED = os . environ . get ( " SD_EMBED " , " 0 " ) == " 1 "
2026-02-19 09:45:17 +00:00
# Warn if SD_EMBED is enabled but the module is not available
if SD_EMBED and not SD_EMBED_AVAILABLE :
print ( " WARNING: SD_EMBED is enabled but sd_embed module is not available. Falling back to standard prompt processing. " , file = sys . stderr )
2024-07-16 16:58:45 +00:00
XPU = os . environ . get ( " XPU " , " 0 " ) == " 1 "
CLIPSKIP = os . environ . get ( " CLIPSKIP " , " 1 " ) == " 1 "
SAFETENSORS = os . environ . get ( " SAFETENSORS " , " 1 " ) == " 1 "
CHUNK_SIZE = os . environ . get ( " CHUNK_SIZE " , " 8 " )
FPS = os . environ . get ( " FPS " , " 7 " )
DISABLE_CPU_OFFLOAD = os . environ . get ( " DISABLE_CPU_OFFLOAD " , " 0 " ) == " 1 "
FRAMES = os . environ . get ( " FRAMES " , " 64 " )
2023-08-09 06:38:51 +00:00
2024-03-07 13:37:45 +00:00
if XPU :
2025-07-01 10:36:17 +00:00
print ( torch . xpu . get_device_name ( 0 ) )
2024-03-07 13:37:45 +00:00
2023-09-19 19:30:39 +00:00
# If MAX_WORKERS are specified in the environment use it, otherwise default to 1
MAX_WORKERS = int ( os . environ . get ( ' PYTHON_GRPC_MAX_WORKERS ' , ' 1 ' ) )
2024-07-16 16:58:45 +00:00
2023-08-14 21:12:00 +00:00
# https://github.com/CompVis/stable-diffusion/issues/239#issuecomment-1627615287
2024-07-16 16:58:45 +00:00
def sc ( self , clip_input , images ) : return images , [ False for i in images ]
2023-08-14 21:12:00 +00:00
# edit the StableDiffusionSafetyChecker class so that, when called, it just returns the images and an array of True values
safety_checker . StableDiffusionSafetyChecker . forward = sc
2023-08-17 21:38:59 +00:00
from diffusers . schedulers import (
DDIMScheduler ,
DPMSolverMultistepScheduler ,
DPMSolverSinglestepScheduler ,
EulerAncestralDiscreteScheduler ,
EulerDiscreteScheduler ,
HeunDiscreteScheduler ,
KDPM2AncestralDiscreteScheduler ,
KDPM2DiscreteScheduler ,
LMSDiscreteScheduler ,
PNDMScheduler ,
UniPCMultistepScheduler ,
)
2024-07-16 16:58:45 +00:00
2025-08-06 08:36:53 +00:00
def is_float ( s ) :
2025-09-19 17:56:08 +00:00
""" Check if a string can be converted to float. """
2025-08-06 08:36:53 +00:00
try :
float ( s )
return True
except ValueError :
return False
2025-09-19 17:56:08 +00:00
def is_int ( s ) :
""" Check if a string can be converted to int. """
try :
int ( s )
return True
except ValueError :
return False
2025-08-06 08:36:53 +00:00
2023-08-17 21:38:59 +00:00
# The scheduler list mapping was taken from here: https://github.com/neggles/animatediff-cli/blob/6f336f5f4b5e38e85d7f06f1744ef42d0a45f2a7/src/animatediff/schedulers.py#L39
# Credits to https://github.com/neggles
# See https://github.com/huggingface/diffusers/issues/4167 for more details on sched mapping from A1111
class DiffusionScheduler ( str , Enum ) :
ddim = " ddim " # DDIM
pndm = " pndm " # PNDM
heun = " heun " # Heun
unipc = " unipc " # UniPC
euler = " euler " # Euler
euler_a = " euler_a " # Euler a
lms = " lms " # LMS
k_lms = " k_lms " # LMS Karras
dpm_2 = " dpm_2 " # DPM2
k_dpm_2 = " k_dpm_2 " # DPM2 Karras
dpm_2_a = " dpm_2_a " # DPM2 a
k_dpm_2_a = " k_dpm_2_a " # DPM2 a Karras
dpmpp_2m = " dpmpp_2m " # DPM++ 2M
k_dpmpp_2m = " k_dpmpp_2m " # DPM++ 2M Karras
dpmpp_sde = " dpmpp_sde " # DPM++ SDE
k_dpmpp_sde = " k_dpmpp_sde " # DPM++ SDE Karras
dpmpp_2m_sde = " dpmpp_2m_sde " # DPM++ 2M SDE
k_dpmpp_2m_sde = " k_dpmpp_2m_sde " # DPM++ 2M SDE Karras
def get_scheduler ( name : str , config : dict = { } ) :
is_karras = name . startswith ( " k_ " )
if is_karras :
# strip the k_ prefix and add the karras sigma flag to config
name = name . lstrip ( " k_ " )
config [ " use_karras_sigmas " ] = True
if name == DiffusionScheduler . ddim :
sched_class = DDIMScheduler
elif name == DiffusionScheduler . pndm :
sched_class = PNDMScheduler
elif name == DiffusionScheduler . heun :
sched_class = HeunDiscreteScheduler
elif name == DiffusionScheduler . unipc :
sched_class = UniPCMultistepScheduler
elif name == DiffusionScheduler . euler :
sched_class = EulerDiscreteScheduler
elif name == DiffusionScheduler . euler_a :
sched_class = EulerAncestralDiscreteScheduler
elif name == DiffusionScheduler . lms :
sched_class = LMSDiscreteScheduler
elif name == DiffusionScheduler . dpm_2 :
# Equivalent to DPM2 in K-Diffusion
sched_class = KDPM2DiscreteScheduler
elif name == DiffusionScheduler . dpm_2_a :
# Equivalent to `DPM2 a`` in K-Diffusion
sched_class = KDPM2AncestralDiscreteScheduler
elif name == DiffusionScheduler . dpmpp_2m :
# Equivalent to `DPM++ 2M` in K-Diffusion
sched_class = DPMSolverMultistepScheduler
config [ " algorithm_type " ] = " dpmsolver++ "
config [ " solver_order " ] = 2
elif name == DiffusionScheduler . dpmpp_sde :
# Equivalent to `DPM++ SDE` in K-Diffusion
sched_class = DPMSolverSinglestepScheduler
elif name == DiffusionScheduler . dpmpp_2m_sde :
# Equivalent to `DPM++ 2M SDE` in K-Diffusion
sched_class = DPMSolverMultistepScheduler
config [ " algorithm_type " ] = " sde-dpmsolver++ "
else :
raise ValueError ( f " Invalid scheduler ' { ' k_ ' if is_karras else ' ' } { name } ' " )
return sched_class . from_config ( config )
2024-07-16 16:58:45 +00:00
2023-08-09 06:38:51 +00:00
# Implement the BackendServicer class with the service methods
class BackendServicer ( backend_pb2_grpc . BackendServicer ) :
2025-12-04 18:02:06 +00:00
2026-02-19 20:35:58 +00:00
def _load_pipeline ( self , request , modelFile , fromSingleFile , torchType , variant , device_map = None ) :
2025-12-04 18:02:06 +00:00
"""
Load a diffusers pipeline dynamically using the dynamic loader .
This method uses load_diffusers_pipeline ( ) for most pipelines , falling back
to explicit handling only for pipelines requiring custom initialization
( e . g . , quantization , special VAE handling ) .
Args :
request : The gRPC request containing pipeline configuration
modelFile : Path to the model file ( for single file loading )
fromSingleFile : Whether to use from_single_file ( ) vs from_pretrained ( )
torchType : The torch dtype to use
variant : Model variant ( e . g . , " fp16 " )
2026-02-19 20:35:58 +00:00
device_map : Device mapping strategy ( e . g . , " auto " for multi - GPU )
2025-12-04 18:02:06 +00:00
Returns :
The loaded pipeline instance
"""
pipeline_type = request . PipelineType
# Handle IMG2IMG request flag with default pipeline
if request . IMG2IMG and pipeline_type == " " :
pipeline_type = " StableDiffusionImg2ImgPipeline "
# ================================================================
# Special cases requiring custom initialization logic
# Only handle pipelines that truly need custom code (quantization,
# special VAE handling, etc.). All other pipelines use dynamic loading.
# ================================================================
# FluxTransformer2DModel - requires quantization and custom transformer loading
if pipeline_type == " FluxTransformer2DModel " :
dtype = torch . bfloat16
bfl_repo = os . environ . get ( " BFL_REPO " , " ChuckMcSneed/FLUX.1-dev " )
2026-02-19 20:35:58 +00:00
transformer = FluxTransformer2DModel . from_single_file ( modelFile , torch_dtype = dtype , device_map = device_map )
2025-12-04 18:02:06 +00:00
quantize ( transformer , weights = qfloat8 )
freeze ( transformer )
2026-02-19 20:35:58 +00:00
text_encoder_2 = T5EncoderModel . from_pretrained ( bfl_repo , subfolder = " text_encoder_2 " , torch_dtype = dtype , device_map = device_map )
2025-12-04 18:02:06 +00:00
quantize ( text_encoder_2 , weights = qfloat8 )
freeze ( text_encoder_2 )
2026-02-19 20:35:58 +00:00
pipe = FluxPipeline . from_pretrained ( bfl_repo , transformer = None , text_encoder_2 = None , torch_dtype = dtype , device_map = device_map )
2025-12-04 18:02:06 +00:00
pipe . transformer = transformer
pipe . text_encoder_2 = text_encoder_2
if request . LowVRAM :
pipe . enable_model_cpu_offload ( )
return pipe
# WanPipeline - requires special VAE with float32 dtype
if pipeline_type == " WanPipeline " :
vae = AutoencoderKLWan . from_pretrained (
request . Model ,
subfolder = " vae " ,
2026-02-19 20:35:58 +00:00
torch_dtype = torch . float32 ,
device_map = device_map
2025-12-04 18:02:06 +00:00
)
pipe = load_diffusers_pipeline (
class_name = " WanPipeline " ,
model_id = request . Model ,
vae = vae ,
2026-02-19 20:35:58 +00:00
torch_dtype = torchType ,
device_map = device_map
2025-12-04 18:02:06 +00:00
)
self . txt2vid = True
return pipe
# WanImageToVideoPipeline - requires special VAE with float32 dtype
if pipeline_type == " WanImageToVideoPipeline " :
vae = AutoencoderKLWan . from_pretrained (
request . Model ,
subfolder = " vae " ,
2026-02-19 20:35:58 +00:00
torch_dtype = torch . float32 ,
device_map = device_map
2025-12-04 18:02:06 +00:00
)
pipe = load_diffusers_pipeline (
class_name = " WanImageToVideoPipeline " ,
model_id = request . Model ,
vae = vae ,
2026-02-19 20:35:58 +00:00
torch_dtype = torchType ,
device_map = device_map
2025-12-04 18:02:06 +00:00
)
self . img2vid = True
return pipe
# SanaPipeline - requires special VAE and text encoder dtype conversion
if pipeline_type == " SanaPipeline " :
pipe = load_diffusers_pipeline (
class_name = " SanaPipeline " ,
model_id = request . Model ,
variant = " bf16 " ,
2026-02-19 20:35:58 +00:00
torch_dtype = torch . bfloat16 ,
device_map = device_map
2025-12-04 18:02:06 +00:00
)
pipe . vae . to ( torch . bfloat16 )
pipe . text_encoder . to ( torch . bfloat16 )
return pipe
# VideoDiffusionPipeline - alias for DiffusionPipeline with txt2vid flag
if pipeline_type == " VideoDiffusionPipeline " :
self . txt2vid = True
pipe = load_diffusers_pipeline (
class_name = " DiffusionPipeline " ,
model_id = request . Model ,
2026-02-19 20:35:58 +00:00
torch_dtype = torchType ,
device_map = device_map
2025-12-04 18:02:06 +00:00
)
return pipe
# StableVideoDiffusionPipeline - needs img2vid flag and CPU offload
if pipeline_type == " StableVideoDiffusionPipeline " :
self . img2vid = True
pipe = load_diffusers_pipeline (
class_name = " StableVideoDiffusionPipeline " ,
model_id = request . Model ,
torch_dtype = torchType ,
2026-02-19 20:35:58 +00:00
variant = variant ,
device_map = device_map
2025-12-04 18:02:06 +00:00
)
if not DISABLE_CPU_OFFLOAD :
pipe . enable_model_cpu_offload ( )
return pipe
2026-01-14 08:07:30 +00:00
# LTX2ImageToVideoPipeline - needs img2vid flag, CPU offload, and special handling
if pipeline_type == " LTX2ImageToVideoPipeline " :
self . img2vid = True
self . ltx2_pipeline = True
2026-01-22 13:09:20 +00:00
# Check if loading from single file (GGUF)
if fromSingleFile and LTX2VideoTransformer3DModel is not None :
_ , single_file_ext = os . path . splitext ( modelFile )
if single_file_ext == " .gguf " :
# Load transformer from single GGUF file with quantization
transformer_kwargs = { }
quantization_config = GGUFQuantizationConfig ( compute_dtype = torchType )
transformer_kwargs [ " quantization_config " ] = quantization_config
transformer = LTX2VideoTransformer3DModel . from_single_file (
modelFile ,
config = request . Model , # Use request.Model as the config/model_id
subfolder = " transformer " ,
2026-02-19 20:35:58 +00:00
device_map = device_map ,
2026-01-22 13:09:20 +00:00
* * transformer_kwargs ,
)
# Load pipeline with custom transformer
pipe = load_diffusers_pipeline (
class_name = " LTX2ImageToVideoPipeline " ,
model_id = request . Model ,
transformer = transformer ,
torch_dtype = torchType ,
2026-02-19 20:35:58 +00:00
device_map = device_map ,
2026-01-22 13:09:20 +00:00
)
else :
# Single file but not GGUF - use standard single file loading
pipe = load_diffusers_pipeline (
class_name = " LTX2ImageToVideoPipeline " ,
model_id = modelFile ,
from_single_file = True ,
torch_dtype = torchType ,
2026-02-19 20:35:58 +00:00
device_map = device_map ,
2026-01-22 13:09:20 +00:00
)
else :
# Standard loading from pretrained
pipe = load_diffusers_pipeline (
class_name = " LTX2ImageToVideoPipeline " ,
model_id = request . Model ,
torch_dtype = torchType ,
2026-02-19 20:35:58 +00:00
variant = variant ,
device_map = device_map
2026-01-22 13:09:20 +00:00
)
if not DISABLE_CPU_OFFLOAD :
pipe . enable_model_cpu_offload ( )
return pipe
# LTX2Pipeline - text-to-video pipeline, needs txt2vid flag, CPU offload, and special handling
if pipeline_type == " LTX2Pipeline " :
self . txt2vid = True
self . ltx2_pipeline = True
# Check if loading from single file (GGUF)
if fromSingleFile and LTX2VideoTransformer3DModel is not None :
_ , single_file_ext = os . path . splitext ( modelFile )
if single_file_ext == " .gguf " :
# Load transformer from single GGUF file with quantization
transformer_kwargs = { }
quantization_config = GGUFQuantizationConfig ( compute_dtype = torchType )
transformer_kwargs [ " quantization_config " ] = quantization_config
transformer = LTX2VideoTransformer3DModel . from_single_file (
modelFile ,
config = request . Model , # Use request.Model as the config/model_id
subfolder = " transformer " ,
2026-02-19 20:35:58 +00:00
device_map = device_map ,
2026-01-22 13:09:20 +00:00
* * transformer_kwargs ,
)
# Load pipeline with custom transformer
pipe = load_diffusers_pipeline (
class_name = " LTX2Pipeline " ,
model_id = request . Model ,
transformer = transformer ,
torch_dtype = torchType ,
2026-02-19 20:35:58 +00:00
device_map = device_map ,
2026-01-22 13:09:20 +00:00
)
else :
# Single file but not GGUF - use standard single file loading
pipe = load_diffusers_pipeline (
class_name = " LTX2Pipeline " ,
model_id = modelFile ,
from_single_file = True ,
torch_dtype = torchType ,
2026-02-19 20:35:58 +00:00
device_map = device_map ,
2026-01-22 13:09:20 +00:00
)
else :
# Standard loading from pretrained
pipe = load_diffusers_pipeline (
class_name = " LTX2Pipeline " ,
model_id = request . Model ,
torch_dtype = torchType ,
2026-02-19 20:35:58 +00:00
variant = variant ,
device_map = device_map
2026-01-22 13:09:20 +00:00
)
2026-01-14 08:07:30 +00:00
if not DISABLE_CPU_OFFLOAD :
pipe . enable_model_cpu_offload ( )
return pipe
2025-12-04 18:02:06 +00:00
# ================================================================
# Dynamic pipeline loading - the default path for most pipelines
# Uses the dynamic loader to instantiate any pipeline by class name
# ================================================================
# Build kwargs for dynamic loading
load_kwargs = { " torch_dtype " : torchType }
# Add variant if not loading from single file
if not fromSingleFile and variant :
load_kwargs [ " variant " ] = variant
# Add use_safetensors for from_pretrained
if not fromSingleFile :
load_kwargs [ " use_safetensors " ] = SAFETENSORS
2026-02-19 20:35:58 +00:00
# Add device_map for multi-GPU support (when TensorParallelSize > 1)
if device_map :
load_kwargs [ " device_map " ] = device_map
2025-12-04 18:02:06 +00:00
# Determine pipeline class name - default to AutoPipelineForText2Image
effective_pipeline_type = pipeline_type if pipeline_type else " AutoPipelineForText2Image "
# Use dynamic loader for all pipelines
try :
pipe = load_diffusers_pipeline (
class_name = effective_pipeline_type ,
model_id = modelFile if fromSingleFile else request . Model ,
from_single_file = fromSingleFile ,
* * load_kwargs
)
except Exception as e :
# Provide helpful error with available pipelines
available = get_available_pipelines ( )
raise ValueError (
f " Failed to load pipeline ' { effective_pipeline_type } ' : { e } \n "
f " Available pipelines: { ' , ' . join ( available [ : 30 ] ) } ... "
) from e
# Apply LowVRAM optimization if supported and requested
if request . LowVRAM and hasattr ( pipe , ' enable_model_cpu_offload ' ) :
pipe . enable_model_cpu_offload ( )
return pipe
2023-08-09 06:38:51 +00:00
def Health ( self , request , context ) :
return backend_pb2 . Reply ( message = bytes ( " OK " , ' utf-8 ' ) )
2024-07-16 16:58:45 +00:00
2023-08-09 06:38:51 +00:00
def LoadModel ( self , request , context ) :
try :
print ( f " Loading model { request . Model } ... " , file = sys . stderr )
print ( f " Request { request } " , file = sys . stderr )
torchType = torch . float32
2023-12-13 18:20:22 +00:00
variant = None
2023-08-09 06:38:51 +00:00
if request . F16Memory :
torchType = torch . float16
2024-07-16 16:58:45 +00:00
variant = " fp16 "
2023-08-09 06:38:51 +00:00
2025-02-11 09:16:32 +00:00
options = request . Options
# empty dict
self . options = { }
# The options are a list of strings in this form optname:optvalue
# We are storing all the options in a dict so we can use it later when
# generating the images
for opt in options :
2025-04-29 15:08:55 +00:00
if " : " not in opt :
continue
2025-02-11 09:16:32 +00:00
key , value = opt . split ( " : " )
2025-08-06 08:36:53 +00:00
# if value is a number, convert it to the appropriate type
if is_float ( value ) :
2025-09-19 17:56:08 +00:00
value = float ( value )
elif is_int ( value ) :
value = int ( value )
elif value . lower ( ) in [ " true " , " false " ] :
value = value . lower ( ) == " true "
2025-02-11 09:16:32 +00:00
self . options [ key ] = value
2025-08-06 08:36:53 +00:00
# From options, extract if present "torch_dtype" and set it to the appropriate type
if " torch_dtype " in self . options :
if self . options [ " torch_dtype " ] == " fp16 " :
torchType = torch . float16
elif self . options [ " torch_dtype " ] == " bf16 " :
torchType = torch . bfloat16
elif self . options [ " torch_dtype " ] == " fp32 " :
torchType = torch . float32
# remove it from options
del self . options [ " torch_dtype " ]
2025-04-29 15:08:55 +00:00
print ( f " Options: { self . options } " , file = sys . stderr )
2023-08-14 21:12:00 +00:00
local = False
modelFile = request . Model
2023-08-15 23:11:42 +00:00
2023-12-24 18:24:52 +00:00
self . cfg_scale = 7
2024-08-10 23:31:53 +00:00
self . PipelineType = request . PipelineType
2023-08-15 23:11:42 +00:00
if request . CFGScale != 0 :
2023-12-24 18:24:52 +00:00
self . cfg_scale = request . CFGScale
2024-07-16 16:58:45 +00:00
2024-09-04 14:29:09 +00:00
clipmodel = " Lykon/dreamshaper-8 "
2023-08-17 21:38:59 +00:00
if request . CLIPModel != " " :
clipmodel = request . CLIPModel
clipsubfolder = " text_encoder "
if request . CLIPSubfolder != " " :
clipsubfolder = request . CLIPSubfolder
2024-07-16 16:58:45 +00:00
2023-08-14 21:12:00 +00:00
# Check if ModelFile exists
if request . ModelFile != " " :
if os . path . exists ( request . ModelFile ) :
local = True
modelFile = request . ModelFile
2024-07-16 16:58:45 +00:00
2023-08-14 21:12:00 +00:00
fromSingleFile = request . Model . startswith ( " http " ) or request . Model . startswith ( " / " ) or local
2024-07-16 16:58:45 +00:00
self . img2vid = False
self . txt2vid = False
2026-01-14 08:07:30 +00:00
self . ltx2_pipeline = False
2025-12-04 18:02:06 +00:00
2026-01-22 13:09:20 +00:00
print ( f " LoadModel: PipelineType from request: { request . PipelineType } " , file = sys . stderr )
2026-02-19 20:35:58 +00:00
# Determine device_map for multi-GPU support based on TensorParallelSize
# When TensorParallelSize > 1, use device_map='auto' to distribute model across GPUs
device_map = None
if hasattr ( request , ' TensorParallelSize ' ) and request . TensorParallelSize > 1 :
device_map = " auto "
print ( f " LoadModel: Multi-GPU mode enabled with TensorParallelSize= { request . TensorParallelSize } , using device_map= ' auto ' " , file = sys . stderr )
2025-12-04 18:02:06 +00:00
# Load pipeline using dynamic loader
# Special cases that require custom initialization are handled first
self . pipe = self . _load_pipeline (
request = request ,
modelFile = modelFile ,
fromSingleFile = fromSingleFile ,
torchType = torchType ,
2026-02-19 20:35:58 +00:00
variant = variant ,
device_map = device_map
2025-12-04 18:02:06 +00:00
)
2026-01-22 13:09:20 +00:00
print ( f " LoadModel: After loading - ltx2_pipeline: { self . ltx2_pipeline } , img2vid: { self . img2vid } , txt2vid: { self . txt2vid } , PipelineType: { self . PipelineType } " , file = sys . stderr )
2023-12-13 18:20:22 +00:00
2023-08-18 20:06:24 +00:00
if CLIPSKIP and request . CLIPSkip != 0 :
2023-12-13 18:20:22 +00:00
self . clip_skip = request . CLIPSkip
else :
self . clip_skip = 0
2024-07-16 16:58:45 +00:00
2023-08-09 06:38:51 +00:00
# torch_dtype needs to be customized. float16 for GPU, float32 for CPU
# TODO: this needs to be customized
2023-08-18 20:06:24 +00:00
if request . SchedulerType != " " :
self . pipe . scheduler = get_scheduler ( request . SchedulerType , self . pipe . scheduler . config )
2024-07-16 16:58:45 +00:00
2024-03-07 13:37:45 +00:00
if COMPEL :
self . compel = Compel (
2024-07-16 16:58:45 +00:00
tokenizer = [ self . pipe . tokenizer , self . pipe . tokenizer_2 ] ,
2024-03-07 13:37:45 +00:00
text_encoder = [ self . pipe . text_encoder , self . pipe . text_encoder_2 ] ,
returned_embeddings_type = ReturnedEmbeddingsType . PENULTIMATE_HIDDEN_STATES_NON_NORMALIZED ,
requires_pooled = [ False , True ]
2024-07-16 16:58:45 +00:00
)
2023-12-13 18:20:22 +00:00
if request . ControlNet :
self . controlnet = ControlNetModel . from_pretrained (
2026-02-19 20:35:58 +00:00
request . ControlNet , torch_dtype = torchType , variant = variant , device_map = device_map
2023-12-13 18:20:22 +00:00
)
self . pipe . controlnet = self . controlnet
else :
self . controlnet = None
2024-10-31 11:12:22 +00:00
if request . LoraAdapter and not os . path . isabs ( request . LoraAdapter ) :
2023-08-27 08:11:16 +00:00
# modify LoraAdapter to be relative to modelFileBase
2024-10-31 11:12:22 +00:00
request . LoraAdapter = os . path . join ( request . ModelPath , request . LoraAdapter )
2023-09-04 17:38:38 +00:00
device = " cpu " if not request . CUDA else " cuda "
2025-07-01 10:36:17 +00:00
if XPU :
device = " xpu "
2025-08-22 06:42:29 +00:00
mps_available = hasattr ( torch . backends , " mps " ) and torch . backends . mps . is_available ( )
if mps_available :
device = " mps "
2023-09-04 17:38:38 +00:00
self . device = device
2023-08-27 08:11:16 +00:00
if request . LoraAdapter :
# Check if its a local file and not a directory ( we load lora differently for a safetensor file )
if os . path . exists ( request . LoraAdapter ) and not os . path . isdir ( request . LoraAdapter ) :
2024-07-16 16:58:45 +00:00
self . pipe . load_lora_weights ( request . LoraAdapter )
2023-08-27 08:11:16 +00:00
else :
self . pipe . unet . load_attn_procs ( request . LoraAdapter )
2024-11-05 14:14:33 +00:00
if len ( request . LoraAdapters ) > 0 :
i = 0
adapters_name = [ ]
adapters_weights = [ ]
for adapter in request . LoraAdapters :
if not os . path . isabs ( adapter ) :
adapter = os . path . join ( request . ModelPath , adapter )
self . pipe . load_lora_weights ( adapter , adapter_name = f " adapter_ { i } " )
adapters_name . append ( f " adapter_ { i } " )
i + = 1
for adapters_weight in request . LoraScales :
adapters_weights . append ( adapters_weight )
self . pipe . set_adapters ( adapters_name , adapter_weights = adapters_weights )
2023-08-27 08:11:16 +00:00
2026-02-19 20:35:58 +00:00
# Only move pipeline to device if NOT using device_map
# device_map handles device placement automatically
if device_map is None and device != " cpu " :
2025-07-01 10:36:17 +00:00
self . pipe . to ( device )
2024-07-16 16:58:45 +00:00
if self . controlnet :
2025-07-01 10:36:17 +00:00
self . controlnet . to ( device )
2023-08-09 06:38:51 +00:00
except Exception as err :
return backend_pb2 . Result ( success = False , message = f " Unexpected { err =} , { type ( err ) =} " )
# Implement your logic here for the LoadModel service
# Replace this with your desired response
return backend_pb2 . Result ( message = " Model loaded successfully " , success = True )
2023-08-27 08:11:16 +00:00
# https://github.com/huggingface/diffusers/issues/3064
def load_lora_weights ( self , checkpoint_path , multiplier , device , dtype ) :
LORA_PREFIX_UNET = " lora_unet "
LORA_PREFIX_TEXT_ENCODER = " lora_te "
# load LoRA weight from .safetensors
state_dict = load_file ( checkpoint_path , device = device )
updates = defaultdict ( dict )
for key , value in state_dict . items ( ) :
# it is suggested to print out the key, it usually will be something like below
# "lora_te_text_model_encoder_layers_0_self_attn_k_proj.lora_down.weight"
layer , elem = key . split ( ' . ' , 1 )
updates [ layer ] [ elem ] = value
# directly update weight in diffusers model
for layer , elems in updates . items ( ) :
if " text " in layer :
layer_infos = layer . split ( LORA_PREFIX_TEXT_ENCODER + " _ " ) [ - 1 ] . split ( " _ " )
curr_layer = self . pipe . text_encoder
else :
layer_infos = layer . split ( LORA_PREFIX_UNET + " _ " ) [ - 1 ] . split ( " _ " )
curr_layer = self . pipe . unet
# find the target layer
temp_name = layer_infos . pop ( 0 )
while len ( layer_infos ) > - 1 :
try :
curr_layer = curr_layer . __getattr__ ( temp_name )
if len ( layer_infos ) > 0 :
temp_name = layer_infos . pop ( 0 )
elif len ( layer_infos ) == 0 :
break
except Exception :
if len ( temp_name ) > 0 :
temp_name + = " _ " + layer_infos . pop ( 0 )
else :
temp_name = layer_infos . pop ( 0 )
# get elements for this layer
weight_up = elems [ ' lora_up.weight ' ] . to ( dtype )
weight_down = elems [ ' lora_down.weight ' ] . to ( dtype )
2023-08-27 13:35:59 +00:00
alpha = elems [ ' alpha ' ] if ' alpha ' in elems else None
2023-08-27 08:11:16 +00:00
if alpha :
alpha = alpha . item ( ) / weight_up . shape [ 1 ]
else :
alpha = 1.0
# update weight
if len ( weight_up . shape ) == 4 :
curr_layer . weight . data + = multiplier * alpha * torch . mm ( weight_up . squeeze ( 3 ) . squeeze ( 2 ) , weight_down . squeeze ( 3 ) . squeeze ( 2 ) ) . unsqueeze ( 2 ) . unsqueeze ( 3 )
else :
curr_layer . weight . data + = multiplier * alpha * torch . mm ( weight_up , weight_down )
2023-08-09 06:38:51 +00:00
def GenerateImage ( self , request , context ) :
prompt = request . positive_prompt
2023-12-11 07:20:34 +00:00
steps = 1
if request . step != 0 :
steps = request . step
2023-08-14 21:12:00 +00:00
# create a dictionary of values for the parameters
options = {
2023-12-11 07:20:34 +00:00
" num_inference_steps " : steps ,
2023-08-14 21:12:00 +00:00
}
2025-12-04 18:02:06 +00:00
if hasattr ( request , ' negative_prompt ' ) and request . negative_prompt != " " :
options [ " negative_prompt " ] = request . negative_prompt
2025-08-24 20:03:08 +00:00
# Handle image source: prioritize RefImages over request.src
image_src = None
if hasattr ( request , ' ref_images ' ) and request . ref_images and len ( request . ref_images ) > 0 :
# Use the first reference image if available
image_src = request . ref_images [ 0 ]
print ( f " Using reference image: { image_src } " , file = sys . stderr )
elif request . src != " " :
# Fall back to request.src if no ref_images
image_src = request . src
print ( f " Using source image: { image_src } " , file = sys . stderr )
else :
print ( " No image source provided " , file = sys . stderr )
if image_src and not self . controlnet and not self . img2vid :
image = Image . open ( image_src )
2023-08-17 21:38:59 +00:00
options [ " image " ] = image
2025-08-24 20:03:08 +00:00
elif self . controlnet and image_src :
pose_image = load_image ( image_src )
2023-12-13 18:20:22 +00:00
options [ " image " ] = pose_image
if CLIPSKIP and self . clip_skip != 0 :
2024-07-16 16:58:45 +00:00
options [ " clip_skip " ] = self . clip_skip
2023-08-17 21:38:59 +00:00
2025-12-04 18:02:06 +00:00
kwargs = { }
2023-08-17 21:38:59 +00:00
2025-02-11 09:16:32 +00:00
# populate kwargs from self.options.
kwargs . update ( self . options )
Fix image upload processing and img2img pipeline in diffusers backend (#8879)
* fix: add missing bufio.Flush in processImageFile
The processImageFile function writes decoded image data (from base64
or URL download) through a bufio.NewWriter but never calls Flush()
before closing the underlying file. Since bufio's default buffer is
4096 bytes, small images produce 0-byte files and large images are
truncated — causing PIL to fail with "cannot identify image file".
This breaks all image input paths: file, files, and ref_images
parameters in /v1/images/generations, making img2img, inpainting,
and reference image features non-functional.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* fix: merge options into kwargs in diffusers GenerateImage
The GenerateImage method builds a local `options` dict containing the
source image (PIL), negative_prompt, and num_inference_steps, but
never merges it into `kwargs` before calling self.pipe(**kwargs).
This causes img2img to fail with "Input is in incorrect format"
because the pipeline never receives the image parameter.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* test: add unit test for processImageFile base64 decoding
Verifies that a base64-encoded PNG survives the write path
(encode → decode → bufio.Write → Flush → file on disk) with
byte-for-byte fidelity. The test image is small enough to fit
entirely in bufio's 4096-byte buffer, which is the exact scenario
where the missing Flush() produced a 0-byte file.
Also tests that invalid base64 input is handled gracefully.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* test: verify GenerateImage merges options into pipeline kwargs
Mocks the diffusers pipeline and calls GenerateImage with a source
image and negative prompt. Asserts that the pipeline receives the
image, negative_prompt, and num_inference_steps via kwargs — the
exact parameters that were silently dropped before the fix.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* fix: move kwargs.update(options) earlier in GenerateImage
Move the options merge right after self.options merge (L742) so that
image, negative_prompt, and num_inference_steps are available to all
downstream code paths including img2vid and txt2vid.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* test: convert processImageFile tests to ginkgo
Replace standard testing with ginkgo/gomega to be consistent with
the rest of the test suites in the project.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
---------
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-03-11 07:05:50 +00:00
kwargs . update ( options )
Allow to manually set the seed for the SD pipeline (#998)
**Description**
Enable setting the seed for the stable diffusion pipeline. This is done
through an additional `seed` parameter in the request, such as:
```bash
curl http://localhost:8080/v1/images/generations \
-H "Content-Type: application/json" \
-d '{"model": "stablediffusion", "prompt": "prompt", "n": 1, "step": 51, "size": "512x512", "seed": 3}'
```
**Notes for Reviewers**
When the `seed` parameter is not sent, `request.seed` defaults to `0`,
making it difficult to detect an actual seed of `0`. Is there a way to
change the default to `-1` for instance ?
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions:
1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR.
3. Sign your commits
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->
2023-09-04 17:10:55 +00:00
# Set seed
if request . seed > 0 :
2023-09-04 17:38:38 +00:00
kwargs [ " generator " ] = torch . Generator ( device = self . device ) . manual_seed (
Allow to manually set the seed for the SD pipeline (#998)
**Description**
Enable setting the seed for the stable diffusion pipeline. This is done
through an additional `seed` parameter in the request, such as:
```bash
curl http://localhost:8080/v1/images/generations \
-H "Content-Type: application/json" \
-d '{"model": "stablediffusion", "prompt": "prompt", "n": 1, "step": 51, "size": "512x512", "seed": 3}'
```
**Notes for Reviewers**
When the `seed` parameter is not sent, `request.seed` defaults to `0`,
making it difficult to detect an actual seed of `0`. Is there a way to
change the default to `-1` for instance ?
**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions:
1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR.
3. Sign your commits
By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->
2023-09-04 17:10:55 +00:00
request . seed
)
2024-08-10 23:31:53 +00:00
if self . PipelineType == " FluxPipeline " :
kwargs [ " max_sequence_length " ] = 256
2024-11-06 07:53:02 +00:00
if request . width :
kwargs [ " width " ] = request . width
if request . height :
kwargs [ " height " ] = request . height
2024-08-10 23:31:53 +00:00
if self . PipelineType == " FluxTransformer2DModel " :
kwargs [ " output_type " ] = " pil "
kwargs [ " generator " ] = torch . Generator ( " cpu " ) . manual_seed ( 0 )
2023-12-15 23:06:20 +00:00
if self . img2vid :
# Load the conditioning image
2025-08-24 20:03:08 +00:00
if image_src :
image = load_image ( image_src )
else :
# Fallback to request.src for img2vid if no ref_images
image = load_image ( request . src )
2023-12-15 23:06:20 +00:00
image = image . resize ( ( 1024 , 576 ) )
generator = torch . manual_seed ( request . seed )
2023-12-24 18:24:52 +00:00
frames = self . pipe ( image , guidance_scale = self . cfg_scale , decode_chunk_size = CHUNK_SIZE , generator = generator ) . frames [ 0 ]
2023-12-15 23:06:20 +00:00
export_to_video ( frames , request . dst , fps = FPS )
return backend_pb2 . Result ( message = " Media generated successfully " , success = True )
if self . txt2vid :
2023-12-24 18:24:52 +00:00
video_frames = self . pipe ( prompt , guidance_scale = self . cfg_scale , num_inference_steps = steps , num_frames = int ( FRAMES ) ) . frames
2023-12-15 23:06:20 +00:00
export_to_video ( video_frames , request . dst )
return backend_pb2 . Result ( message = " Media generated successfully " , success = True )
2024-11-06 07:53:02 +00:00
print ( f " Generating image with { kwargs =} " , file = sys . stderr )
2023-08-16 20:24:52 +00:00
image = { }
if COMPEL :
2024-03-07 13:37:45 +00:00
conditioning , pooled = self . compel . build_conditioning_tensor ( prompt )
kwargs [ " prompt_embeds " ] = conditioning
kwargs [ " pooled_prompt_embeds " ] = pooled
2023-08-16 20:24:52 +00:00
# pass the kwargs dictionary to the self.pipe method
2026-02-11 21:58:19 +00:00
image = self . pipe (
guidance_scale = self . cfg_scale ,
* * kwargs
) . images [ 0 ]
2026-02-19 09:45:17 +00:00
elif SD_EMBED and SD_EMBED_AVAILABLE :
2026-02-11 21:58:19 +00:00
if self . PipelineType == " StableDiffusionPipeline " :
(
kwargs [ " prompt_embeds " ] ,
kwargs [ " negative_prompt_embeds " ] ,
) = get_weighted_text_embeddings_sd15 (
pipe = self . pipe ,
prompt = prompt ,
neg_prompt = request . negative_prompt if hasattr ( request , ' negative_prompt ' ) else None ,
)
if self . PipelineType == " StableDiffusionXLPipeline " :
(
kwargs [ " prompt_embeds " ] ,
kwargs [ " negative_prompt_embeds " ] ,
kwargs [ " pooled_prompt_embeds " ] ,
kwargs [ " negative_pooled_prompt_embeds " ] ,
) = get_weighted_text_embeddings_sdxl (
pipe = self . pipe ,
prompt = prompt ,
neg_prompt = request . negative_prompt if hasattr ( request , ' negative_prompt ' ) else None
)
if self . PipelineType == " StableDiffusion3Pipeline " :
(
kwargs [ " prompt_embeds " ] ,
kwargs [ " negative_prompt_embeds " ] ,
kwargs [ " pooled_prompt_embeds " ] ,
kwargs [ " negative_pooled_prompt_embeds " ] ,
) = get_weighted_text_embeddings_sd3 (
pipe = self . pipe ,
prompt = prompt ,
neg_prompt = request . negative_prompt if hasattr ( request , ' negative_prompt ' ) else None
)
if self . PipelineType == " FluxTransformer2DModel " :
(
kwargs [ " prompt_embeds " ] ,
kwargs [ " pooled_prompt_embeds " ] ,
) = get_weighted_text_embeddings_flux1 (
pipe = self . pipe ,
prompt = prompt ,
)
2023-12-24 18:24:52 +00:00
image = self . pipe (
guidance_scale = self . cfg_scale ,
2023-08-16 20:24:52 +00:00
* * kwargs
2024-07-16 16:58:45 +00:00
) . images [ 0 ]
2023-08-16 20:24:52 +00:00
else :
# pass the kwargs dictionary to the self.pipe method
image = self . pipe (
2023-12-24 18:24:52 +00:00
prompt ,
guidance_scale = self . cfg_scale ,
2023-08-16 20:24:52 +00:00
* * kwargs
2024-07-16 16:58:45 +00:00
) . images [ 0 ]
2023-08-09 06:38:51 +00:00
2023-08-14 21:12:00 +00:00
# save the result
2023-08-09 06:38:51 +00:00
image . save ( request . dst )
2023-12-15 23:06:20 +00:00
return backend_pb2 . Result ( message = " Media generated " , success = True )
2023-08-09 06:38:51 +00:00
2025-08-28 08:26:42 +00:00
def GenerateVideo ( self , request , context ) :
try :
prompt = request . prompt
if not prompt :
2026-01-22 13:09:20 +00:00
print ( f " GenerateVideo: No prompt provided for video generation. " , file = sys . stderr )
2025-08-28 08:26:42 +00:00
return backend_pb2 . Result ( success = False , message = " No prompt provided for video generation " )
2026-01-22 13:09:20 +00:00
# Debug: Print raw request values
print ( f " GenerateVideo: Raw request values - num_frames: { request . num_frames } , fps: { request . fps } , cfg_scale: { request . cfg_scale } , step: { request . step } " , file = sys . stderr )
2025-08-28 08:26:42 +00:00
# Set default values from request or use defaults
num_frames = request . num_frames if request . num_frames > 0 else 81
fps = request . fps if request . fps > 0 else 16
cfg_scale = request . cfg_scale if request . cfg_scale > 0 else 4.0
num_inference_steps = request . step if request . step > 0 else 40
2026-01-22 13:09:20 +00:00
print ( f " GenerateVideo: Using values - num_frames: { num_frames } , fps: { fps } , cfg_scale: { cfg_scale } , num_inference_steps: { num_inference_steps } " , file = sys . stderr )
2025-08-28 08:26:42 +00:00
# Prepare generation parameters
kwargs = {
" prompt " : prompt ,
" negative_prompt " : request . negative_prompt if request . negative_prompt else " " ,
" height " : request . height if request . height > 0 else 720 ,
" width " : request . width if request . width > 0 else 1280 ,
" num_frames " : num_frames ,
" guidance_scale " : cfg_scale ,
" num_inference_steps " : num_inference_steps ,
}
# Add custom options from self.options (including guidance_scale_2 if specified)
kwargs . update ( self . options )
# Set seed if provided
if request . seed > 0 :
kwargs [ " generator " ] = torch . Generator ( device = self . device ) . manual_seed ( request . seed )
# Handle start and end images for video generation
if request . start_image :
kwargs [ " start_image " ] = load_image ( request . start_image )
if request . end_image :
kwargs [ " end_image " ] = load_image ( request . end_image )
print ( f " Generating video with { kwargs =} " , file = sys . stderr )
2026-01-22 13:09:20 +00:00
print ( f " GenerateVideo: Pipeline type: { self . PipelineType } , ltx2_pipeline flag: { self . ltx2_pipeline } " , file = sys . stderr )
2025-08-28 08:26:42 +00:00
# Generate video frames based on pipeline type
2026-01-22 13:09:20 +00:00
if self . ltx2_pipeline or self . PipelineType in [ " LTX2Pipeline " , " LTX2ImageToVideoPipeline " ] :
# LTX-2 generation with audio (supports both text-to-video and image-to-video)
# Determine if this is text-to-video (no image) or image-to-video (has image)
has_image = bool ( request . start_image )
2026-01-14 08:07:30 +00:00
2026-01-22 13:09:20 +00:00
# Remove image-related parameters that might have been added earlier
kwargs . pop ( " start_image " , None )
kwargs . pop ( " end_image " , None )
# LTX2ImageToVideoPipeline uses 'image' parameter for image-to-video
# LTX2Pipeline (text-to-video) doesn't need an image parameter
if has_image :
# Image-to-video: use 'image' parameter
if self . PipelineType == " LTX2ImageToVideoPipeline " :
image = load_image ( request . start_image )
kwargs [ " image " ] = image
print ( f " LTX-2: Using image-to-video mode with image " , file = sys . stderr )
else :
# If pipeline type is LTX2Pipeline but we have an image, we can't do image-to-video
return backend_pb2 . Result ( success = False , message = " LTX2Pipeline does not support image-to-video. Use LTX2ImageToVideoPipeline for image-to-video generation. " )
else :
# Text-to-video: no image parameter needed
# Ensure no image-related kwargs are present
kwargs . pop ( " image " , None )
print ( f " LTX-2: Using text-to-video mode (no image) " , file = sys . stderr )
2026-01-14 08:07:30 +00:00
# LTX-2 uses 'frame_rate' instead of 'fps'
frame_rate = float ( fps )
kwargs [ " frame_rate " ] = frame_rate
# LTX-2 requires output_type="np" and return_dict=False
kwargs [ " output_type " ] = " np "
kwargs [ " return_dict " ] = False
# Generate video and audio
2026-01-22 13:09:20 +00:00
print ( f " LTX-2: Generating with kwargs: { kwargs } " , file = sys . stderr )
try :
video , audio = self . pipe ( * * kwargs )
print ( f " LTX-2: Generated video shape: { video . shape } , audio shape: { audio . shape } " , file = sys . stderr )
except Exception as e :
print ( f " LTX-2: Error during pipe() call: { e } " , file = sys . stderr )
traceback . print_exc ( )
return backend_pb2 . Result ( success = False , message = f " Error generating video with LTX-2 pipeline: { e } " )
2026-01-14 08:07:30 +00:00
# Convert video to uint8 format
video = ( video * 255 ) . round ( ) . astype ( " uint8 " )
video = torch . from_numpy ( video )
2026-01-22 13:09:20 +00:00
print ( f " LTX-2: Converting video, shape after conversion: { video . shape } " , file = sys . stderr )
print ( f " LTX-2: Audio sample rate: { self . pipe . vocoder . config . output_sampling_rate } " , file = sys . stderr )
print ( f " LTX-2: Output path: { request . dst } " , file = sys . stderr )
2026-01-14 08:07:30 +00:00
# Use LTX-2's encode_video function which handles audio
2026-01-22 13:09:20 +00:00
try :
ltx2_encode_video (
video [ 0 ] ,
fps = frame_rate ,
audio = audio [ 0 ] . float ( ) . cpu ( ) ,
audio_sample_rate = self . pipe . vocoder . config . output_sampling_rate ,
output_path = request . dst ,
)
# Verify file was created and has content
import os
if os . path . exists ( request . dst ) :
file_size = os . path . getsize ( request . dst )
print ( f " LTX-2: Video file created successfully, size: { file_size } bytes " , file = sys . stderr )
if file_size == 0 :
return backend_pb2 . Result ( success = False , message = f " Video file was created but is empty (0 bytes). Check LTX-2 encode_video function. " )
else :
return backend_pb2 . Result ( success = False , message = f " Video file was not created at { request . dst } " )
except Exception as e :
print ( f " LTX-2: Error encoding video: { e } " , file = sys . stderr )
traceback . print_exc ( )
return backend_pb2 . Result ( success = False , message = f " Error encoding video: { e } " )
2026-01-14 08:07:30 +00:00
return backend_pb2 . Result ( message = " Video generated successfully " , success = True )
elif self . PipelineType == " WanPipeline " :
2025-08-28 08:26:42 +00:00
# WAN2.2 text-to-video generation
output = self . pipe ( * * kwargs )
frames = output . frames [ 0 ] # WAN2.2 returns frames in this format
elif self . PipelineType == " WanImageToVideoPipeline " :
# WAN2.2 image-to-video generation
if request . start_image :
# Load and resize the input image according to WAN2.2 requirements
image = load_image ( request . start_image )
# Use request dimensions or defaults, but respect WAN2.2 constraints
request_height = request . height if request . height > 0 else 480
request_width = request . width if request . width > 0 else 832
max_area = request_height * request_width
aspect_ratio = image . height / image . width
mod_value = self . pipe . vae_scale_factor_spatial * self . pipe . transformer . config . patch_size [ 1 ]
height = round ( ( max_area * aspect_ratio ) * * 0.5 / mod_value ) * mod_value
width = round ( ( max_area / aspect_ratio ) * * 0.5 / mod_value ) * mod_value
image = image . resize ( ( width , height ) )
kwargs [ " image " ] = image
kwargs [ " height " ] = height
kwargs [ " width " ] = width
output = self . pipe ( * * kwargs )
frames = output . frames [ 0 ]
elif self . img2vid :
# Generic image-to-video generation
if request . start_image :
image = load_image ( request . start_image )
image = image . resize ( ( request . width if request . width > 0 else 1024 ,
request . height if request . height > 0 else 576 ) )
kwargs [ " image " ] = image
output = self . pipe ( * * kwargs )
frames = output . frames [ 0 ]
elif self . txt2vid :
# Generic text-to-video generation
output = self . pipe ( * * kwargs )
frames = output . frames [ 0 ]
else :
2026-01-22 13:09:20 +00:00
print ( f " GenerateVideo: Pipeline { self . PipelineType } does not match any known video pipeline handler " , file = sys . stderr )
2025-08-28 08:26:42 +00:00
return backend_pb2 . Result ( success = False , message = f " Pipeline { self . PipelineType } does not support video generation " )
2026-01-14 08:07:30 +00:00
# Export video (for non-LTX-2 pipelines)
2026-01-22 13:09:20 +00:00
print ( f " GenerateVideo: Exporting video to { request . dst } with fps= { fps } " , file = sys . stderr )
2025-08-28 08:26:42 +00:00
export_to_video ( frames , request . dst , fps = fps )
2026-01-22 13:09:20 +00:00
# Verify file was created
import os
if os . path . exists ( request . dst ) :
file_size = os . path . getsize ( request . dst )
print ( f " GenerateVideo: Video file created, size: { file_size } bytes " , file = sys . stderr )
if file_size == 0 :
return backend_pb2 . Result ( success = False , message = f " Video file was created but is empty (0 bytes) " )
else :
return backend_pb2 . Result ( success = False , message = f " Video file was not created at { request . dst } " )
2025-08-28 08:26:42 +00:00
return backend_pb2 . Result ( message = " Video generated successfully " , success = True )
except Exception as err :
print ( f " Error generating video: { err } " , file = sys . stderr )
traceback . print_exc ( )
return backend_pb2 . Result ( success = False , message = f " Error generating video: { err } " )
2024-07-16 16:58:45 +00:00
2023-08-09 06:38:51 +00:00
def serve ( address ) :
2025-04-19 06:53:24 +00:00
server = grpc . server ( futures . ThreadPoolExecutor ( max_workers = MAX_WORKERS ) ,
options = [
( ' grpc.max_message_length ' , 50 * 1024 * 1024 ) , # 50MB
( ' grpc.max_send_message_length ' , 50 * 1024 * 1024 ) , # 50MB
( ' grpc.max_receive_message_length ' , 50 * 1024 * 1024 ) , # 50MB
2026-03-29 22:47:27 +00:00
] ,
interceptors = get_auth_interceptors ( ) ,
)
2023-08-09 06:38:51 +00:00
backend_pb2_grpc . add_BackendServicer_to_server ( BackendServicer ( ) , server )
server . add_insecure_port ( address )
server . start ( )
print ( " Server started. Listening on: " + address , file = sys . stderr )
# Define the signal handler function
def signal_handler ( sig , frame ) :
print ( " Received termination signal. Shutting down... " )
server . stop ( 0 )
sys . exit ( 0 )
# Set the signal handlers for SIGINT and SIGTERM
signal . signal ( signal . SIGINT , signal_handler )
signal . signal ( signal . SIGTERM , signal_handler )
try :
while True :
time . sleep ( _ONE_DAY_IN_SECONDS )
except KeyboardInterrupt :
server . stop ( 0 )
2024-07-16 16:58:45 +00:00
2023-08-09 06:38:51 +00:00
if __name__ == " __main__ " :
parser = argparse . ArgumentParser ( description = " Run the gRPC server. " )
parser . add_argument (
" --addr " , default = " localhost:50051 " , help = " The address to bind the server to. "
)
args = parser . parse_args ( )
2024-07-16 16:58:45 +00:00
serve ( args . addr )