* feat: Add backend gallery
This PR add support to manage backends as similar to models. There is
now available a backend gallery which can be used to install and remove
extra backends.
The backend gallery can be configured similarly as a model gallery, and
API calls allows to install and remove new backends in runtime, and as
well during the startup phase of LocalAI.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add backends docs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* wip: Backend Dockerfile for python backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: drop extras images, build python backends separately
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixup on all backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* test CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Tweaks
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop old backends leftovers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixup CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Move dockerfile upper
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fix proto
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Feature dropped for consistency - we prefer model galleries
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add missing packages in the build image
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* exllama is ponly available on cublas
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* pin torch on chatterbox
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups to index
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Debug CI
* Install accellerators deps
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add target arch
* Add cuda minor version
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Use self-hosted runners
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* ci: use quay for test images
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixups for vllm and chatterbox
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Small fixups on CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chatterbox is only available for nvidia
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Simplify CI builds
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Adapt test, use qwen3
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(model gallery): add jina-reranker-v1-tiny-en-gguf
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Use reranker from llama.cpp in AIO images
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Limit concurrent jobs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* Add note to help run nvidia containers with SELinux
* Use correct CUDA container references as noted in the dockerhub overview
* Clean trailing whitespaces
The GGML format is now dead, since in the next version of LocalAI we
already bring many breaking compatibility changes, taking the occasion
also to drop ggml support (pre-gguf).
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Since the remote gallery was introduced this is now completely
superseded by it. In order to keep the code clean and remove redudant
parts let's simplify the usage.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Rename LocalAI-Extra-Usage -> Extra-Usage, add MACHINE_TAG as cli flag option, add docs about extra-usage and machine-tag
Signed-off-by: mintyleaf <mintyleafdev@gmail.com>
* feat(backend): add stablediffusion-ggml
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(ci): track stablediffusion-ggml
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Use default scheduler and sampler if not specified
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Move cfg scale out of diffusers block
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Make it working
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: set free_params_immediately to false to call the model in sequence
https://github.com/leejet/stable-diffusion.cpp/issues/366
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(backends): Drop bert.cpp
use llama.cpp 3.2 as a drop-in replacement for bert.cpp
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(tests): make test more robust
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
update link to examples which have moved to their own repository
Signed-off-by: Philipp Seelig <philipp@daxbau.net>
Co-authored-by: Philipp Seelig <philipp@daxbau.net>
Co-authored-by: Dave <dave@gray101.com>
* chore(cli): be consistent between workers and expose ExtraLLamaCPPArgs to both
Fixes: https://github.com/mudler/LocalAI/issues/3427
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* bump grpcio
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Refer to the OpenAI documentation to update the openai-functions documentation
I saw the openai official website, apIn the description: The parameters `function_call` and `functions` have been replaced by `tool_choice` and `tools`.So I submitted this update;But I haven't read the code of localai, so I'm not sure if it also applies to localai.
Signed-off-by: 四少爷 <sex@jermey.cn>
* Update Usage Example
The original usage example was too outdated, and calling with the new version of the openai python package would result in errors. Therefore, the curl example was rewritten (as curl examples are also used elsewhere).
Signed-off-by: 四少爷 <sex@jermey.cn>
* add python example
Signed-off-by: 四少爷 <sex@jermey.cn>
---------
Signed-off-by: 四少爷 <sex@jermey.cn>
* feat(functions): enhance parsing with broken JSON when we parse the raw results
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* breaking: make function name by default
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(grammar): dynamically generate grammars with mutating keys
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactor: simplify condition
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Update docs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* docs(swagger): finish convering gallery section
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* docs: add section to explain how to install models with local-ai run
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Minor docs adjustments
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(llama.cpp): add embeddings
Also enable embeddings by default for llama.cpp models
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(Makefile): prepare llama.cpp sources only once
Otherwise we keep cloning llama.cpp for each of the variants
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* do not set embeddings to false
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* docs: add embeddings to the YAML config reference
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(install.sh): support federated install
This allows to support federation by exposing:
- FEDERATED: true/false to share the instance
- FEDERATED_SERVER: true/false to start the federated load balancer (it
forwards requests to the federation)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* docs: update installer parameters
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Dave <dave@gray101.com>
This allows LocalAI to be less noisy avoiding to connect outside.
Needed if e.g. there is no plan into using p2p across separate networks.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Added some OpenVINO models
Added Phi-3 trust_remote_code: true
Added Hermes 2 Pro Llama3
Added Multilingual-E5-base embedding model with OpenVINO acceleration (CPU and XPU)
Added all-MiniLM-L6-v2 with OpenVINO acceleration (CPU and XPU)
* Added Remote Code for phi, fixed error on Yamllint
* update openvino.yaml
I need to go to rest: today is not my day...
* Update model-gallery.md with correct gallery file
The readme points to a file that hasn't been updated in months so when there are announcements about new models, user's won't get them pointing to the old file. Point to the updated files instead.
Signed-off-by: QuinnPiers <167640194+QuinnPiers@users.noreply.github.com>
* Update model-gallery.md
second pass with more understanding
Signed-off-by: QuinnPiers <167640194+QuinnPiers@users.noreply.github.com>
* Update model-gallery.md
Signed-off-by: QuinnPiers <167640194+QuinnPiers@users.noreply.github.com>
* Update model-gallery.md
Signed-off-by: QuinnPiers <167640194+QuinnPiers@users.noreply.github.com>
---------
Signed-off-by: QuinnPiers <167640194+QuinnPiers@users.noreply.github.com>
* feat: enable polling configs for systems with broken fsnotify (docker volumes on windows)
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: update logging to make it clear that the config file is being polled
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
---------
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: initial work towards not committing generated files to the repository
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* feat: improve build docs
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: remove unused folder from .dockerignore and .gitignore
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: attempt to fix extra backend tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: attempt to fix other tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: more test fixes
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: fix apple tests
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: more extras tests fixes
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: add GOBIN to PATH in docker build
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: extra tests and Dockerfile corrections
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: remove build dependency checks
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: add golang protobuf compilers to tests-linux action
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: ensure protogen is run for extra backend installs
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: use newer protobuf
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: more missing protoc binaries
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: missing dependencies during docker build
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: don't install grpc compilers in the final stage if they aren't needed
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: python-grpc-tools in 22.04 repos is too old
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: add a couple of extra build dependencies to Makefile
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: unbreak container rebuild functionality
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
---------
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* feat: migrate to alecthomas/kong for CLI
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* feat: bring in new flag for granular log levels
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* chore: go mod tidy
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* feat: allow loading cli flag values from ["./localai.yaml", "~/.config/localai.yaml", "/etc/localai.yaml"] in that order
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* feat: load from .env file instead of a yaml file
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* feat: better loading for environment files
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* feat(doc): add initial documentation about configuration
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: remove test log lines
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* feat: integrate new documentation into existing pages
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* feat: add documentation on .env files
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix: cleanup some documentation table errors
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* feat: refactor CLI logic out to it's own package under core/cli
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
---------
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
* fix(seed): generate random seed per-request if -1 is set
Also update ci with new workflows and allow the aio tests to run with an
api key
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* docs(openvino): Add OpenVINO example
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* readme: update quickstart
* aio(gpu): fix dreamshaper
* tests(aio): allow to run tests also against an endpoint
* docs: split content
* tests: less verbosity
---------
Co-authored-by: Dave <dave@gray101.com>
* docs(aio): Add AIO images docs
* add image generation link to quickstart
* while reviewing I noticed this one link was missing, so quickly adding it.
Signed-off-by: Dave <dave@gray101.com>
Co-authored-by: Dave <dave@gray101.com>
* docs(mac): Improve documentation for mac build
- added documentation to build from current master
- added troubleshooting information
Signed-off-by: Sebastian <tauven@gmail.com>
* docs(max): fix typo
Signed-off-by: Sebastian <tauven@gmail.com>
---------
Signed-off-by: Sebastian <tauven@gmail.com>
* feat(elevenlabs): map elevenlabs API support to TTS
This allows elevenlabs Clients to work automatically with LocalAI by
supporting the elevenlabs API.
The elevenlabs server endpoint is implemented such as it is wired to the
TTS endpoints.
Fixes: https://github.com/mudler/LocalAI/issues/1809
* feat(openai/tts): compat layer with openai tts
Fixes: #1276
* fix: adapt tts CLI
The default sampler on some models don't return enough candidates which
leads to a false sense of randomness. Tracing back the code it looks
that with the temperature sampler there might not be enough
candidates to pick from, and since the seed and "randomness" take effect
while picking a good candidate this yields to the same results over and
over.
Fixes https://github.com/mudler/LocalAI/issues/1723 by updating the
examples and documentation to use mirostat instead.
* feat(refactor): refactor config and input reading
* feat(tts): read config file for TTS
* examples(kubernetes): Add simple deployment example
* examples(kubernetes): Add simple deployment for intel arc
* docs(sycl): add sycl example
* feat(tts): do not always pick a first model
* fixups to run vall-e-x on container
* Correctly resolve backend
* move downloader out
* separate startup functions for preloading configuration files
* docs: add popular model examples
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* shorteners
* Add llava
* Add mistral-openorca
* Better link to build section
* docs: update
* fixup
* Drop code dups
* Minor fixups
* Apply suggestions from code review
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* ci: try to cache gRPC build during tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* ci: do not build all images for tests, just necessary
* ci: cache gRPC also in release pipeline
* fixes
* Update model_preload_test.go
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* feat: allow to pass by models via args
* expose it also as an env/arg
* docs: enhancements to build/requirements
* do not display status always
* print download status
* not all mesages are debug
* feat(img2vid): Initial support for img2vid
* doc(SD): fix SDXL Example
* Minor fixups for img2vid
* docs(img2img): fix example curl call
* feat(txt2vid): initial support
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* diffusers: be retro-compatible with CUDA settings
* docs(img2vid, txt2vid): examples
* Add notice on docs
---------
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* Use cuda in transformers if available
tensorflow probably needs a different check.
Signed-off-by: Erich Schubert <kno10@users.noreply.github.com>
* feat: expose CUDA at top level
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* tests: add to tests and create workflow for py extra backends
* doc: update note on how to use core images
---------
Signed-off-by: Erich Schubert <kno10@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Erich Schubert <kno10@users.noreply.github.com>
* Update docs for new requirements.txt path
Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>
* Fix typo (.PONY -> .PHONY) in python backend makefiles
Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>
---------
Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>
* Update path to sentencetransformers backend for local execution
Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>
* Rename huggingface-embeddings -> sentencetransformers in embeddings.md for consistency with the backend structure
The Dockerfile still knows the "huggingface-embeddings"
backend (I assume for compatibility reasons) but uses the
sentencetransformers backend under the hood anyway.
I figured it would be good to update the docs to use the new naming to
make it less confusing moving forward. As the docker container knows
both the "huggingface-embeddings" and the "sentencetransformers"
backend, this should not break anything.
Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>
---------
Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>