LocalAI/pkg
Ettore Di Giacinto 6d5bde860b
feat(llama.cpp): upgrade and use libmtmd (#5379)
* WIP

* wip

* wip

* Make it compile

* Update json.hpp

* this shouldn't be private for now

* Add logs

* Reset auto detected template

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Re-enable grammars

* This seems to be broken - 360a9c98e1 (diff-a18a8e64e12a01167d8e98fc)[…]cccf0d4eed09d76d879L2998-L3207

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Placeholder

* Simplify image loading

* use completion type

* disable streaming

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* correctly return timings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Remove some debug logging

* Adapt tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Keep header

* embedding: do not use oai type

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Sync from server.cpp

* Use utils and json directly from llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Sync with upstream

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: copy json.hpp from the correct location

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: add httplib

* sync llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Embeddiongs: set OAICOMPAT_TYPE_EMBEDDING

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: sync with server.cpp by including it

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* make it darwin-compatible

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-05-17 16:02:53 +02:00
..
assets fix: use rice when embedding large binaries (#5309) 2025-05-04 16:42:42 +02:00
concurrency chore: update jobresult_test.go (#4124) 2024-11-12 08:52:18 +01:00
downloader fix: typos (#5376) 2025-05-16 12:45:48 +02:00
functions chore(deps): update llama.cpp and sync with upstream changes (#4950) 2025-03-06 00:40:58 +01:00
grpc feat(video-gen): add endpoint for video generation (#5247) 2025-04-26 18:05:01 +02:00
langchain feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants (#2232) 2024-05-04 17:56:12 +02:00
library fix: use rice when embedding large binaries (#5309) 2025-05-04 16:42:42 +02:00
model fix: typos (#5376) 2025-05-16 12:45:48 +02:00
oci chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
startup chore: drop embedded models (#4715) 2025-01-30 00:03:01 +01:00
store chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
templates feat(llama.cpp): upgrade and use libmtmd (#5379) 2025-05-17 16:02:53 +02:00
utils feat(tts): Implement naive response_format for tts endpoint (#4035) 2024-11-02 19:13:35 +00:00
xsync chore: fix go.mod module (#2635) 2024-06-23 08:24:36 +00:00
xsysinfo fix(gpu): do not assume gpu being returned has node and mem (#5310) 2025-05-03 19:00:24 +02:00