* docs(website): updated the AI lab documentation Signed-off-by: Shipra Singh <shipsing@redhat.com> * docs(website): incorporated SME comments Signed-off-by: Shipra Singh <shipsing@redhat.com> --------- Signed-off-by: Shipra Singh <shipsing@redhat.com>
1.8 KiB
| sidebar_position | title | description | keywords | tags | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | Starting an inference server | Starting an inference server for a model. |
|
|
Starting an inference server for a model
Once a model is downloaded, a model service can be started. A model service is an inference server that runs in a container and exposes the model through the well-known chat API common to many providers.
Prerequisites
Procedure
-
Click the Podman AI Lab icon in the left navigation pane.
-
In the Podman AI Lab navigation bar, click Services.
-
Click the New Model Service button at the top right corner of the page. The Creating Model service page opens.
:::note
On a macOS machine, you get a notification to create a GPU-enabled Podman machine to run your GPU workloads. Click the Create GPU enabled machine button to proceed.
:::
-
Select the model for which you want to start an inference server from the dropdown list, and edit the port number if needed.
-
Click Create service. The inference server for the model is being started, and this requires some time.

-
Click the Open service details button.

