2024-12-25 12:21:38 +00:00
---
2025-05-23 06:18:14 +00:00
sidebar_position: 5
2024-10-30 11:40:39 +00:00
slug: /python_api_reference
2026-01-07 02:00:09 +00:00
sidebar_custom_props: {
categoryIcon: SiPython
}
2024-10-30 11:40:39 +00:00
---
2025-01-06 08:54:22 +00:00
# Python API
2024-10-09 07:30:22 +00:00
2025-08-26 11:35:29 +00:00
A complete reference for RAGFlow's Python APIs. Before proceeding, please ensure you [have your RAGFlow API key ready for authentication ](https://ragflow.io/docs/dev/acquire_ragflow_api_key ).
2024-10-14 12:48:23 +00:00
2024-12-18 11:01:05 +00:00
:::tip NOTE
Run the following command to download the Python SDK:
2024-10-19 11:46:13 +00:00
2024-12-18 11:01:05 +00:00
```bash
pip install ragflow-sdk
```
2025-02-26 07:52:26 +00:00
2024-10-09 07:30:22 +00:00
:::
2024-10-19 11:46:13 +00:00
---
2024-12-13 02:25:52 +00:00
2025-03-26 01:03:18 +00:00
## ERROR CODES
---
2025-12-17 11:27:47 +00:00
| Code | Message | Description |
|------|-----------------------|----------------------------|
| 400 | Bad Request | Invalid request parameters |
| 401 | Unauthorized | Unauthorized access |
| 403 | Forbidden | Access denied |
| 404 | Not Found | Resource not found |
| 500 | Internal Server Error | Server internal error |
| 1001 | Invalid Chunk ID | Invalid Chunk ID |
| 1002 | Chunk Update Failed | Chunk update failed |
2025-03-26 01:03:18 +00:00
---
2025-02-26 07:52:26 +00:00
## OpenAI-Compatible API
---
### Create chat completion
Creates a model response for the given historical chat conversation via OpenAI's API.
#### Parameters
##### model: `str`, *Required*
The model used to generate the response. The server will parse this automatically, so you can set it to any value for now.
##### messages: `list[object]`, *Required*
A list of historical chat messages used to generate the response. This must contain at least one message with the `user` role.
##### stream: `boolean`
Whether to receive the response as a stream. Set this to `false` explicitly if you prefer to receive the entire response in one go instead of as a stream.
#### Returns
2025-02-28 08:09:40 +00:00
- Success: Response [message ](https://platform.openai.com/docs/api-reference/chat/create ) like OpenAI
2025-02-26 07:52:26 +00:00
- Failure: `Exception`
#### Examples
2026-02-05 07:56:58 +00:00
> **Note**
> Streaming via `client.chat.completions.create(stream=True, ...)` does not
> return `reference` currently because `reference` is only exposed in the
> non-stream response payload. The only way to return `reference` is non-stream
> mode with `with_raw_response`.
:::caution NOTE
Streaming via `client.chat.completions.create(stream=True, ...)` does not return `reference` because it is *only* included in the raw response payload in non-stream mode. To return `reference` , set `stream=False` .
:::
2025-02-26 07:52:26 +00:00
```python
from openai import OpenAI
2026-02-05 07:56:58 +00:00
import json
2025-02-26 07:52:26 +00:00
model = "model"
client = OpenAI(api_key="ragflow-api-key", base_url=f"http://ragflow_address/api/v1/chats_openai/< chat_id > ")
2025-07-23 10:10:05 +00:00
stream = True
reference = True
2026-02-05 07:56:58 +00:00
request_kwargs = dict(
2025-02-26 07:52:26 +00:00
model=model,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who are you?"},
2025-07-23 10:10:05 +00:00
{"role": "assistant", "content": "I am an AI assistant named..."},
{"role": "user", "content": "Can you tell me how to install neovim"},
2025-02-26 07:52:26 +00:00
],
2026-02-05 01:54:33 +00:00
extra_body={
2026-02-05 07:56:58 +00:00
"extra_body": {
"reference": reference,
"reference_metadata": {
"include": True,
"fields": ["author", "year", "source"],
},
}
},
2025-02-26 07:52:26 +00:00
)
if stream:
2026-02-05 07:56:58 +00:00
completion = client.chat.completions.create(stream=True, **request_kwargs)
2025-09-22 09:27:25 +00:00
for chunk in completion:
print(chunk)
2025-02-26 07:52:26 +00:00
else:
2026-02-05 07:56:58 +00:00
resp = client.chat.completions.with_raw_response.create(
stream=False, **request_kwargs
)
print("status:", resp.http_response.status_code)
raw_text = resp.http_response.text
print("raw:", raw_text)
data = json.loads(raw_text)
print("assistant:", data["choices"][0]["message"].get("content"))
print("reference:", data["choices"][0]["message"].get("reference"))
2025-02-26 07:52:26 +00:00
```
2026-02-05 01:54:33 +00:00
When `extra_body.reference_metadata.include` is `true` , each reference chunk may include a `document_metadata` object in both streaming and non-streaming responses.
2024-12-18 11:01:05 +00:00
## DATASET MANAGEMENT
2024-12-13 02:25:52 +00:00
2024-12-18 11:01:05 +00:00
---
2024-10-19 11:46:13 +00:00
2024-12-18 11:01:05 +00:00
### Create dataset
2024-10-09 07:30:22 +00:00
```python
RAGFlow.create_dataset(
name: str,
2025-04-29 08:53:57 +00:00
avatar: Optional[str] = None,
description: Optional[str] = None,
embedding_model: Optional[str] = "BAAI/bge-large-zh-v1.5@BAAI",
2026-01-13 01:41:02 +00:00
permission: str = "me",
2024-10-18 12:56:33 +00:00
chunk_method: str = "naive",
2024-10-09 07:30:22 +00:00
parser_config: DataSet.ParserConfig = None
) -> DataSet
```
2024-10-18 12:56:33 +00:00
Creates a dataset.
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### name: `str`, *Required*
2024-10-09 07:30:22 +00:00
The unique name of the dataset to create. It must adhere to the following requirements:
2025-04-29 08:53:57 +00:00
- Maximum 128 characters.
2024-10-09 07:30:22 +00:00
- Case-insensitive.
2024-12-18 11:01:05 +00:00
##### avatar: `str`
2024-10-09 07:30:22 +00:00
2025-04-29 08:53:57 +00:00
Base64 encoding of the avatar. Defaults to `None`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### description: `str`
2024-10-09 07:30:22 +00:00
2025-04-29 08:53:57 +00:00
A brief description of the dataset to create. Defaults to `None` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### permission
2024-10-09 07:30:22 +00:00
2026-01-13 01:41:02 +00:00
Specifies who can access the dataset to create. Available options:
2024-11-14 10:44:37 +00:00
- `"me"` : (Default) Only you can manage the dataset.
- `"team"` : All team members can manage the dataset.
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### chunk_method, `str`
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
The chunking method of the dataset to create. Available options:
- `"naive"` : General (default)
- `"manual` : Manual
- `"qa"` : Q& A
- `"table"` : Table
- `"paper"` : Paper
- `"book"` : Book
- `"laws"` : Laws
- `"presentation"` : Presentation
- `"picture"` : Picture
2024-10-25 09:11:58 +00:00
- `"one"` : One
2024-10-30 09:59:23 +00:00
- `"email"` : Email
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### parser_config
2024-10-09 07:30:22 +00:00
2024-10-30 09:59:23 +00:00
The parser configuration of the dataset. A `ParserConfig` object's attributes vary based on the selected `chunk_method` :
2024-10-30 07:33:36 +00:00
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"naive"`:
2025-07-10 01:43:19 +00:00
`{"chunk_token_num":512,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"use_raptor":False}}` .
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"qa"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"manuel"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"table"`:
2024-10-30 07:33:36 +00:00
`None`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"paper"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"book"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"laws"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"picture"`:
2024-10-30 09:59:23 +00:00
`None`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"presentation"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"one"`:
2024-10-30 07:33:36 +00:00
`None`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"knowledge-graph"`:
2025-04-09 11:32:25 +00:00
`{"chunk_token_num":128,"delimiter":"\\n","entity_types":["organization","person","location","event","time"]}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"email"`:
2024-10-30 09:59:23 +00:00
`None`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-14 12:48:23 +00:00
- Success: A `dataset` object.
- Failure: `Exception`
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-14 12:48:23 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
2024-10-19 11:46:13 +00:00
dataset = rag_object.create_dataset(name="kb_1")
2024-10-09 07:30:22 +00:00
```
---
2024-12-18 11:01:05 +00:00
### Delete datasets
2024-10-09 07:30:22 +00:00
```python
2026-03-12 01:47:42 +00:00
RAGFlow.delete_datasets(ids: list[str] | None = None, delete_all: bool = False)
2024-10-09 07:30:22 +00:00
```
2024-10-24 12:04:50 +00:00
Deletes datasets by ID.
2024-10-14 12:48:23 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-14 12:03:33 +00:00
2026-03-12 01:47:42 +00:00
##### ids: `list[str]` or `None`
2024-10-11 01:55:27 +00:00
2025-05-16 02:16:43 +00:00
The IDs of the datasets to delete. Defaults to `None` .
2026-03-06 10:16:42 +00:00
- If omitted, or set to `null` or an empty array, no datasets are deleted.
- If an array of IDs is provided, only the datasets matching those IDs are deleted.
2024-10-11 01:55:27 +00:00
2026-03-12 01:47:42 +00:00
##### delete_all: `bool`
Whether to delete all datasets owned by the current user when `ids` is omitted, or set to `None` or an empty list. Defaults to `False` .
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-14 12:48:23 +00:00
- Success: No value is returned.
- Failure: `Exception`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
2024-10-12 12:07:21 +00:00
```python
2025-05-16 02:16:43 +00:00
rag_object.delete_datasets(ids=["d94a8dc02c9711f0930f7fbc369eab6d","e94a8dc02c9711f0930f7fbc369eab6e"])
2026-03-12 01:47:42 +00:00
rag_object.delete_datasets(delete_all=True)
2024-10-09 07:30:22 +00:00
```
---
2024-12-18 11:01:05 +00:00
### List datasets
2024-10-09 07:30:22 +00:00
```python
RAGFlow.list_datasets(
2026-01-13 01:41:02 +00:00
page: int = 1,
page_size: int = 30,
orderby: str = "create_time",
2024-10-11 01:55:27 +00:00
desc: bool = True,
id: str = None,
2026-03-10 10:05:45 +00:00
name: str = None,
include_parsing_status: bool = False
2024-10-16 12:38:19 +00:00
) -> list[DataSet]
2024-10-09 07:30:22 +00:00
```
2024-10-22 09:10:23 +00:00
Lists datasets.
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### page: `int`
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
Specifies the page on which the datasets will be displayed. Defaults to `1` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### page_size: `int`
2024-10-09 07:30:22 +00:00
2024-11-05 07:21:37 +00:00
The number of datasets on each page. Defaults to `30` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### orderby: `str`
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
The field by which datasets should be sorted. Available options:
- `"create_time"` (default)
- `"update_time"`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### desc: `bool`
2024-10-09 07:30:22 +00:00
2024-10-18 12:56:33 +00:00
Indicates whether the retrieved datasets should be sorted in descending order. Defaults to `True` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### id: `str`
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
The ID of the dataset to retrieve. Defaults to `None` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### name: `str`
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
The name of the dataset to retrieve. Defaults to `None` .
2024-10-09 07:30:22 +00:00
2026-03-10 10:05:45 +00:00
##### include_parsing_status: `bool`
Whether to include document parsing status counts in each returned `DataSet` object. Defaults to `False` . When set to `True` , each `DataSet` object will include the following additional attributes:
- `unstart_count` : `int` Number of documents not yet started parsing.
- `running_count` : `int` Number of documents currently being parsed.
- `cancel_count` : `int` Number of documents whose parsing was cancelled.
- `done_count` : `int` Number of documents that have been successfully parsed.
- `fail_count` : `int` Number of documents whose parsing failed.
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
- Success: A list of `DataSet` objects.
2024-10-14 12:48:23 +00:00
- Failure: `Exception` .
2024-10-12 12:07:21 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### List all datasets
2024-10-14 12:03:33 +00:00
2024-10-14 12:48:23 +00:00
```python
2024-10-19 11:46:13 +00:00
for dataset in rag_object.list_datasets():
print(dataset)
2024-10-09 07:30:22 +00:00
```
2024-12-18 11:01:05 +00:00
##### Retrieve a dataset by ID
2024-10-09 07:30:22 +00:00
2024-10-14 12:48:23 +00:00
```python
dataset = rag_object.list_datasets(id = "id_1")
print(dataset[0])
```
2026-03-10 10:05:45 +00:00
##### List datasets with parsing status
```python
for dataset in rag_object.list_datasets(include_parsing_status=True):
print(dataset.done_count, dataset.fail_count, dataset.running_count)
```
2024-10-14 12:48:23 +00:00
---
2024-10-14 12:03:33 +00:00
2024-12-18 11:01:05 +00:00
### Update dataset
2024-10-09 07:30:22 +00:00
```python
2024-10-11 01:55:27 +00:00
DataSet.update(update_message: dict)
2024-10-09 07:30:22 +00:00
```
2024-10-19 11:46:13 +00:00
Updates configurations for the current dataset.
2024-10-14 12:48:23 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-14 12:48:23 +00:00
2024-12-18 11:01:05 +00:00
##### update_message: `dict[str, str|int]`, *Required*
2024-10-14 12:48:23 +00:00
2024-10-19 11:46:13 +00:00
A dictionary representing the attributes to update, with the following keys:
2024-10-25 09:11:58 +00:00
- `"name"` : `str` The revised name of the dataset.
2025-05-09 11:17:08 +00:00
- Basic Multilingual Plane (BMP) only
- Maximum 128 characters
- Case-insensitive
2026-01-13 01:41:02 +00:00
- `"avatar"` : (*Body parameter*), `string`
2025-05-09 11:17:08 +00:00
The updated base64 encoding of the avatar.
- Maximum 65535 characters
2026-01-13 01:41:02 +00:00
- `"embedding_model"` : (*Body parameter*), `string`
The updated embedding model name.
2024-10-14 12:48:23 +00:00
- Ensure that `"chunk_count"` is `0` before updating `"embedding_model"` .
2025-05-09 11:17:08 +00:00
- Maximum 255 characters
- Must follow `model_name@model_factory` format
2026-01-13 01:41:02 +00:00
- `"permission"` : (*Body parameter*), `string`
The updated dataset permission. Available options:
2025-05-09 11:17:08 +00:00
- `"me"` : (Default) Only you can manage the dataset.
- `"team"` : All team members can manage the dataset.
2026-01-13 01:41:02 +00:00
- `"pagerank"` : (*Body parameter*), `int`
2025-05-09 11:17:08 +00:00
refer to [Set page rank ](https://ragflow.io/docs/dev/set_page_rank )
- Default: `0`
- Minimum: `0`
- Maximum: `100`
2026-01-13 01:41:02 +00:00
- `"chunk_method"` : (*Body parameter*), `enum<string>`
The chunking method for the dataset. Available options:
2025-05-09 11:17:08 +00:00
- `"naive"` : General (default)
2024-10-14 12:48:23 +00:00
- `"book"` : Book
2025-05-09 11:17:08 +00:00
- `"email"` : Email
2024-10-14 12:48:23 +00:00
- `"laws"` : Laws
2025-05-09 11:17:08 +00:00
- `"manual"` : Manual
2024-10-25 09:11:58 +00:00
- `"one"` : One
2025-05-09 11:17:08 +00:00
- `"paper"` : Paper
- `"picture"` : Picture
- `"presentation"` : Presentation
- `"qa"` : Q& A
- `"table"` : Table
- `"tag"` : Tag
2024-10-14 12:48:23 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-14 12:48:23 +00:00
- Success: No value is returned.
- Failure: `Exception`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
dataset = rag_object.list_datasets(name="kb_name")
2025-03-26 09:30:09 +00:00
dataset = dataset[0]
2024-10-18 12:56:33 +00:00
dataset.update({"embedding_model":"BAAI/bge-zh-v1.5", "chunk_method":"manual"})
2024-10-09 07:30:22 +00:00
```
2024-10-17 10:19:17 +00:00
2024-10-09 07:30:22 +00:00
---
2024-12-18 11:01:05 +00:00
## FILE MANAGEMENT WITHIN DATASET
2024-10-09 07:30:22 +00:00
2024-10-18 12:56:33 +00:00
---
2024-12-18 11:01:05 +00:00
### Upload documents
2024-10-09 07:30:22 +00:00
```python
2024-10-16 12:38:19 +00:00
DataSet.upload_documents(document_list: list[dict])
2024-10-09 07:30:22 +00:00
```
2024-10-18 12:56:33 +00:00
Uploads documents to the current dataset.
2024-10-17 10:19:17 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### document_list: `list[dict]`, *Required*
2024-10-17 10:19:17 +00:00
A list of dictionaries representing the documents to upload, each containing the following keys:
2024-10-09 07:30:22 +00:00
2026-01-13 01:41:02 +00:00
- `"display_name"` : (Optional) The file name to display in the dataset.
2024-10-18 12:56:33 +00:00
- `"blob"` : (Optional) The binary content of the file to upload.
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-17 10:19:17 +00:00
- Success: No value is returned.
- Failure: `Exception`
2024-10-16 10:41:24 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-16 10:41:24 +00:00
2024-10-17 10:19:17 +00:00
```python
2024-10-18 12:56:33 +00:00
dataset = rag_object.create_dataset(name="kb_name")
dataset.upload_documents([{"display_name": "1.txt", "blob": "< BINARY_CONTENT_OF_THE_DOC > "}, {"display_name": "2.pdf", "blob": "< BINARY_CONTENT_OF_THE_DOC > "}])
2024-10-16 10:41:24 +00:00
```
2024-10-17 10:19:17 +00:00
2024-10-16 10:41:24 +00:00
---
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
### Update document
2024-10-16 10:41:24 +00:00
```python
Document.update(update_message:dict)
```
2024-10-17 10:19:17 +00:00
Updates configurations for the current document.
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-16 10:41:24 +00:00
2024-12-18 11:01:05 +00:00
##### update_message: `dict[str, str|dict[]]`, *Required*
2024-10-17 10:19:17 +00:00
2024-10-19 11:46:13 +00:00
A dictionary representing the attributes to update, with the following keys:
2024-10-22 09:10:23 +00:00
- `"display_name"` : `str` The name of the document to update.
2025-03-04 07:43:09 +00:00
- `"meta_fields"` : `dict[str, Any]` The meta fields of the document.
2024-10-18 12:56:33 +00:00
- `"chunk_method"` : `str` The parsing method to apply to the document.
- `"naive"` : General
- `"manual` : Manual
- `"qa"` : Q& A
- `"table"` : Table
- `"paper"` : Paper
- `"book"` : Book
- `"laws"` : Laws
- `"presentation"` : Presentation
- `"picture"` : Picture
- `"one"` : One
2024-10-30 09:59:23 +00:00
- `"email"` : Email
2024-10-30 07:33:36 +00:00
- `"parser_config"` : `dict[str, Any]` The parsing configuration for the document. Its attributes vary based on the selected `"chunk_method"` :
2026-01-13 01:41:02 +00:00
- `"chunk_method"` =`"naive"`:
2025-04-15 09:45:52 +00:00
`{"chunk_token_num":128,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"use_raptor":False}}` .
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"qa"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"manuel"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"table"`:
2024-10-30 07:33:36 +00:00
`None`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"paper"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"book"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"laws"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"presentation"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"picture"`:
2024-10-30 09:59:23 +00:00
`None`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"one"`:
2024-10-30 07:33:36 +00:00
`None`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"knowledge-graph"`:
2025-04-09 11:32:25 +00:00
`{"chunk_token_num":128,"delimiter":"\\n","entity_types":["organization","person","location","event","time"]}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"email"`:
2024-10-30 09:59:23 +00:00
`None`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-17 10:19:17 +00:00
- Success: No value is returned.
- Failure: `Exception`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
2024-10-16 10:41:24 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-16 10:41:24 +00:00
2024-10-19 11:46:13 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
dataset = rag_object.list_datasets(id='id')
dataset = dataset[0]
2024-10-17 10:19:17 +00:00
doc = dataset.list_documents(id="wdfxb5t547d")
2024-10-16 10:41:24 +00:00
doc = doc[0]
2025-07-01 01:47:23 +00:00
doc.update([{"parser_config": {"chunk_token_num": 256}}, {"chunk_method": "manual"}])
2024-10-16 10:41:24 +00:00
```
2024-10-09 07:30:22 +00:00
---
2024-12-18 11:01:05 +00:00
### Download document
2024-10-09 07:30:22 +00:00
```python
2024-10-16 10:41:24 +00:00
Document.download() -> bytes
```
2024-10-19 11:46:13 +00:00
Downloads the current document.
2024-10-18 12:56:33 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-16 10:41:24 +00:00
2024-10-18 12:56:33 +00:00
The downloaded document in bytes.
2024-10-16 10:41:24 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-16 10:41:24 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-16 10:41:24 +00:00
2024-10-18 12:56:33 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
dataset = rag_object.list_datasets(id="id")
dataset = dataset[0]
doc = dataset.list_documents(id="wdfxb5t547d")
2024-10-16 10:41:24 +00:00
doc = doc[0]
open("~/ragflow.txt", "wb+").write(doc.download())
print(doc)
```
---
2024-12-18 11:01:05 +00:00
### List documents
2024-10-16 10:41:24 +00:00
```python
2025-08-04 08:35:35 +00:00
Dataset.list_documents(
id: str = None,
keywords: str = None,
page: int = 1,
page_size: int = 30,
order_by: str = "create_time",
desc: bool = True,
create_time_from: int = 0,
create_time_to: int = 0
) -> list[Document]
2024-10-09 07:30:22 +00:00
```
2024-10-22 09:10:23 +00:00
Lists documents in the current dataset.
2024-10-18 12:56:33 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### id: `str`
2024-10-09 07:30:22 +00:00
2024-10-18 12:56:33 +00:00
The ID of the document to retrieve. Defaults to `None` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### keywords: `str`
2024-10-16 10:41:24 +00:00
2024-10-21 01:47:59 +00:00
The keywords used to match document titles. Defaults to `None` .
2024-10-16 10:41:24 +00:00
2024-12-18 11:01:05 +00:00
##### page: `int`
2024-10-16 10:41:24 +00:00
2024-11-04 12:03:14 +00:00
Specifies the page on which the documents will be displayed. Defaults to `1` .
2024-10-16 10:41:24 +00:00
2024-12-18 11:01:05 +00:00
##### page_size: `int`
2024-10-16 10:41:24 +00:00
2024-11-05 07:21:37 +00:00
The maximum number of documents on each page. Defaults to `30` .
2024-10-17 10:19:17 +00:00
2024-12-18 11:01:05 +00:00
##### orderby: `str`
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
The field by which documents should be sorted. Available options:
2024-10-18 12:56:33 +00:00
2024-10-19 11:46:13 +00:00
- `"create_time"` (default)
2024-10-18 12:56:33 +00:00
- `"update_time"`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### desc: `bool`
2024-10-17 10:19:17 +00:00
2024-10-18 12:56:33 +00:00
Indicates whether the retrieved documents should be sorted in descending order. Defaults to `True` .
2024-10-17 10:19:17 +00:00
2025-08-04 08:35:35 +00:00
##### create_time_from: `int`
Unix timestamp for filtering documents created after this time. 0 means no filter. Defaults to 0.
##### create_time_to: `int`
Unix timestamp for filtering documents created before this time. 0 means no filter. Defaults to 0.
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-17 11:52:35 +00:00
- Success: A list of `Document` objects.
- Failure: `Exception` .
2024-10-09 07:30:22 +00:00
2024-10-17 11:52:35 +00:00
A `Document` object contains the following attributes:
2024-10-19 11:46:13 +00:00
- `id` : The document ID. Defaults to `""` .
- `name` : The document name. Defaults to `""` .
- `thumbnail` : The thumbnail image of the document. Defaults to `None` .
2024-10-24 08:14:07 +00:00
- `dataset_id` : The dataset ID associated with the document. Defaults to `None` .
2025-04-27 03:44:08 +00:00
- `chunk_method` The chunking method name. Defaults to `"naive"` .
2024-10-19 11:46:13 +00:00
- `source_type` : The source type of the document. Defaults to `"local"` .
2024-10-21 11:50:45 +00:00
- `type` : Type or category of the document. Defaults to `""` . Reserved for future use.
2024-10-19 11:46:13 +00:00
- `created_by` : `str` The creator of the document. Defaults to `""` .
- `size` : `int` The document size in bytes. Defaults to `0` .
- `token_count` : `int` The number of tokens in the document. Defaults to `0` .
2024-10-21 01:47:59 +00:00
- `chunk_count` : `int` The number of chunks in the document. Defaults to `0` .
2024-10-19 11:46:13 +00:00
- `progress` : `float` The current processing progress as a percentage. Defaults to `0.0` .
- `progress_msg` : `str` A message indicating the current progress status. Defaults to `""` .
- `process_begin_at` : `datetime` The start time of document processing. Defaults to `None` .
2025-07-07 06:11:47 +00:00
- `process_duration` : `float` Duration of the processing in seconds. Defaults to `0.0` .
2024-10-21 11:50:45 +00:00
- `run` : `str` The document's processing status:
2024-10-23 03:00:35 +00:00
- `"UNSTART"` (default)
- `"RUNNING"`
- `"CANCEL"`
- `"DONE"`
- `"FAIL"`
2024-10-21 11:50:45 +00:00
- `status` : `str` Reserved for future use.
2024-10-30 07:33:36 +00:00
- `parser_config` : `ParserConfig` Configuration object for the parser. Its attributes vary based on the selected `chunk_method` :
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"naive"`:
2025-04-15 09:45:52 +00:00
`{"chunk_token_num":128,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"use_raptor":False}}` .
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"qa"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"manuel"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"table"`:
2024-10-30 07:33:36 +00:00
`None`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"paper"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"book"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"laws"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"presentation"`:
2025-04-15 09:45:52 +00:00
`{"raptor": {"use_raptor": False}}`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"picure"`:
2024-10-30 09:59:23 +00:00
`None`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"one"`:
2024-10-30 07:33:36 +00:00
`None`
2026-01-13 01:41:02 +00:00
- `chunk_method` =`"email"`:
2024-10-30 09:59:23 +00:00
`None`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
dataset = rag_object.create_dataset(name="kb_1")
2024-10-09 07:30:22 +00:00
filename1 = "~/ragflow.txt"
2024-10-19 11:46:13 +00:00
blob = open(filename1 , "rb").read()
dataset.upload_documents([{"name":filename1,"blob":blob}])
2024-11-04 12:03:14 +00:00
for doc in dataset.list_documents(keywords="rag", page=0, page_size=12):
2024-10-19 11:46:13 +00:00
print(doc)
2024-10-09 07:30:22 +00:00
```
---
2024-12-18 11:01:05 +00:00
### Delete documents
2024-10-09 07:30:22 +00:00
```python
2026-03-12 01:47:42 +00:00
DataSet.delete_documents(ids: list[str] | None = None, delete_all: bool = False)
2024-10-09 07:30:22 +00:00
```
2024-10-17 10:19:17 +00:00
2024-10-19 11:46:13 +00:00
Deletes documents by ID.
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-19 11:46:13 +00:00
2026-03-06 10:16:42 +00:00
##### ids: `list[str]` or `None`
The IDs of the documents to delete. Defaults to `None` .
2024-10-19 11:46:13 +00:00
2026-03-06 10:16:42 +00:00
- If omitted, or set to `null` or an empty array, no documents are deleted.
- If an array of IDs is provided, only the documents matching those IDs are deleted.
2024-10-17 11:52:35 +00:00
2026-03-12 01:47:42 +00:00
##### delete_all: `bool`
Whether to delete all documents in the current dataset when `ids` is omitted, or set to `None` or an empty list. Defaults to `False` .
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-17 10:19:17 +00:00
- Success: No value is returned.
- Failure: `Exception`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
dataset = rag_object.list_datasets(name="kb_1")
dataset = dataset[0]
dataset.delete_documents(ids=["id_1","id_2"])
2026-03-12 01:47:42 +00:00
dataset.delete_documents(delete_all=True)
2024-10-09 07:30:22 +00:00
```
---
2024-12-18 11:01:05 +00:00
### Parse documents
2024-10-09 07:30:22 +00:00
```python
2024-10-16 12:38:19 +00:00
DataSet.async_parse_documents(document_ids:list[str]) -> None
2024-10-09 07:30:22 +00:00
```
2024-10-21 01:47:59 +00:00
Parses documents in the current dataset.
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### document_ids: `list[str]`, *Required*
2024-10-17 10:19:17 +00:00
2024-10-17 11:52:35 +00:00
The IDs of the documents to parse.
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-17 10:19:17 +00:00
2024-10-21 11:50:45 +00:00
- Success: No value is returned.
2024-10-17 10:19:17 +00:00
- Failure: `Exception`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-10-19 11:46:13 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
dataset = rag_object.create_dataset(name="dataset_name")
2024-10-09 07:30:22 +00:00
documents = [
2024-10-22 09:10:23 +00:00
{'display_name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()},
{'display_name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()},
{'display_name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()}
2024-10-09 07:30:22 +00:00
]
2024-10-19 11:46:13 +00:00
dataset.upload_documents(documents)
documents = dataset.list_documents(keywords="test")
ids = []
2024-10-16 10:41:24 +00:00
for document in documents:
ids.append(document.id)
2024-10-19 11:46:13 +00:00
dataset.async_parse_documents(ids)
print("Async bulk parsing initiated.")
2024-10-09 07:30:22 +00:00
```
2024-10-17 10:19:17 +00:00
---
2025-10-14 01:31:19 +00:00
### Parse documents (with document status)
```python
DataSet.parse_documents(document_ids: list[str]) -> list[tuple[str, str, int, int]]
```
2025-10-14 05:40:56 +00:00
*Asynchronously* parses documents in the current dataset.
This method encapsulates `async_parse_documents()` . It awaits the completion of all parsing tasks before returning detailed results, including the parsing status and statistics for each document. If a keyboard interruption occurs (e.g., `Ctrl+C` ), all pending parsing tasks will be cancelled gracefully.
2025-10-14 01:31:19 +00:00
#### Parameters
##### document_ids: `list[str]`, *Required*
The IDs of the documents to parse.
#### Returns
2025-10-14 05:40:56 +00:00
A list of tuples with detailed parsing results:
2025-10-14 01:31:19 +00:00
```python
[
(document_id: str, status: str, chunk_count: int, token_count: int),
...
]
```
2026-01-13 01:41:02 +00:00
- `status` : The final parsing state (e.g., `success` , `failed` , `cancelled` ).
- `chunk_count` : The number of content chunks created from the document.
- `token_count` : The total number of tokens processed.
2025-10-14 01:31:19 +00:00
---
#### Example
```python
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
dataset = rag_object.create_dataset(name="dataset_name")
documents = dataset.list_documents(keywords="test")
ids = [doc.id for doc in documents]
try:
finished = dataset.parse_documents(ids)
for doc_id, status, chunk_count, token_count in finished:
print(f"Document {doc_id} parsing finished with status: {status}, chunks: {chunk_count}, tokens: {token_count}")
except KeyboardInterrupt:
print("\nParsing interrupted by user. All pending tasks have been cancelled.")
except Exception as e:
print(f"Parsing failed: {e}")
```
---
2024-12-18 11:01:05 +00:00
### Stop parsing documents
2024-10-18 12:56:33 +00:00
```python
DataSet.async_cancel_parse_documents(document_ids:list[str])-> None
```
2024-10-21 01:47:59 +00:00
Stops parsing specified documents.
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-18 12:56:33 +00:00
2024-12-18 11:01:05 +00:00
##### document_ids: `list[str]`, *Required*
2024-10-18 12:56:33 +00:00
2024-10-19 11:46:13 +00:00
The IDs of the documents for which parsing should be stopped.
2024-10-18 12:56:33 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-18 12:56:33 +00:00
- Success: No value is returned.
- Failure: `Exception`
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-18 12:56:33 +00:00
```python
2024-10-19 11:46:13 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
dataset = rag_object.create_dataset(name="dataset_name")
2024-10-18 12:56:33 +00:00
documents = [
2024-10-22 09:10:23 +00:00
{'display_name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()},
{'display_name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()},
{'display_name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()}
2024-10-18 12:56:33 +00:00
]
2024-10-19 11:46:13 +00:00
dataset.upload_documents(documents)
documents = dataset.list_documents(keywords="test")
ids = []
2024-10-18 12:56:33 +00:00
for document in documents:
ids.append(document.id)
2024-10-19 11:46:13 +00:00
dataset.async_parse_documents(ids)
print("Async bulk parsing initiated.")
dataset.async_cancel_parse_documents(ids)
print("Async bulk parsing cancelled.")
2024-10-18 12:56:33 +00:00
```
---
2024-12-25 12:21:38 +00:00
## CHUNK MANAGEMENT WITHIN DATASET
---
2024-12-18 11:01:05 +00:00
### Add chunk
2024-10-17 10:19:17 +00:00
2024-10-09 07:30:22 +00:00
```python
2026-03-29 12:17:01 +00:00
Document.add_chunk(content:str, important_keywords:list[str] = [], image_base64:str = None, *, tag_kwd:list[str] = []) -> Chunk
2024-10-09 07:30:22 +00:00
```
2024-10-17 10:19:17 +00:00
2024-10-21 01:47:59 +00:00
Adds a chunk to the current document.
2024-10-19 11:46:13 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### content: `str`, *Required*
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
The text content of the chunk.
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### important_keywords: `list[str]`
2024-10-17 11:52:35 +00:00
2024-10-21 01:47:59 +00:00
The key terms or phrases to tag with the chunk.
2024-10-17 10:19:17 +00:00
2026-03-16 12:15:36 +00:00
##### image_base64: `str`
A base64-encoded image to associate with the chunk. If the chunk already has an image, the new image will be vertically concatenated below the existing one.
2026-03-29 12:17:01 +00:00
##### tag_kwd: `list[str]`
Tag keywords to associate with the chunk.
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-17 11:52:35 +00:00
2024-10-21 01:47:59 +00:00
- Success: A `Chunk` object.
2024-10-19 11:46:13 +00:00
- Failure: `Exception` .
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
A `Chunk` object contains the following attributes:
2024-10-23 12:07:47 +00:00
- `id` : `str` : The chunk ID.
- `content` : `str` The text content of the chunk.
- `important_keywords` : `list[str]` A list of key terms or phrases tagged with the chunk.
2026-03-29 12:17:01 +00:00
- `tag_kwd` : `list[str]` A list of tag keywords associated with the chunk.
2026-03-16 12:15:36 +00:00
- `image_id` : `str` The image ID associated with the chunk (empty string if no image).
2024-10-21 01:47:59 +00:00
- `create_time` : `str` The time when the chunk was created (added to the document).
- `create_timestamp` : `float` The timestamp representing the creation time of the chunk, expressed in seconds since January 1, 1970.
2024-10-24 08:14:07 +00:00
- `dataset_id` : `str` The ID of the associated dataset.
2024-10-21 01:47:59 +00:00
- `document_name` : `str` The name of the associated document.
- `document_id` : `str` The ID of the associated document.
2024-10-22 09:10:23 +00:00
- `available` : `bool` The chunk's availability status in the dataset. Value options:
- `False` : Unavailable
2024-10-24 12:04:50 +00:00
- `True` : Available (default)
2024-10-21 01:47:59 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-17 11:52:35 +00:00
2024-10-16 10:41:24 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
2025-01-27 07:45:16 +00:00
datasets = rag_object.list_datasets(id="123")
dataset = datasets[0]
2024-10-21 01:47:59 +00:00
doc = dataset.list_documents(id="wdfxb5t547d")
doc = doc[0]
chunk = doc.add_chunk(content="xxxxxxx")
2024-10-16 10:41:24 +00:00
```
2024-10-17 10:19:17 +00:00
2026-03-16 12:15:36 +00:00
Adding a chunk with an image:
```python
import base64
with open("image.jpg", "rb") as f:
img_b64 = base64.b64encode(f.read()).decode()
chunk = doc.add_chunk(content="description of image", image_base64=img_b64)
```
2024-10-21 01:47:59 +00:00
---
2024-12-18 11:01:05 +00:00
### List chunks
2024-10-09 07:30:22 +00:00
```python
2024-11-05 07:21:37 +00:00
Document.list_chunks(keywords: str = None, page: int = 1, page_size: int = 30, id : str = None) -> list[Chunk]
2024-10-09 07:30:22 +00:00
```
2024-10-22 09:10:23 +00:00
Lists chunks in the current document.
2024-10-21 01:47:59 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### keywords: `str`
2024-10-23 03:00:35 +00:00
2024-10-21 01:47:59 +00:00
The keywords used to match chunk content. Defaults to `None`
2024-10-17 10:19:17 +00:00
2024-12-18 11:01:05 +00:00
##### page: `int`
2024-10-21 01:47:59 +00:00
2024-11-04 12:03:14 +00:00
Specifies the page on which the chunks will be displayed. Defaults to `1` .
2024-10-21 01:47:59 +00:00
2024-12-18 11:01:05 +00:00
##### page_size: `int`
2024-10-17 10:19:17 +00:00
2024-11-05 07:21:37 +00:00
The maximum number of chunks on each page. Defaults to `30` .
2024-10-17 10:19:17 +00:00
2024-12-18 11:01:05 +00:00
##### id: `str`
2024-10-21 01:47:59 +00:00
The ID of the chunk to retrieve. Default: `None`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
- Success: A list of `Chunk` objects.
- Failure: `Exception` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
dataset = rag_object.list_datasets("123")
dataset = dataset[0]
2024-12-27 06:40:00 +00:00
docs = dataset.list_documents(keywords="test", page=1, page_size=12)
for chunk in docs[0].list_chunks(keywords="rag", page=0, page_size=12):
2024-10-21 01:47:59 +00:00
print(chunk)
2024-10-09 07:30:22 +00:00
```
---
2024-12-18 11:01:05 +00:00
### Delete chunks
2024-10-09 07:30:22 +00:00
```python
2026-03-12 01:47:42 +00:00
Document.delete_chunks(ids: list[str] | None = None, delete_all: bool = False)
2024-10-09 07:30:22 +00:00
```
2024-10-17 11:52:35 +00:00
2024-10-19 11:46:13 +00:00
Deletes chunks by ID.
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-17 10:19:17 +00:00
2026-03-12 01:47:42 +00:00
##### ids: `list[str]` or `None`
2024-10-17 10:19:17 +00:00
2026-03-06 10:16:42 +00:00
The IDs of the chunks to delete. Defaults to `None` .
- If omitted, or set to `null` or an empty array, no chunks are deleted.
- If an array of IDs is provided, only the chunks matching those IDs are deleted.
2024-10-09 07:30:22 +00:00
2026-03-12 01:47:42 +00:00
##### delete_all: `bool`
Whether to delete all chunks in the current document when `ids` is omitted, or set to `None` or an empty list. Defaults to `False` .
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-17 10:19:17 +00:00
- Success: No value is returned.
- Failure: `Exception`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
dataset = rag_object.list_datasets(id="123")
dataset = dataset[0]
doc = dataset.list_documents(id="wdfxb5t547d")
2024-10-16 10:41:24 +00:00
doc = doc[0]
2024-10-09 07:30:22 +00:00
chunk = doc.add_chunk(content="xxxxxxx")
2024-10-16 10:41:24 +00:00
doc.delete_chunks(["id_1","id_2"])
2026-03-12 01:47:42 +00:00
doc.delete_chunks(delete_all=True)
2024-10-09 07:30:22 +00:00
```
---
2024-12-18 11:01:05 +00:00
### Update chunk
2024-10-09 07:30:22 +00:00
```python
2024-10-16 10:41:24 +00:00
Chunk.update(update_message: dict)
2024-10-09 07:30:22 +00:00
```
2024-10-18 12:56:33 +00:00
2024-10-19 11:46:13 +00:00
Updates content or configurations for the current chunk.
2024-10-18 12:56:33 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-16 10:41:24 +00:00
2024-12-18 11:01:05 +00:00
##### update_message: `dict[str, str|list[str]|int]` *Required*
2024-10-16 10:41:24 +00:00
2024-10-19 11:46:13 +00:00
A dictionary representing the attributes to update, with the following keys:
2024-10-23 12:07:47 +00:00
- `"content"` : `str` The text content of the chunk.
2024-10-21 01:47:59 +00:00
- `"important_keywords"` : `list[str]` A list of key terms or phrases to tag with the chunk.
2026-03-29 12:17:01 +00:00
- `"tag_kwd"` : `list[str]` A list of tag keywords to associate with the chunk.
2024-10-22 09:10:23 +00:00
- `"available"` : `bool` The chunk's availability status in the dataset. Value options:
- `False` : Unavailable
2024-10-24 12:04:50 +00:00
- `True` : Available (default)
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-17 10:19:17 +00:00
- Success: No value is returned.
- Failure: `Exception`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-18 12:56:33 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
dataset = rag_object.list_datasets(id="123")
2024-10-17 11:52:35 +00:00
dataset = dataset[0]
doc = dataset.list_documents(id="wdfxb5t547d")
2024-10-16 10:41:24 +00:00
doc = doc[0]
2024-10-09 07:30:22 +00:00
chunk = doc.add_chunk(content="xxxxxxx")
2024-10-17 11:52:35 +00:00
chunk.update({"content":"sdfx..."})
2024-10-09 07:30:22 +00:00
```
---
2024-12-18 11:01:05 +00:00
### Retrieve chunks
2024-10-09 07:30:22 +00:00
```python
2025-09-05 03:12:15 +00:00
RAGFlow.retrieve(question:str="", dataset_ids:list[str]=None, document_ids=list[str]=None, page:int=1, page_size:int=30, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,cross_languages:list[str]=None,metadata_condition: dict=None) -> list[Chunk]
2024-10-09 07:30:22 +00:00
```
2024-10-22 09:10:23 +00:00
Retrieves chunks from specified datasets.
2024-10-21 01:47:59 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### question: `str`, *Required*
2024-10-09 07:30:22 +00:00
The user query or query keywords. Defaults to `""` .
2024-12-18 11:01:05 +00:00
##### dataset_ids: `list[str]`, *Required*
2024-10-09 07:30:22 +00:00
2026-01-13 01:41:02 +00:00
The IDs of the datasets to search. Defaults to `None` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### document_ids: `list[str]`
2024-10-09 07:30:22 +00:00
2026-01-13 01:41:02 +00:00
The IDs of the documents to search. Defaults to `None` . You must ensure all selected documents use the same embedding model. Otherwise, an error will occur.
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### page: `int`
2024-10-09 07:30:22 +00:00
2024-10-22 09:10:23 +00:00
The starting index for the documents to retrieve. Defaults to `1` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### page_size: `int`
2024-10-09 07:30:22 +00:00
2024-11-05 07:21:37 +00:00
The maximum number of chunks to retrieve. Defaults to `30` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### Similarity_threshold: `float`
2024-10-09 07:30:22 +00:00
The minimum similarity score. Defaults to `0.2` .
2024-12-18 11:01:05 +00:00
##### vector_similarity_weight: `float`
2024-10-09 07:30:22 +00:00
2024-10-18 12:56:33 +00:00
The weight of vector cosine similarity. Defaults to `0.3` . If x represents the vector cosine similarity, then (1 - x) is the term similarity weight.
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### top_k: `int`
2024-10-09 07:30:22 +00:00
2025-01-27 07:45:16 +00:00
The number of chunks engaged in vector cosine computation. Defaults to `1024` .
2024-10-18 12:56:33 +00:00
2024-12-18 11:01:05 +00:00
##### rerank_id: `str`
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
The ID of the rerank model. Defaults to `None` .
2024-10-16 10:41:24 +00:00
2024-12-18 11:01:05 +00:00
##### keyword: `bool`
2024-10-18 12:56:33 +00:00
2024-10-22 09:10:23 +00:00
Indicates whether to enable keyword-based matching:
2024-10-18 12:56:33 +00:00
2024-10-22 09:10:23 +00:00
- `True` : Enable keyword-based matching.
- `False` : Disable keyword-based matching (default).
2024-10-16 10:41:24 +00:00
2026-01-13 01:41:02 +00:00
##### cross_languages: `list[string]`
2025-07-21 09:25:28 +00:00
The languages that should be translated into, in order to achieve keywords retrievals in different languages.
2025-09-05 03:12:15 +00:00
##### metadata_condition: `dict`
2025-09-11 11:02:50 +00:00
filter condition for `meta_fields` .
2025-09-05 03:12:15 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-18 12:56:33 +00:00
- Success: A list of `Chunk` objects representing the document chunks.
- Failure: `Exception`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-18 12:56:33 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
2024-10-21 01:47:59 +00:00
dataset = rag_object.list_datasets(name="ragflow")
dataset = dataset[0]
2024-10-09 07:30:22 +00:00
name = 'ragflow_test.txt'
2024-10-16 10:41:24 +00:00
path = './test_data/ragflow_test.txt'
2024-12-31 09:25:24 +00:00
documents =[{"display_name":"test_retrieve_chunks.txt","blob":open(path, "rb").read()}]
2024-12-27 06:49:43 +00:00
docs = dataset.upload_documents(documents)
doc = docs[0]
doc.add_chunk(content="This is a chunk addition test")
for c in rag_object.retrieve(dataset_ids=[dataset.id],document_ids=[doc.id]):
print(c)
2024-10-09 07:30:22 +00:00
```
---
2024-12-18 11:01:05 +00:00
## CHAT ASSISTANT MANAGEMENT
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
---
2024-12-18 11:01:05 +00:00
### Create chat assistant
2024-10-14 12:48:23 +00:00
2024-10-09 07:30:22 +00:00
```python
2024-10-12 05:48:43 +00:00
RAGFlow.create_chat(
2026-04-01 02:50:22 +00:00
name: str,
icon: str = “”,
dataset_ids: list[str] | None = None,
llm_id: str | None = None,
llm_setting: dict | None = None,
prompt_config: dict | None = None,
**kwargs
2024-10-12 05:48:43 +00:00
) -> Chat
2024-10-09 07:30:22 +00:00
```
2024-10-16 12:38:19 +00:00
Creates a chat assistant.
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### name: `str`, *Required*
2024-10-18 12:56:33 +00:00
2024-10-23 12:07:47 +00:00
The name of the chat assistant.
2024-10-18 12:56:33 +00:00
2026-04-01 02:50:22 +00:00
##### icon: `str`
2024-10-18 12:56:33 +00:00
2026-04-01 02:50:22 +00:00
Base64 encoding of the avatar. Defaults to `””` .
2024-10-18 12:56:33 +00:00
2024-12-18 11:01:05 +00:00
##### dataset_ids: `list[str]`
2024-10-18 12:56:33 +00:00
2026-04-01 02:50:22 +00:00
The IDs of the associated datasets. Defaults to `[]` . When omitted or empty, the SDK creates an empty chat assistant and you can attach datasets later.
##### llm_id: `str | None`
The LLM model name/ID to use. If `None` , the user’ s default chat model is used. Defaults to `None` .
##### llm_setting: `dict | None`
LLM generation settings. Defaults to `None` (server defaults apply). Supported keys:
- `”temperature”` : `float` Controls the randomness of the model’ s predictions. Defaults to `0.1` .
- `”top_p”` : `float` Nucleus sampling threshold. Defaults to `0.3` .
- `”presence_penalty”` : `float` Penalizes tokens that have already appeared. Defaults to `0.4` .
- `”frequency_penalty”` : `float` Reduces repetition of frequent tokens. Defaults to `0.7` .
- `”max_token”` : `int` Maximum number of tokens in the response. Defaults to `512` .
##### prompt_config: `dict | None`
Instructions for the LLM to follow. Defaults to `None` (server defaults apply). Supported keys:
- `”system”` : `str` The system prompt content.
- `”empty_response”` : `str` Response when nothing is retrieved. Leave blank to let the LLM improvise. Defaults to `None` .
- `”prologue”` : `str` The opening greeting shown to the user. Defaults to `”Hi! I’ m your assistant. What can I do for you?”` .
- `”quote”` : `bool` Whether to display source references. Defaults to `True` .
- `”parameters”` : `list[dict]` Variables used in the system prompt. Each entry has `”key”` (`str`) and `”optional”` (`bool`). The `knowledge` variable is reserved for retrieved chunks. Default: `[{“key”: “knowledge”, “optional”: True}]` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-18 12:56:33 +00:00
- Success: A `Chat` object representing the chat assistant.
- Failure: `Exception`
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
datasets = rag_object.list_datasets(name="kb_1")
dataset_ids = []
for dataset in datasets:
dataset_ids.append(dataset.id)
2024-10-24 08:14:07 +00:00
assistant = rag_object.create_chat("Miss R", dataset_ids=dataset_ids)
2024-10-09 07:30:22 +00:00
```
---
2024-12-18 11:01:05 +00:00
### Update chat assistant
2024-10-09 07:30:22 +00:00
```python
2024-10-12 05:48:43 +00:00
Chat.update(update_message: dict)
2024-10-09 07:30:22 +00:00
```
2026-04-01 02:50:22 +00:00
Partially updates configurations for the current chat assistant.
2024-10-14 12:48:23 +00:00
2026-04-01 02:50:22 +00:00
`Chat.update()` uses `PATCH /api/v1/chats/{chat_id}` . Only the provided keys are changed; all other fields are preserved.
2024-10-19 11:46:13 +00:00
2026-04-01 02:50:22 +00:00
#### Parameters
2024-10-14 12:48:23 +00:00
2026-04-01 02:50:22 +00:00
##### update_message: `dict`, *Required*
A dictionary representing the attributes to update. Supported keys:
- `”name”` : `str` The revised name of the chat assistant.
- `”icon”` : `str` Base64 encoding of the avatar.
- `”dataset_ids”` : `list[str]` The datasets to associate with the chat assistant.
- `”llm_id”` : `str` The LLM model name/ID to use.
- `”llm_setting”` : `dict` LLM generation settings:
- `”temperature”` : `float` Controls the randomness of the model’ s predictions.
- `”top_p”` : `float` Nucleus sampling threshold.
- `”presence_penalty”` : `float` Penalizes tokens that have already appeared.
- `”frequency_penalty”` : `float` Reduces repetition of frequent tokens.
- `”max_token”` : `int` Maximum number of tokens in the response.
- `”prompt_config”` : `dict` Instructions for the LLM to follow:
- `”system”` : `str` The system prompt content.
- `”empty_response”` : `str` Response when nothing is retrieved. Leave blank to let the LLM improvise.
- `”prologue”` : `str` The opening greeting shown to the user.
- `”quote”` : `bool` Whether to display source references.
- `”parameters”` : `list[dict]` Variables used in the system prompt.
- `”similarity_threshold”` : `float` Minimum similarity score for retrieved chunks. Defaults to `0.2` .
- `”vector_similarity_weight”` : `float` Weight of vector cosine similarity in the hybrid score. Defaults to `0.3` .
- `”top_n”` : `int` Number of top chunks fed to the LLM. Defaults to `6` .
- `”top_k”` : `int` Candidate pool size for reranking. Defaults to `1024` .
- `”rerank_id”` : `str` Reranking model ID. If empty, vector cosine similarity is used.
2024-10-14 12:48:23 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-14 12:48:23 +00:00
- Success: No value is returned.
- Failure: `Exception`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
datasets = rag_object.list_datasets(name="kb_1")
2024-10-24 08:14:07 +00:00
dataset_id = datasets[0].id
assistant = rag_object.create_chat("Miss R", dataset_ids=[dataset_id])
2026-04-01 02:50:22 +00:00
assistant.update({"name": "Stefan", "llm_setting": {"temperature": 0.8}, "top_n": 8})
2024-10-09 07:30:22 +00:00
```
---
2024-12-18 11:01:05 +00:00
### Delete chat assistants
2024-10-09 07:30:22 +00:00
```python
2026-03-12 01:47:42 +00:00
RAGFlow.delete_chats(ids: list[str] | None = None, delete_all: bool = False)
2024-10-09 07:30:22 +00:00
```
2024-10-12 05:48:43 +00:00
2024-10-19 11:46:13 +00:00
Deletes chat assistants by ID.
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-12 05:48:43 +00:00
2026-03-12 01:47:42 +00:00
##### ids: `list[str]` or `None`
2024-10-12 05:48:43 +00:00
2026-03-06 10:16:42 +00:00
The IDs of the chat assistants to delete. Defaults to `None` .
- If omitted, or set to `null` or an empty array, no chat assistants are deleted.
- If an array of IDs is provided, only the chat assistants matching those IDs are deleted.
2024-10-09 07:30:22 +00:00
2026-03-12 01:47:42 +00:00
##### delete_all: `bool`
Whether to delete all chat assistants owned by the current user when `ids` is omitted, or set to `None` or an empty list. Defaults to `False` .
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-14 12:48:23 +00:00
- Success: No value is returned.
- Failure: `Exception`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
rag_object.delete_chats(ids=["id_1","id_2"])
2026-03-12 01:47:42 +00:00
rag_object.delete_chats(delete_all=True)
2024-10-09 07:30:22 +00:00
```
---
2024-12-18 11:01:05 +00:00
### List chat assistants
2024-10-09 07:30:22 +00:00
```python
2024-10-12 05:48:43 +00:00
RAGFlow.list_chats(
2026-01-13 01:41:02 +00:00
page: int = 1,
page_size: int = 30,
orderby: str = "create_time",
2024-10-12 05:48:43 +00:00
desc: bool = True,
2026-04-01 02:50:22 +00:00
id: str | None = None,
name: str | None = None,
keywords: str | None = None,
owner_ids: str | list[str] | None = None,
parser_id: str | None = None
2024-10-16 12:38:19 +00:00
) -> list[Chat]
2024-10-09 07:30:22 +00:00
```
2024-10-22 09:10:23 +00:00
Lists chat assistants.
2024-10-17 11:52:35 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### page: `int`
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
Specifies the page on which the chat assistants will be displayed. Defaults to `1` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### page_size: `int`
2024-10-09 07:30:22 +00:00
2024-11-05 07:21:37 +00:00
The number of chat assistants on each page. Defaults to `30` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### orderby: `str`
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
The attribute by which the results are sorted. Available options:
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
- `"create_time"` (default)
- `"update_time"`
2024-12-18 11:01:05 +00:00
##### desc: `bool`
2024-10-09 07:30:22 +00:00
2024-10-18 12:56:33 +00:00
Indicates whether the retrieved chat assistants should be sorted in descending order. Defaults to `True` .
2024-10-09 07:30:22 +00:00
2026-04-01 02:50:22 +00:00
##### id: `str | None`
Exact match on chat assistant ID. Defaults to `None` .
##### name: `str | None`
Exact match on chat assistant name. Defaults to `None` .
##### keywords: `str | None`
Case-insensitive fuzzy match against chat assistant names. Defaults to `None` .
##### owner_ids: `str | list[str] | None`
Filter by owner tenant IDs. Defaults to `None` .
2024-10-09 07:30:22 +00:00
2026-04-01 02:50:22 +00:00
##### parser_id: `str | None`
2024-10-09 07:30:22 +00:00
2026-04-01 02:50:22 +00:00
Filter by parser type. Defaults to `None` .
2024-10-09 07:30:22 +00:00
2026-04-01 02:50:22 +00:00
When `id` or `name` is provided, exact filtering takes precedence over `keywords` .
2024-10-14 12:48:23 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-16 12:38:19 +00:00
- Success: A list of `Chat` objects.
2024-10-14 12:48:23 +00:00
- Failure: `Exception` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
for assistant in rag_object.list_chats():
2024-10-14 12:48:23 +00:00
print(assistant)
2024-10-09 07:30:22 +00:00
```
---
2024-12-19 06:36:51 +00:00
## SESSION MANAGEMENT
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
---
2024-12-18 11:01:05 +00:00
### Create session with chat assistant
2024-10-09 07:30:22 +00:00
```python
2024-10-12 11:35:19 +00:00
Chat.create_session(name: str = "New session") -> Session
2024-10-09 07:30:22 +00:00
```
2024-11-14 10:44:37 +00:00
Creates a session with the current chat assistant.
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### name: `str`
2024-10-09 07:30:22 +00:00
2024-10-16 12:38:19 +00:00
The name of the chat session to create.
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-16 12:38:19 +00:00
- Success: A `Session` object containing the following attributes:
- `id` : `str` The auto-generated unique identifier of the created session.
- `name` : `str` The name of the created session.
2025-06-05 01:29:07 +00:00
- `message` : `list[Message]` The opening message of the created session. Default: `[{"role": "assistant", "content": "Hi! I am your assistant, can I help you?"}]`
2024-10-16 12:38:19 +00:00
- `chat_id` : `str` The ID of the associated chat assistant.
- Failure: `Exception`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
assistant = rag_object.list_chats(name="Miss R")
2024-10-16 12:38:19 +00:00
assistant = assistant[0]
session = assistant.create_session()
2024-10-09 07:30:22 +00:00
```
2024-10-21 01:47:59 +00:00
---
2024-12-18 11:01:05 +00:00
### Update chat assistant's session
2024-10-09 07:30:22 +00:00
```python
2024-10-16 12:38:19 +00:00
Session.update(update_message: dict)
2024-10-09 07:30:22 +00:00
```
2024-11-14 10:44:37 +00:00
Updates the current session of the current chat assistant.
2024-10-16 12:38:19 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-16 12:38:19 +00:00
2024-12-18 11:01:05 +00:00
##### update_message: `dict[str, Any]`, *Required*
2024-10-16 12:38:19 +00:00
2024-10-19 11:46:13 +00:00
A dictionary representing the attributes to update, with only one key:
2024-10-25 09:11:58 +00:00
- `"name"` : `str` The revised name of the session.
2024-10-16 12:38:19 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-16 12:38:19 +00:00
- Success: No value is returned.
- Failure: `Exception`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
assistant = rag_object.list_chats(name="Miss R")
2024-10-16 12:38:19 +00:00
assistant = assistant[0]
session = assistant.create_session("session_name")
session.update({"name": "updated_name"})
2024-10-09 07:30:22 +00:00
```
---
2024-12-18 11:01:05 +00:00
### List chat assistant's sessions
2024-10-09 07:30:22 +00:00
```python
2024-10-12 11:35:19 +00:00
Chat.list_sessions(
2026-01-13 01:41:02 +00:00
page: int = 1,
page_size: int = 30,
orderby: str = "create_time",
2024-10-12 11:35:19 +00:00
desc: bool = True,
id: str = None,
name: str = None
2024-10-16 12:38:19 +00:00
) -> list[Session]
2024-10-09 07:30:22 +00:00
```
2024-10-16 12:38:19 +00:00
Lists sessions associated with the current chat assistant.
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### page: `int`
2024-10-09 07:30:22 +00:00
2024-10-19 11:46:13 +00:00
Specifies the page on which the sessions will be displayed. Defaults to `1` .
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
##### page_size: `int`
2024-10-09 07:30:22 +00:00
2024-11-05 07:21:37 +00:00
The number of sessions on each page. Defaults to `30` .
2024-10-12 11:35:19 +00:00
2024-12-18 11:01:05 +00:00
##### orderby: `str`
2024-10-12 11:35:19 +00:00
2024-10-19 11:46:13 +00:00
The field by which sessions should be sorted. Available options:
2024-10-18 12:56:33 +00:00
2024-10-19 11:46:13 +00:00
- `"create_time"` (default)
2024-10-18 12:56:33 +00:00
- `"update_time"`
2024-10-12 11:35:19 +00:00
2024-12-18 11:01:05 +00:00
##### desc: `bool`
2024-10-12 11:35:19 +00:00
2024-10-18 12:56:33 +00:00
Indicates whether the retrieved sessions should be sorted in descending order. Defaults to `True` .
2024-10-12 11:35:19 +00:00
2024-12-18 11:01:05 +00:00
##### id: `str`
2024-10-12 11:35:19 +00:00
2024-10-16 12:38:19 +00:00
The ID of the chat session to retrieve. Defaults to `None` .
2024-10-12 11:35:19 +00:00
2024-12-18 11:01:05 +00:00
##### name: `str`
2024-10-12 11:35:19 +00:00
2024-10-21 01:47:59 +00:00
The name of the chat session to retrieve. Defaults to `None` .
2024-10-12 11:35:19 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-12 11:35:19 +00:00
2024-10-16 12:38:19 +00:00
- Success: A list of `Session` objects associated with the current chat assistant.
- Failure: `Exception` .
2024-10-12 11:35:19 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-16 12:38:19 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-16 12:38:19 +00:00
2024-10-19 11:46:13 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
assistant = rag_object.list_chats(name="Miss R")
2024-10-16 12:38:19 +00:00
assistant = assistant[0]
for session in assistant.list_sessions():
print(session)
```
2024-10-12 11:35:19 +00:00
2024-10-09 07:30:22 +00:00
---
2024-12-18 11:01:05 +00:00
### Delete chat assistant's sessions
2024-10-09 07:30:22 +00:00
```python
2026-03-12 01:47:42 +00:00
Chat.delete_sessions(ids: list[str] | None = None, delete_all: bool = False)
2024-10-09 07:30:22 +00:00
```
2024-11-14 10:44:37 +00:00
Deletes sessions of the current chat assistant by ID.
2024-10-16 12:38:19 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-16 12:38:19 +00:00
2026-03-12 01:47:42 +00:00
##### ids: `list[str]` or `None`
2024-10-16 12:38:19 +00:00
2026-03-06 10:16:42 +00:00
The IDs of the sessions to delete. Defaults to `None` .
- If omitted, or set to `null` or an empty array, no sessions are deleted.
- If an array of IDs is provided, only the sessions matching those IDs are deleted.
2024-10-16 12:38:19 +00:00
2026-03-12 01:47:42 +00:00
##### delete_all: `bool`
Whether to delete all sessions of the current chat assistant when `ids` is omitted, or set to `None` or an empty list. Defaults to `False` .
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-09 07:30:22 +00:00
2024-10-16 12:38:19 +00:00
- Success: No value is returned.
- Failure: `Exception`
2024-10-09 07:30:22 +00:00
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-09 07:30:22 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-09 07:30:22 +00:00
2024-10-21 01:47:59 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
assistant = rag_object.list_chats(name="Miss R")
2024-10-16 12:38:19 +00:00
assistant = assistant[0]
assistant.delete_sessions(ids=["id_1","id_2"])
2026-03-12 01:47:42 +00:00
assistant.delete_sessions(delete_all=True)
2024-10-21 01:47:59 +00:00
```
---
2024-12-18 11:01:05 +00:00
### Converse with chat assistant
2024-10-21 01:47:59 +00:00
```python
2024-12-20 09:34:16 +00:00
Session.ask(question: str = "", stream: bool = False, **kwargs) -> Optional[Message, iter[Message]]
2024-10-21 01:47:59 +00:00
```
2024-11-14 10:44:37 +00:00
Asks a specified chat assistant a question to start an AI-powered conversation.
:::tip NOTE
In streaming mode, not all responses include a reference, as this depends on the system's judgement.
:::
2024-10-21 01:47:59 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-10-21 01:47:59 +00:00
2024-12-18 11:01:05 +00:00
##### question: `str`, *Required*
2024-10-21 01:47:59 +00:00
2025-01-27 07:45:16 +00:00
The question to start an AI-powered conversation. Default to `""`
2024-10-21 01:47:59 +00:00
2024-12-18 11:01:05 +00:00
##### stream: `bool`
2024-10-21 01:47:59 +00:00
Indicates whether to output responses in a streaming way:
2024-11-22 03:11:06 +00:00
- `True` : Enable streaming (default).
- `False` : Disable streaming.
2024-10-21 01:47:59 +00:00
2024-12-20 09:34:16 +00:00
##### **kwargs
The parameters in prompt(system).
2024-12-18 11:01:05 +00:00
#### Returns
2024-10-21 01:47:59 +00:00
2024-12-18 11:01:05 +00:00
- A `Message` object containing the response to the question if `stream` is set to `False` .
2024-10-21 01:47:59 +00:00
- An iterator containing multiple `message` objects (`iter[Message]`) if `stream` is set to `True`
The following shows the attributes of a `Message` object:
2024-12-18 11:01:05 +00:00
##### id: `str`
2024-10-21 01:47:59 +00:00
The auto-generated message ID.
2024-12-18 11:01:05 +00:00
##### content: `str`
2024-10-21 01:47:59 +00:00
The content of the message. Defaults to `"Hi! I am your assistant, can I help you?"` .
2024-12-18 11:01:05 +00:00
##### reference: `list[Chunk]`
2024-10-21 01:47:59 +00:00
A list of `Chunk` objects representing references to the message, each containing the following attributes:
2026-01-13 01:41:02 +00:00
- `id` `str`
2024-10-21 01:47:59 +00:00
The chunk ID.
2026-01-13 01:41:02 +00:00
- `content` `str`
2024-10-21 01:47:59 +00:00
The content of the chunk.
2026-01-13 01:41:02 +00:00
- `img_id` `str`
2024-10-29 11:56:46 +00:00
The ID of the snapshot of the chunk. Applicable only when the source of the chunk is an image, PPT, PPTX, or PDF file.
2026-01-13 01:41:02 +00:00
- `document_id` `str`
2024-10-21 01:47:59 +00:00
The ID of the referenced document.
2026-01-13 01:41:02 +00:00
- `document_name` `str`
2024-10-21 01:47:59 +00:00
The name of the referenced document.
2026-02-05 01:54:33 +00:00
- `document_metadata` `dict`
Optional document metadata, returned only when `extra_body.reference_metadata.include` is `true` .
2026-01-13 01:41:02 +00:00
- `position` `list[str]`
2024-10-21 01:47:59 +00:00
The location information of the chunk within the referenced document.
2026-01-13 01:41:02 +00:00
- `dataset_id` `str`
2024-10-21 01:47:59 +00:00
The ID of the dataset to which the referenced document belongs.
2026-01-13 01:41:02 +00:00
- `similarity` `float`
2024-10-29 11:56:46 +00:00
A composite similarity score of the chunk ranging from `0` to `1` , with a higher value indicating greater similarity. It is the weighted sum of `vector_similarity` and `term_similarity` .
2026-01-13 01:41:02 +00:00
- `vector_similarity` `float`
2024-10-21 01:47:59 +00:00
A vector similarity score of the chunk ranging from `0` to `1` , with a higher value indicating greater similarity between vector embeddings.
2026-01-13 01:41:02 +00:00
- `term_similarity` `float`
2024-10-21 01:47:59 +00:00
A keyword similarity score of the chunk ranging from `0` to `1` , with a higher value indicating greater similarity between keywords.
2024-12-18 11:01:05 +00:00
#### Examples
2024-10-21 01:47:59 +00:00
```python
2024-12-11 04:38:57 +00:00
from ragflow_sdk import RAGFlow
2024-10-21 01:47:59 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
assistant = rag_object.list_chats(name="Miss R")
assistant = assistant[0]
2026-01-13 01:41:02 +00:00
session = assistant.create_session()
2024-10-21 01:47:59 +00:00
print("\n==================== Miss R =====================\n")
2024-11-12 09:14:33 +00:00
print("Hello. What can I do for you?")
2024-10-21 01:47:59 +00:00
while True:
question = input("\n==================== User =====================\n> ")
print("\n==================== Miss R =====================\n")
2026-01-13 01:41:02 +00:00
2024-10-21 01:47:59 +00:00
cont = ""
for ans in session.ask(question, stream=True):
2024-11-06 10:03:45 +00:00
print(ans.content[len(cont):], end='', flush=True)
cont = ans.content
```
2024-11-14 10:44:37 +00:00
2024-11-06 10:03:45 +00:00
---
2024-12-18 11:01:05 +00:00
### Create session with agent
2024-11-06 10:03:45 +00:00
```python
2025-03-03 09:15:16 +00:00
Agent.create_session(**kwargs) -> Session
2024-11-06 10:03:45 +00:00
```
2024-12-18 11:01:05 +00:00
Creates a session with the current agent.
2024-11-06 10:03:45 +00:00
2024-12-20 09:34:16 +00:00
#### Parameters
##### **kwargs
The parameters in `begin` component.
2026-03-05 09:26:39 +00:00
Also supports:
- `release` (`bool | str`, optional): Set to `True` (or `"true"` ) to create the session in release mode (published version only).
2024-12-18 11:01:05 +00:00
#### Returns
2024-11-06 10:03:45 +00:00
- Success: A `Session` object containing the following attributes:
- `id` : `str` The auto-generated unique identifier of the created session.
2025-06-05 01:29:07 +00:00
- `message` : `list[Message]` The messages of the created session assistant. Default: `[{"role": "assistant", "content": "Hi! I am your assistant, can I help you?"}]`
2024-12-19 10:19:56 +00:00
- `agent_id` : `str` The ID of the associated agent.
2024-11-06 10:03:45 +00:00
- Failure: `Exception`
2024-12-18 11:01:05 +00:00
#### Examples
2024-11-06 10:03:45 +00:00
```python
2024-12-25 12:21:38 +00:00
from ragflow_sdk import RAGFlow, Agent
2024-11-06 10:03:45 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
2025-03-18 06:20:19 +00:00
agent_id = "AGENT_ID"
agent = rag_object.list_agents(id = agent_id)[0]
2025-03-03 09:15:16 +00:00
session = agent.create_session()
2026-03-05 09:26:39 +00:00
# Or create in release mode:
# session = agent.create_session(release=True)
2024-11-06 10:03:45 +00:00
```
2024-11-14 10:44:37 +00:00
2024-11-06 10:03:45 +00:00
---
2024-12-18 11:01:05 +00:00
### Converse with agent
2024-11-06 10:03:45 +00:00
```python
2024-12-20 09:34:16 +00:00
Session.ask(question: str="", stream: bool = False) -> Optional[Message, iter[Message]]
2024-11-06 10:03:45 +00:00
```
2024-11-14 10:44:37 +00:00
Asks a specified agent a question to start an AI-powered conversation.
:::tip NOTE
In streaming mode, not all responses include a reference, as this depends on the system's judgement.
:::
2024-11-06 10:03:45 +00:00
2024-12-18 11:01:05 +00:00
#### Parameters
2024-11-06 10:03:45 +00:00
2024-12-20 09:34:16 +00:00
##### question: `str`
2024-11-06 10:03:45 +00:00
2025-12-28 03:55:52 +00:00
The question to start an AI-powered conversation. If the **Begin** component takes parameters, a question is not required.
2024-11-06 10:03:45 +00:00
2024-12-18 11:01:05 +00:00
##### stream: `bool`
2024-11-06 10:03:45 +00:00
Indicates whether to output responses in a streaming way:
2024-11-22 03:11:06 +00:00
- `True` : Enable streaming (default).
- `False` : Disable streaming.
2024-11-06 10:03:45 +00:00
2024-12-18 11:01:05 +00:00
#### Returns
2024-11-06 10:03:45 +00:00
- A `Message` object containing the response to the question if `stream` is set to `False`
- An iterator containing multiple `message` objects (`iter[Message]`) if `stream` is set to `True`
The following shows the attributes of a `Message` object:
2024-12-18 11:01:05 +00:00
##### id: `str`
2024-11-06 10:03:45 +00:00
The auto-generated message ID.
2024-12-18 11:01:05 +00:00
##### content: `str`
2024-11-06 10:03:45 +00:00
The content of the message. Defaults to `"Hi! I am your assistant, can I help you?"` .
2024-12-18 11:01:05 +00:00
##### reference: `list[Chunk]`
2024-11-06 10:03:45 +00:00
A list of `Chunk` objects representing references to the message, each containing the following attributes:
2026-01-13 01:41:02 +00:00
- `id` `str`
2024-11-06 10:03:45 +00:00
The chunk ID.
2026-01-13 01:41:02 +00:00
- `content` `str`
2024-11-06 10:03:45 +00:00
The content of the chunk.
2026-01-13 01:41:02 +00:00
- `image_id` `str`
2024-11-06 10:03:45 +00:00
The ID of the snapshot of the chunk. Applicable only when the source of the chunk is an image, PPT, PPTX, or PDF file.
2026-01-13 01:41:02 +00:00
- `document_id` `str`
2024-11-06 10:03:45 +00:00
The ID of the referenced document.
2026-01-13 01:41:02 +00:00
- `document_name` `str`
2024-11-06 10:03:45 +00:00
The name of the referenced document.
2026-02-05 01:54:33 +00:00
- `document_metadata` `dict`
Optional document metadata, returned only when `extra_body.reference_metadata.include` is `true` .
2026-01-13 01:41:02 +00:00
- `position` `list[str]`
2024-11-06 10:03:45 +00:00
The location information of the chunk within the referenced document.
2026-01-13 01:41:02 +00:00
- `dataset_id` `str`
2024-11-06 10:03:45 +00:00
The ID of the dataset to which the referenced document belongs.
2026-01-13 01:41:02 +00:00
- `similarity` `float`
2024-11-06 10:03:45 +00:00
A composite similarity score of the chunk ranging from `0` to `1` , with a higher value indicating greater similarity. It is the weighted sum of `vector_similarity` and `term_similarity` .
2026-01-13 01:41:02 +00:00
- `vector_similarity` `float`
2024-11-06 10:03:45 +00:00
A vector similarity score of the chunk ranging from `0` to `1` , with a higher value indicating greater similarity between vector embeddings.
2026-01-13 01:41:02 +00:00
- `term_similarity` `float`
2024-11-06 10:03:45 +00:00
A keyword similarity score of the chunk ranging from `0` to `1` , with a higher value indicating greater similarity between keywords.
2024-12-18 11:01:05 +00:00
#### Examples
2024-11-06 10:03:45 +00:00
```python
2024-12-25 12:21:38 +00:00
from ragflow_sdk import RAGFlow, Agent
2024-11-06 10:03:45 +00:00
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
AGENT_id = "AGENT_ID"
2025-03-03 09:15:16 +00:00
agent = rag_object.list_agents(id = AGENT_id)[0]
2026-01-13 01:41:02 +00:00
session = agent.create_session()
2024-11-06 10:03:45 +00:00
2024-11-14 10:44:37 +00:00
print("\n===== Miss R ====\n")
2024-11-06 10:03:45 +00:00
print("Hello. What can I do for you?")
while True:
2024-11-14 10:44:37 +00:00
question = input("\n===== User ====\n> ")
print("\n==== Miss R ====\n")
2026-01-13 01:41:02 +00:00
2024-11-06 10:03:45 +00:00
cont = ""
for ans in session.ask(question, stream=True):
print(ans.content[len(cont):], end='', flush=True)
cont = ans.content
2024-12-04 08:23:22 +00:00
```
2024-12-18 11:01:05 +00:00
2024-12-04 08:23:22 +00:00
---
2024-12-18 11:01:05 +00:00
### List agent sessions
2024-12-04 08:23:22 +00:00
```python
Agent.list_sessions(
2026-01-13 01:41:02 +00:00
page: int = 1,
page_size: int = 30,
orderby: str = "update_time",
2024-12-04 08:23:22 +00:00
desc: bool = True,
id: str = None
) -> List[Session]
```
Lists sessions associated with the current agent.
2024-12-18 11:01:05 +00:00
#### Parameters
2024-12-04 08:23:22 +00:00
2024-12-18 11:01:05 +00:00
##### page: `int`
2024-12-04 08:23:22 +00:00
Specifies the page on which the sessions will be displayed. Defaults to `1` .
2024-12-18 11:01:05 +00:00
##### page_size: `int`
2024-12-04 08:23:22 +00:00
The number of sessions on each page. Defaults to `30` .
2024-12-18 11:01:05 +00:00
##### orderby: `str`
2024-12-04 08:23:22 +00:00
The field by which sessions should be sorted. Available options:
- `"create_time"`
- `"update_time"` (default)
2024-12-18 11:01:05 +00:00
##### desc: `bool`
2024-12-04 08:23:22 +00:00
Indicates whether the retrieved sessions should be sorted in descending order. Defaults to `True` .
2024-12-18 11:01:05 +00:00
##### id: `str`
2024-12-04 08:23:22 +00:00
The ID of the agent session to retrieve. Defaults to `None` .
2024-12-18 11:01:05 +00:00
#### Returns
2024-12-04 08:23:22 +00:00
- Success: A list of `Session` objects associated with the current agent.
- Failure: `Exception` .
2024-12-18 11:01:05 +00:00
#### Examples
2024-12-04 08:23:22 +00:00
```python
from ragflow_sdk import RAGFlow
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
2025-03-03 09:15:16 +00:00
AGENT_id = "AGENT_ID"
agent = rag_object.list_agents(id = AGENT_id)[0]
sessons = agent.list_sessions()
2024-12-04 08:23:22 +00:00
for session in sessions:
print(session)
```
2025-03-03 09:15:16 +00:00
---
### Delete agent's sessions
```python
2026-03-12 01:47:42 +00:00
Agent.delete_sessions(ids: list[str] | None = None, delete_all: bool = False)
2025-03-03 09:15:16 +00:00
```
2025-12-08 04:21:18 +00:00
Deletes sessions of an agent by ID.
2025-03-03 09:15:16 +00:00
#### Parameters
2026-03-12 01:47:42 +00:00
##### ids: `list[str]` or `None`
2025-03-03 09:15:16 +00:00
2026-03-06 10:16:42 +00:00
The IDs of the sessions to delete. Defaults to `None` .
2026-03-12 01:47:42 +00:00
- If omitted, or set to `None` or an empty array, no sessions are deleted.
2026-03-06 10:16:42 +00:00
- If an array of IDs is provided, only the sessions matching those IDs are deleted.
2025-03-03 09:15:16 +00:00
2026-03-12 01:47:42 +00:00
##### delete_all: `bool`
Whether to delete all sessions of the current agent when `ids` is omitted, or set to `None` or an empty list. Defaults to `False` .
2025-03-03 09:15:16 +00:00
#### Returns
- Success: No value is returned.
- Failure: `Exception`
#### Examples
```python
from ragflow_sdk import RAGFlow
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
AGENT_id = "AGENT_ID"
agent = rag_object.list_agents(id = AGENT_id)[0]
agent.delete_sessions(ids=["id_1","id_2"])
2026-03-12 01:47:42 +00:00
agent.delete_sessions(delete_all=True)
2025-03-03 09:15:16 +00:00
```
2024-12-04 08:23:22 +00:00
---
2024-12-18 11:01:05 +00:00
2024-12-19 06:36:51 +00:00
## AGENT MANAGEMENT
---
2024-12-18 11:01:05 +00:00
### List agents
2024-12-04 08:23:22 +00:00
```python
RAGFlow.list_agents(
2026-01-13 01:41:02 +00:00
page: int = 1,
page_size: int = 30,
orderby: str = "create_time",
2024-12-04 08:23:22 +00:00
desc: bool = True,
id: str = None,
title: str = None
) -> List[Agent]
```
Lists agents.
2024-12-18 11:01:05 +00:00
#### Parameters
2024-12-04 08:23:22 +00:00
2024-12-18 11:01:05 +00:00
##### page: `int`
2024-12-04 08:23:22 +00:00
Specifies the page on which the agents will be displayed. Defaults to `1` .
2024-12-18 11:01:05 +00:00
##### page_size: `int`
2024-12-04 08:23:22 +00:00
The number of agents on each page. Defaults to `30` .
2024-12-18 11:01:05 +00:00
##### orderby: `str`
2024-12-04 08:23:22 +00:00
The attribute by which the results are sorted. Available options:
- `"create_time"` (default)
- `"update_time"`
2024-12-18 11:01:05 +00:00
##### desc: `bool`
2024-12-04 08:23:22 +00:00
Indicates whether the retrieved agents should be sorted in descending order. Defaults to `True` .
2026-01-13 01:41:02 +00:00
##### id: `str`
2024-12-04 08:23:22 +00:00
The ID of the agent to retrieve. Defaults to `None` .
2026-01-13 01:41:02 +00:00
##### name: `str`
2024-12-04 08:23:22 +00:00
The name of the agent to retrieve. Defaults to `None` .
2024-12-18 11:01:05 +00:00
#### Returns
2024-12-04 08:23:22 +00:00
- Success: A list of `Agent` objects.
- Failure: `Exception` .
2024-12-18 11:01:05 +00:00
#### Examples
2024-12-04 08:23:22 +00:00
```python
from ragflow_sdk import RAGFlow
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
for agent in rag_object.list_agents():
print(agent)
2024-12-18 11:01:05 +00:00
```
2025-03-13 11:06:50 +00:00
---
2025-05-12 09:59:53 +00:00
### Create agent
2025-03-13 11:06:50 +00:00
2025-05-12 09:59:53 +00:00
```python
RAGFlow.create_agent(
title: str,
dsl: dict,
description: str | None = None
) -> None
```
Create an agent.
#### Parameters
##### title: `str`
Specifies the title of the agent.
##### dsl: `dict`
Specifies the canvas DSL of the agent.
##### description: `str`
The description of the agent. Defaults to `None` .
#### Returns
- Success: Nothing.
- Failure: `Exception` .
#### Examples
```python
from ragflow_sdk import RAGFlow
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
rag_object.create_agent(
title="Test Agent",
description="A test agent",
dsl={
# ... canvas DSL here ...
}
)
```
---
### Update agent
```python
RAGFlow.update_agent(
agent_id: str,
title: str | None = None,
description: str | None = None,
dsl: dict | None = None
) -> None
```
Update an agent.
#### Parameters
##### agent_id: `str`
Specifies the id of the agent to be updated.
##### title: `str`
Specifies the new title of the agent. `None` if you do not want to update this.
##### dsl: `dict`
Specifies the new canvas DSL of the agent. `None` if you do not want to update this.
##### description: `str`
The new description of the agent. `None` if you do not want to update this.
#### Returns
- Success: Nothing.
- Failure: `Exception` .
#### Examples
```python
from ragflow_sdk import RAGFlow
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
rag_object.update_agent(
agent_id="58af890a2a8911f0a71a11b922ed82d6",
title="Test Agent",
description="A test agent",
dsl={
# ... canvas DSL here ...
}
)
```
---
### Delete agent
```python
RAGFlow.delete_agent(
agent_id: str
) -> None
```
Delete an agent.
#### Parameters
##### agent_id: `str`
Specifies the id of the agent to be deleted.
#### Returns
- Success: Nothing.
- Failure: `Exception` .
#### Examples
```python
from ragflow_sdk import RAGFlow
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
rag_object.delete_agent("58af890a2a8911f0a71a11b922ed82d6")
```
---
2026-01-12 07:31:02 +00:00
## Memory Management
### Create Memory
```python
Ragflow.create_memory(
name: str,
memory_type: list[str],
embd_id: str,
llm_id: str
) -> Memory
```
Create a new memory.
#### Parameters
##### name: `str`, *Required*
The unique name of the memory to create. It must adhere to the following requirements:
- Basic Multilingual Plane (BMP) only
- Maximum 128 characters
##### memory_type: `list[str]`, *Required*
Specifies the types of memory to extract. Available options:
- `raw` : The raw dialogue content between the user and the agent . *Required by default* .
- `semantic` : General knowledge and facts about the user and world.
- `episodic` : Time-stamped records of specific events and experiences.
- `procedural` : Learned skills, habits, and automated procedures.
##### embd_id: `str`, *Required*
The name of the embedding model to use. For example: `"BAAI/bge-large-zh-v1.5@BAAI"`
- Maximum 255 characters
- Must follow `model_name@model_factory` format
##### llm_id: `str`, *Required*
The name of the chat model to use. For example: `"glm-4-flash@ZHIPU-AI"`
- Maximum 255 characters
- Must follow `model_name@model_factory` format
#### Returns
- Success: A `memory` object.
- Failure: `Exception`
#### Examples
```python
from ragflow_sdk import RAGFlow
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
memory = rag_obj.create_memory("name", ["raw"], "BAAI/bge-large-zh-v1.5@SILICONFLOW", "glm-4-flash@ZHIPU-AI")
```
---
### Update Memory
```python
Memory.update(
update_dict: dict
) -> Memory
```
Updates configurations for a specified memory.
#### Parameters
##### update_dict: `dict`, *Required*
Configurations to update. Available configurations:
- `name` : `string` , *Optional*
The revised name of the memory.
- Basic Multilingual Plane (BMP) only
- Maximum 128 characters, *Optional*
- `avatar` : `string` , *Optional*
The updated base64 encoding of the avatar.
- Maximum 65535 characters
- `permission` : `enum<string>` , *Optional*
The updated memory permission. Available options:
- `"me"` : (Default) Only you can manage the memory.
- `"team"` : All team members can manage the memory.
- `llm_id` : `string` , *Optional*
The name of the chat model to use. For example: `"glm-4-flash@ZHIPU-AI"`
- Maximum 255 characters
- Must follow `model_name@model_factory` format
- `description` : `string` , *Optional*
The description of the memory. Defaults to `None` .
- `memory_size` : `int` , *Optional*
Defaults to `5*1024*1024` Bytes. Accounts for each message's content + its embedding vector (≈ Content + Dimensions × 8 Bytes). Example: A 1 KB message with 1024-dim embedding uses ~9 KB. The 5 MB default limit holds ~500 such messages.
- Maximum 10 * 1024 * 1024 Bytes
- `forgetting_policy` : `enum<string>` , *Optional*
Evicts existing data based on the chosen policy when the size limit is reached, freeing up space for new messages. Available options:
- `"FIFO"` : (Default) Prioritize messages with the earliest `forget_at` time for removal. When the pool of messages that have `forget_at` set is insufficient, it falls back to selecting messages in ascending order of their `valid_at` (oldest first).
- `temperature` : (*Body parameter*), `float` , *Optional*
Adjusts output randomness. Lower = more deterministic; higher = more creative.
- Range [0, 1]
- `system_prompt` : (*Body parameter*), `string` , *Optional*
Defines the system-level instructions and role for the AI assistant. It is automatically assembled based on the selected `memory_type` by `PromptAssembler` in `memory/utils/prompt_util.py` . This prompt sets the foundational behavior and context for the entire conversation.
- Keep the `OUTPUT REQUIREMENTS` and `OUTPUT FORMAT` parts unchanged.
- `user_prompt` : (*Body parameter*), `string` , *Optional*
Represents the user's custom setting, which is the specific question or instruction the AI needs to respond to directly. Defaults to `None` .
#### Returns
- Success: A `memory` object.
- Failure: `Exception`
#### Examples
```python
from ragflow_sdk import Ragflow, Memory
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
memory_obejct = Memory(rag_object, {"id": "your memory_id"})
memory_object.update({"name": "New_name"})
```
---
### List Memory
```python
Ragflow.list_memory(
page: int = 1,
page_size: int = 50,
tenant_id: str | list[str] = None,
memory_type: str | list[str] = None,
storage_type: str = None,
keywords: str = None) -> dict
```
List memories.
#### Parameters
##### page: `int`, *Optional*
Specifies the page on which the datasets will be displayed. Defaults to `1`
##### page_size: `int`, *Optional*
The number of memories on each page. Defaults to `50` .
##### tenant_id: `str` or `list[str]`, *Optional*
The owner's ID, supports search multiple IDs.
##### memory_type: `str` or `list[str]`, *Optional*
The type of memory (as set during creation). A memory matches if its type is **included in** the provided value(s). Available options:
- `raw`
- `semantic`
- `episodic`
- `procedural`
##### storage_type: `str`, *Optional*
The storage format of messages. Available options:
- `table` : (Default)
##### keywords: `str`, *Optional*
The name of memory to retrieve, supports fuzzy search.
#### Returns
Success: A dict of `Memory` object list and total count.
```json
{"memory_list": list[Memory], "total_count": int}
```
Failure: `Exception`
#### Examples
```
from ragflow_sdk import Ragflow, Memory
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
rag_obejct.list_memory()
```
---
### Get Memory Config
```python
Memory.get_config()
```
Get the configuration of a specified memory.
#### Parameters
None
#### Returns
Success: A `Memory` object.
Failure: `Exception`
#### Examples
```python
from ragflow_sdk import Ragflow, Memory
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
memory_obejct = Memory(rag_object, {"id": "your memory_id"})
memory_obejct.get_config()
```
---
### Delete Memory
```python
Ragflow.delete_memory(
memory_id: str
) -> None
```
Delete a specified memory.
#### Parameters
##### memory_id: `str`, *Required*
The ID of the memory.
#### Returns
Success: Nothing
Failure: `Exception`
#### Examples
```python
from ragflow_sdk import Ragflow, Memory
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
rag_object.delete_memory("your memory_id")
```
---
### List messages of a memory
```python
Memory.list_memory_messages(
agent_id: str | list[str]=None,
keywords: str=None,
page: int=1,
page_size: int=50
) -> dict
```
List the messages of a specified memory.
#### Parameters
##### agent_id: `str` or `list[str]`, *Optional*
Filters messages by the ID of their source agent. Supports multiple values.
##### keywords: `str`, *Optional*
Filters messages by their session ID. This field supports fuzzy search.
##### page: `int`, *Optional*
Specifies the page on which the messages will be displayed. Defaults to `1` .
##### page_size: `int`, *Optional*
The number of messages on each page. Defaults to `50` .
#### Returns
Success: a dict of messages and meta info.
```json
{"messages": {"message_list": [{message dict}], "total_count": int}, "storage_type": "table"}
```
Failure: `Exception`
#### Examples
```python
from ragflow_sdk import Ragflow, Memory
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
memory_obejct = Memory(rag_object, {"id": "your memory_id"})
memory_obejct.list_memory_messages()
```
---
### Add Message
```python
Ragflow.add_message(
memory_id: list[str],
agent_id: str,
session_id: str,
user_input: str,
agent_response: str,
user_id: str = ""
) -> str
```
Add a message to specified memories.
#### Parameters
##### memory_id: `list[str]`, *Required*
The IDs of the memories to save messages.
##### agent_id: `str`, *Required*
The ID of the message's source agent.
##### session_id: `str`, *Required*
The ID of the message's session.
##### user_input: `str`, *Required*
The text input provided by the user.
##### agent_response: `str`, *Required*
The text response generated by the AI agent.
##### user_id: `str`, *Optional*
The user participating in the conversation with the agent. Defaults to `""` .
#### Returns
Success: A text `"All add to task."`
Failure: `Exception`
#### Examples
```python
from ragflow_sdk import Ragflow, Memory
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
message_payload = {
"memory_id": memory_ids,
"agent_id": agent_id,
"session_id": session_id,
"user_id": "",
"user_input": "Your question here",
"agent_response": """
Your agent response here
"""
}
client.add_message(**message_payload)
```
---
### Forget Message
```python
Memory.forget_message(message_id: int) -> bool
```
Forget a specified message. After forgetting, this message will not be retrieved by agents, and it will also be prioritized for cleanup by the forgetting policy.
#### Parameters
##### message_id: `int`, *Required*
The ID of the message to forget.
#### Returns
Success: True
Failure: `Exception`
#### Examples
```python
from ragflow_sdk import Ragflow, Memory
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
memory_object = Memory(rag_object, {"id": "your memory_id"})
memory_object.forget_message(message_id)
```
---
### Update message status
```python
Memory.update_message_status(message_id: int, status: bool) -> bool
```
Update message status, enable or disable a message. Once a message is disabled, it will not be retrieved by agents.
#### Parameters
##### message_id: `int`, *Required*
The ID of the message to enable or disable.
##### status: `bool`, *Required*
The status of message. `True` = `enabled` , `False` = `disabled` .
#### Returns
Success: `True`
Failure: `Exception`
#### Examples
```python
from ragflow_sdk import Ragflow, Memory
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
memory_object = Memory(rag_object, {"id": "your memory_id"})
memory_object.update_message_status(message_id, True)
```
---
### Search message
```python
Ragflow.search_message(
query: str,
memory_id: list[str],
agent_id: str=None,
session_id: str=None,
similarity_threshold: float=0.2,
keywords_similarity_weight: float=0.7,
top_n: int=10
) -> list[dict]
```
Searches and retrieves messages from memory based on the provided `query` and other configuration parameters.
#### Parameters
##### query: `str`, *Required*
The search term or natural language question used to find relevant messages.
##### memory_id: `list[str]`, *Required*
The IDs of the memories to search. Supports multiple values.
##### agent_id: `str`, *Optional*
The ID of the message's source agent. Defaults to `None` .
##### session_id: `str`, *Optional*
The ID of the message's session. Defaults to `None` .
##### similarity_threshold: `float`, *Optional*
The minimum cosine similarity score required for a message to be considered a match. A higher value yields more precise but fewer results. Defaults to `0.2` .
- Range [0.0, 1.0]
##### keywords_similarity_weight: `float`, *Optional*
Controls the influence of keyword matching versus semantic (embedding-based) matching in the final relevance score. A value of 0.5 gives them equal weight. Defaults to `0.7` .
- Range [0.0, 1.0]
##### top_n: `int`, *Optional*
The maximum number of most relevant messages to return. This limits the result set size for efficiency. Defaults to `10` .
#### Returns
Success: A list of `message` dict.
Failure: `Exception`
#### Examples
```python
from ragflow_sdk import Ragflow
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
rag_object.search_message("your question", ["your memory_id"])
```
---
### Get Recent Messages
```python
Ragflow.get_recent_messages(
memory_id: list[str],
agent_id: str=None,
session_id: str=None,
limit: int=10
) -> list[dict]
```
Retrieves the most recent messages from specified memories. Typically accepts a `limit` parameter to control the number of messages returned.
#### Parameters
##### memory_id: `list[str]`, *Required*
The IDs of the memories to search. Supports multiple values.
##### agent_id: `str`, *Optional*
The ID of the message's source agent. Defaults to `None` .
##### session_id: `str`, *Optional*
The ID of the message's session. Defaults to `None` .
##### limit: `int`, *Optional*
Control the number of messages returned. Defaults to `10` .
#### Returns
Success: A list of `message` dict.
Failure: `Exception`
#### Examples
```python
from ragflow_sdk import Ragflow
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
rag_object.get_recent_messages(["your memory_id"])
```
---
### Get Message Content
```python
Memory.get_message_content(message_id: int)
```
Retrieves the full content and embed vector of a specific message using its unique message ID.
#### Parameters
##### message_id: `int`, *Required*
#### Returns
Success: A `message` dict.
Failure: `Exception`
#### Examples
```python
from ragflow_sdk import Ragflow
rag_object = RAGFlow(api_key="< YOUR_API_KEY > ", base_url="http://< YOUR_BASE_URL > :9380")
memory_object = Memory(rag_object, {"id": "your memory_id"})
memory_object.get_message_content(message_id)
```
---