## Summary
- Add null checks for `config.setting` in `get_chat_model()` and
`aget_chat_model()` to prevent `AttributeError` when memories are
disabled
- When the memory toggle creates a `UserConversationConfig` via
`get_or_create` with `setting=None`, accessing
`config.setting.price_tier` crashes — now falls through to the default
chat model instead
## Root Cause
The "Enable Memories" toggle PATCH endpoint uses `get_or_create` on
`UserConversationConfig`, which can create a config with `setting=None`.
Both `get_chat_model()` and `aget_chat_model()` then crash:
- For subscribed users: `if config:` passes but `return config.setting`
returns `None`, causing downstream crashes
- For non-subscribed users: `config.setting.price_tier` raises
`AttributeError` on `None`
## Fix
Change `if config:` → `if config and config.setting:` (subscribed path)
and add `and config.setting` guard before `.price_tier` access
(non-subscribed path), in both sync and async variants.
## Test plan
- [ ] Toggle memories off with no prior chat model configured — settings
page should still load
- [ ] Chat responses should use default model when setting is None
- [ ] Existing users with configured chat models should be unaffected
Fixes#1287
Signed-off-by: majiayu000 <1835304752@qq.com>
Starlette 1.0.0 removed the deprecated TemplateResponse signature
where `name` was the first positional arg and `request` was passed
inside `context`. The new signature requires `request` as the first
positional argument: TemplateResponse(request, name=...).
This caused a 500 error in production on web client endpoints with:
"Jinja2Templates.TemplateResponse() missing 1 required positional
argument: 'name'" (with older Starlette) or "'request'" (with 1.0.0).
Update all TemplateResponse calls in web_client.py to use the new
Starlette 1.0.0 signature: pass `request` as the first positional
arg and `name` as an explicit keyword argument.
Issue didn't trigger locally as uv is used locally and pip in docker
builds. These resolve dependencies including starletter version to
install differently. Locally 0.52.0 was installed while on production
starlette 1.0.0 was used. This is what caused the issue and the
mismatch in expectation
Add banner to home, chat, shared chat and settings pages for coverage.
Link to settings account section to export data and mention Khoj
self-host option in banner
- Add missing skipif decorator to test_create_automation
- Change skip condition from 'is None' to 'not' (falsy check) to
also handle empty string, which happens when GitHub secrets are
unavailable in fork PRs
Changes (4 files):
- pyproject.toml: authlib 1.6.6 → 1.6.9
- src/interface/web/package.json: dompurify ^3.2.6 → ^3.3.2, eslint-config-next 14.2.3 → 14.2.35
- documentation/package.json: @docusaurus/* → ^3.9.2, added serialize-javascript resolution
And regenerated lock files.
The only resolution override is serialize-javascript in documentation,
which is unavoidable since Docusaurus still pins old
copy-webpack-plugin and css-minimizer-webpack-plugin that depend on
serialize-javascript ^6.x.
## Summary
`src/khoj/processor/content/org_mode/orgnode.py:57` opens a file with
`open(filename, "r")` but never closes it. The file handle leaks for the
lifetime of the returned `Orgnode` list.
## Fix
Replaced bare `open()` with a `with` statement to ensure the file is
closed after `makelist()` finishes reading.
```python
# Before
def makelist_with_filepath(filename):
f = open(filename, "r")
return makelist(f, filename)
# After
def makelist_with_filepath(filename):
with open(filename, "r") as f:
return makelist(f, filename)
```
This is safe because `makelist()` fully consumes the file during the
call (building the Orgnode list from file contents), so the file handle
is no longer needed after it returns.
When PyMuPDFLoader fails to process an invalid PDF file, the exception
is caught but pdf_entry_by_pages is referenced before assignment,
causing an UnboundLocalError.
Initialized pdf_entry_by_pages to an empty list before the try block so
the return statement always has a valid value, even when an exception
occurs.
Verified with both invalid input (returns []) and valid PDFs (returns
extracted text).
Fixes#1289
Co-authored-by: BillionClaw <267901332+BillionClaw@users.noreply.github.com>
## Problem
When `ChatModel.friendly_name` is `None`, the `__str__` method returns
`None`, causing:
```
TypeError: __str__ returned non-string (type NoneType)
```
## Solution
Fall back to `name` field when `friendly_name` is `None`.
Related issue: #1251
Co-authored-by: 阳虎 <yanghu@yanghudeMacBook-Pro.local>
## Summary
In `extract_from_webpage()`, the `content` parameter is unconditionally
overwritten to `None` on the line before the `is_none_or_empty(content)`
check. This means any pre-fetched content (e.g. text content already
retrieved by the Exa search engine) is always discarded, forcing an
unnecessary re-scrape of the webpage.
## Bug
```python
async def extract_from_webpage(
url: str,
subqueries: set[str] = None,
content: str = None, # <-- caller passes pre-fetched content
...
) -> Tuple[set[str], str, Union[None, str]]:
content = None # <-- BUG: immediately overwrites it
if is_none_or_empty(content): # always True
content = await scrape_webpage_with_fallback(url)
```
## Fix
Remove the `content = None` assignment so the passed-in content is used
when available, falling back to scraping only when needed.
This bug was introduced in a refactor and causes:
- Wasted API calls to web scrapers for pages whose content is already
available
- Increased latency for search results that include inline content (e.g.
Exa)
Signed-off-by: JiangNan <1394485448@qq.com>
## Summary
Fix a Python operator precedence bug in the `research()` function that
causes `current_iteration` to be set to a boolean instead of the actual
count of previous iterations.
## Bug
```python
if current_iteration := len(previous_iterations) > 0:
```
Python evaluates this as:
```python
if current_iteration := (len(previous_iterations) > 0): # assigns True or False
```
So `current_iteration` becomes `True` (1) or `False` (0) regardless of
how many previous iterations exist.
## Fix
```python
if (current_iteration := len(previous_iterations)) > 0:
```
With parentheses, `current_iteration` is correctly set to the count
(e.g. 4), and then compared to 0.
## Impact
When resuming research with previous iterations, the loop counter was
effectively reset to 1 instead of the true count. This allowed the
research loop to run significantly more iterations than `MAX_ITERATIONS`
intended, wasting compute and API calls.
Signed-off-by: JiangNan <1394485448@qq.com>
Remove redundant SDK version check in LauncherActivity since both
branches set the same orientation value. This simplifies the code
without changing behavior
Signed-off-by: Olexandr88 <radole1203@gmail.com>
## Summary
- Fixes AttributeError: 'str' object has no attribute 'iter_content' in
text_to_speech endpoint
- When `ELEVEN_LABS_API_KEY` is not configured, the function was
returning a string instead of a Response object
## Changes
- Introduced `TextToSpeechError` exception class in `text_to_speech.py`
- Changed `generate_text_to_speech` to raise exception instead of
returning error string
- Updated API endpoint to catch the exception and return HTTP 501 (Not
Implemented)
## Test plan
- [x] Code passes ruff lint check
- [ ] Manual testing with and without Eleven Labs API key configured
Fixes#1049
---------
Signed-off-by: majiayu000 <1835304752@qq.com>
Co-authored-by: Debanjum <debanjum@gmail.com>
Add a "Copy References" button to the references pane in the web app.
In ReferencePanel Component
- Add a "Copy References" button to the `ReferencePanel` component.
- Implement functionality to copy all references (notes, online, and
code) as a markdown bullet list.
- Update the `TeaserReferencesSection` component to include the "Copy
References" button.
- Show copied to clipboard indicator when references copied on button click
Closes#1021
---------
Co-authored-by: Debanjum <debanjum@gmail.com>
- When you type in search modal, and matches the pattern `file:`, you
should see list of all files in vault and non-vault
- This list is filtered down as you type more letters
### Technical Details
- Added file filter mode (`isFileFilterMode` state) to filter search
results by specific files
- Updated `getSuggestions()` function to search file from vault and
non-vault via khoj backend.
- Updated the selection behavior to handle both file selection and
search result selection
Closes https://github.com/khoj-ai/khoj/issues/1025
---------
Co-authored-by: Debanjum <debanjum@gmail.com>
### **feat(obsidian): Enhance Sync Experience with Progress Bars and Bug
Fixes**
This pull request significantly improves the content synchronization
experience for Obsidian users by fixing a critical bug and introducing
new UI elements for better feedback and monitoring.
The previous implementation could fail with `403 Forbidden` errors when
syncing a large number of files due to server-side rate limiting. This
update addresses that issue and provides users with clear, real-time
feedback on storage usage and sync progress.
---
### Key Changes
* **Improve Sync Robustness**
Refactor `updateContentIndex` to sync files prioritized by file type (md
> pdf > image) and batched by size (10Mb) and item limits (50 items).
This respects server rate limits and ensures that large vaults can be
indexed reliably without triggering `403` errors.
* **Show Cloud Storage Usage Bar**
A progress bar has been added to the settings page to display cloud
storage usage.
* **Total Limit**: The storage limit (**10 MB** for free, **500 MB** for
premium) is now reliably determined by the `is_active` flag returned
from the `/api/v1/user` endpoint, eliminating fragile client-side
heuristics.
* **Used Space**: The used space is calculated via a **client-side
estimation** of all files configured for synchronization. This provides
a clear and immediate indicator of the vault's storage footprint.
* **Show Real-time Sync Progress Bar**
When a manual sync is triggered via the "Force Sync" button, a progress
bar now appears, providing real-time feedback on the operation.
* It displays the number of files processed against the total number of
files to be indexed or deleted.
* This is implemented using a **callback mechanism** (`onProgress`) to
cleanly communicate progress from the sync logic (`utils.ts`) to the UI
(`settings.ts`) without coupling them.
* **Auto-refresh Storage Used After Sync**
The Cloud Storage Usage bar is now automatically refreshed upon the
completion of a "Force Sync". This ensures the user immediately sees the
updated storage estimation without needing to reopen the settings panel.
---
### Visuals
<img width="980" height="237" alt="image"
src="https://github.com/user-attachments/assets/2b3ce420-766b-476f-9fc0-c6b38c0226fb"
/>
---------
Co-authored-by: Debanjum <debanjum@gmail.com>
# Motivation
A major component of useful AI systems is adaptation to the user
context. This is a major reason why we'd enabled syncing knowledge
bases. The next steps in this direction is to dynamically update the
evolving state of the user as conversations take place across time and
topics. This allows for more personalized conversations and to maintain
context across conversations.
# Overview
This change introduces medium and long term memories in Khoj.
- The scope of a conversation can be thought of as short term memory.
- Medium term memory extends to the past week.
- Long term memory extends to anytime in the past, where a search query
results in a match.
# Details
- Enable user to view and manage agent generated memories from their
settings page
- Fully integrate the memory object into all downstream usage, from
image generation, notes extraction, online search, etc.
- Scope memory per agent. The default agent has access to memories
created by other agents as well.
- Enable users and admins to enable/disable Khoj's memory system
---------
Co-authored-by: Debanjum <debanjum@gmail.com>
Fix
- Ensure researcher and coder know to save files to /home/user dir
- Make E2B code executor check for generated files in /home/user
- Do not re-add file types already downloaded from /home/user
Issues
- E2B has a mismatch in default home_dir for run_code & list_dir cmds
So run_code was run with /root as home dir. And list_dir("~") was
checking under /home/user. This caused files written to /home/user
by code not to be discovered by the list_files step.
- Previously the researcher did not know that generated files should
be written to /home/user. So it could tell the coder to save files to
a different directory. Now the researcher knows where to save files to
show them to user as well.
- Add excludeFolders field to KhojSetting interface
- Rename 'Sync Folders' to 'Include Folders' for clarity
- Add 'Exclude Folders' UI section with folder picker
- Filter out excluded folders during content sync
- Show file counts when syncing (X of Y files)
- Prevent excluding root folder
This allows users to exclude specific directories (e.g., Inbox,
Highlights) from being indexed, while the existing Include Folders acts
as a whitelist.
---------
Co-authored-by: Debanjum <debanjum@gmail.com>
This change had been removed in 9a8c707 to avoid overwrites. We now
use random filename for generated files to avoid overwrite from
subsequent runs.
Encourage model to write code that writes files in home folder to
capture with logical filenames.
Add khoj app landing page to khoj monorepo. Show in a more natural
place, when non logged in users open the khoj app home page.
Authenticated users still see logged in home page experience.
Delete old login html page. Login via popup on home is the single,
unified login experience.
Have docs mention khoj home url, no need to mention /login as login
popup shows on home page too