Commit graph

5168 commits

Author SHA1 Message Date
Debanjum
9258f57dce Release Khoj version 2.0.0-beta.28 2026-03-26 09:03:47 +05:30
Debanjum
171ac5d243 Add deprecation banner to top of web landing page as well 2026-03-26 08:59:05 +05:30
Debanjum
fdd5fd8f74 Fix getting billing config to show deprecation banner on Khoj cloud 2026-03-26 08:59:05 +05:30
Debanjum
8965db7087 Bump server, ciient dependencies 2026-03-26 08:59:05 +05:30
lif
8b8504edb8
Fix AttributeError when memories disabled and setting is None (#1296)
## Summary
- Add null checks for `config.setting` in `get_chat_model()` and
`aget_chat_model()` to prevent `AttributeError` when memories are
disabled
- When the memory toggle creates a `UserConversationConfig` via
`get_or_create` with `setting=None`, accessing
`config.setting.price_tier` crashes — now falls through to the default
chat model instead

## Root Cause
The "Enable Memories" toggle PATCH endpoint uses `get_or_create` on
`UserConversationConfig`, which can create a config with `setting=None`.
Both `get_chat_model()` and `aget_chat_model()` then crash:
- For subscribed users: `if config:` passes but `return config.setting`
returns `None`, causing downstream crashes
- For non-subscribed users: `config.setting.price_tier` raises
`AttributeError` on `None`

## Fix
Change `if config:` → `if config and config.setting:` (subscribed path)
and add `and config.setting` guard before `.price_tier` access
(non-subscribed path), in both sync and async variants.

## Test plan
- [ ] Toggle memories off with no prior chat model configured — settings
page should still load
- [ ] Chat responses should use default model when setting is None
- [ ] Existing users with configured chat models should be unaffected

Fixes #1287

Signed-off-by: majiayu000 <1835304752@qq.com>
2026-03-26 08:26:25 +05:30
Debanjum
7475a781bc Release Khoj version 2.0.0-beta.27 2026-03-26 01:40:02 +05:30
Debanjum
a19e7acd5a Fix TemplateResponse calls to be compatible with Starlette 1.0.0
Starlette 1.0.0 removed the deprecated TemplateResponse signature
where `name` was the first positional arg and `request` was passed
inside `context`. The new signature requires `request` as the first
positional argument: TemplateResponse(request, name=...).

This caused a 500 error in production on web client endpoints with:
"Jinja2Templates.TemplateResponse() missing 1 required positional
argument: 'name'" (with older Starlette) or "'request'" (with 1.0.0).

Update all TemplateResponse calls in web_client.py to use the new
Starlette 1.0.0 signature: pass `request` as the first positional
arg and `name` as an explicit keyword argument.

Issue didn't trigger locally as uv is used locally and pip in docker
builds. These resolve dependencies including starletter version to
install differently. Locally 0.52.0 was installed while on production
starlette 1.0.0 was used. This is what caused the issue and the
mismatch in expectation
2026-03-26 01:38:41 +05:30
Debanjum
b8797e00fa Release Khoj version 2.0.0-beta.26 2026-03-25 20:37:55 +05:30
Debanjum
f356386f3a Ignore dup file errors for pypi wheel validation. Expected Next 15 behavior 2026-03-25 20:30:12 +05:30
Debanjum
d4df9a73ec Use next Link instead of raw a html tags to wrap more links on web app 2026-03-25 20:07:43 +05:30
Debanjum
f7bce48934 Show Khoj cloud deprecation banner on web app to Khoj cloud users
Add banner to home, chat, shared chat and settings pages for coverage.
Link to settings account section to export data and mention Khoj
self-host option in banner
2026-03-25 19:31:53 +05:30
Debanjum
b8f82b27f5 Use next Link instead of raw a html tags to wrap Khoj home logo link 2026-03-25 19:14:59 +05:30
Debanjum
a9749c7184 Upgrade to Next.js 15 for web app 2026-03-25 18:32:48 +05:30
Debanjum
51a56af7ca Skip automation tests when GEMINI_API_KEY is not set
- Add missing skipif decorator to test_create_automation
- Change skip condition from 'is None' to 'not' (falsy check) to
  also handle empty string, which happens when GitHub secrets are
  unavailable in fork PRs
2026-03-25 18:09:24 +05:30
Debanjum
7264ebf533 Bump package dependencies
Changes (4 files):
- pyproject.toml: authlib 1.6.6 → 1.6.9
- src/interface/web/package.json: dompurify ^3.2.6 → ^3.3.2, eslint-config-next 14.2.3 → 14.2.35
- documentation/package.json: @docusaurus/* → ^3.9.2, added serialize-javascript resolution

And regenerated lock files.

The only resolution override is serialize-javascript in documentation,
which is unavoidable since Docusaurus still pins old
copy-webpack-plugin and css-minimizer-webpack-plugin that depend on
serialize-javascript ^6.x.
2026-03-25 18:09:12 +05:30
Tay
0e169159f8
Close leaked file handle in orgnode parser (#1284)
## Summary

`src/khoj/processor/content/org_mode/orgnode.py:57` opens a file with
`open(filename, "r")` but never closes it. The file handle leaks for the
lifetime of the returned `Orgnode` list.

## Fix

Replaced bare `open()` with a `with` statement to ensure the file is
closed after `makelist()` finishes reading.

```python
# Before
def makelist_with_filepath(filename):
    f = open(filename, "r")
    return makelist(f, filename)

# After
def makelist_with_filepath(filename):
    with open(filename, "r") as f:
        return makelist(f, filename)
```

This is safe because `makelist()` fully consumes the file during the
call (building the Orgnode list from file contents), so the file handle
is no longer needed after it returns.
2026-03-25 18:03:20 +05:30
BillionToken
530443a4f6
Fix UnboundLocalError in PdfToEntries.extract_text when PDF processing fails (#1292)
When PyMuPDFLoader fails to process an invalid PDF file, the exception
is caught but pdf_entry_by_pages is referenced before assignment, 
causing an UnboundLocalError.

Initialized pdf_entry_by_pages to an empty list before the try block so 
the return statement always has a valid value, even when an exception
occurs.

Verified with both invalid input (returns []) and valid PDFs (returns
extracted text).

Fixes #1289

Co-authored-by: BillionClaw <267901332+BillionClaw@users.noreply.github.com>
2026-03-25 17:47:50 +05:30
yang1002378395-cmyk
e863126140
fix: ChatModel.__str__ returns None when friendly_name is null (#1277)
## Problem
When `ChatModel.friendly_name` is `None`, the `__str__` method returns
`None`, causing:
```
TypeError: __str__ returned non-string (type NoneType)
```

## Solution
Fall back to `name` field when `friendly_name` is `None`.

Related issue: #1251

Co-authored-by: 阳虎 <yanghu@yanghudeMacBook-Pro.local>
2026-03-20 00:58:44 +05:30
jnMetaCode
678549c6b0
Fix extract_from_webpage discarding pre-fetched content (#1269)
## Summary

In `extract_from_webpage()`, the `content` parameter is unconditionally
overwritten to `None` on the line before the `is_none_or_empty(content)`
check. This means any pre-fetched content (e.g. text content already
retrieved by the Exa search engine) is always discarded, forcing an
unnecessary re-scrape of the webpage.

## Bug

```python
async def extract_from_webpage(
    url: str,
    subqueries: set[str] = None,
    content: str = None,     # <-- caller passes pre-fetched content
    ...
) -> Tuple[set[str], str, Union[None, str]]:
    content = None            # <-- BUG: immediately overwrites it
    if is_none_or_empty(content):  # always True
        content = await scrape_webpage_with_fallback(url)
```

## Fix

Remove the `content = None` assignment so the passed-in content is used
when available, falling back to scraping only when needed.

This bug was introduced in a refactor and causes:
- Wasted API calls to web scrapers for pages whose content is already
available
- Increased latency for search results that include inline content (e.g.
Exa)

Signed-off-by: JiangNan <1394485448@qq.com>
2026-03-17 10:33:52 +05:30
jnMetaCode
6735d33af2
Fix operator precedence in research iteration counter (#1271)
## Summary

Fix a Python operator precedence bug in the `research()` function that
causes `current_iteration` to be set to a boolean instead of the actual
count of previous iterations.

## Bug

```python
if current_iteration := len(previous_iterations) > 0:
```

Python evaluates this as:
```python
if current_iteration := (len(previous_iterations) > 0):  # assigns True or False
```

So `current_iteration` becomes `True` (1) or `False` (0) regardless of
how many previous iterations exist.

## Fix

```python
if (current_iteration := len(previous_iterations)) > 0:
```

With parentheses, `current_iteration` is correctly set to the count
(e.g. 4), and then compared to 0.

## Impact

When resuming research with previous iterations, the loop counter was
effectively reset to 1 instead of the true count. This allowed the
research loop to run significantly more iterations than `MAX_ITERATIONS`
intended, wasting compute and API calls.

Signed-off-by: JiangNan <1394485448@qq.com>
2026-03-17 10:28:56 +05:30
Olexandr88
0e9878c070
Remove redundant SDK version check in LauncherActivity of Android app (#1263)
Remove redundant SDK version check in LauncherActivity since both
branches set the same orientation value. This simplifies the code
without changing behavior

Signed-off-by: Olexandr88 <radole1203@gmail.com>
2026-03-06 12:10:20 +05:30
layla
2c82967807
Fix typos in telemetry error message and comment (#1265)
Fix spelling typos in telemetry.py. Corrects 'recieved' to 'received'
and 'equest' to 'request' in comments and error messages.
2026-03-06 12:04:37 +05:30
Debanjum
17be2d4800
Update Pipali project announcement in README 2026-03-06 12:02:50 +05:30
Debanjum
aeea140099
Update What's New in Readme - Mention Pipali release 2026-03-05 21:31:07 -08:00
saba imran
b864cb1f30 only show the payment card on the settings page if the user is subscribed 2026-03-02 10:35:09 -08:00
Debanjum
0b8cf5112f Drop trailing slash to get memories via api on web app in production
Trailing slash in api calls to server doesn't work in production
behind proxy, only in local next.js dev server.
2026-02-24 10:46:42 -08:00
Debanjum
94bae4789a Release Khoj version 2.0.0-beta.25 2026-02-22 11:56:01 -08:00
lif
5a51f17a71
Fix AttributeError when Eleven Labs API key is not set (#1238)
## Summary
- Fixes AttributeError: 'str' object has no attribute 'iter_content' in
text_to_speech endpoint
- When `ELEVEN_LABS_API_KEY` is not configured, the function was
returning a string instead of a Response object

## Changes
- Introduced `TextToSpeechError` exception class in `text_to_speech.py`
- Changed `generate_text_to_speech` to raise exception instead of
returning error string
- Updated API endpoint to catch the exception and return HTTP 501 (Not
Implemented)

## Test plan
- [x] Code passes ruff lint check
- [ ] Manual testing with and without Eleven Labs API key configured

Fixes #1049

---------

Signed-off-by: majiayu000 <1835304752@qq.com>
Co-authored-by: Debanjum <debanjum@gmail.com>
2026-02-23 01:23:57 +05:30
Koshik Debanath
b0cd8dc8fd
Add ability to copy references from Web App (#1144)
Add a "Copy References" button to the references pane in the web app.

In ReferencePanel Component
- Add a "Copy References" button to the `ReferencePanel` component.
- Implement functionality to copy all references (notes, online, and
code) as a markdown bullet list.
- Update the `TeaserReferencesSection` component to include the "Copy
References" button.
- Show copied to clipboard indicator when references copied on button click

Closes #1021

---------

Co-authored-by: Debanjum <debanjum@gmail.com>
2026-02-23 01:11:45 +05:30
Sam Ho
0ba0f3d0f2
Show autocomplete suggestions for File Query Filters on Obsidian App (#1128)
- When you type in search modal, and matches the pattern `file:`, you
should see list of all files in vault and non-vault
- This list is filtered down as you type more letters 


### Technical Details

- Added file filter mode (`isFileFilterMode` state) to filter search
results by specific files
- Updated `getSuggestions()` function to search file from vault and
non-vault via khoj backend.
- Updated the selection behavior to handle both file selection and
search result selection

Closes https://github.com/khoj-ai/khoj/issues/1025

---------

Co-authored-by: Debanjum <debanjum@gmail.com>
2026-02-23 00:33:17 +05:30
Debanjum
60a2f6d4da Constrain setuptools to resolve installing server in ci
Error due openai-whisper depending on pkg_resource that setuptools
seems to have dropped.

See https://github.com/pypa/setuptools/issues/5174 for reference

Also bump pillow version
2026-02-22 10:48:04 -08:00
Debanjum
9dfef3f40b Bump more server dependencies and update uv lock file too 2026-02-22 10:28:53 -08:00
Debanjum
44b9240253 Bump server, desktop, obsidian and docs dependencies 2026-02-22 09:19:50 -08:00
Debanjum
21c51b9ace Ensure only serve home landing page files under specified path 2026-02-22 09:19:41 -08:00
Debanjum
9cbf620e45 Retry (with fallback) on Gemini fails with internal server error
Khoj should use a fallback model when available to retry request if
calls to Gemini fail with internal server error.
2026-01-06 12:14:08 -08:00
Henri Jamet
1fd6e16cff
Improve Obsidian Batch Sync. Show Progress, Storage Used on Settings Page (#1221)
### **feat(obsidian): Enhance Sync Experience with Progress Bars and Bug
Fixes**

This pull request significantly improves the content synchronization
experience for Obsidian users by fixing a critical bug and introducing
new UI elements for better feedback and monitoring.

The previous implementation could fail with `403 Forbidden` errors when
syncing a large number of files due to server-side rate limiting. This
update addresses that issue and provides users with clear, real-time
feedback on storage usage and sync progress.

---
### Key Changes

* **Improve Sync Robustness**
Refactor `updateContentIndex` to sync files prioritized by file type (md
> pdf > image) and batched by size (10Mb) and item limits (50 items).
This respects server rate limits and ensures that large vaults can be
indexed reliably without triggering `403` errors.

* **Show Cloud Storage Usage Bar**
A progress bar has been added to the settings page to display cloud
storage usage.
* **Total Limit**: The storage limit (**10 MB** for free, **500 MB** for
premium) is now reliably determined by the `is_active` flag returned
from the `/api/v1/user` endpoint, eliminating fragile client-side
heuristics.
* **Used Space**: The used space is calculated via a **client-side
estimation** of all files configured for synchronization. This provides
a clear and immediate indicator of the vault's storage footprint.

* **Show Real-time Sync Progress Bar**
When a manual sync is triggered via the "Force Sync" button, a progress
bar now appears, providing real-time feedback on the operation.
* It displays the number of files processed against the total number of
files to be indexed or deleted.
* This is implemented using a **callback mechanism** (`onProgress`) to
cleanly communicate progress from the sync logic (`utils.ts`) to the UI
(`settings.ts`) without coupling them.

* **Auto-refresh Storage Used After Sync**
The Cloud Storage Usage bar is now automatically refreshed upon the
completion of a "Force Sync". This ensures the user immediately sees the
updated storage estimation without needing to reopen the settings panel.

---
### Visuals

<img width="980" height="237" alt="image"
src="https://github.com/user-attachments/assets/2b3ce420-766b-476f-9fc0-c6b38c0226fb"
/>

---------

Co-authored-by: Debanjum <debanjum@gmail.com>
2026-01-03 13:04:46 +05:30
sabaimran
d6c2d1fa49
Give Khoj Long Term Memories (#1168)
# Motivation
A major component of useful AI systems is adaptation to the user
context. This is a major reason why we'd enabled syncing knowledge
bases. The next steps in this direction is to dynamically update the
evolving state of the user as conversations take place across time and
topics. This allows for more personalized conversations and to maintain
context across conversations.

# Overview
This change introduces medium and long term memories in Khoj. 
- The scope of a conversation can be thought of as short term memory. 
- Medium term memory extends to the past week.
- Long term memory extends to anytime in the past, where a search query
results in a match.

# Details
- Enable user to view and manage agent generated memories from their
settings page
- Fully integrate the memory object into all downstream usage, from
image generation, notes extraction, online search, etc.
- Scope memory per agent. The default agent has access to memories
created by other agents as well.
- Enable users and admins to enable/disable Khoj's memory system

---------

Co-authored-by: Debanjum <debanjum@gmail.com>
2026-01-03 09:07:05 +05:30
Debanjum
d55a00288b Release Khoj version 2.0.0-beta.24 2026-01-01 19:59:10 -08:00
Debanjum
ff4b9f3502 Render mermaid diagram wrapped in markdown codeblocks on web app 2026-01-01 19:57:49 -08:00
Debanjum
19900e42ef Fix registering subscription payment failures 2026-01-01 19:57:49 -08:00
Debanjum
a58ae3dd84 Make LLM actors write & code sandbox check for artifacts in /home/user
Fix
- Ensure researcher and coder know to save files to /home/user dir
- Make E2B code executor check for generated files in /home/user
- Do not re-add file types already downloaded from /home/user

Issues
- E2B has a mismatch in default home_dir for run_code & list_dir cmds
So run_code was run with /root as home dir. And list_dir("~") was
checking under /home/user. This caused files written to /home/user
by code not to be discovered by the list_files step.
- Previously the researcher did not know that generated files should
be written to /home/user. So it could tell the coder to save files to
a different directory. Now the researcher knows where to save files to
show them to user as well.
2025-12-29 14:57:37 -08:00
Debanjum
b607a6187e Release Khoj version 2.0.0-beta.23 2025-12-29 01:42:25 -08:00
Boris Smus
f413ce7354
Enable excluding folders to sync from obsidian plugin settings (#1235)
- Add excludeFolders field to KhojSetting interface
- Rename 'Sync Folders' to 'Include Folders' for clarity
- Add 'Exclude Folders' UI section with folder picker
- Filter out excluded folders during content sync
- Show file counts when syncing (X of Y files)
- Prevent excluding root folder

This allows users to exclude specific directories (e.g., Inbox,
Highlights) from being indexed, while the existing Include Folders acts
as a whitelist.

---------

Co-authored-by: Debanjum <debanjum@gmail.com>
2025-12-29 15:09:19 +05:30
Debanjum
9b9cdc756f Capture more files generated by code execution in sandbox
This change had been removed in 9a8c707 to avoid overwrites. We now
use random filename for generated files to avoid overwrite from
subsequent runs.

Encourage model to write code that writes files in home folder to
capture with logical filenames.
2025-12-29 00:57:17 -08:00
Debanjum
c5650f166a Make Nano Banana Pro output 2K resolution images 2025-12-29 00:57:17 -08:00
Debanjum
1b7ccd141d Harden the user check of the Notion integration 2025-12-29 00:57:17 -08:00
Debanjum
b8eeefa0b1 Bump web app package dependencies 2025-12-29 00:57:17 -08:00
Debanjum
9801ffd2de Add Khoj app landing page. Show it when unauthenticated users open app
Add khoj app landing page to khoj monorepo. Show in a more natural
place, when non logged in users open the khoj app home page.

Authenticated users still see logged in home page experience.
2025-12-29 00:57:17 -08:00
Debanjum
5e65754a8b Unify login via popup on home. No need for separate login html page.
Delete old login html page. Login via popup on home is the single,
unified login experience.

Have docs mention khoj home url, no need to mention /login as login
popup shows on home page too
2025-12-29 00:57:17 -08:00
Debanjum
f65f6ae848 Fix streaming thoughts from multi-turn tools after parallel tool calling
Single turn tools are still executed in parallel. Multi turn tools
like operator are executed in serial.
2025-12-29 00:57:17 -08:00