Integrate Find (#1834)

## Purpose

integrate Find to Docs

## Proposal

- [x]  add a `useSeachDocs` hook in charged of calling the search
endpoint.
- [x]  add a optional `path` param to the `search` route. This param
represents the parent document path in case of a sub-documents
(descendants) search.
- [x] ️return Indexer results directly without DB calls to retrieve the
Document objects. All informations necessary for display are indexed in
Find. We can skip the DB calls and improve performance.
- [x] ♻️ refactor react `DocSearchContent` components.
`DocSearchContent` and `DocSearchSubContent` are now merged a unique
component handling all search scenarios and relying on the unique
`search` route.
- [x] 🔥remove pagination logic in the Indexer. Removing the DB calls
also removes the DRF queryset object which handles the pagination. Also
we consider pagination not to be necessary for search v1.
- [x] 🔥remove the `document/<document_id>/descendants` route. This route
is not used anymore. The logic of finding the descendants are moved to
the internal `_list_descendants` method. This method is based on the
parent `path` instead of the parent `id` which has some consequence
about the user access management. Relying on the path prevents the use
of the `self.get_object()` method which used to handle the user access
logic.
- [x] handle fallback logic on DRF based title search in case of
non-configured, badly configured or failing at run time indexer.
- [x] handle language extension in `title` field. Find returns titles
with a language extension (ex: `{ title.fr: "rapport d'activité" }`
instead of `{ "title": "rapport d'activité" }`.
- [x] 🔧 add a `common.test` file to allow running the tests without
docker
- [x] ♻️ rename `SearchIndexer` -> `FindDocumentIndexer`. This class has
to do with Find in particular and the convention is more coherent with
`BaseDocumentIndexer`
- [x] ♻️ rename `SEARCH_INDEXER_URL` -> `INDEXING_URL` and
`SEARCH_INDEXER_QUERY_URL` -> `SEARCH_URL`. I found the original names
very confusing.
- [x] 🔧 update the environment variables to activate the
FindDocumentIndexer.
- [x] automate the generation of encryption key during bootstrap.
OIDC_STORE_REFRESH_TOKEN_KEY is a mandatory secret key. We can not push
it on Github and we want any contributor to be able to run the app by
only running the `make bootstrap`. We chose to generate and wright it
into the `common.local` during bootstrap.

## External contributions

Thank you for your contribution! 🎉  

Please ensure the following items are checked before submitting your
pull request:
- [x] I have read and followed the [contributing
guidelines](https://github.com/suitenumerique/docs/blob/main/CONTRIBUTING.md)
- [x] I have read and agreed to the [Code of
Conduct](https://github.com/suitenumerique/docs/blob/main/CODE_OF_CONDUCT.md)
- [x] I have signed off my commits with `git commit --signoff` (DCO
compliance)
- [x] I have signed my commits with my SSH or GPG key (`git commit -S`)
- [x] My commit messages follow the required format: `<gitmoji>(type)
title description`
- [x] I have added a changelog entry under `## [Unreleased]` section (if
noticeable change)
- [x] I have added corresponding tests for new features or bug fixes (if
applicable)

---------

Signed-off-by: charles <charles.englebert@protonmail.com>
This commit is contained in:
Charles Englebert 2026-03-17 17:32:03 +01:00 committed by GitHub
parent ad36210e45
commit 0fca6db79c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
37 changed files with 1758 additions and 788 deletions

View file

@ -107,12 +107,15 @@ and this project adheres to
- ✨(frontend) Add stat for Crisp #1824
- ✨(auth) add silent login #1690
- 🔧(project) add DJANGO_EMAIL_URL_APP environment variable #1825
- ✨(frontend) activate Find search #1834
- ✨ handle searching on subdocuments #1834
### Changed
- ♿(frontend) improve accessibility:
- ♿️(frontend) fix subdoc opening and emoji pick focus #1745
- ✨(backend) add field for button label in email template #1817
- ✨(backend) improve fallback logic on search endpoint #1834
### Fixed
@ -126,6 +129,8 @@ and this project adheres to
### Removed
- 🔥(project) remove all code related to template #1780
- 🔥(api) remove `documents/<document_id>/descendants/` endpoint #1834
- 🔥(api) remove pagination on `documents/search/` endpoint #1834
### Security

View file

@ -79,10 +79,16 @@ create-env-local-files:
@touch env.d/development/kc_postgresql.local
.PHONY: create-env-local-files
generate-secret-keys:
generate-secret-keys: ## generate secret keys to be stored in common.local
@bin/generate-oidc-store-refresh-token-key.sh
.PHONY: generate-secret-keys
pre-bootstrap: \
data/media \
data/static \
create-env-local-files
create-env-local-files \
generate-secret-keys
.PHONY: pre-bootstrap
post-bootstrap: \

View file

@ -173,6 +173,11 @@ make frontend-test
make frontend-lint
```
Backend tests can be run without docker. This is useful to configure PyCharm or VSCode to do it.
Removing docker for testing requires to overwrite some URL and port values that are different in and out of
Docker. `env.d/development/common` contains all variables, some of them having to be overwritten by those in
`env.d/development/common.test`.
### Demo content
Create a basic demo site:

View file

@ -1,6 +0,0 @@
#!/usr/bin/env bash
# shellcheck source=bin/_config.sh
source "$(dirname "${BASH_SOURCE[0]}")/_config.sh"
_dc_run app-dev python -c 'from cryptography.fernet import Fernet;import sys; sys.stdout.write("\n" + Fernet.generate_key().decode() + "\n");'

View file

@ -0,0 +1,13 @@
#!/usr/bin/env bash
# Generate the secret OIDC_STORE_REFRESH_TOKEN_KEY and store it to common.local
set -eo pipefail
COMMON_LOCAL="env.d/development/common.local"
OIDC_STORE_REFRESH_TOKEN_KEY=$(openssl rand -base64 32)
echo "" >> "${COMMON_LOCAL}"
echo "OIDC_STORE_REFRESH_TOKEN_KEY=${OIDC_STORE_REFRESH_TOKEN_KEY}" >> "${COMMON_LOCAL}"
echo "✓ OIDC_STORE_REFRESH_TOKEN_KEY generated and stored in ${COMMON_LOCAL}"

View file

@ -108,6 +108,9 @@ These are the environment variables you can set for the `impress-backend` contai
| OIDC_RP_SCOPES | Scopes requested for OIDC | openid email |
| OIDC_RP_SIGN_ALGO | verification algorithm used OIDC tokens | RS256 |
| OIDC_STORE_ID_TOKEN | Store OIDC token | true |
| OIDC_STORE_ACCESS_TOKEN | If True stores OIDC access token in session. | false |
| OIDC_STORE_REFRESH_TOKEN | If True stores OIDC refresh token in session. | false |
| OIDC_STORE_REFRESH_TOKEN_KEY | Key to encrypt refresh token stored in session, must be a valid Fernet key | |
| OIDC_USERINFO_FULLNAME_FIELDS | OIDC token claims to create full name | ["first_name", "last_name"] |
| OIDC_USERINFO_SHORTNAME_FIELD | OIDC token claims to create shortname | first_name |
| OIDC_USE_NONCE | Use nonce for OIDC | true |
@ -117,8 +120,9 @@ These are the environment variables you can set for the `impress-backend` contai
| SEARCH_INDEXER_CLASS | Class of the backend for document indexation & search | |
| SEARCH_INDEXER_COUNTDOWN | Minimum debounce delay of indexation jobs (in seconds) | 1 |
| SEARCH_INDEXER_QUERY_LIMIT | Maximum number of results expected from search endpoint | 50 |
| SEARCH_INDEXER_SECRET | Token for indexation queries | |
| SEARCH_INDEXER_URL | Find application endpoint for indexation | |
| SEARCH_URL | Find application endpoint for search queries | |
| SEARCH_INDEXER_SECRET | Token required for indexation queries | |
| INDEXING_URL | Find application endpoint for indexation | |
| SENTRY_DSN | Sentry host | |
| SESSION_COOKIE_AGE | duration of the cookie session | 60*60*12 |
| SIGNUP_NEW_USER_TO_MARKETING_EMAIL | Register new user to the marketing onboarding. If True, see env LASUITE_MARKETING_* system | False |

View file

@ -1,8 +1,8 @@
# Setup the Find search for Impress
# Setup Find search for Docs
This configuration will enable the fulltext search feature for Docs :
- Each save on **core.Document** or **core.DocumentAccess** will trigger the indexer
- The `api/v1.0/documents/search/` will work as a proxy with the Find API for fulltext search.
This configuration will enable Find searches:
- Each save on **core.Document** or **core.DocumentAccess** will trigger the indexing of the document into Find.
- The `api/v1.0/documents/search/` will be used as proxy for searching documents from Find indexes.
## Create an index service for Docs
@ -15,27 +15,27 @@ See [how-to-use-indexer.md](how-to-use-indexer.md) for details.
## Configure settings of Docs
Add those Django settings the Docs application to enable the feature.
Find uses a service provider authentication for indexing and a OIDC authentication for searching.
Add those Django settings to the Docs application to enable the feature.
```shell
SEARCH_INDEXER_CLASS="core.services.search_indexers.FindDocumentIndexer"
SEARCH_INDEXER_COUNTDOWN=10 # Debounce delay in seconds for the indexer calls.
SEARCH_INDEXER_QUERY_LIMIT=50 # Maximum number of results expected from the search endpoint
# The token from service "docs" of Find application (development).
INDEXING_URL="http://find:8000/api/v1.0/documents/index/"
SEARCH_URL="http://find:8000/api/v1.0/documents/search/"
# Service provider authentication
SEARCH_INDEXER_SECRET="find-api-key-for-docs-with-exactly-50-chars-length"
SEARCH_INDEXER_URL="http://find:8000/api/v1.0/documents/index/"
# Search endpoint. Uses the OIDC token for authentication
SEARCH_INDEXER_QUERY_URL="http://find:8000/api/v1.0/documents/search/"
# Maximum number of results expected from the search endpoint
SEARCH_INDEXER_QUERY_LIMIT=50
# OIDC authentication
OIDC_STORE_ACCESS_TOKEN=True # Store the access token in the session
OIDC_STORE_REFRESH_TOKEN=True # Store the encrypted refresh token in the session
OIDC_STORE_REFRESH_TOKEN_KEY="<your-32-byte-encryption-key==>"
```
We also need to enable the **OIDC Token** refresh or the authentication will fail quickly.
```shell
# Store OIDC tokens in the session
OIDC_STORE_ACCESS_TOKEN = True # Store the access token in the session
OIDC_STORE_REFRESH_TOKEN = True # Store the encrypted refresh token in the session
OIDC_STORE_REFRESH_TOKEN_KEY = "your-32-byte-encryption-key==" # Must be a valid Fernet key (32 url-safe base64-encoded bytes)
```
`OIDC_STORE_REFRESH_TOKEN_KEY` must be a valid Fernet key (32 url-safe base64-encoded bytes).
To create one, use the `bin/generate-oidc-store-refresh-token-key.sh` command.

View file

@ -52,8 +52,8 @@ OIDC_REDIRECT_ALLOWED_HOSTS="localhost:8083,localhost:3000"
OIDC_AUTH_REQUEST_EXTRA_PARAMS={"acr_values": "eidas1"}
# Store OIDC tokens in the session. Needed by search/ endpoint.
# OIDC_STORE_ACCESS_TOKEN = True
# OIDC_STORE_REFRESH_TOKEN = True # Store the encrypted refresh token in the session.
OIDC_STORE_ACCESS_TOKEN=True
OIDC_STORE_REFRESH_TOKEN=True # Store the encrypted refresh token in the session.
# Must be a valid Fernet key (32 url-safe base64-encoded bytes)
# To create one, use the bin/fernetkey command.
@ -87,8 +87,9 @@ DOCSPEC_API_URL=http://docspec:4000/conversion
# Theme customization
THEME_CUSTOMIZATION_CACHE_TIMEOUT=15
# Indexer (disabled)
# SEARCH_INDEXER_CLASS="core.services.search_indexers.SearchIndexer"
# Indexer (disabled by default)
# SEARCH_INDEXER_CLASS=core.services.search_indexers.FindDocumentIndexer
SEARCH_INDEXER_SECRET=find-api-key-for-docs-with-exactly-50-chars-length # Key generated by create_demo in Find app.
SEARCH_INDEXER_URL="http://find:8000/api/v1.0/documents/index/"
SEARCH_INDEXER_QUERY_URL="http://find:8000/api/v1.0/documents/search/"
INDEXING_URL=http://find:8000/api/v1.0/documents/index/
SEARCH_URL=http://find:8000/api/v1.0/documents/search/
SEARCH_INDEXER_QUERY_LIMIT=50

View file

@ -0,0 +1,7 @@
# Test environment configuration for running tests without docker
# Base configuration is loaded from 'common' file
DJANGO_SETTINGS_MODULE=impress.settings
DJANGO_CONFIGURATION=Test
DB_PORT=15432
AWS_S3_ENDPOINT_URL=http://localhost:9000

View file

@ -47,10 +47,13 @@ class DocumentFilter(django_filters.FilterSet):
title = AccentInsensitiveCharFilter(
field_name="title", lookup_expr="unaccent__icontains", label=_("Title")
)
q = AccentInsensitiveCharFilter(
field_name="title", lookup_expr="unaccent__icontains", label=_("Search")
)
class Meta:
model = models.Document
fields = ["title"]
fields = ["title", "q"]
class ListDocumentFilter(DocumentFilter):
@ -70,7 +73,7 @@ class ListDocumentFilter(DocumentFilter):
class Meta:
model = models.Document
fields = ["is_creator_me", "is_favorite", "title"]
fields = ["is_creator_me", "is_favorite", "title", "q"]
# pylint: disable=unused-argument
def filter_is_creator_me(self, queryset, name, value):

View file

@ -1004,8 +1004,5 @@ class ThreadSerializer(serializers.ModelSerializer):
class SearchDocumentSerializer(serializers.Serializer):
"""Serializer for fulltext search requests through Find application"""
q = serializers.CharField(required=True, allow_blank=False, trim_whitespace=True)
page_size = serializers.IntegerField(
required=False, min_value=1, max_value=50, default=20
)
page = serializers.IntegerField(required=False, min_value=1, default=1)
q = serializers.CharField(required=True, allow_blank=True, trim_whitespace=True)
path = serializers.CharField(required=False, allow_blank=False)

View file

@ -72,7 +72,11 @@ from core.utils import (
)
from . import permissions, serializers, utils
from .filters import DocumentFilter, ListDocumentFilter, UserSearchFilter
from .filters import (
DocumentFilter,
ListDocumentFilter,
UserSearchFilter,
)
from .throttling import (
DocumentThrottle,
UserListThrottleBurst,
@ -604,20 +608,18 @@ class DocumentViewSet(
It performs early filtering on model fields, annotates user roles, and removes
descendant documents to keep only the highest ancestors readable by the current user.
"""
user = self.request.user
user = request.user
# Not calling filter_queryset. We do our own cooking.
queryset = self.get_queryset()
filterset = ListDocumentFilter(
self.request.GET, queryset=queryset, request=self.request
)
filterset = ListDocumentFilter(request.GET, queryset=queryset, request=request)
if not filterset.is_valid():
raise drf.exceptions.ValidationError(filterset.errors)
filter_data = filterset.form.cleaned_data
# Filter as early as possible on fields that are available on the model
for field in ["is_creator_me", "title"]:
for field in ["is_creator_me", "title", "q"]:
queryset = filterset.filters[field].filter(queryset, filter_data[field])
queryset = queryset.annotate_user_roles(user)
@ -1084,7 +1086,7 @@ class DocumentViewSet(
filter_data = filterset.form.cleaned_data
# Filter as early as possible on fields that are available on the model
for field in ["is_creator_me", "title"]:
for field in ["is_creator_me", "title", "q"]:
queryset = filterset.filters[field].filter(queryset, filter_data[field])
queryset = queryset.annotate_user_roles(user)
@ -1107,7 +1109,11 @@ class DocumentViewSet(
ordering=["path"],
)
def descendants(self, request, *args, **kwargs):
"""Handle listing descendants of a document"""
"""Deprecated endpoint to list descendants of a document."""
logger.warning(
"The 'descendants' endpoint is deprecated and will be removed in a future release. "
"The search endpoint should be used for all document retrieval use cases."
)
document = self.get_object()
queryset = document.get_descendants().filter(ancestors_deleted_at__isnull=True)
@ -1397,82 +1403,103 @@ class DocumentViewSet(
return duplicated_document
def _search_simple(self, request, text):
"""
Returns a queryset filtered by the content of the document title
"""
# As the 'list' view we get a prefiltered queryset (deleted docs are excluded)
queryset = self.get_queryset()
filterset = DocumentFilter({"title": text}, queryset=queryset)
if not filterset.is_valid():
raise drf.exceptions.ValidationError(filterset.errors)
queryset = filterset.filter_queryset(queryset)
return self.get_response_for_queryset(
queryset.order_by("-updated_at"),
context={
"request": request,
},
)
def _search_fulltext(self, indexer, request, params):
"""
Returns a queryset from the results the fulltext search of Find
"""
access_token = request.session.get("oidc_access_token")
user = request.user
text = params.validated_data["q"]
queryset = models.Document.objects.all()
# Retrieve the documents ids from Find.
results = indexer.search(
text=text,
token=access_token,
visited=get_visited_document_ids_of(queryset, user),
)
docs_by_uuid = {str(d.pk): d for d in queryset.filter(pk__in=results)}
ordered_docs = [docs_by_uuid[id] for id in results]
page = self.paginate_queryset(ordered_docs)
serializer = self.get_serializer(
page if page else ordered_docs,
many=True,
context={
"request": request,
},
)
return self.get_paginated_response(serializer.data)
@drf.decorators.action(detail=False, methods=["get"], url_path="search")
@method_decorator(refresh_oidc_access_token)
def search(self, request, *args, **kwargs):
"""
Returns a DRF response containing the filtered, annotated and ordered document list.
Returns an ordered list of documents best matching the search query parameter 'q'.
Applies filtering based on request parameter 'q' from `SearchDocumentSerializer`.
Depending of the configuration it can be:
- A fulltext search through the opensearch indexation app "find" if the backend is
enabled (see SEARCH_INDEXER_CLASS)
- A filtering by the model field 'title'.
The ordering is always by the most recent first.
It depends on a search configurable Search Indexer. If no Search Indexer is configured
or if it is not reachable, the function falls back to a basic title search.
"""
params = serializers.SearchDocumentSerializer(data=request.query_params)
params.is_valid(raise_exception=True)
indexer = get_document_indexer()
if indexer is None:
# fallback on title search if the indexer is not configured
return self._title_search(request, params.validated_data, *args, **kwargs)
if indexer:
return self._search_fulltext(indexer, request, params=params)
try:
return self._search_with_indexer(indexer, request, params=params)
except requests.exceptions.RequestException as e:
logger.error("Error while searching documents with indexer: %s", e)
# fallback on title search if the indexer is not reached
return self._title_search(request, params.validated_data, *args, **kwargs)
# The indexer is not configured, we fallback on a simple icontains filter by the
# model field 'title'.
return self._search_simple(request, text=params.validated_data["q"])
@staticmethod
def _search_with_indexer(indexer, request, params):
"""
Returns a list of documents matching the query (q) according to the configured indexer.
"""
queryset = models.Document.objects.all()
results = indexer.search(
q=params.validated_data["q"],
token=request.session.get("oidc_access_token"),
path=(
params.validated_data["path"]
if "path" in params.validated_data
else None
),
visited=get_visited_document_ids_of(queryset, request.user),
)
return drf_response.Response(
{
"count": len(results),
"next": None,
"previous": None,
"results": results,
}
)
def _title_search(self, request, validated_data, *args, **kwargs):
"""
Fallback search method when no indexer is configured.
Only searches in the title field of documents.
"""
if not validated_data.get("path"):
return self.list(request, *args, **kwargs)
return self._list_descendants(request, validated_data)
def _list_descendants(self, request, validated_data):
"""
List all documents whose path starts with the provided path parameter.
Includes the parent document itself.
Used internally by the search endpoint when path filtering is requested.
"""
# Get parent document without access filtering
parent_path = validated_data["path"]
try:
parent = models.Document.objects.annotate_user_roles(request.user).get(
path=parent_path
)
except models.Document.DoesNotExist as exc:
raise drf.exceptions.NotFound("Document not found from path.") from exc
abilities = parent.get_abilities(request.user)
if not abilities.get("search"):
raise drf.exceptions.PermissionDenied(
"You do not have permission to search within this document."
)
# Get descendants and include the parent, ordered by path
queryset = (
parent.get_descendants(include_self=True)
.filter(ancestors_deleted_at__isnull=True)
.order_by("path")
)
queryset = self.filter_queryset(queryset)
# filter by title
filterset = DocumentFilter(request.GET, queryset=queryset)
if not filterset.is_valid():
raise drf.exceptions.ValidationError(filterset.errors)
queryset = filterset.qs
return self.get_response_for_queryset(queryset)
@drf.decorators.action(detail=True, methods=["get"], url_path="versions")
def versions_list(self, request, *args, **kwargs):

View file

@ -1330,6 +1330,7 @@ class Document(MP_Node, BaseModel):
"versions_destroy": is_owner_or_admin,
"versions_list": has_access_role,
"versions_retrieve": has_access_role,
"search": can_get,
}
def send_email(self, subject, emails, context=None, language=None):

View file

@ -8,7 +8,6 @@ from functools import cache
from django.conf import settings
from django.contrib.auth.models import AnonymousUser
from django.core.exceptions import ImproperlyConfigured
from django.db.models import Subquery
from django.utils.module_loading import import_string
import requests
@ -78,7 +77,9 @@ def get_visited_document_ids_of(queryset, user):
if isinstance(user, AnonymousUser):
return []
qs = models.LinkTrace.objects.filter(user=user)
visited_ids = models.LinkTrace.objects.filter(user=user).values_list(
"document_id", flat=True
)
docs = (
queryset.exclude(accesses__user=user)
@ -86,7 +87,7 @@ def get_visited_document_ids_of(queryset, user):
deleted_at__isnull=True,
ancestors_deleted_at__isnull=True,
)
.filter(pk__in=Subquery(qs.values("document_id")))
.filter(pk__in=visited_ids)
.order_by("pk")
.distinct("pk")
)
@ -107,15 +108,13 @@ class BaseDocumentIndexer(ABC):
Initialize the indexer.
"""
self.batch_size = settings.SEARCH_INDEXER_BATCH_SIZE
self.indexer_url = settings.SEARCH_INDEXER_URL
self.indexer_url = settings.INDEXING_URL
self.indexer_secret = settings.SEARCH_INDEXER_SECRET
self.search_url = settings.SEARCH_INDEXER_QUERY_URL
self.search_url = settings.SEARCH_URL
self.search_limit = settings.SEARCH_INDEXER_QUERY_LIMIT
if not self.indexer_url:
raise ImproperlyConfigured(
"SEARCH_INDEXER_URL must be set in Django settings."
)
raise ImproperlyConfigured("INDEXING_URL must be set in Django settings.")
if not self.indexer_secret:
raise ImproperlyConfigured(
@ -123,9 +122,7 @@ class BaseDocumentIndexer(ABC):
)
if not self.search_url:
raise ImproperlyConfigured(
"SEARCH_INDEXER_QUERY_URL must be set in Django settings."
)
raise ImproperlyConfigured("SEARCH_URL must be set in Django settings.")
def index(self, queryset=None, batch_size=None):
"""
@ -185,7 +182,7 @@ class BaseDocumentIndexer(ABC):
"""
# pylint: disable-next=too-many-arguments,too-many-positional-arguments
def search(self, text, token, visited=(), nb_results=None):
def search(self, q, token, visited=(), nb_results=None, path=None):
"""
Search for documents in Find app.
Ensure the same default ordering as "Docs" list : -updated_at
@ -193,7 +190,7 @@ class BaseDocumentIndexer(ABC):
Returns ids of the documents
Args:
text (str): Text search content.
q (str): user query.
token (str): OIDC Authentication token.
visited (list, optional):
List of ids of active public documents with LinkTrace
@ -201,21 +198,24 @@ class BaseDocumentIndexer(ABC):
nb_results (int, optional):
The number of results to return.
Defaults to 50 if not specified.
path (str, optional):
The parent path to search descendants of.
"""
nb_results = nb_results or self.search_limit
response = self.search_query(
results = self.search_query(
data={
"q": text,
"q": q,
"visited": visited,
"services": ["docs"],
"nb_results": nb_results,
"order_by": "updated_at",
"order_direction": "desc",
"path": path,
},
token=token,
)
return [d["_id"] for d in response]
return results
@abstractmethod
def search_query(self, data, token) -> dict:
@ -226,11 +226,57 @@ class BaseDocumentIndexer(ABC):
"""
class SearchIndexer(BaseDocumentIndexer):
class FindDocumentIndexer(BaseDocumentIndexer):
"""
Document indexer that pushes documents to La Suite Find app.
Document indexer that indexes and searches documents with La Suite Find app.
"""
# pylint: disable=too-many-arguments,too-many-positional-arguments
def search(self, q, token, visited=(), nb_results=None, path=None):
"""format Find search results"""
search_results = super().search(q, token, visited, nb_results, path)
return [
{
**hit["_source"],
"id": hit["_id"],
"title": self.get_title(hit["_source"]),
}
for hit in search_results
]
@staticmethod
def get_title(source):
"""
Find returns the titles with an extension depending on the language.
This function extracts the title in a generic way.
Handles multiple cases:
- Localized title fields like "title.<some_extension>"
- Fallback to plain "title" field if localized version not found
- Returns empty string if no title field exists
Args:
source (dict): The _source dictionary from a search hit
Returns:
str: The extracted title or empty string if not found
Example:
>>> get_title({"title.fr": "Bonjour", "id": 1})
"Bonjour"
>>> get_title({"title": "Hello", "id": 1})
"Hello"
>>> get_title({"id": 1})
""
"""
titles = utils.get_value_by_pattern(source, r"^title\.")
for title in titles:
if title:
return title
if "title" in source:
return source["title"]
return ""
def serialize_document(self, document, accesses):
"""
Convert a Document to the JSON format expected by La Suite Find.

View file

@ -63,7 +63,7 @@ def batch_document_indexer_task(timestamp):
logger.info("Indexed %d documents", count)
def trigger_batch_document_indexer(item):
def trigger_batch_document_indexer(document):
"""
Trigger indexation task with debounce a delay set by the SEARCH_INDEXER_COUNTDOWN setting.
@ -82,14 +82,14 @@ def trigger_batch_document_indexer(item):
if batch_indexer_throttle_acquire(timeout=countdown):
logger.info(
"Add task for batch document indexation from updated_at=%s in %d seconds",
item.updated_at.isoformat(),
document.updated_at.isoformat(),
countdown,
)
batch_document_indexer_task.apply_async(
args=[item.updated_at], countdown=countdown
args=[document.updated_at], countdown=countdown
)
else:
logger.info("Skip task for batch document %s indexation", item.pk)
logger.info("Skip task for batch document %s indexation", document.pk)
else:
document_indexer_task.apply(args=[item.pk])
document_indexer_task.apply(args=[document.pk])

View file

@ -11,7 +11,7 @@ from django.db import transaction
import pytest
from core import factories
from core.services.search_indexers import SearchIndexer
from core.services.search_indexers import FindDocumentIndexer
@pytest.mark.django_db
@ -19,7 +19,7 @@ from core.services.search_indexers import SearchIndexer
def test_index():
"""Test the command `index` that run the Find app indexer for all the available documents."""
user = factories.UserFactory()
indexer = SearchIndexer()
indexer = FindDocumentIndexer()
with transaction.atomic():
doc = factories.DocumentFactory()
@ -36,7 +36,7 @@ def test_index():
str(no_title_doc.path): {"users": [user.sub]},
}
with mock.patch.object(SearchIndexer, "push") as mock_push:
with mock.patch.object(FindDocumentIndexer, "push") as mock_push:
call_command("index")
push_call_args = [call.args[0] for call in mock_push.call_args_list]

View file

@ -39,12 +39,10 @@ def indexer_settings_fixture(settings):
get_document_indexer.cache_clear()
settings.SEARCH_INDEXER_CLASS = "core.services.search_indexers.SearchIndexer"
settings.SEARCH_INDEXER_CLASS = "core.services.search_indexers.FindDocumentIndexer"
settings.SEARCH_INDEXER_SECRET = "ThisIsAKeyForTest"
settings.SEARCH_INDEXER_URL = "http://localhost:8081/api/v1.0/documents/index/"
settings.SEARCH_INDEXER_QUERY_URL = (
"http://localhost:8081/api/v1.0/documents/search/"
)
settings.INDEXING_URL = "http://localhost:8081/api/v1.0/documents/index/"
settings.SEARCH_URL = "http://localhost:8081/api/v1.0/documents/search/"
settings.SEARCH_INDEXER_COUNTDOWN = 1
yield settings

View file

@ -16,7 +16,16 @@ fake = Faker()
pytestmark = pytest.mark.django_db
def test_api_documents_list_filter_and_access_rights():
@pytest.mark.parametrize(
"title_search_field",
# for integration with indexer search we must have
# the same filtering behaviour with "q" and "title" parameters
[
("title"),
("q"),
],
)
def test_api_documents_list_filter_and_access_rights(title_search_field):
"""Filtering on querystring parameters should respect access rights."""
user = factories.UserFactory()
client = APIClient()
@ -76,7 +85,7 @@ def test_api_documents_list_filter_and_access_rights():
filters = {
"link_reach": random.choice([None, *models.LinkReachChoices.values]),
"title": random.choice([None, *word_list]),
title_search_field: random.choice([None, *word_list]),
"favorite": random.choice([None, True, False]),
"creator": random.choice([None, user, other_user]),
"ordering": random.choice(

View file

@ -59,6 +59,7 @@ def test_api_documents_retrieve_anonymous_public_standalone():
"partial_update": document.link_role == "editor",
"restore": False,
"retrieve": True,
"search": True,
"tree": True,
"update": document.link_role == "editor",
"versions_destroy": False,
@ -136,6 +137,7 @@ def test_api_documents_retrieve_anonymous_public_parent():
"partial_update": grand_parent.link_role == "editor",
"restore": False,
"retrieve": True,
"search": True,
"tree": True,
"update": grand_parent.link_role == "editor",
"versions_destroy": False,
@ -246,6 +248,7 @@ def test_api_documents_retrieve_authenticated_unrelated_public_or_authenticated(
"partial_update": document.link_role == "editor",
"restore": False,
"retrieve": True,
"search": True,
"tree": True,
"update": document.link_role == "editor",
"versions_destroy": False,
@ -330,6 +333,7 @@ def test_api_documents_retrieve_authenticated_public_or_authenticated_parent(rea
"partial_update": grand_parent.link_role == "editor",
"restore": False,
"retrieve": True,
"search": True,
"tree": True,
"update": grand_parent.link_role == "editor",
"versions_destroy": False,
@ -529,6 +533,7 @@ def test_api_documents_retrieve_authenticated_related_parent():
"partial_update": access.role not in ["reader", "commenter"],
"restore": access.role == "owner",
"retrieve": True,
"search": True,
"tree": True,
"update": access.role not in ["reader", "commenter"],
"versions_destroy": access.role in ["administrator", "owner"],

View file

@ -1,46 +1,31 @@
"""
Tests for Documents API endpoint in impress's core app: list
Tests for Documents API endpoint in impress's core app: search
"""
import random
from json import loads as json_loads
from django.test import RequestFactory
from unittest import mock
import pytest
import responses
from faker import Faker
from rest_framework import response as drf_response
from rest_framework.test import APIClient
from core import factories, models
from core import factories
from core.services.search_indexers import get_document_indexer
fake = Faker()
pytestmark = pytest.mark.django_db
def build_search_url(**kwargs):
"""Build absolute uri for search endpoint with ORDERED query arguments"""
return (
RequestFactory()
.get("/api/v1.0/documents/search/", dict(sorted(kwargs.items())))
.build_absolute_uri()
)
@pytest.mark.parametrize("role", models.LinkRoleChoices.values)
@pytest.mark.parametrize("reach", models.LinkReachChoices.values)
@mock.patch("core.services.search_indexers.FindDocumentIndexer.search_query")
@responses.activate
def test_api_documents_search_anonymous(reach, role, indexer_settings):
def test_api_documents_search_anonymous(search_query, indexer_settings):
"""
Anonymous users should not be allowed to search documents whatever the
link reach and link role
Anonymous users should be allowed to search documents with Find.
"""
indexer_settings.SEARCH_INDEXER_QUERY_URL = "http://find/api/v1.0/search"
indexer_settings.SEARCH_URL = "http://find/api/v1.0/search"
factories.DocumentFactory(link_reach=reach, link_role=role)
# Find response
# mock Find response
responses.add(
responses.POST,
"http://find/api/v1.0/search",
@ -48,7 +33,22 @@ def test_api_documents_search_anonymous(reach, role, indexer_settings):
status=200,
)
response = APIClient().get("/api/v1.0/documents/search/", data={"q": "alpha"})
q = "alpha"
response = APIClient().get("/api/v1.0/documents/search/", data={"q": q})
assert search_query.call_count == 1
assert search_query.call_args[1] == {
"data": {
"q": q,
"visited": [],
"services": ["docs"],
"nb_results": 50,
"order_by": "updated_at",
"order_direction": "desc",
"path": None,
},
"token": None,
}
assert response.status_code == 200
assert response.json() == {
@ -59,64 +59,121 @@ def test_api_documents_search_anonymous(reach, role, indexer_settings):
}
def test_api_documents_search_endpoint_is_none(indexer_settings):
@mock.patch("core.api.viewsets.DocumentViewSet.list")
def test_api_documents_search_fall_back_on_search_list(mock_list, indexer_settings):
"""
Missing SEARCH_INDEXER_QUERY_URL, so the indexer is not properly configured.
Should fallback on title filter
When indexer is not configured and no path is provided,
should fall back on list method
"""
indexer_settings.SEARCH_INDEXER_QUERY_URL = None
indexer_settings.SEARCH_URL = None
assert get_document_indexer() is None
user = factories.UserFactory()
document = factories.DocumentFactory(title="alpha")
access = factories.UserDocumentAccessFactory(document=document, user=user)
client = APIClient()
client.force_login(user)
response = client.get("/api/v1.0/documents/search/", data={"q": "alpha"})
assert response.status_code == 200
content = response.json()
results = content.pop("results")
assert content == {
"count": 1,
mocked_response = {
"count": 0,
"next": None,
"previous": None,
"results": [{"title": "mocked list result"}],
}
assert len(results) == 1
assert results[0] == {
"id": str(document.id),
"abilities": document.get_abilities(user),
"ancestors_link_reach": None,
"ancestors_link_role": None,
"computed_link_reach": document.computed_link_reach,
"computed_link_role": document.computed_link_role,
"created_at": document.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(document.creator.id),
"depth": 1,
"excerpt": document.excerpt,
"link_reach": document.link_reach,
"link_role": document.link_role,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 1,
"numchild": 0,
"path": document.path,
"title": document.title,
"updated_at": document.updated_at.isoformat().replace("+00:00", "Z"),
"deleted_at": None,
"user_role": access.role,
mock_list.return_value = drf_response.Response(mocked_response)
q = "alpha"
response = client.get("/api/v1.0/documents/search/", data={"q": q})
assert mock_list.call_count == 1
assert mock_list.call_args[0][0].GET.get("q") == q
assert response.json() == mocked_response
@mock.patch("core.api.viewsets.DocumentViewSet._list_descendants")
def test_api_documents_search_fallback_on_search_list_sub_docs(
mock_list_descendants, indexer_settings
):
"""
When indexer is not configured and path parameter is provided,
should call _list_descendants() method
"""
indexer_settings.SEARCH_URL = "http://find/api/v1.0/search"
assert get_document_indexer() is not None
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
parent = factories.DocumentFactory(title="parent", users=[user])
mocked_response = {
"count": 0,
"next": None,
"previous": None,
"results": [{"title": "mocked _list_descendants result"}],
}
mock_list_descendants.return_value = drf_response.Response(mocked_response)
q = "alpha"
response = client.get(
"/api/v1.0/documents/search/", data={"q": q, "path": parent.path}
)
assert mock_list_descendants.call_count == 1
assert mock_list_descendants.call_args[0][0].GET.get("q") == q
assert mock_list_descendants.call_args[0][0].GET.get("path") == parent.path
assert response.json() == mocked_response
@mock.patch("core.api.viewsets.DocumentViewSet._title_search")
def test_api_documents_search_indexer_crashes(mock_title_search, indexer_settings):
"""
When indexer is configured but crashes -> falls back on title_search
"""
# indexer is properly configured
indexer_settings.SEARCH_URL = None
assert get_document_indexer() is None
# but returns an error when the query is sent
responses.add(
responses.POST,
"http://find/api/v1.0/search",
json=[{"error": "Some indexer error"}],
status=404,
)
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
mocked_response = {
"count": 0,
"next": None,
"previous": None,
"results": [{"title": "mocked title_search result"}],
}
mock_title_search.return_value = drf_response.Response(mocked_response)
parent = factories.DocumentFactory(title="parent", users=[user])
q = "alpha"
response = client.get(
"/api/v1.0/documents/search/", data={"q": "alpha", "path": parent.path}
)
# the search endpoint did not crash
assert response.status_code == 200
# fallback on title_search
assert mock_title_search.call_count == 1
assert mock_title_search.call_args[0][0].GET.get("q") == q
assert mock_title_search.call_args[0][0].GET.get("path") == parent.path
assert response.json() == mocked_response
@responses.activate
def test_api_documents_search_invalid_params(indexer_settings):
"""Validate the format of documents as returned by the search view."""
indexer_settings.SEARCH_INDEXER_QUERY_URL = "http://find/api/v1.0/search"
indexer_settings.SEARCH_URL = "http://find/api/v1.0/search"
assert get_document_indexer() is not None
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
@ -125,49 +182,28 @@ def test_api_documents_search_invalid_params(indexer_settings):
assert response.status_code == 400
assert response.json() == {"q": ["This field is required."]}
response = client.get("/api/v1.0/documents/search/", data={"q": " "})
assert response.status_code == 400
assert response.json() == {"q": ["This field may not be blank."]}
response = client.get(
"/api/v1.0/documents/search/", data={"q": "any", "page": "NaN"}
)
assert response.status_code == 400
assert response.json() == {"page": ["A valid integer is required."]}
@responses.activate
def test_api_documents_search_format(indexer_settings):
def test_api_documents_search_success(indexer_settings):
"""Validate the format of documents as returned by the search view."""
indexer_settings.SEARCH_INDEXER_QUERY_URL = "http://find/api/v1.0/search"
indexer_settings.SEARCH_URL = "http://find/api/v1.0/search"
assert get_document_indexer() is not None
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
user_a, user_b, user_c = factories.UserFactory.create_batch(3)
document = factories.DocumentFactory(
title="alpha",
users=(user_a, user_c),
link_traces=(user, user_b),
)
access = factories.UserDocumentAccessFactory(document=document, user=user)
document = {"id": "doc-123", "title": "alpha", "path": "path/to/alpha.pdf"}
# Find response
responses.add(
responses.POST,
"http://find/api/v1.0/search",
json=[
{"_id": str(document.pk)},
{
"_id": str(document["id"]),
"_source": {"title": document["title"], "path": document["path"]},
},
],
status=200,
)
response = client.get("/api/v1.0/documents/search/", data={"q": "alpha"})
response = APIClient().get("/api/v1.0/documents/search/", data={"q": "alpha"})
assert response.status_code == 200
content = response.json()
@ -177,249 +213,6 @@ def test_api_documents_search_format(indexer_settings):
"next": None,
"previous": None,
}
assert len(results) == 1
assert results[0] == {
"id": str(document.id),
"abilities": document.get_abilities(user),
"ancestors_link_reach": None,
"ancestors_link_role": None,
"computed_link_reach": document.computed_link_reach,
"computed_link_role": document.computed_link_role,
"created_at": document.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(document.creator.id),
"depth": 1,
"excerpt": document.excerpt,
"link_reach": document.link_reach,
"link_role": document.link_role,
"nb_accesses_ancestors": 3,
"nb_accesses_direct": 3,
"numchild": 0,
"path": document.path,
"title": document.title,
"updated_at": document.updated_at.isoformat().replace("+00:00", "Z"),
"deleted_at": None,
"user_role": access.role,
}
@responses.activate
@pytest.mark.parametrize(
"pagination, status, expected",
(
(
{"page": 1, "page_size": 10},
200,
{
"count": 10,
"previous": None,
"next": None,
"range": (0, None),
},
),
(
{},
200,
{
"count": 10,
"previous": None,
"next": None,
"range": (0, None),
"api_page_size": 21, # default page_size is 20
},
),
(
{"page": 2, "page_size": 10},
404,
{},
),
(
{"page": 1, "page_size": 5},
200,
{
"count": 10,
"previous": None,
"next": {"page": 2, "page_size": 5},
"range": (0, 5),
},
),
(
{"page": 2, "page_size": 5},
200,
{
"count": 10,
"previous": {"page_size": 5},
"next": None,
"range": (5, None),
},
),
({"page": 3, "page_size": 5}, 404, {}),
),
)
def test_api_documents_search_pagination(
indexer_settings, pagination, status, expected
):
"""Documents should be ordered by descending "score" by default"""
indexer_settings.SEARCH_INDEXER_QUERY_URL = "http://find/api/v1.0/search"
assert get_document_indexer() is not None
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
docs = factories.DocumentFactory.create_batch(10, title="alpha", users=[user])
docs_by_uuid = {str(doc.pk): doc for doc in docs}
api_results = [{"_id": id} for id in docs_by_uuid.keys()]
# reorder randomly to simulate score ordering
random.shuffle(api_results)
# Find response
# pylint: disable-next=assignment-from-none
api_search = responses.add(
responses.POST,
"http://find/api/v1.0/search",
json=api_results,
status=200,
)
response = client.get(
"/api/v1.0/documents/search/",
data={
"q": "alpha",
**pagination,
},
)
assert response.status_code == status
if response.status_code < 300:
previous_url = (
build_search_url(q="alpha", **expected["previous"])
if expected["previous"]
else None
)
next_url = (
build_search_url(q="alpha", **expected["next"])
if expected["next"]
else None
)
start, end = expected["range"]
content = response.json()
assert content["count"] == expected["count"]
assert content["previous"] == previous_url
assert content["next"] == next_url
results = content.pop("results")
# The find api results ordering by score is kept
assert [r["id"] for r in results] == [r["_id"] for r in api_results[start:end]]
# Check the query parameters.
assert api_search.call_count == 1
assert api_search.calls[0].response.status_code == 200
assert json_loads(api_search.calls[0].request.body) == {
"q": "alpha",
"visited": [],
"services": ["docs"],
"nb_results": 50,
"order_by": "updated_at",
"order_direction": "desc",
}
@responses.activate
@pytest.mark.parametrize(
"pagination, status, expected",
(
(
{"page": 1, "page_size": 10},
200,
{"count": 10, "previous": None, "next": None, "range": (0, None)},
),
(
{},
200,
{"count": 10, "previous": None, "next": None, "range": (0, None)},
),
(
{"page": 2, "page_size": 10},
404,
{},
),
(
{"page": 1, "page_size": 5},
200,
{
"count": 10,
"previous": None,
"next": {"page": 2, "page_size": 5},
"range": (0, 5),
},
),
(
{"page": 2, "page_size": 5},
200,
{
"count": 10,
"previous": {"page_size": 5},
"next": None,
"range": (5, None),
},
),
({"page": 3, "page_size": 5}, 404, {}),
),
)
def test_api_documents_search_pagination_endpoint_is_none(
indexer_settings, pagination, status, expected
):
"""Documents should be ordered by descending "-updated_at" by default"""
indexer_settings.SEARCH_INDEXER_QUERY_URL = None
assert get_document_indexer() is None
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
factories.DocumentFactory.create_batch(10, title="alpha", users=[user])
response = client.get(
"/api/v1.0/documents/search/",
data={
"q": "alpha",
**pagination,
},
)
assert response.status_code == status
if response.status_code < 300:
previous_url = (
build_search_url(q="alpha", **expected["previous"])
if expected["previous"]
else None
)
next_url = (
build_search_url(q="alpha", **expected["next"])
if expected["next"]
else None
)
queryset = models.Document.objects.order_by("-updated_at")
start, end = expected["range"]
expected_results = [str(d.pk) for d in queryset[start:end]]
content = response.json()
assert content["count"] == expected["count"]
assert content["previous"] == previous_url
assert content["next"] == next_url
results = content.pop("results")
assert [r["id"] for r in results] == expected_results
assert results == [
{"id": document["id"], "title": document["title"], "path": document["path"]}
]

View file

@ -0,0 +1,956 @@
"""
Tests for search API endpoint in impress's core app when indexer is not
available and a path param is given.
"""
import random
from django.contrib.auth.models import AnonymousUser
import pytest
from rest_framework.test import APIClient
from core import factories
from core.api.filters import remove_accents
pytestmark = pytest.mark.django_db
@pytest.fixture(autouse=True)
def disable_indexer(indexer_settings):
"""Disable search indexer for all tests in this file."""
indexer_settings.SEARCH_INDEXER_CLASS = None
def test_api_documents_search_descendants_list_anonymous_public_standalone():
"""Anonymous users should be allowed to retrieve the descendants of a public document."""
document = factories.DocumentFactory(link_reach="public", title="doc parent")
child1, child2 = factories.DocumentFactory.create_batch(
2, parent=document, title="doc child"
)
grand_child = factories.DocumentFactory(parent=child1, title="doc grand child")
factories.UserDocumentAccessFactory(document=child1)
response = APIClient().get(
"/api/v1.0/documents/search/", data={"q": "doc", "path": document.path}
)
assert response.status_code == 200
assert response.json() == {
"count": 4,
"next": None,
"previous": None,
"results": [
{
# the search should include the parent document itself
"abilities": document.get_abilities(AnonymousUser()),
"ancestors_link_role": None,
"ancestors_link_reach": None,
"computed_link_reach": document.computed_link_reach,
"computed_link_role": document.computed_link_role,
"created_at": document.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(document.creator.id),
"deleted_at": None,
"depth": 1,
"excerpt": document.excerpt,
"id": str(document.id),
"is_favorite": False,
"link_reach": document.link_reach,
"link_role": document.link_role,
"numchild": 2,
"nb_accesses_ancestors": 0,
"nb_accesses_direct": 0,
"path": document.path,
"title": document.title,
"updated_at": document.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": child1.get_abilities(AnonymousUser()),
"ancestors_link_reach": document.link_reach,
"ancestors_link_role": document.link_role,
"computed_link_reach": child1.computed_link_reach,
"computed_link_role": child1.computed_link_role,
"created_at": child1.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child1.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child1.excerpt,
"id": str(child1.id),
"is_favorite": False,
"link_reach": child1.link_reach,
"link_role": child1.link_role,
"numchild": 1,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 1,
"path": child1.path,
"title": child1.title,
"updated_at": child1.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": grand_child.get_abilities(AnonymousUser()),
"ancestors_link_reach": document.link_reach,
"ancestors_link_role": document.link_role
if (child1.link_reach == "public" and child1.link_role == "editor")
else document.link_role,
"computed_link_reach": "public",
"computed_link_role": grand_child.computed_link_role,
"created_at": grand_child.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(grand_child.creator.id),
"deleted_at": None,
"depth": 3,
"excerpt": grand_child.excerpt,
"id": str(grand_child.id),
"is_favorite": False,
"link_reach": grand_child.link_reach,
"link_role": grand_child.link_role,
"numchild": 0,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": grand_child.path,
"title": grand_child.title,
"updated_at": grand_child.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": child2.get_abilities(AnonymousUser()),
"ancestors_link_reach": document.link_reach,
"ancestors_link_role": document.link_role,
"computed_link_reach": "public",
"computed_link_role": child2.computed_link_role,
"created_at": child2.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child2.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child2.excerpt,
"id": str(child2.id),
"is_favorite": False,
"link_reach": child2.link_reach,
"link_role": child2.link_role,
"numchild": 0,
"nb_accesses_ancestors": 0,
"nb_accesses_direct": 0,
"path": child2.path,
"title": child2.title,
"updated_at": child2.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
],
}
def test_api_documents_search_descendants_list_anonymous_public_parent():
"""
Anonymous users should be allowed to retrieve the descendants of a document who
has a public ancestor.
"""
grand_parent = factories.DocumentFactory(
link_reach="public", title="grand parent doc"
)
parent = factories.DocumentFactory(
parent=grand_parent,
link_reach=random.choice(["authenticated", "restricted"]),
title="parent doc",
)
document = factories.DocumentFactory(
link_reach=random.choice(["authenticated", "restricted"]),
parent=parent,
title="document",
)
child1, child2 = factories.DocumentFactory.create_batch(
2, parent=document, title="child doc"
)
grand_child = factories.DocumentFactory(parent=child1, title="grand child doc")
factories.UserDocumentAccessFactory(document=child1)
response = APIClient().get(
"/api/v1.0/documents/search/", data={"q": "doc", "path": document.path}
)
assert response.status_code == 200
assert response.json() == {
"count": 4,
"next": None,
"previous": None,
"results": [
{
# the search should include the parent document itself
"abilities": document.get_abilities(AnonymousUser()),
"ancestors_link_reach": "public",
"ancestors_link_role": grand_parent.link_role,
"computed_link_reach": document.computed_link_reach,
"computed_link_role": document.computed_link_role,
"created_at": document.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(document.creator.id),
"deleted_at": None,
"depth": 3,
"excerpt": document.excerpt,
"id": str(document.id),
"is_favorite": False,
"link_reach": document.link_reach,
"link_role": document.link_role,
"numchild": 2,
"nb_accesses_ancestors": 0,
"nb_accesses_direct": 0,
"path": document.path,
"title": document.title,
"updated_at": document.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": child1.get_abilities(AnonymousUser()),
"ancestors_link_reach": "public",
"ancestors_link_role": grand_parent.link_role,
"computed_link_reach": child1.computed_link_reach,
"computed_link_role": child1.computed_link_role,
"created_at": child1.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child1.creator.id),
"deleted_at": None,
"depth": 4,
"excerpt": child1.excerpt,
"id": str(child1.id),
"is_favorite": False,
"link_reach": child1.link_reach,
"link_role": child1.link_role,
"numchild": 1,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 1,
"path": child1.path,
"title": child1.title,
"updated_at": child1.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": grand_child.get_abilities(AnonymousUser()),
"ancestors_link_reach": "public",
"ancestors_link_role": grand_child.ancestors_link_role,
"computed_link_reach": "public",
"computed_link_role": grand_child.computed_link_role,
"created_at": grand_child.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(grand_child.creator.id),
"deleted_at": None,
"depth": 5,
"excerpt": grand_child.excerpt,
"id": str(grand_child.id),
"is_favorite": False,
"link_reach": grand_child.link_reach,
"link_role": grand_child.link_role,
"numchild": 0,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": grand_child.path,
"title": grand_child.title,
"updated_at": grand_child.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": child2.get_abilities(AnonymousUser()),
"ancestors_link_reach": "public",
"ancestors_link_role": grand_parent.link_role,
"computed_link_reach": "public",
"computed_link_role": child2.computed_link_role,
"created_at": child2.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child2.creator.id),
"deleted_at": None,
"depth": 4,
"excerpt": child2.excerpt,
"id": str(child2.id),
"is_favorite": False,
"link_reach": child2.link_reach,
"link_role": child2.link_role,
"numchild": 0,
"nb_accesses_ancestors": 0,
"nb_accesses_direct": 0,
"path": child2.path,
"title": child2.title,
"updated_at": child2.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
],
}
@pytest.mark.parametrize("reach", ["restricted", "authenticated"])
def test_api_documents_search_descendants_list_anonymous_restricted_or_authenticated(
reach,
):
"""
Anonymous users should not be able to retrieve descendants of a document that is not public.
"""
document = factories.DocumentFactory(title="parent", link_reach=reach)
child = factories.DocumentFactory(title="child", parent=document)
_grand_child = factories.DocumentFactory(title="grand child", parent=child)
response = APIClient().get(
"/api/v1.0/documents/search/", data={"q": "child", "path": document.path}
)
assert response.status_code == 403
assert response.json() == {
"detail": "You do not have permission to search within this document."
}
@pytest.mark.parametrize("reach", ["public", "authenticated"])
def test_api_documents_search_descendants_list_authenticated_unrelated_public_or_authenticated(
reach,
):
"""
Authenticated users should be able to retrieve the descendants of a public/authenticated
document to which they are not related.
"""
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
document = factories.DocumentFactory(link_reach=reach, title="parent")
child1, child2 = factories.DocumentFactory.create_batch(
2, parent=document, link_reach="restricted", title="child"
)
grand_child = factories.DocumentFactory(parent=child1, title="grand child")
factories.UserDocumentAccessFactory(document=child1)
response = client.get(
"/api/v1.0/documents/search/", data={"q": "child", "path": document.path}
)
assert response.status_code == 200
assert response.json() == {
"count": 3,
"next": None,
"previous": None,
"results": [
{
"abilities": child1.get_abilities(user),
"ancestors_link_reach": reach,
"ancestors_link_role": document.link_role,
"computed_link_reach": child1.computed_link_reach,
"computed_link_role": child1.computed_link_role,
"created_at": child1.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child1.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child1.excerpt,
"id": str(child1.id),
"is_favorite": False,
"link_reach": child1.link_reach,
"link_role": child1.link_role,
"numchild": 1,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 1,
"path": child1.path,
"title": child1.title,
"updated_at": child1.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": grand_child.get_abilities(user),
"ancestors_link_reach": reach,
"ancestors_link_role": document.link_role,
"computed_link_reach": grand_child.computed_link_reach,
"computed_link_role": grand_child.computed_link_role,
"created_at": grand_child.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(grand_child.creator.id),
"deleted_at": None,
"depth": 3,
"excerpt": grand_child.excerpt,
"id": str(grand_child.id),
"is_favorite": False,
"link_reach": grand_child.link_reach,
"link_role": grand_child.link_role,
"numchild": 0,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": grand_child.path,
"title": grand_child.title,
"updated_at": grand_child.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": child2.get_abilities(user),
"ancestors_link_reach": reach,
"ancestors_link_role": document.link_role,
"computed_link_reach": child2.computed_link_reach,
"computed_link_role": child2.computed_link_role,
"created_at": child2.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child2.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child2.excerpt,
"id": str(child2.id),
"is_favorite": False,
"link_reach": child2.link_reach,
"link_role": child2.link_role,
"numchild": 0,
"nb_accesses_ancestors": 0,
"nb_accesses_direct": 0,
"path": child2.path,
"title": child2.title,
"updated_at": child2.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
],
}
@pytest.mark.parametrize("reach", ["public", "authenticated"])
def test_api_documents_search_descendants_list_authenticated_public_or_authenticated_parent(
reach,
):
"""
Authenticated users should be allowed to retrieve the descendants of a document who
has a public or authenticated ancestor.
"""
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
grand_parent = factories.DocumentFactory(link_reach=reach, title="grand parent")
parent = factories.DocumentFactory(
parent=grand_parent, link_reach="restricted", title="parent"
)
document = factories.DocumentFactory(
link_reach="restricted", parent=parent, title="document"
)
child1, child2 = factories.DocumentFactory.create_batch(
2, parent=document, link_reach="restricted", title="child"
)
grand_child = factories.DocumentFactory(parent=child1, title="grand child")
factories.UserDocumentAccessFactory(document=child1)
response = client.get(
"/api/v1.0/documents/search/", data={"q": "child", "path": document.path}
)
assert response.status_code == 200
assert response.json() == {
"count": 3,
"next": None,
"previous": None,
"results": [
{
"abilities": child1.get_abilities(user),
"ancestors_link_reach": reach,
"ancestors_link_role": grand_parent.link_role,
"computed_link_reach": child1.computed_link_reach,
"computed_link_role": child1.computed_link_role,
"created_at": child1.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child1.creator.id),
"deleted_at": None,
"depth": 4,
"excerpt": child1.excerpt,
"id": str(child1.id),
"is_favorite": False,
"link_reach": child1.link_reach,
"link_role": child1.link_role,
"numchild": 1,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 1,
"path": child1.path,
"title": child1.title,
"updated_at": child1.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": grand_child.get_abilities(user),
"ancestors_link_reach": reach,
"ancestors_link_role": grand_parent.link_role,
"computed_link_reach": grand_child.computed_link_reach,
"computed_link_role": grand_child.computed_link_role,
"created_at": grand_child.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(grand_child.creator.id),
"deleted_at": None,
"depth": 5,
"excerpt": grand_child.excerpt,
"id": str(grand_child.id),
"is_favorite": False,
"link_reach": grand_child.link_reach,
"link_role": grand_child.link_role,
"numchild": 0,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": grand_child.path,
"title": grand_child.title,
"updated_at": grand_child.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": child2.get_abilities(user),
"ancestors_link_reach": reach,
"ancestors_link_role": grand_parent.link_role,
"computed_link_reach": child2.computed_link_reach,
"computed_link_role": child2.computed_link_role,
"created_at": child2.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child2.creator.id),
"deleted_at": None,
"depth": 4,
"excerpt": child2.excerpt,
"id": str(child2.id),
"is_favorite": False,
"link_reach": child2.link_reach,
"link_role": child2.link_role,
"numchild": 0,
"nb_accesses_ancestors": 0,
"nb_accesses_direct": 0,
"path": child2.path,
"title": child2.title,
"updated_at": child2.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
],
}
def test_api_documents_search_descendants_list_authenticated_unrelated_restricted():
"""
Authenticated users should not be allowed to retrieve the descendants of a document that is
restricted and to which they are not related.
"""
user = factories.UserFactory(with_owned_document=True)
client = APIClient()
client.force_login(user)
document = factories.DocumentFactory(link_reach="restricted", title="parent")
child1, _child2 = factories.DocumentFactory.create_batch(
2, parent=document, title="child"
)
_grand_child = factories.DocumentFactory(parent=child1, title="grand child")
factories.UserDocumentAccessFactory(document=child1)
response = client.get(
"/api/v1.0/documents/search/", data={"q": "child", "path": document.path}
)
assert response.status_code == 403
assert response.json() == {
"detail": "You do not have permission to search within this document."
}
def test_api_documents_search_descendants_list_authenticated_related_direct():
"""
Authenticated users should be allowed to retrieve the descendants of a document
to which they are directly related whatever the role.
"""
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
document = factories.DocumentFactory(title="parent")
access = factories.UserDocumentAccessFactory(document=document, user=user)
factories.UserDocumentAccessFactory(document=document)
child1, child2 = factories.DocumentFactory.create_batch(
2, parent=document, title="child"
)
factories.UserDocumentAccessFactory(document=child1)
grand_child = factories.DocumentFactory(parent=child1, title="grand child")
response = client.get(
"/api/v1.0/documents/search/", data={"q": "child", "path": document.path}
)
assert response.status_code == 200
assert response.json() == {
"count": 3,
"next": None,
"previous": None,
"results": [
{
"abilities": child1.get_abilities(user),
"ancestors_link_reach": child1.ancestors_link_reach,
"ancestors_link_role": child1.ancestors_link_role,
"computed_link_reach": child1.computed_link_reach,
"computed_link_role": child1.computed_link_role,
"created_at": child1.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child1.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child1.excerpt,
"id": str(child1.id),
"is_favorite": False,
"link_reach": child1.link_reach,
"link_role": child1.link_role,
"numchild": 1,
"nb_accesses_ancestors": 3,
"nb_accesses_direct": 1,
"path": child1.path,
"title": child1.title,
"updated_at": child1.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": access.role,
},
{
"abilities": grand_child.get_abilities(user),
"ancestors_link_reach": grand_child.ancestors_link_reach,
"ancestors_link_role": grand_child.ancestors_link_role,
"computed_link_reach": grand_child.computed_link_reach,
"computed_link_role": grand_child.computed_link_role,
"created_at": grand_child.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(grand_child.creator.id),
"deleted_at": None,
"depth": 3,
"excerpt": grand_child.excerpt,
"id": str(grand_child.id),
"is_favorite": False,
"link_reach": grand_child.link_reach,
"link_role": grand_child.link_role,
"numchild": 0,
"nb_accesses_ancestors": 3,
"nb_accesses_direct": 0,
"path": grand_child.path,
"title": grand_child.title,
"updated_at": grand_child.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": access.role,
},
{
"abilities": child2.get_abilities(user),
"ancestors_link_reach": child2.ancestors_link_reach,
"ancestors_link_role": child2.ancestors_link_role,
"computed_link_reach": child2.computed_link_reach,
"computed_link_role": child2.computed_link_role,
"created_at": child2.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child2.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child2.excerpt,
"id": str(child2.id),
"is_favorite": False,
"link_reach": child2.link_reach,
"link_role": child2.link_role,
"numchild": 0,
"nb_accesses_ancestors": 2,
"nb_accesses_direct": 0,
"path": child2.path,
"title": child2.title,
"updated_at": child2.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": access.role,
},
],
}
def test_api_documents_search_descendants_list_authenticated_related_parent():
"""
Authenticated users should be allowed to retrieve the descendants of a document if they
are related to one of its ancestors whatever the role.
"""
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
grand_parent = factories.DocumentFactory(link_reach="restricted", title="parent")
grand_parent_access = factories.UserDocumentAccessFactory(
document=grand_parent, user=user
)
parent = factories.DocumentFactory(
parent=grand_parent, link_reach="restricted", title="parent"
)
document = factories.DocumentFactory(
parent=parent, link_reach="restricted", title="document"
)
child1, child2 = factories.DocumentFactory.create_batch(
2, parent=document, title="child"
)
factories.UserDocumentAccessFactory(document=child1)
grand_child = factories.DocumentFactory(parent=child1, title="grand child")
response = client.get(
"/api/v1.0/documents/search/", data={"q": "child", "path": document.path}
)
assert response.status_code == 200
assert response.json() == {
"count": 3,
"next": None,
"previous": None,
"results": [
{
"abilities": child1.get_abilities(user),
"ancestors_link_reach": child1.ancestors_link_reach,
"ancestors_link_role": child1.ancestors_link_role,
"computed_link_reach": child1.computed_link_reach,
"computed_link_role": child1.computed_link_role,
"created_at": child1.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child1.creator.id),
"deleted_at": None,
"depth": 4,
"excerpt": child1.excerpt,
"id": str(child1.id),
"is_favorite": False,
"link_reach": child1.link_reach,
"link_role": child1.link_role,
"numchild": 1,
"nb_accesses_ancestors": 2,
"nb_accesses_direct": 1,
"path": child1.path,
"title": child1.title,
"updated_at": child1.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": grand_parent_access.role,
},
{
"abilities": grand_child.get_abilities(user),
"ancestors_link_reach": grand_child.ancestors_link_reach,
"ancestors_link_role": grand_child.ancestors_link_role,
"computed_link_reach": grand_child.computed_link_reach,
"computed_link_role": grand_child.computed_link_role,
"created_at": grand_child.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(grand_child.creator.id),
"deleted_at": None,
"depth": 5,
"excerpt": grand_child.excerpt,
"id": str(grand_child.id),
"is_favorite": False,
"link_reach": grand_child.link_reach,
"link_role": grand_child.link_role,
"numchild": 0,
"nb_accesses_ancestors": 2,
"nb_accesses_direct": 0,
"path": grand_child.path,
"title": grand_child.title,
"updated_at": grand_child.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": grand_parent_access.role,
},
{
"abilities": child2.get_abilities(user),
"ancestors_link_reach": child2.ancestors_link_reach,
"ancestors_link_role": child2.ancestors_link_role,
"computed_link_reach": child2.computed_link_reach,
"computed_link_role": child2.computed_link_role,
"created_at": child2.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child2.creator.id),
"deleted_at": None,
"depth": 4,
"excerpt": child2.excerpt,
"id": str(child2.id),
"is_favorite": False,
"link_reach": child2.link_reach,
"link_role": child2.link_role,
"numchild": 0,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": child2.path,
"title": child2.title,
"updated_at": child2.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": grand_parent_access.role,
},
],
}
def test_api_documents_search_descendants_list_authenticated_related_child():
"""
Authenticated users should not be allowed to retrieve all the descendants of a document
as a result of being related to one of its children.
"""
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
document = factories.DocumentFactory(link_reach="restricted")
child1, _child2 = factories.DocumentFactory.create_batch(2, parent=document)
_grand_child = factories.DocumentFactory(parent=child1)
factories.UserDocumentAccessFactory(document=child1, user=user)
factories.UserDocumentAccessFactory(document=document)
response = client.get(
"/api/v1.0/documents/search/", data={"q": "doc", "path": document.path}
)
assert response.status_code == 403
assert response.json() == {
"detail": "You do not have permission to search within this document."
}
def test_api_documents_search_descendants_list_authenticated_related_team_none(
mock_user_teams,
):
"""
Authenticated users should not be able to retrieve the descendants of a restricted document
related to teams in which the user is not.
"""
mock_user_teams.return_value = []
user = factories.UserFactory(with_owned_document=True)
client = APIClient()
client.force_login(user)
document = factories.DocumentFactory(link_reach="restricted", title="document")
factories.DocumentFactory.create_batch(2, parent=document, title="child")
factories.TeamDocumentAccessFactory(document=document, team="myteam")
response = client.get(
"/api/v1.0/documents/search/", data={"q": "doc", "path": document.path}
)
assert response.status_code == 403
assert response.json() == {
"detail": "You do not have permission to search within this document."
}
def test_api_documents_search_descendants_list_authenticated_related_team_members(
mock_user_teams,
):
"""
Authenticated users should be allowed to retrieve the descendants of a document to which they
are related via a team whatever the role.
"""
mock_user_teams.return_value = ["myteam"]
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
document = factories.DocumentFactory(link_reach="restricted", title="parent")
child1, child2 = factories.DocumentFactory.create_batch(
2, parent=document, title="child"
)
grand_child = factories.DocumentFactory(parent=child1, title="grand child")
access = factories.TeamDocumentAccessFactory(document=document, team="myteam")
response = client.get(
"/api/v1.0/documents/search/", data={"q": "child", "path": document.path}
)
# pylint: disable=R0801
assert response.status_code == 200
assert response.json() == {
"count": 3,
"next": None,
"previous": None,
"results": [
{
"abilities": child1.get_abilities(user),
"ancestors_link_reach": child1.ancestors_link_reach,
"ancestors_link_role": child1.ancestors_link_role,
"computed_link_reach": child1.computed_link_reach,
"computed_link_role": child1.computed_link_role,
"created_at": child1.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child1.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child1.excerpt,
"id": str(child1.id),
"is_favorite": False,
"link_reach": child1.link_reach,
"link_role": child1.link_role,
"numchild": 1,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": child1.path,
"title": child1.title,
"updated_at": child1.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": access.role,
},
{
"abilities": grand_child.get_abilities(user),
"ancestors_link_reach": grand_child.ancestors_link_reach,
"ancestors_link_role": grand_child.ancestors_link_role,
"computed_link_reach": grand_child.computed_link_reach,
"computed_link_role": grand_child.computed_link_role,
"created_at": grand_child.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(grand_child.creator.id),
"deleted_at": None,
"depth": 3,
"excerpt": grand_child.excerpt,
"id": str(grand_child.id),
"is_favorite": False,
"link_reach": grand_child.link_reach,
"link_role": grand_child.link_role,
"numchild": 0,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": grand_child.path,
"title": grand_child.title,
"updated_at": grand_child.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": access.role,
},
{
"abilities": child2.get_abilities(user),
"ancestors_link_reach": child2.ancestors_link_reach,
"ancestors_link_role": child2.ancestors_link_role,
"computed_link_reach": child2.computed_link_reach,
"computed_link_role": child2.computed_link_role,
"created_at": child2.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child2.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child2.excerpt,
"id": str(child2.id),
"is_favorite": False,
"link_reach": child2.link_reach,
"link_role": child2.link_role,
"numchild": 0,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": child2.path,
"title": child2.title,
"updated_at": child2.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": access.role,
},
],
}
@pytest.mark.parametrize(
"query,nb_results",
[
("", 7), # Empty string
("Project Alpha", 1), # Exact match
("project", 2), # Partial match (case-insensitive)
("Guide", 2), # Word match within a title
("Special", 0), # No match (nonexistent keyword)
("2024", 2), # Match by numeric keyword
("velo", 1), # Accent-insensitive match (velo vs vélo)
("bêta", 1), # Accent-insensitive match (bêta vs beta)
],
)
def test_api_documents_search_descendants_search_on_title(query, nb_results):
"""Authenticated users should be able to search documents by their unaccented title."""
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
parent = factories.DocumentFactory(users=[user])
# Create documents with predefined titles
titles = [
"Project Alpha Documentation",
"Project Beta Overview",
"User Guide",
"Financial Report 2024",
"Annual Review 2024",
"Guide du vélo urbain", # <-- Title with accent for accent-insensitive test
]
for title in titles:
factories.DocumentFactory(title=title, parent=parent)
# Perform the search query
response = client.get(
"/api/v1.0/documents/search/", data={"q": query, "path": parent.path}
)
assert response.status_code == 200
results = response.json()["results"]
assert len(results) == nb_results
# Ensure all results contain the query in their title
for result in results:
assert (
remove_accents(query).lower().strip()
in remove_accents(result["title"]).lower()
)

View file

@ -101,6 +101,7 @@ def test_api_documents_trashbin_format():
"partial_update": False,
"restore": True,
"retrieve": True,
"search": False,
"tree": True,
"update": False,
"versions_destroy": False,

View file

@ -189,6 +189,7 @@ def test_models_documents_get_abilities_forbidden(
"versions_destroy": False,
"versions_list": False,
"versions_retrieve": False,
"search": False,
}
nb_queries = 1 if is_authenticated else 0
with django_assert_num_queries(nb_queries):
@ -255,6 +256,7 @@ def test_models_documents_get_abilities_reader(
"versions_destroy": False,
"versions_list": False,
"versions_retrieve": False,
"search": True,
}
nb_queries = 1 if is_authenticated else 0
with django_assert_num_queries(nb_queries):
@ -326,6 +328,7 @@ def test_models_documents_get_abilities_commenter(
"versions_destroy": False,
"versions_list": False,
"versions_retrieve": False,
"search": True,
}
nb_queries = 1 if is_authenticated else 0
with django_assert_num_queries(nb_queries):
@ -394,6 +397,7 @@ def test_models_documents_get_abilities_editor(
"versions_destroy": False,
"versions_list": False,
"versions_retrieve": False,
"search": True,
}
nb_queries = 1 if is_authenticated else 0
with django_assert_num_queries(nb_queries):
@ -451,6 +455,7 @@ def test_models_documents_get_abilities_owner(django_assert_num_queries):
"versions_destroy": True,
"versions_list": True,
"versions_retrieve": True,
"search": True,
}
with django_assert_num_queries(1):
assert document.get_abilities(user) == expected_abilities
@ -494,6 +499,7 @@ def test_models_documents_get_abilities_owner(django_assert_num_queries):
"versions_destroy": False,
"versions_list": False,
"versions_retrieve": False,
"search": False,
}
@ -541,6 +547,7 @@ def test_models_documents_get_abilities_administrator(django_assert_num_queries)
"versions_destroy": True,
"versions_list": True,
"versions_retrieve": True,
"search": True,
}
with django_assert_num_queries(1):
assert document.get_abilities(user) == expected_abilities
@ -598,6 +605,7 @@ def test_models_documents_get_abilities_editor_user(django_assert_num_queries):
"versions_destroy": False,
"versions_list": True,
"versions_retrieve": True,
"search": True,
}
with django_assert_num_queries(1):
assert document.get_abilities(user) == expected_abilities
@ -663,6 +671,7 @@ def test_models_documents_get_abilities_reader_user(
"versions_destroy": False,
"versions_list": True,
"versions_retrieve": True,
"search": True,
}
with override_settings(AI_ALLOW_REACH_FROM=ai_access_setting):
@ -729,6 +738,7 @@ def test_models_documents_get_abilities_commenter_user(
"versions_destroy": False,
"versions_list": True,
"versions_retrieve": True,
"search": True,
}
with override_settings(AI_ALLOW_REACH_FROM=ai_access_setting):
@ -791,6 +801,7 @@ def test_models_documents_get_abilities_preset_role(django_assert_num_queries):
"versions_destroy": False,
"versions_list": True,
"versions_retrieve": True,
"search": True,
}

View file

@ -1,5 +1,5 @@
"""
Unit tests for the Document model
Unit tests for FindDocumentIndexer
"""
# pylint: disable=too-many-lines
@ -12,7 +12,7 @@ from django.db import transaction
import pytest
from core import factories, models
from core.services.search_indexers import SearchIndexer
from core.services.search_indexers import FindDocumentIndexer
pytestmark = pytest.mark.django_db
@ -30,7 +30,7 @@ def reset_throttle():
reset_batch_indexer_throttle()
@mock.patch.object(SearchIndexer, "push")
@mock.patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings")
@pytest.mark.django_db(transaction=True)
def test_models_documents_post_save_indexer(mock_push):
@ -41,7 +41,7 @@ def test_models_documents_post_save_indexer(mock_push):
accesses = {}
data = [call.args[0] for call in mock_push.call_args_list]
indexer = SearchIndexer()
indexer = FindDocumentIndexer()
assert len(data) == 1
@ -64,14 +64,14 @@ def test_models_documents_post_save_indexer_no_batches(indexer_settings):
"""Test indexation task on doculment creation, no throttle"""
indexer_settings.SEARCH_INDEXER_COUNTDOWN = 0
with mock.patch.object(SearchIndexer, "push") as mock_push:
with mock.patch.object(FindDocumentIndexer, "push") as mock_push:
with transaction.atomic():
doc1, doc2, doc3 = factories.DocumentFactory.create_batch(3)
accesses = {}
data = [call.args[0] for call in mock_push.call_args_list]
indexer = SearchIndexer()
indexer = FindDocumentIndexer()
# 3 calls
assert len(data) == 3
@ -91,7 +91,7 @@ def test_models_documents_post_save_indexer_no_batches(indexer_settings):
assert cache.get("file-batch-indexer-throttle") is None
@mock.patch.object(SearchIndexer, "push")
@mock.patch.object(FindDocumentIndexer, "push")
@pytest.mark.django_db(transaction=True)
def test_models_documents_post_save_indexer_not_configured(mock_push, indexer_settings):
"""Task should not start an indexation when disabled"""
@ -106,13 +106,13 @@ def test_models_documents_post_save_indexer_not_configured(mock_push, indexer_se
assert mock_push.assert_not_called
@mock.patch.object(SearchIndexer, "push")
@mock.patch.object(FindDocumentIndexer, "push")
@pytest.mark.django_db(transaction=True)
def test_models_documents_post_save_indexer_wrongly_configured(
mock_push, indexer_settings
):
"""Task should not start an indexation when disabled"""
indexer_settings.SEARCH_INDEXER_URL = None
indexer_settings.INDEXING_URL = None
user = factories.UserFactory()
@ -123,7 +123,7 @@ def test_models_documents_post_save_indexer_wrongly_configured(
assert mock_push.assert_not_called
@mock.patch.object(SearchIndexer, "push")
@mock.patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings")
@pytest.mark.django_db(transaction=True)
def test_models_documents_post_save_indexer_with_accesses(mock_push):
@ -145,7 +145,7 @@ def test_models_documents_post_save_indexer_with_accesses(mock_push):
data = [call.args[0] for call in mock_push.call_args_list]
indexer = SearchIndexer()
indexer = FindDocumentIndexer()
assert len(data) == 1
assert sorted(data[0], key=itemgetter("id")) == sorted(
@ -158,7 +158,7 @@ def test_models_documents_post_save_indexer_with_accesses(mock_push):
)
@mock.patch.object(SearchIndexer, "push")
@mock.patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings")
@pytest.mark.django_db(transaction=True)
def test_models_documents_post_save_indexer_deleted(mock_push):
@ -207,7 +207,7 @@ def test_models_documents_post_save_indexer_deleted(mock_push):
data = [call.args[0] for call in mock_push.call_args_list]
indexer = SearchIndexer()
indexer = FindDocumentIndexer()
assert len(data) == 2
@ -244,14 +244,14 @@ def test_models_documents_indexer_hard_deleted():
factories.UserDocumentAccessFactory(document=doc, user=user)
# Call task on deleted document.
with mock.patch.object(SearchIndexer, "push") as mock_push:
with mock.patch.object(FindDocumentIndexer, "push") as mock_push:
doc.delete()
# Hard delete document are not re-indexed.
assert mock_push.assert_not_called
@mock.patch.object(SearchIndexer, "push")
@mock.patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings")
@pytest.mark.django_db(transaction=True)
def test_models_documents_post_save_indexer_restored(mock_push):
@ -308,7 +308,7 @@ def test_models_documents_post_save_indexer_restored(mock_push):
data = [call.args[0] for call in mock_push.call_args_list]
indexer = SearchIndexer()
indexer = FindDocumentIndexer()
# All docs are re-indexed
assert len(data) == 2
@ -337,16 +337,16 @@ def test_models_documents_post_save_indexer_restored(mock_push):
@pytest.mark.usefixtures("indexer_settings")
def test_models_documents_post_save_indexer_throttle():
"""Test indexation task skipping on document update"""
indexer = SearchIndexer()
indexer = FindDocumentIndexer()
user = factories.UserFactory()
with mock.patch.object(SearchIndexer, "push"):
with mock.patch.object(FindDocumentIndexer, "push"):
with transaction.atomic():
docs = factories.DocumentFactory.create_batch(5, users=(user,))
accesses = {str(item.path): {"users": [user.sub]} for item in docs}
with mock.patch.object(SearchIndexer, "push") as mock_push:
with mock.patch.object(FindDocumentIndexer, "push") as mock_push:
# Simulate 1 running task
cache.set("document-batch-indexer-throttle", 1)
@ -359,7 +359,7 @@ def test_models_documents_post_save_indexer_throttle():
assert [call.args[0] for call in mock_push.call_args_list] == []
with mock.patch.object(SearchIndexer, "push") as mock_push:
with mock.patch.object(FindDocumentIndexer, "push") as mock_push:
# No waiting task
cache.delete("document-batch-indexer-throttle")
@ -389,7 +389,7 @@ def test_models_documents_access_post_save_indexer():
"""Test indexation task on DocumentAccess update"""
users = factories.UserFactory.create_batch(3)
with mock.patch.object(SearchIndexer, "push"):
with mock.patch.object(FindDocumentIndexer, "push"):
with transaction.atomic():
doc = factories.DocumentFactory(users=users)
doc_accesses = models.DocumentAccess.objects.filter(document=doc).order_by(
@ -398,7 +398,7 @@ def test_models_documents_access_post_save_indexer():
reset_batch_indexer_throttle()
with mock.patch.object(SearchIndexer, "push") as mock_push:
with mock.patch.object(FindDocumentIndexer, "push") as mock_push:
with transaction.atomic():
for doc_access in doc_accesses:
doc_access.save()
@ -426,7 +426,7 @@ def test_models_items_access_post_save_indexer_no_throttle(indexer_settings):
reset_batch_indexer_throttle()
with mock.patch.object(SearchIndexer, "push") as mock_push:
with mock.patch.object(FindDocumentIndexer, "push") as mock_push:
with transaction.atomic():
for doc_access in doc_accesses:
doc_access.save()
@ -439,3 +439,70 @@ def test_models_items_access_post_save_indexer_no_throttle(indexer_settings):
assert [len(d) for d in data] == [1] * 3
# the same document is indexed 3 times
assert [d[0]["id"] for d in data] == [str(doc.pk)] * 3
@mock.patch.object(FindDocumentIndexer, "search_query")
@pytest.mark.usefixtures("indexer_settings")
def test_find_document_indexer_search(mock_search_query):
"""Test search function of FindDocumentIndexer returns formatted results"""
# Mock API response from Find
hits = [
{
"_id": "doc-123",
"_source": {
"title": "Test Document",
"content": "This is test content",
"updated_at": "2024-01-01T00:00:00Z",
"path": "/some/path/doc-123",
},
},
{
"_id": "doc-456",
"_source": {
"title.fr": "Document de test",
"content": "Contenu de test",
"updated_at": "2024-01-02T00:00:00Z",
},
},
]
mock_search_query.return_value = hits
q = "test"
token = "fake-token"
nb_results = 10
path = "/some/path/"
visited = ["doc-123"]
results = FindDocumentIndexer().search(
q=q, token=token, nb_results=nb_results, path=path, visited=visited
)
mock_search_query.assert_called_once()
call_args = mock_search_query.call_args
assert call_args[1]["data"] == {
"q": q,
"visited": visited,
"services": ["docs"],
"nb_results": nb_results,
"order_by": "updated_at",
"order_direction": "desc",
"path": path,
}
assert len(results) == 2
assert results == [
{
"id": hits[0]["_id"],
"title": hits[0]["_source"]["title"],
"content": hits[0]["_source"]["content"],
"updated_at": hits[0]["_source"]["updated_at"],
"path": hits[0]["_source"]["path"],
},
{
"id": hits[1]["_id"],
"title": hits[1]["_source"]["title.fr"],
"title.fr": hits[1]["_source"]["title.fr"], # <- Find response artefact
"content": hits[1]["_source"]["content"],
"updated_at": hits[1]["_source"]["updated_at"],
},
]

View file

@ -15,7 +15,7 @@ from requests import HTTPError
from core import factories, models, utils
from core.services.search_indexers import (
BaseDocumentIndexer,
SearchIndexer,
FindDocumentIndexer,
get_document_indexer,
get_visited_document_ids_of,
)
@ -78,41 +78,41 @@ def test_services_search_indexer_is_configured(indexer_settings):
# Valid class
indexer_settings.SEARCH_INDEXER_CLASS = (
"core.services.search_indexers.SearchIndexer"
"core.services.search_indexers.FindDocumentIndexer"
)
get_document_indexer.cache_clear()
assert get_document_indexer() is not None
indexer_settings.SEARCH_INDEXER_URL = ""
indexer_settings.INDEXING_URL = ""
# Invalid url
get_document_indexer.cache_clear()
assert not get_document_indexer()
def test_services_search_indexer_url_is_none(indexer_settings):
def test_services_indexing_url_is_none(indexer_settings):
"""
Indexer should raise RuntimeError if SEARCH_INDEXER_URL is None or empty.
Indexer should raise RuntimeError if INDEXING_URL is None or empty.
"""
indexer_settings.SEARCH_INDEXER_URL = None
indexer_settings.INDEXING_URL = None
with pytest.raises(ImproperlyConfigured) as exc_info:
SearchIndexer()
FindDocumentIndexer()
assert "SEARCH_INDEXER_URL must be set in Django settings." in str(exc_info.value)
assert "INDEXING_URL must be set in Django settings." in str(exc_info.value)
def test_services_search_indexer_url_is_empty(indexer_settings):
def test_services_indexing_url_is_empty(indexer_settings):
"""
Indexer should raise RuntimeError if SEARCH_INDEXER_URL is empty string.
Indexer should raise RuntimeError if INDEXING_URL is empty string.
"""
indexer_settings.SEARCH_INDEXER_URL = ""
indexer_settings.INDEXING_URL = ""
with pytest.raises(ImproperlyConfigured) as exc_info:
SearchIndexer()
FindDocumentIndexer()
assert "SEARCH_INDEXER_URL must be set in Django settings." in str(exc_info.value)
assert "INDEXING_URL must be set in Django settings." in str(exc_info.value)
def test_services_search_indexer_secret_is_none(indexer_settings):
@ -122,7 +122,7 @@ def test_services_search_indexer_secret_is_none(indexer_settings):
indexer_settings.SEARCH_INDEXER_SECRET = None
with pytest.raises(ImproperlyConfigured) as exc_info:
SearchIndexer()
FindDocumentIndexer()
assert "SEARCH_INDEXER_SECRET must be set in Django settings." in str(
exc_info.value
@ -136,39 +136,35 @@ def test_services_search_indexer_secret_is_empty(indexer_settings):
indexer_settings.SEARCH_INDEXER_SECRET = ""
with pytest.raises(ImproperlyConfigured) as exc_info:
SearchIndexer()
FindDocumentIndexer()
assert "SEARCH_INDEXER_SECRET must be set in Django settings." in str(
exc_info.value
)
def test_services_search_endpoint_is_none(indexer_settings):
def test_services_search_url_is_none(indexer_settings):
"""
Indexer should raise RuntimeError if SEARCH_INDEXER_QUERY_URL is None.
Indexer should raise RuntimeError if SEARCH_URL is None.
"""
indexer_settings.SEARCH_INDEXER_QUERY_URL = None
indexer_settings.SEARCH_URL = None
with pytest.raises(ImproperlyConfigured) as exc_info:
SearchIndexer()
FindDocumentIndexer()
assert "SEARCH_INDEXER_QUERY_URL must be set in Django settings." in str(
exc_info.value
)
assert "SEARCH_URL must be set in Django settings." in str(exc_info.value)
def test_services_search_endpoint_is_empty(indexer_settings):
def test_services_search_url_is_empty(indexer_settings):
"""
Indexer should raise RuntimeError if SEARCH_INDEXER_QUERY_URL is empty.
Indexer should raise RuntimeError if SEARCH_URL is empty.
"""
indexer_settings.SEARCH_INDEXER_QUERY_URL = ""
indexer_settings.SEARCH_URL = ""
with pytest.raises(ImproperlyConfigured) as exc_info:
SearchIndexer()
FindDocumentIndexer()
assert "SEARCH_INDEXER_QUERY_URL must be set in Django settings." in str(
exc_info.value
)
assert "SEARCH_URL must be set in Django settings." in str(exc_info.value)
@pytest.mark.usefixtures("indexer_settings")
@ -192,7 +188,7 @@ def test_services_search_indexers_serialize_document_returns_expected_json():
}
}
indexer = SearchIndexer()
indexer = FindDocumentIndexer()
result = indexer.serialize_document(document, accesses)
assert set(result.pop("users")) == {str(user_a.sub), str(user_b.sub)}
@ -221,7 +217,7 @@ def test_services_search_indexers_serialize_document_deleted():
parent.soft_delete()
document.refresh_from_db()
indexer = SearchIndexer()
indexer = FindDocumentIndexer()
result = indexer.serialize_document(document, {})
assert result["is_active"] is False
@ -232,7 +228,7 @@ def test_services_search_indexers_serialize_document_empty():
"""Empty documents returns empty content in the serialized json."""
document = factories.DocumentFactory(content="", title=None)
indexer = SearchIndexer()
indexer = FindDocumentIndexer()
result = indexer.serialize_document(document, {})
assert result["content"] == ""
@ -246,7 +242,7 @@ def test_services_search_indexers_index_errors(indexer_settings):
"""
factories.DocumentFactory()
indexer_settings.SEARCH_INDEXER_URL = "http://app-find/api/v1.0/documents/index/"
indexer_settings.INDEXING_URL = "http://app-find/api/v1.0/documents/index/"
responses.add(
responses.POST,
@ -256,10 +252,10 @@ def test_services_search_indexers_index_errors(indexer_settings):
)
with pytest.raises(HTTPError):
SearchIndexer().index()
FindDocumentIndexer().index()
@patch.object(SearchIndexer, "push")
@patch.object(FindDocumentIndexer, "push")
def test_services_search_indexers_batches_pass_only_batch_accesses(
mock_push, indexer_settings
):
@ -276,7 +272,7 @@ def test_services_search_indexers_batches_pass_only_batch_accesses(
access = factories.UserDocumentAccessFactory(document=document)
expected_user_subs[str(document.id)] = str(access.user.sub)
assert SearchIndexer().index() == 5
assert FindDocumentIndexer().index() == 5
# Should be 3 batches: 2 + 2 + 1
assert mock_push.call_count == 3
@ -299,7 +295,7 @@ def test_services_search_indexers_batches_pass_only_batch_accesses(
assert seen_doc_ids == {str(d.id) for d in documents}
@patch.object(SearchIndexer, "push")
@patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings")
def test_services_search_indexers_batch_size_argument(mock_push):
"""
@ -314,7 +310,7 @@ def test_services_search_indexers_batch_size_argument(mock_push):
access = factories.UserDocumentAccessFactory(document=document)
expected_user_subs[str(document.id)] = str(access.user.sub)
assert SearchIndexer().index(batch_size=2) == 5
assert FindDocumentIndexer().index(batch_size=2) == 5
# Should be 3 batches: 2 + 2 + 1
assert mock_push.call_count == 3
@ -337,7 +333,7 @@ def test_services_search_indexers_batch_size_argument(mock_push):
assert seen_doc_ids == {str(d.id) for d in documents}
@patch.object(SearchIndexer, "push")
@patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings")
def test_services_search_indexers_ignore_empty_documents(mock_push):
"""
@ -349,7 +345,7 @@ def test_services_search_indexers_ignore_empty_documents(mock_push):
empty_title = factories.DocumentFactory(title="")
empty_content = factories.DocumentFactory(content="")
assert SearchIndexer().index() == 3
assert FindDocumentIndexer().index() == 3
assert mock_push.call_count == 1
@ -365,7 +361,7 @@ def test_services_search_indexers_ignore_empty_documents(mock_push):
}
@patch.object(SearchIndexer, "push")
@patch.object(FindDocumentIndexer, "push")
def test_services_search_indexers_skip_empty_batches(mock_push, indexer_settings):
"""
Documents indexing batch can be empty if all the docs are empty.
@ -377,14 +373,14 @@ def test_services_search_indexers_skip_empty_batches(mock_push, indexer_settings
# Only empty docs
factories.DocumentFactory.create_batch(5, content="", title="")
assert SearchIndexer().index() == 1
assert FindDocumentIndexer().index() == 1
assert mock_push.call_count == 1
results = [doc["id"] for doc in mock_push.call_args[0][0]]
assert results == [str(document.id)]
@patch.object(SearchIndexer, "push")
@patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings")
def test_services_search_indexers_ancestors_link_reach(mock_push):
"""Document accesses and reach should take into account ancestors link reaches."""
@ -395,7 +391,7 @@ def test_services_search_indexers_ancestors_link_reach(mock_push):
parent = factories.DocumentFactory(parent=grand_parent, link_reach="public")
document = factories.DocumentFactory(parent=parent, link_reach="restricted")
assert SearchIndexer().index() == 4
assert FindDocumentIndexer().index() == 4
results = {doc["id"]: doc for doc in mock_push.call_args[0][0]}
assert len(results) == 4
@ -405,7 +401,7 @@ def test_services_search_indexers_ancestors_link_reach(mock_push):
assert results[str(document.id)]["reach"] == "public"
@patch.object(SearchIndexer, "push")
@patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings")
def test_services_search_indexers_ancestors_users(mock_push):
"""Document accesses and reach should include users from ancestors."""
@ -415,7 +411,7 @@ def test_services_search_indexers_ancestors_users(mock_push):
parent = factories.DocumentFactory(parent=grand_parent, users=[user_p])
document = factories.DocumentFactory(parent=parent, users=[user_d])
assert SearchIndexer().index() == 3
assert FindDocumentIndexer().index() == 3
results = {doc["id"]: doc for doc in mock_push.call_args[0][0]}
assert len(results) == 3
@ -428,7 +424,7 @@ def test_services_search_indexers_ancestors_users(mock_push):
}
@patch.object(SearchIndexer, "push")
@patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings")
def test_services_search_indexers_ancestors_teams(mock_push):
"""Document accesses and reach should include teams from ancestors."""
@ -436,7 +432,7 @@ def test_services_search_indexers_ancestors_teams(mock_push):
parent = factories.DocumentFactory(parent=grand_parent, teams=["team_p"])
document = factories.DocumentFactory(parent=parent, teams=["team_d"])
assert SearchIndexer().index() == 3
assert FindDocumentIndexer().index() == 3
results = {doc["id"]: doc for doc in mock_push.call_args[0][0]}
assert len(results) == 3
@ -451,9 +447,9 @@ def test_push_uses_correct_url_and_data(mock_post, indexer_settings):
push() should call requests.post with the correct URL from settings
the timeout set to 10 seconds and the data as JSON.
"""
indexer_settings.SEARCH_INDEXER_URL = "http://example.com/index"
indexer_settings.INDEXING_URL = "http://example.com/index"
indexer = SearchIndexer()
indexer = FindDocumentIndexer()
sample_data = [{"id": "123", "title": "Test"}]
mock_response = mock_post.return_value
@ -464,7 +460,7 @@ def test_push_uses_correct_url_and_data(mock_post, indexer_settings):
mock_post.assert_called_once()
args, kwargs = mock_post.call_args
assert args[0] == indexer_settings.SEARCH_INDEXER_URL
assert args[0] == indexer_settings.INDEXING_URL
assert kwargs.get("json") == sample_data
assert kwargs.get("timeout") == 10
@ -542,9 +538,7 @@ def test_services_search_indexers_search_errors(indexer_settings):
"""
factories.DocumentFactory()
indexer_settings.SEARCH_INDEXER_QUERY_URL = (
"http://app-find/api/v1.0/documents/search/"
)
indexer_settings.SEARCH_URL = "http://app-find/api/v1.0/documents/search/"
responses.add(
responses.POST,
@ -554,17 +548,17 @@ def test_services_search_indexers_search_errors(indexer_settings):
)
with pytest.raises(HTTPError):
SearchIndexer().search("alpha", token="mytoken")
FindDocumentIndexer().search("alpha", token="mytoken")
@patch("requests.post")
def test_services_search_indexers_search(mock_post, indexer_settings):
"""
search() should call requests.post to SEARCH_INDEXER_QUERY_URL with the
search() should call requests.post to SEARCH_URL with the
document ids from linktraces.
"""
user = factories.UserFactory()
indexer = SearchIndexer()
indexer = FindDocumentIndexer()
mock_response = mock_post.return_value
mock_response.raise_for_status.return_value = None # No error
@ -582,7 +576,7 @@ def test_services_search_indexers_search(mock_post, indexer_settings):
args, kwargs = mock_post.call_args
assert args[0] == indexer_settings.SEARCH_INDEXER_QUERY_URL
assert args[0] == indexer_settings.SEARCH_URL
query_data = kwargs.get("json")
assert query_data["q"] == "alpha"
@ -605,7 +599,7 @@ def test_services_search_indexers_search_nb_results(mock_post, indexer_settings)
indexer_settings.SEARCH_INDEXER_QUERY_LIMIT = 25
user = factories.UserFactory()
indexer = SearchIndexer()
indexer = FindDocumentIndexer()
mock_response = mock_post.return_value
mock_response.raise_for_status.return_value = None # No error
@ -623,7 +617,7 @@ def test_services_search_indexers_search_nb_results(mock_post, indexer_settings)
args, kwargs = mock_post.call_args
assert args[0] == indexer_settings.SEARCH_INDEXER_QUERY_URL
assert args[0] == indexer_settings.SEARCH_URL
assert kwargs.get("json")["nb_results"] == 25
# The argument overrides the setting value
@ -631,5 +625,53 @@ def test_services_search_indexers_search_nb_results(mock_post, indexer_settings)
args, kwargs = mock_post.call_args
assert args[0] == indexer_settings.SEARCH_INDEXER_QUERY_URL
assert args[0] == indexer_settings.SEARCH_URL
assert kwargs.get("json")["nb_results"] == 109
def test_search_indexer_get_title_with_localized_field():
"""Test extracting title from localized title field."""
source = {"title.extension": "Bonjour", "id": 1, "content": "test"}
result = FindDocumentIndexer.get_title(source)
assert result == "Bonjour"
def test_search_indexer_get_title_with_multiple_localized_fields():
"""Test that first matching localized title is returned."""
source = {"title.extension": "Bonjour", "title.en": "Hello", "id": 1}
result = FindDocumentIndexer.get_title(source)
assert result in ["Bonjour", "Hello"]
def test_search_indexer_get_title_fallback_to_plain_title():
"""Test fallback to plain 'title' field when no localized field exists."""
source = {"title": "Hello World", "id": 1}
result = FindDocumentIndexer.get_title(source)
assert result == "Hello World"
def test_search_indexer_get_title_no_title_field():
"""Test that empty string is returned when no title field exists."""
source = {"id": 1, "content": "test"}
result = FindDocumentIndexer.get_title(source)
assert result == ""
def test_search_indexer_get_title_with_empty_localized_title():
"""Test that fallback works when localized title is empty."""
source = {"title.extension": "", "title": "Fallback Title", "id": 1}
result = FindDocumentIndexer.get_title(source)
assert result == "Fallback Title"
def test_search_indexer_get_title_with_multiple_extension():
"""Test extracting title from title field with multiple extensions."""
source = {"title.extension_1.extension_2": "Bonjour", "id": 1, "content": "test"}
result = FindDocumentIndexer.get_title(source)
assert result == "Bonjour"

View file

@ -205,3 +205,38 @@ def test_utils_users_sharing_documents_with_empty_result():
cached_data = cache.get(cache_key)
assert cached_data == {}
def test_utils_get_value_by_pattern_matching_key():
"""Test extracting value from a dictionary with a matching key pattern."""
data = {"title.extension": "Bonjour", "id": 1, "content": "test"}
result = utils.get_value_by_pattern(data, r"^title\.")
assert set(result) == {"Bonjour"}
def test_utils_get_value_by_pattern_multiple_matches():
"""Test that all matching keys are returned."""
data = {"title.extension_1": "Bonjour", "title.extension_2": "Hello", "id": 1}
result = utils.get_value_by_pattern(data, r"^title\.")
assert set(result) == {
"Bonjour",
"Hello",
}
def test_utils_get_value_by_pattern_multiple_extensions():
"""Test that all matching keys are returned."""
data = {"title.extension_1.extension_2": "Bonjour", "id": 1}
result = utils.get_value_by_pattern(data, r"^title\.")
assert set(result) == {"Bonjour"}
def test_utils_get_value_by_pattern_no_match():
"""Test that empty list is returned when no key matches the pattern."""
data = {"name": "Test", "id": 1}
result = utils.get_value_by_pattern(data, r"^title\.")
assert result == []

View file

@ -18,6 +18,27 @@ from core import enums, models
logger = logging.getLogger(__name__)
def get_value_by_pattern(data, pattern):
"""
Get all values from keys matching a regex pattern in a dictionary.
Args:
data (dict): Source dictionary to search
pattern (str): Regex pattern to match against keys
Returns:
list: List of values for all matching keys, empty list if no matches
Example:
>>> get_value_by_pattern({"title.fr": "Bonjour", "id": 1}, r"^title\\.")
["Bonjour"]
>>> get_value_by_pattern({"title.fr": "Bonjour", "title.en": "Hello"}, r"^title\\.")
["Bonjour", "Hello"]
"""
regex = re.compile(pattern)
return [value for key, value in data.items() if regex.match(key)]
def get_ancestor_to_descendants_map(paths, steplen):
"""
Given a list of document paths, return a mapping of ancestor_path -> set of descendant_paths.

View file

@ -113,8 +113,8 @@ class Base(Configuration):
SEARCH_INDEXER_BATCH_SIZE = values.IntegerValue(
default=100_000, environ_name="SEARCH_INDEXER_BATCH_SIZE", environ_prefix=None
)
SEARCH_INDEXER_URL = values.Value(
default=None, environ_name="SEARCH_INDEXER_URL", environ_prefix=None
INDEXING_URL = values.Value(
default=None, environ_name="INDEXING_URL", environ_prefix=None
)
SEARCH_INDEXER_COUNTDOWN = values.IntegerValue(
default=1, environ_name="SEARCH_INDEXER_COUNTDOWN", environ_prefix=None
@ -122,8 +122,8 @@ class Base(Configuration):
SEARCH_INDEXER_SECRET = values.Value(
default=None, environ_name="SEARCH_INDEXER_SECRET", environ_prefix=None
)
SEARCH_INDEXER_QUERY_URL = values.Value(
default=None, environ_name="SEARCH_INDEXER_QUERY_URL", environ_prefix=None
SEARCH_URL = values.Value(
default=None, environ_name="SEARCH_URL", environ_prefix=None
)
SEARCH_INDEXER_QUERY_LIMIT = values.PositiveIntegerValue(
default=50, environ_name="SEARCH_INDEXER_QUERY_LIMIT", environ_prefix=None

View file

@ -3,6 +3,7 @@ import {
StyleSchema,
} from '@blocknote/core';
import { useBlockNoteEditor } from '@blocknote/react';
import { useTreeContext } from '@gouvfr-lasuite/ui-kit';
import type { KeyboardEvent } from 'react';
import { useEffect, useRef, useState } from 'react';
import { useTranslation } from 'react-i18next';
@ -26,12 +27,13 @@ import {
import FoundPageIcon from '@/docs/doc-editor/assets/doc-found.svg';
import AddPageIcon from '@/docs/doc-editor/assets/doc-plus.svg';
import {
Doc,
getEmojiAndTitle,
useCreateChildDocTree,
useDocStore,
useTrans,
} from '@/docs/doc-management';
import { DocSearchSubPageContent, DocSearchTarget } from '@/docs/doc-search';
import { DocSearchContent, DocSearchTarget } from '@/docs/doc-search';
import { useResponsiveStore } from '@/stores';
const inputStyle = css`
@ -87,7 +89,7 @@ export const SearchPage = ({
const { isDesktop } = useResponsiveStore();
const { untitledDocument } = useTrans();
const isEditable = editor.isEditable;
const treeContext = useTreeContext<Doc>();
/**
* createReactInlineContentSpec add automatically the focus after
* the inline content, so we need to set the focus on the input
@ -226,9 +228,11 @@ export const SearchPage = ({
`}
$margin={{ top: '0.5rem' }}
>
<DocSearchSubPageContent
<DocSearchContent
groupName={t('Select a document')}
search={search}
filters={{ target: DocSearchTarget.CURRENT }}
target={DocSearchTarget.CURRENT}
parentPath={treeContext?.root?.path}
onSelect={(doc) => {
if (!isEditable) {
return;
@ -256,7 +260,7 @@ export const SearchPage = ({
editor.focus();
}}
renderElement={(doc) => {
renderSearchElement={(doc) => {
const { emoji, titleWithoutEmoji } = getEmojiAndTitle(
doc.title || untitledDocument,
);

View file

@ -9,5 +9,5 @@ export * from './useDocsFavorite';
export * from './useDuplicateDoc';
export * from './useMoveDoc';
export * from './useRestoreDoc';
export * from './useSubDocs';
export * from './useUpdateDoc';
export * from './useSearchDocs';

View file

@ -15,7 +15,6 @@ export type DocsParams = {
page: number;
ordering?: DocsOrdering;
is_creator_me?: boolean;
title?: string;
is_favorite?: boolean;
};
@ -31,9 +30,6 @@ export const constructParams = (params: DocsParams): URLSearchParams => {
if (params.is_creator_me !== undefined) {
searchParams.set('is_creator_me', params.is_creator_me.toString());
}
if (params.title && params.title.length > 0) {
searchParams.set('title', params.title);
}
if (params.is_favorite !== undefined) {
searchParams.set('is_favorite', params.is_favorite.toString());
}

View file

@ -0,0 +1,81 @@
import { useQuery } from '@tanstack/react-query';
import {
APIError,
APIList,
errorCauses,
fetchAPI,
useAPIInfiniteQuery,
} from '@/api';
import { Doc } from '@/docs/doc-management';
import { DocSearchTarget } from '@/docs/doc-search';
export type SearchDocsParams = {
page: number;
q: string;
target?: DocSearchTarget;
parentPath?: string;
};
const constructParams = ({
q,
page,
target,
parentPath,
}: SearchDocsParams): URLSearchParams => {
const searchParams = new URLSearchParams();
searchParams.set('q', q);
if (target === DocSearchTarget.CURRENT && parentPath) {
searchParams.set('path', parentPath);
}
if (page) {
searchParams.set('page', page.toString());
}
return searchParams;
};
const searchDocs = async ({
q,
page,
target,
parentPath,
}: SearchDocsParams): Promise<APIList<Doc>> => {
const searchParams = constructParams({ q, page, target, parentPath });
const response = await fetchAPI(
`documents/search/?${searchParams.toString()}`,
);
if (!response.ok) {
throw new APIError('Failed to get the docs', await errorCauses(response));
}
return response.json() as Promise<APIList<Doc>>;
};
export const KEY_LIST_SEARCH_DOC = 'search-docs';
export const useSearchDocs = (
{ q, page, target, parentPath }: SearchDocsParams,
queryConfig?: { enabled?: boolean },
) => {
return useQuery<APIList<Doc>, APIError, APIList<Doc>>({
queryKey: [KEY_LIST_SEARCH_DOC, 'search', { q, page, target, parentPath }],
queryFn: () => searchDocs({ q, page, target, parentPath }),
...queryConfig,
});
};
export const useInfiniteSearchDocs = (
params: SearchDocsParams,
queryConfig?: { enabled?: boolean },
) => {
return useAPIInfiniteQuery(
KEY_LIST_SEARCH_DOC,
searchDocs,
params,
queryConfig,
);
};

View file

@ -1,62 +0,0 @@
import { UseQueryOptions, useQuery } from '@tanstack/react-query';
import {
APIError,
InfiniteQueryConfig,
errorCauses,
fetchAPI,
useAPIInfiniteQuery,
} from '@/api';
import { DocsOrdering } from '../types';
import { DocsResponse, constructParams } from './useDocs';
export type SubDocsParams = {
page: number;
ordering?: DocsOrdering;
is_creator_me?: boolean;
title?: string;
is_favorite?: boolean;
parent_id: string;
};
export const getSubDocs = async (
params: SubDocsParams,
): Promise<DocsResponse> => {
const searchParams = constructParams(params);
searchParams.set('parent_id', params.parent_id);
const response: Response = await fetchAPI(
`documents/${params.parent_id}/descendants/?${searchParams.toString()}`,
);
if (!response.ok) {
throw new APIError(
'Failed to get the sub docs',
await errorCauses(response),
);
}
return response.json() as Promise<DocsResponse>;
};
export const KEY_LIST_SUB_DOC = 'sub-docs';
export function useSubDocs(
params: SubDocsParams,
queryConfig?: UseQueryOptions<DocsResponse, APIError, DocsResponse>,
) {
return useQuery<DocsResponse, APIError, DocsResponse>({
queryKey: [KEY_LIST_SUB_DOC, params],
queryFn: () => getSubDocs(params),
...queryConfig,
});
}
export const useInfiniteSubDocs = (
params: SubDocsParams,
queryConfig?: InfiniteQueryConfig<DocsResponse>,
) => {
return useAPIInfiniteQuery(KEY_LIST_SUB_DOC, getSubDocs, params, queryConfig);
};

View file

@ -4,7 +4,10 @@ import { InView } from 'react-intersection-observer';
import { Box } from '@/components/';
import { QuickSearchData, QuickSearchGroup } from '@/components/quick-search';
import { Doc, useInfiniteDocs } from '@/docs/doc-management';
import { useInfiniteSearchDocs } from '@/docs/doc-management/api/useSearchDocs';
import { DocSearchTarget } from '@/docs/doc-search';
import { Doc } from '../../doc-management';
import { DocSearchItem } from './DocSearchItem';
@ -15,6 +18,8 @@ type DocSearchContentProps = {
isSearchNotMandatory?: boolean;
onSelect: (doc: Doc) => void;
onLoadingChange?: (loading: boolean) => void;
target?: DocSearchTarget;
parentPath?: string;
renderSearchElement?: (doc: Doc) => React.ReactNode;
};
@ -25,6 +30,8 @@ export const DocSearchContent = ({
onSelect,
onLoadingChange,
renderSearchElement,
target,
parentPath,
isSearchNotMandatory,
}: DocSearchContentProps) => {
const {
@ -34,10 +41,17 @@ export const DocSearchContent = ({
isLoading,
fetchNextPage,
hasNextPage,
} = useInfiniteDocs({
} = useInfiniteSearchDocs(
{
q: search,
page: 1,
...(search ? { title: search } : {}),
});
target,
parentPath,
},
{
enabled: target !== DocSearchTarget.CURRENT || !!parentPath,
},
);
const loading = isFetching || isRefetching || isLoading;
const [docsData, setDocsData] = useState<QuickSearchData<Doc>>({
@ -79,12 +93,12 @@ export const DocSearchContent = ({
}, [
search,
data?.pages,
fetchNextPage,
hasNextPage,
filterResults,
groupName,
isSearchNotMandatory,
loading,
hasNextPage,
fetchNextPage,
]);
useEffect(() => {

View file

@ -1,4 +1,5 @@
import { Modal, ModalSize } from '@gouvfr-lasuite/cunningham-react';
import { TreeContextType, useTreeContext } from '@gouvfr-lasuite/ui-kit';
import Image from 'next/image';
import { useRouter } from 'next/router';
import { useState } from 'react';
@ -8,45 +9,39 @@ import { useDebouncedCallback } from 'use-debounce';
import { Box, ButtonCloseModal, Text } from '@/components';
import { QuickSearch } from '@/components/quick-search';
import { Doc, useDocUtils } from '@/docs/doc-management';
import {
DocSearchFilters,
DocSearchFiltersValues,
DocSearchTarget,
} from '@/docs/doc-search';
import { useResponsiveStore } from '@/stores';
import EmptySearchIcon from '../assets/illustration-docs-empty.png';
import { DocSearchContent } from './DocSearchContent';
import {
DocSearchFilters,
DocSearchFiltersValues,
DocSearchTarget,
} from './DocSearchFilters';
import { DocSearchItem } from './DocSearchItem';
import { DocSearchSubPageContent } from './DocSearchSubPageContent';
type DocSearchModalGlobalProps = {
onClose: () => void;
isOpen: boolean;
showFilters?: boolean;
defaultFilters?: DocSearchFiltersValues;
treeContext?: TreeContextType<Doc> | null;
};
const DocSearchModalGlobal = ({
showFilters = false,
defaultFilters,
treeContext,
...modalProps
}: DocSearchModalGlobalProps) => {
const { t } = useTranslation();
const [loading, setLoading] = useState(false);
const router = useRouter();
const isDocPage = router.pathname === '/docs/[id]';
const [search, setSearch] = useState('');
const [filters, setFilters] = useState<DocSearchFiltersValues>(
defaultFilters ?? {},
);
const target = filters.target ?? DocSearchTarget.ALL;
const { isDesktop } = useResponsiveStore();
const handleInputSearch = useDebouncedCallback(setSearch, 300);
const handleSelect = (doc: Doc) => {
@ -121,26 +116,23 @@ const DocSearchModalGlobal = ({
</Box>
)}
{search && (
<>
{target === DocSearchTarget.ALL && (
<DocSearchContent
groupName={t('Select a document')}
search={search}
onSelect={handleSelect}
onLoadingChange={setLoading}
target={
filters.target === DocSearchTarget.CURRENT
? DocSearchTarget.CURRENT
: DocSearchTarget.ALL
}
parentPath={
filters.target === DocSearchTarget.CURRENT
? treeContext?.root?.path
: undefined
}
/>
)}
{isDocPage && target === DocSearchTarget.CURRENT && (
<DocSearchSubPageContent
search={search}
filters={filters}
onSelect={handleSelect}
onLoadingChange={setLoading}
renderElement={(doc) => <DocSearchItem doc={doc} />}
/>
)}
</>
)}
</Box>
</QuickSearch>
</Box>
@ -158,6 +150,7 @@ const DocSearchModalDetail = ({
}: DocSearchModalDetailProps) => {
const { hasChildren, isChild } = useDocUtils(doc);
const isWithChildren = isChild || hasChildren;
const treeContext = useTreeContext<Doc>();
let defaultFilters = DocSearchTarget.ALL;
let showFilters = false;
@ -171,6 +164,7 @@ const DocSearchModalDetail = ({
{...modalProps}
showFilters={showFilters}
defaultFilters={{ target: defaultFilters }}
treeContext={treeContext}
/>
);
};

View file

@ -1,103 +0,0 @@
import { useTreeContext } from '@gouvfr-lasuite/ui-kit';
import { t } from 'i18next';
import React, { useEffect, useState } from 'react';
import { InView } from 'react-intersection-observer';
import { QuickSearchData, QuickSearchGroup } from '@/components/quick-search';
import { Doc, useInfiniteSubDocs } from '@/docs/doc-management';
import { DocSearchFiltersValues } from './DocSearchFilters';
type DocSearchSubPageContentProps = {
search: string;
filters: DocSearchFiltersValues;
onSelect: (doc: Doc) => void;
onLoadingChange?: (loading: boolean) => void;
renderElement: (doc: Doc) => React.ReactNode;
};
export const DocSearchSubPageContent = ({
search,
filters,
onSelect,
onLoadingChange,
renderElement,
}: DocSearchSubPageContentProps) => {
const treeContext = useTreeContext<Doc>();
const {
data: subDocsData,
isFetching,
isRefetching,
isLoading,
fetchNextPage: subDocsFetchNextPage,
hasNextPage: subDocsHasNextPage,
} = useInfiniteSubDocs(
{
page: 1,
title: search,
...filters,
parent_id: treeContext?.root?.id ?? '',
},
{
enabled: !!treeContext?.root?.id,
},
);
const [docsData, setDocsData] = useState<QuickSearchData<Doc>>({
groupName: '',
elements: [],
emptyString: '',
});
const loading = isFetching || isRefetching || isLoading;
useEffect(() => {
if (loading) {
return;
}
const subDocs = subDocsData?.pages.flatMap((page) => page.results) || [];
if (treeContext?.root) {
const isRootTitleIncludeSearch = treeContext.root?.title
?.toLowerCase()
.includes(search.toLowerCase());
if (isRootTitleIncludeSearch) {
subDocs.unshift(treeContext.root);
}
}
setDocsData({
groupName: subDocs.length > 0 ? t('Select a doc') : '',
elements: search ? subDocs : [],
emptyString: search ? t('No document found') : t('Search by title'),
endActions: subDocsHasNextPage
? [
{
content: <InView onChange={() => void subDocsFetchNextPage()} />,
},
]
: [],
});
}, [
loading,
search,
subDocsData?.pages,
subDocsFetchNextPage,
subDocsHasNextPage,
treeContext?.root,
]);
useEffect(() => {
onLoadingChange?.(loading);
}, [loading, onLoadingChange]);
return (
<QuickSearchGroup
onSelect={onSelect}
group={docsData}
renderElement={renderElement}
/>
);
};

View file

@ -1,4 +1,3 @@
export * from './DocSearchContent';
export * from './DocSearchModal';
export * from './DocSearchFilters';
export * from './DocSearchSubPageContent';