Integrate Find (#1834)

## Purpose

integrate Find to Docs

## Proposal

- [x]  add a `useSeachDocs` hook in charged of calling the search
endpoint.
- [x]  add a optional `path` param to the `search` route. This param
represents the parent document path in case of a sub-documents
(descendants) search.
- [x] ️return Indexer results directly without DB calls to retrieve the
Document objects. All informations necessary for display are indexed in
Find. We can skip the DB calls and improve performance.
- [x] ♻️ refactor react `DocSearchContent` components.
`DocSearchContent` and `DocSearchSubContent` are now merged a unique
component handling all search scenarios and relying on the unique
`search` route.
- [x] 🔥remove pagination logic in the Indexer. Removing the DB calls
also removes the DRF queryset object which handles the pagination. Also
we consider pagination not to be necessary for search v1.
- [x] 🔥remove the `document/<document_id>/descendants` route. This route
is not used anymore. The logic of finding the descendants are moved to
the internal `_list_descendants` method. This method is based on the
parent `path` instead of the parent `id` which has some consequence
about the user access management. Relying on the path prevents the use
of the `self.get_object()` method which used to handle the user access
logic.
- [x] handle fallback logic on DRF based title search in case of
non-configured, badly configured or failing at run time indexer.
- [x] handle language extension in `title` field. Find returns titles
with a language extension (ex: `{ title.fr: "rapport d'activité" }`
instead of `{ "title": "rapport d'activité" }`.
- [x] 🔧 add a `common.test` file to allow running the tests without
docker
- [x] ♻️ rename `SearchIndexer` -> `FindDocumentIndexer`. This class has
to do with Find in particular and the convention is more coherent with
`BaseDocumentIndexer`
- [x] ♻️ rename `SEARCH_INDEXER_URL` -> `INDEXING_URL` and
`SEARCH_INDEXER_QUERY_URL` -> `SEARCH_URL`. I found the original names
very confusing.
- [x] 🔧 update the environment variables to activate the
FindDocumentIndexer.
- [x] automate the generation of encryption key during bootstrap.
OIDC_STORE_REFRESH_TOKEN_KEY is a mandatory secret key. We can not push
it on Github and we want any contributor to be able to run the app by
only running the `make bootstrap`. We chose to generate and wright it
into the `common.local` during bootstrap.

## External contributions

Thank you for your contribution! 🎉  

Please ensure the following items are checked before submitting your
pull request:
- [x] I have read and followed the [contributing
guidelines](https://github.com/suitenumerique/docs/blob/main/CONTRIBUTING.md)
- [x] I have read and agreed to the [Code of
Conduct](https://github.com/suitenumerique/docs/blob/main/CODE_OF_CONDUCT.md)
- [x] I have signed off my commits with `git commit --signoff` (DCO
compliance)
- [x] I have signed my commits with my SSH or GPG key (`git commit -S`)
- [x] My commit messages follow the required format: `<gitmoji>(type)
title description`
- [x] I have added a changelog entry under `## [Unreleased]` section (if
noticeable change)
- [x] I have added corresponding tests for new features or bug fixes (if
applicable)

---------

Signed-off-by: charles <charles.englebert@protonmail.com>
This commit is contained in:
Charles Englebert 2026-03-17 17:32:03 +01:00 committed by GitHub
parent ad36210e45
commit 0fca6db79c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
37 changed files with 1758 additions and 788 deletions

View file

@ -107,12 +107,15 @@ and this project adheres to
- ✨(frontend) Add stat for Crisp #1824 - ✨(frontend) Add stat for Crisp #1824
- ✨(auth) add silent login #1690 - ✨(auth) add silent login #1690
- 🔧(project) add DJANGO_EMAIL_URL_APP environment variable #1825 - 🔧(project) add DJANGO_EMAIL_URL_APP environment variable #1825
- ✨(frontend) activate Find search #1834
- ✨ handle searching on subdocuments #1834
### Changed ### Changed
- ♿(frontend) improve accessibility: - ♿(frontend) improve accessibility:
- ♿️(frontend) fix subdoc opening and emoji pick focus #1745 - ♿️(frontend) fix subdoc opening and emoji pick focus #1745
- ✨(backend) add field for button label in email template #1817 - ✨(backend) add field for button label in email template #1817
- ✨(backend) improve fallback logic on search endpoint #1834
### Fixed ### Fixed
@ -126,6 +129,8 @@ and this project adheres to
### Removed ### Removed
- 🔥(project) remove all code related to template #1780 - 🔥(project) remove all code related to template #1780
- 🔥(api) remove `documents/<document_id>/descendants/` endpoint #1834
- 🔥(api) remove pagination on `documents/search/` endpoint #1834
### Security ### Security

View file

@ -79,10 +79,16 @@ create-env-local-files:
@touch env.d/development/kc_postgresql.local @touch env.d/development/kc_postgresql.local
.PHONY: create-env-local-files .PHONY: create-env-local-files
generate-secret-keys:
generate-secret-keys: ## generate secret keys to be stored in common.local
@bin/generate-oidc-store-refresh-token-key.sh
.PHONY: generate-secret-keys
pre-bootstrap: \ pre-bootstrap: \
data/media \ data/media \
data/static \ data/static \
create-env-local-files create-env-local-files \
generate-secret-keys
.PHONY: pre-bootstrap .PHONY: pre-bootstrap
post-bootstrap: \ post-bootstrap: \

View file

@ -173,6 +173,11 @@ make frontend-test
make frontend-lint make frontend-lint
``` ```
Backend tests can be run without docker. This is useful to configure PyCharm or VSCode to do it.
Removing docker for testing requires to overwrite some URL and port values that are different in and out of
Docker. `env.d/development/common` contains all variables, some of them having to be overwritten by those in
`env.d/development/common.test`.
### Demo content ### Demo content
Create a basic demo site: Create a basic demo site:

View file

@ -1,6 +0,0 @@
#!/usr/bin/env bash
# shellcheck source=bin/_config.sh
source "$(dirname "${BASH_SOURCE[0]}")/_config.sh"
_dc_run app-dev python -c 'from cryptography.fernet import Fernet;import sys; sys.stdout.write("\n" + Fernet.generate_key().decode() + "\n");'

View file

@ -0,0 +1,13 @@
#!/usr/bin/env bash
# Generate the secret OIDC_STORE_REFRESH_TOKEN_KEY and store it to common.local
set -eo pipefail
COMMON_LOCAL="env.d/development/common.local"
OIDC_STORE_REFRESH_TOKEN_KEY=$(openssl rand -base64 32)
echo "" >> "${COMMON_LOCAL}"
echo "OIDC_STORE_REFRESH_TOKEN_KEY=${OIDC_STORE_REFRESH_TOKEN_KEY}" >> "${COMMON_LOCAL}"
echo "✓ OIDC_STORE_REFRESH_TOKEN_KEY generated and stored in ${COMMON_LOCAL}"

View file

@ -108,6 +108,9 @@ These are the environment variables you can set for the `impress-backend` contai
| OIDC_RP_SCOPES | Scopes requested for OIDC | openid email | | OIDC_RP_SCOPES | Scopes requested for OIDC | openid email |
| OIDC_RP_SIGN_ALGO | verification algorithm used OIDC tokens | RS256 | | OIDC_RP_SIGN_ALGO | verification algorithm used OIDC tokens | RS256 |
| OIDC_STORE_ID_TOKEN | Store OIDC token | true | | OIDC_STORE_ID_TOKEN | Store OIDC token | true |
| OIDC_STORE_ACCESS_TOKEN | If True stores OIDC access token in session. | false |
| OIDC_STORE_REFRESH_TOKEN | If True stores OIDC refresh token in session. | false |
| OIDC_STORE_REFRESH_TOKEN_KEY | Key to encrypt refresh token stored in session, must be a valid Fernet key | |
| OIDC_USERINFO_FULLNAME_FIELDS | OIDC token claims to create full name | ["first_name", "last_name"] | | OIDC_USERINFO_FULLNAME_FIELDS | OIDC token claims to create full name | ["first_name", "last_name"] |
| OIDC_USERINFO_SHORTNAME_FIELD | OIDC token claims to create shortname | first_name | | OIDC_USERINFO_SHORTNAME_FIELD | OIDC token claims to create shortname | first_name |
| OIDC_USE_NONCE | Use nonce for OIDC | true | | OIDC_USE_NONCE | Use nonce for OIDC | true |
@ -117,8 +120,9 @@ These are the environment variables you can set for the `impress-backend` contai
| SEARCH_INDEXER_CLASS | Class of the backend for document indexation & search | | | SEARCH_INDEXER_CLASS | Class of the backend for document indexation & search | |
| SEARCH_INDEXER_COUNTDOWN | Minimum debounce delay of indexation jobs (in seconds) | 1 | | SEARCH_INDEXER_COUNTDOWN | Minimum debounce delay of indexation jobs (in seconds) | 1 |
| SEARCH_INDEXER_QUERY_LIMIT | Maximum number of results expected from search endpoint | 50 | | SEARCH_INDEXER_QUERY_LIMIT | Maximum number of results expected from search endpoint | 50 |
| SEARCH_INDEXER_SECRET | Token for indexation queries | | | SEARCH_URL | Find application endpoint for search queries | |
| SEARCH_INDEXER_URL | Find application endpoint for indexation | | | SEARCH_INDEXER_SECRET | Token required for indexation queries | |
| INDEXING_URL | Find application endpoint for indexation | |
| SENTRY_DSN | Sentry host | | | SENTRY_DSN | Sentry host | |
| SESSION_COOKIE_AGE | duration of the cookie session | 60*60*12 | | SESSION_COOKIE_AGE | duration of the cookie session | 60*60*12 |
| SIGNUP_NEW_USER_TO_MARKETING_EMAIL | Register new user to the marketing onboarding. If True, see env LASUITE_MARKETING_* system | False | | SIGNUP_NEW_USER_TO_MARKETING_EMAIL | Register new user to the marketing onboarding. If True, see env LASUITE_MARKETING_* system | False |

View file

@ -1,8 +1,8 @@
# Setup the Find search for Impress # Setup Find search for Docs
This configuration will enable the fulltext search feature for Docs : This configuration will enable Find searches:
- Each save on **core.Document** or **core.DocumentAccess** will trigger the indexer - Each save on **core.Document** or **core.DocumentAccess** will trigger the indexing of the document into Find.
- The `api/v1.0/documents/search/` will work as a proxy with the Find API for fulltext search. - The `api/v1.0/documents/search/` will be used as proxy for searching documents from Find indexes.
## Create an index service for Docs ## Create an index service for Docs
@ -15,27 +15,27 @@ See [how-to-use-indexer.md](how-to-use-indexer.md) for details.
## Configure settings of Docs ## Configure settings of Docs
Add those Django settings the Docs application to enable the feature. Find uses a service provider authentication for indexing and a OIDC authentication for searching.
Add those Django settings to the Docs application to enable the feature.
```shell ```shell
SEARCH_INDEXER_CLASS="core.services.search_indexers.FindDocumentIndexer" SEARCH_INDEXER_CLASS="core.services.search_indexers.FindDocumentIndexer"
SEARCH_INDEXER_COUNTDOWN=10 # Debounce delay in seconds for the indexer calls. SEARCH_INDEXER_COUNTDOWN=10 # Debounce delay in seconds for the indexer calls.
SEARCH_INDEXER_QUERY_LIMIT=50 # Maximum number of results expected from the search endpoint
# The token from service "docs" of Find application (development). INDEXING_URL="http://find:8000/api/v1.0/documents/index/"
SEARCH_URL="http://find:8000/api/v1.0/documents/search/"
# Service provider authentication
SEARCH_INDEXER_SECRET="find-api-key-for-docs-with-exactly-50-chars-length" SEARCH_INDEXER_SECRET="find-api-key-for-docs-with-exactly-50-chars-length"
SEARCH_INDEXER_URL="http://find:8000/api/v1.0/documents/index/"
# Search endpoint. Uses the OIDC token for authentication # OIDC authentication
SEARCH_INDEXER_QUERY_URL="http://find:8000/api/v1.0/documents/search/" OIDC_STORE_ACCESS_TOKEN=True # Store the access token in the session
# Maximum number of results expected from the search endpoint OIDC_STORE_REFRESH_TOKEN=True # Store the encrypted refresh token in the session
SEARCH_INDEXER_QUERY_LIMIT=50 OIDC_STORE_REFRESH_TOKEN_KEY="<your-32-byte-encryption-key==>"
``` ```
We also need to enable the **OIDC Token** refresh or the authentication will fail quickly. `OIDC_STORE_REFRESH_TOKEN_KEY` must be a valid Fernet key (32 url-safe base64-encoded bytes).
To create one, use the `bin/generate-oidc-store-refresh-token-key.sh` command.
```shell
# Store OIDC tokens in the session
OIDC_STORE_ACCESS_TOKEN = True # Store the access token in the session
OIDC_STORE_REFRESH_TOKEN = True # Store the encrypted refresh token in the session
OIDC_STORE_REFRESH_TOKEN_KEY = "your-32-byte-encryption-key==" # Must be a valid Fernet key (32 url-safe base64-encoded bytes)
```

View file

@ -52,8 +52,8 @@ OIDC_REDIRECT_ALLOWED_HOSTS="localhost:8083,localhost:3000"
OIDC_AUTH_REQUEST_EXTRA_PARAMS={"acr_values": "eidas1"} OIDC_AUTH_REQUEST_EXTRA_PARAMS={"acr_values": "eidas1"}
# Store OIDC tokens in the session. Needed by search/ endpoint. # Store OIDC tokens in the session. Needed by search/ endpoint.
# OIDC_STORE_ACCESS_TOKEN = True OIDC_STORE_ACCESS_TOKEN=True
# OIDC_STORE_REFRESH_TOKEN = True # Store the encrypted refresh token in the session. OIDC_STORE_REFRESH_TOKEN=True # Store the encrypted refresh token in the session.
# Must be a valid Fernet key (32 url-safe base64-encoded bytes) # Must be a valid Fernet key (32 url-safe base64-encoded bytes)
# To create one, use the bin/fernetkey command. # To create one, use the bin/fernetkey command.
@ -87,8 +87,9 @@ DOCSPEC_API_URL=http://docspec:4000/conversion
# Theme customization # Theme customization
THEME_CUSTOMIZATION_CACHE_TIMEOUT=15 THEME_CUSTOMIZATION_CACHE_TIMEOUT=15
# Indexer (disabled) # Indexer (disabled by default)
# SEARCH_INDEXER_CLASS="core.services.search_indexers.SearchIndexer" # SEARCH_INDEXER_CLASS=core.services.search_indexers.FindDocumentIndexer
SEARCH_INDEXER_SECRET=find-api-key-for-docs-with-exactly-50-chars-length # Key generated by create_demo in Find app. SEARCH_INDEXER_SECRET=find-api-key-for-docs-with-exactly-50-chars-length # Key generated by create_demo in Find app.
SEARCH_INDEXER_URL="http://find:8000/api/v1.0/documents/index/" INDEXING_URL=http://find:8000/api/v1.0/documents/index/
SEARCH_INDEXER_QUERY_URL="http://find:8000/api/v1.0/documents/search/" SEARCH_URL=http://find:8000/api/v1.0/documents/search/
SEARCH_INDEXER_QUERY_LIMIT=50

View file

@ -0,0 +1,7 @@
# Test environment configuration for running tests without docker
# Base configuration is loaded from 'common' file
DJANGO_SETTINGS_MODULE=impress.settings
DJANGO_CONFIGURATION=Test
DB_PORT=15432
AWS_S3_ENDPOINT_URL=http://localhost:9000

View file

@ -47,10 +47,13 @@ class DocumentFilter(django_filters.FilterSet):
title = AccentInsensitiveCharFilter( title = AccentInsensitiveCharFilter(
field_name="title", lookup_expr="unaccent__icontains", label=_("Title") field_name="title", lookup_expr="unaccent__icontains", label=_("Title")
) )
q = AccentInsensitiveCharFilter(
field_name="title", lookup_expr="unaccent__icontains", label=_("Search")
)
class Meta: class Meta:
model = models.Document model = models.Document
fields = ["title"] fields = ["title", "q"]
class ListDocumentFilter(DocumentFilter): class ListDocumentFilter(DocumentFilter):
@ -70,7 +73,7 @@ class ListDocumentFilter(DocumentFilter):
class Meta: class Meta:
model = models.Document model = models.Document
fields = ["is_creator_me", "is_favorite", "title"] fields = ["is_creator_me", "is_favorite", "title", "q"]
# pylint: disable=unused-argument # pylint: disable=unused-argument
def filter_is_creator_me(self, queryset, name, value): def filter_is_creator_me(self, queryset, name, value):

View file

@ -1004,8 +1004,5 @@ class ThreadSerializer(serializers.ModelSerializer):
class SearchDocumentSerializer(serializers.Serializer): class SearchDocumentSerializer(serializers.Serializer):
"""Serializer for fulltext search requests through Find application""" """Serializer for fulltext search requests through Find application"""
q = serializers.CharField(required=True, allow_blank=False, trim_whitespace=True) q = serializers.CharField(required=True, allow_blank=True, trim_whitespace=True)
page_size = serializers.IntegerField( path = serializers.CharField(required=False, allow_blank=False)
required=False, min_value=1, max_value=50, default=20
)
page = serializers.IntegerField(required=False, min_value=1, default=1)

View file

@ -72,7 +72,11 @@ from core.utils import (
) )
from . import permissions, serializers, utils from . import permissions, serializers, utils
from .filters import DocumentFilter, ListDocumentFilter, UserSearchFilter from .filters import (
DocumentFilter,
ListDocumentFilter,
UserSearchFilter,
)
from .throttling import ( from .throttling import (
DocumentThrottle, DocumentThrottle,
UserListThrottleBurst, UserListThrottleBurst,
@ -604,20 +608,18 @@ class DocumentViewSet(
It performs early filtering on model fields, annotates user roles, and removes It performs early filtering on model fields, annotates user roles, and removes
descendant documents to keep only the highest ancestors readable by the current user. descendant documents to keep only the highest ancestors readable by the current user.
""" """
user = self.request.user user = request.user
# Not calling filter_queryset. We do our own cooking. # Not calling filter_queryset. We do our own cooking.
queryset = self.get_queryset() queryset = self.get_queryset()
filterset = ListDocumentFilter( filterset = ListDocumentFilter(request.GET, queryset=queryset, request=request)
self.request.GET, queryset=queryset, request=self.request
)
if not filterset.is_valid(): if not filterset.is_valid():
raise drf.exceptions.ValidationError(filterset.errors) raise drf.exceptions.ValidationError(filterset.errors)
filter_data = filterset.form.cleaned_data filter_data = filterset.form.cleaned_data
# Filter as early as possible on fields that are available on the model # Filter as early as possible on fields that are available on the model
for field in ["is_creator_me", "title"]: for field in ["is_creator_me", "title", "q"]:
queryset = filterset.filters[field].filter(queryset, filter_data[field]) queryset = filterset.filters[field].filter(queryset, filter_data[field])
queryset = queryset.annotate_user_roles(user) queryset = queryset.annotate_user_roles(user)
@ -1084,7 +1086,7 @@ class DocumentViewSet(
filter_data = filterset.form.cleaned_data filter_data = filterset.form.cleaned_data
# Filter as early as possible on fields that are available on the model # Filter as early as possible on fields that are available on the model
for field in ["is_creator_me", "title"]: for field in ["is_creator_me", "title", "q"]:
queryset = filterset.filters[field].filter(queryset, filter_data[field]) queryset = filterset.filters[field].filter(queryset, filter_data[field])
queryset = queryset.annotate_user_roles(user) queryset = queryset.annotate_user_roles(user)
@ -1107,7 +1109,11 @@ class DocumentViewSet(
ordering=["path"], ordering=["path"],
) )
def descendants(self, request, *args, **kwargs): def descendants(self, request, *args, **kwargs):
"""Handle listing descendants of a document""" """Deprecated endpoint to list descendants of a document."""
logger.warning(
"The 'descendants' endpoint is deprecated and will be removed in a future release. "
"The search endpoint should be used for all document retrieval use cases."
)
document = self.get_object() document = self.get_object()
queryset = document.get_descendants().filter(ancestors_deleted_at__isnull=True) queryset = document.get_descendants().filter(ancestors_deleted_at__isnull=True)
@ -1397,82 +1403,103 @@ class DocumentViewSet(
return duplicated_document return duplicated_document
def _search_simple(self, request, text):
"""
Returns a queryset filtered by the content of the document title
"""
# As the 'list' view we get a prefiltered queryset (deleted docs are excluded)
queryset = self.get_queryset()
filterset = DocumentFilter({"title": text}, queryset=queryset)
if not filterset.is_valid():
raise drf.exceptions.ValidationError(filterset.errors)
queryset = filterset.filter_queryset(queryset)
return self.get_response_for_queryset(
queryset.order_by("-updated_at"),
context={
"request": request,
},
)
def _search_fulltext(self, indexer, request, params):
"""
Returns a queryset from the results the fulltext search of Find
"""
access_token = request.session.get("oidc_access_token")
user = request.user
text = params.validated_data["q"]
queryset = models.Document.objects.all()
# Retrieve the documents ids from Find.
results = indexer.search(
text=text,
token=access_token,
visited=get_visited_document_ids_of(queryset, user),
)
docs_by_uuid = {str(d.pk): d for d in queryset.filter(pk__in=results)}
ordered_docs = [docs_by_uuid[id] for id in results]
page = self.paginate_queryset(ordered_docs)
serializer = self.get_serializer(
page if page else ordered_docs,
many=True,
context={
"request": request,
},
)
return self.get_paginated_response(serializer.data)
@drf.decorators.action(detail=False, methods=["get"], url_path="search") @drf.decorators.action(detail=False, methods=["get"], url_path="search")
@method_decorator(refresh_oidc_access_token) @method_decorator(refresh_oidc_access_token)
def search(self, request, *args, **kwargs): def search(self, request, *args, **kwargs):
""" """
Returns a DRF response containing the filtered, annotated and ordered document list. Returns an ordered list of documents best matching the search query parameter 'q'.
Applies filtering based on request parameter 'q' from `SearchDocumentSerializer`. It depends on a search configurable Search Indexer. If no Search Indexer is configured
Depending of the configuration it can be: or if it is not reachable, the function falls back to a basic title search.
- A fulltext search through the opensearch indexation app "find" if the backend is
enabled (see SEARCH_INDEXER_CLASS)
- A filtering by the model field 'title'.
The ordering is always by the most recent first.
""" """
params = serializers.SearchDocumentSerializer(data=request.query_params) params = serializers.SearchDocumentSerializer(data=request.query_params)
params.is_valid(raise_exception=True) params.is_valid(raise_exception=True)
indexer = get_document_indexer() indexer = get_document_indexer()
if indexer is None:
# fallback on title search if the indexer is not configured
return self._title_search(request, params.validated_data, *args, **kwargs)
if indexer: try:
return self._search_fulltext(indexer, request, params=params) return self._search_with_indexer(indexer, request, params=params)
except requests.exceptions.RequestException as e:
logger.error("Error while searching documents with indexer: %s", e)
# fallback on title search if the indexer is not reached
return self._title_search(request, params.validated_data, *args, **kwargs)
# The indexer is not configured, we fallback on a simple icontains filter by the @staticmethod
# model field 'title'. def _search_with_indexer(indexer, request, params):
return self._search_simple(request, text=params.validated_data["q"]) """
Returns a list of documents matching the query (q) according to the configured indexer.
"""
queryset = models.Document.objects.all()
results = indexer.search(
q=params.validated_data["q"],
token=request.session.get("oidc_access_token"),
path=(
params.validated_data["path"]
if "path" in params.validated_data
else None
),
visited=get_visited_document_ids_of(queryset, request.user),
)
return drf_response.Response(
{
"count": len(results),
"next": None,
"previous": None,
"results": results,
}
)
def _title_search(self, request, validated_data, *args, **kwargs):
"""
Fallback search method when no indexer is configured.
Only searches in the title field of documents.
"""
if not validated_data.get("path"):
return self.list(request, *args, **kwargs)
return self._list_descendants(request, validated_data)
def _list_descendants(self, request, validated_data):
"""
List all documents whose path starts with the provided path parameter.
Includes the parent document itself.
Used internally by the search endpoint when path filtering is requested.
"""
# Get parent document without access filtering
parent_path = validated_data["path"]
try:
parent = models.Document.objects.annotate_user_roles(request.user).get(
path=parent_path
)
except models.Document.DoesNotExist as exc:
raise drf.exceptions.NotFound("Document not found from path.") from exc
abilities = parent.get_abilities(request.user)
if not abilities.get("search"):
raise drf.exceptions.PermissionDenied(
"You do not have permission to search within this document."
)
# Get descendants and include the parent, ordered by path
queryset = (
parent.get_descendants(include_self=True)
.filter(ancestors_deleted_at__isnull=True)
.order_by("path")
)
queryset = self.filter_queryset(queryset)
# filter by title
filterset = DocumentFilter(request.GET, queryset=queryset)
if not filterset.is_valid():
raise drf.exceptions.ValidationError(filterset.errors)
queryset = filterset.qs
return self.get_response_for_queryset(queryset)
@drf.decorators.action(detail=True, methods=["get"], url_path="versions") @drf.decorators.action(detail=True, methods=["get"], url_path="versions")
def versions_list(self, request, *args, **kwargs): def versions_list(self, request, *args, **kwargs):

View file

@ -1330,6 +1330,7 @@ class Document(MP_Node, BaseModel):
"versions_destroy": is_owner_or_admin, "versions_destroy": is_owner_or_admin,
"versions_list": has_access_role, "versions_list": has_access_role,
"versions_retrieve": has_access_role, "versions_retrieve": has_access_role,
"search": can_get,
} }
def send_email(self, subject, emails, context=None, language=None): def send_email(self, subject, emails, context=None, language=None):

View file

@ -8,7 +8,6 @@ from functools import cache
from django.conf import settings from django.conf import settings
from django.contrib.auth.models import AnonymousUser from django.contrib.auth.models import AnonymousUser
from django.core.exceptions import ImproperlyConfigured from django.core.exceptions import ImproperlyConfigured
from django.db.models import Subquery
from django.utils.module_loading import import_string from django.utils.module_loading import import_string
import requests import requests
@ -78,7 +77,9 @@ def get_visited_document_ids_of(queryset, user):
if isinstance(user, AnonymousUser): if isinstance(user, AnonymousUser):
return [] return []
qs = models.LinkTrace.objects.filter(user=user) visited_ids = models.LinkTrace.objects.filter(user=user).values_list(
"document_id", flat=True
)
docs = ( docs = (
queryset.exclude(accesses__user=user) queryset.exclude(accesses__user=user)
@ -86,7 +87,7 @@ def get_visited_document_ids_of(queryset, user):
deleted_at__isnull=True, deleted_at__isnull=True,
ancestors_deleted_at__isnull=True, ancestors_deleted_at__isnull=True,
) )
.filter(pk__in=Subquery(qs.values("document_id"))) .filter(pk__in=visited_ids)
.order_by("pk") .order_by("pk")
.distinct("pk") .distinct("pk")
) )
@ -107,15 +108,13 @@ class BaseDocumentIndexer(ABC):
Initialize the indexer. Initialize the indexer.
""" """
self.batch_size = settings.SEARCH_INDEXER_BATCH_SIZE self.batch_size = settings.SEARCH_INDEXER_BATCH_SIZE
self.indexer_url = settings.SEARCH_INDEXER_URL self.indexer_url = settings.INDEXING_URL
self.indexer_secret = settings.SEARCH_INDEXER_SECRET self.indexer_secret = settings.SEARCH_INDEXER_SECRET
self.search_url = settings.SEARCH_INDEXER_QUERY_URL self.search_url = settings.SEARCH_URL
self.search_limit = settings.SEARCH_INDEXER_QUERY_LIMIT self.search_limit = settings.SEARCH_INDEXER_QUERY_LIMIT
if not self.indexer_url: if not self.indexer_url:
raise ImproperlyConfigured( raise ImproperlyConfigured("INDEXING_URL must be set in Django settings.")
"SEARCH_INDEXER_URL must be set in Django settings."
)
if not self.indexer_secret: if not self.indexer_secret:
raise ImproperlyConfigured( raise ImproperlyConfigured(
@ -123,9 +122,7 @@ class BaseDocumentIndexer(ABC):
) )
if not self.search_url: if not self.search_url:
raise ImproperlyConfigured( raise ImproperlyConfigured("SEARCH_URL must be set in Django settings.")
"SEARCH_INDEXER_QUERY_URL must be set in Django settings."
)
def index(self, queryset=None, batch_size=None): def index(self, queryset=None, batch_size=None):
""" """
@ -185,7 +182,7 @@ class BaseDocumentIndexer(ABC):
""" """
# pylint: disable-next=too-many-arguments,too-many-positional-arguments # pylint: disable-next=too-many-arguments,too-many-positional-arguments
def search(self, text, token, visited=(), nb_results=None): def search(self, q, token, visited=(), nb_results=None, path=None):
""" """
Search for documents in Find app. Search for documents in Find app.
Ensure the same default ordering as "Docs" list : -updated_at Ensure the same default ordering as "Docs" list : -updated_at
@ -193,7 +190,7 @@ class BaseDocumentIndexer(ABC):
Returns ids of the documents Returns ids of the documents
Args: Args:
text (str): Text search content. q (str): user query.
token (str): OIDC Authentication token. token (str): OIDC Authentication token.
visited (list, optional): visited (list, optional):
List of ids of active public documents with LinkTrace List of ids of active public documents with LinkTrace
@ -201,21 +198,24 @@ class BaseDocumentIndexer(ABC):
nb_results (int, optional): nb_results (int, optional):
The number of results to return. The number of results to return.
Defaults to 50 if not specified. Defaults to 50 if not specified.
path (str, optional):
The parent path to search descendants of.
""" """
nb_results = nb_results or self.search_limit nb_results = nb_results or self.search_limit
response = self.search_query( results = self.search_query(
data={ data={
"q": text, "q": q,
"visited": visited, "visited": visited,
"services": ["docs"], "services": ["docs"],
"nb_results": nb_results, "nb_results": nb_results,
"order_by": "updated_at", "order_by": "updated_at",
"order_direction": "desc", "order_direction": "desc",
"path": path,
}, },
token=token, token=token,
) )
return [d["_id"] for d in response] return results
@abstractmethod @abstractmethod
def search_query(self, data, token) -> dict: def search_query(self, data, token) -> dict:
@ -226,11 +226,57 @@ class BaseDocumentIndexer(ABC):
""" """
class SearchIndexer(BaseDocumentIndexer): class FindDocumentIndexer(BaseDocumentIndexer):
""" """
Document indexer that pushes documents to La Suite Find app. Document indexer that indexes and searches documents with La Suite Find app.
""" """
# pylint: disable=too-many-arguments,too-many-positional-arguments
def search(self, q, token, visited=(), nb_results=None, path=None):
"""format Find search results"""
search_results = super().search(q, token, visited, nb_results, path)
return [
{
**hit["_source"],
"id": hit["_id"],
"title": self.get_title(hit["_source"]),
}
for hit in search_results
]
@staticmethod
def get_title(source):
"""
Find returns the titles with an extension depending on the language.
This function extracts the title in a generic way.
Handles multiple cases:
- Localized title fields like "title.<some_extension>"
- Fallback to plain "title" field if localized version not found
- Returns empty string if no title field exists
Args:
source (dict): The _source dictionary from a search hit
Returns:
str: The extracted title or empty string if not found
Example:
>>> get_title({"title.fr": "Bonjour", "id": 1})
"Bonjour"
>>> get_title({"title": "Hello", "id": 1})
"Hello"
>>> get_title({"id": 1})
""
"""
titles = utils.get_value_by_pattern(source, r"^title\.")
for title in titles:
if title:
return title
if "title" in source:
return source["title"]
return ""
def serialize_document(self, document, accesses): def serialize_document(self, document, accesses):
""" """
Convert a Document to the JSON format expected by La Suite Find. Convert a Document to the JSON format expected by La Suite Find.

View file

@ -63,7 +63,7 @@ def batch_document_indexer_task(timestamp):
logger.info("Indexed %d documents", count) logger.info("Indexed %d documents", count)
def trigger_batch_document_indexer(item): def trigger_batch_document_indexer(document):
""" """
Trigger indexation task with debounce a delay set by the SEARCH_INDEXER_COUNTDOWN setting. Trigger indexation task with debounce a delay set by the SEARCH_INDEXER_COUNTDOWN setting.
@ -82,14 +82,14 @@ def trigger_batch_document_indexer(item):
if batch_indexer_throttle_acquire(timeout=countdown): if batch_indexer_throttle_acquire(timeout=countdown):
logger.info( logger.info(
"Add task for batch document indexation from updated_at=%s in %d seconds", "Add task for batch document indexation from updated_at=%s in %d seconds",
item.updated_at.isoformat(), document.updated_at.isoformat(),
countdown, countdown,
) )
batch_document_indexer_task.apply_async( batch_document_indexer_task.apply_async(
args=[item.updated_at], countdown=countdown args=[document.updated_at], countdown=countdown
) )
else: else:
logger.info("Skip task for batch document %s indexation", item.pk) logger.info("Skip task for batch document %s indexation", document.pk)
else: else:
document_indexer_task.apply(args=[item.pk]) document_indexer_task.apply(args=[document.pk])

View file

@ -11,7 +11,7 @@ from django.db import transaction
import pytest import pytest
from core import factories from core import factories
from core.services.search_indexers import SearchIndexer from core.services.search_indexers import FindDocumentIndexer
@pytest.mark.django_db @pytest.mark.django_db
@ -19,7 +19,7 @@ from core.services.search_indexers import SearchIndexer
def test_index(): def test_index():
"""Test the command `index` that run the Find app indexer for all the available documents.""" """Test the command `index` that run the Find app indexer for all the available documents."""
user = factories.UserFactory() user = factories.UserFactory()
indexer = SearchIndexer() indexer = FindDocumentIndexer()
with transaction.atomic(): with transaction.atomic():
doc = factories.DocumentFactory() doc = factories.DocumentFactory()
@ -36,7 +36,7 @@ def test_index():
str(no_title_doc.path): {"users": [user.sub]}, str(no_title_doc.path): {"users": [user.sub]},
} }
with mock.patch.object(SearchIndexer, "push") as mock_push: with mock.patch.object(FindDocumentIndexer, "push") as mock_push:
call_command("index") call_command("index")
push_call_args = [call.args[0] for call in mock_push.call_args_list] push_call_args = [call.args[0] for call in mock_push.call_args_list]

View file

@ -39,12 +39,10 @@ def indexer_settings_fixture(settings):
get_document_indexer.cache_clear() get_document_indexer.cache_clear()
settings.SEARCH_INDEXER_CLASS = "core.services.search_indexers.SearchIndexer" settings.SEARCH_INDEXER_CLASS = "core.services.search_indexers.FindDocumentIndexer"
settings.SEARCH_INDEXER_SECRET = "ThisIsAKeyForTest" settings.SEARCH_INDEXER_SECRET = "ThisIsAKeyForTest"
settings.SEARCH_INDEXER_URL = "http://localhost:8081/api/v1.0/documents/index/" settings.INDEXING_URL = "http://localhost:8081/api/v1.0/documents/index/"
settings.SEARCH_INDEXER_QUERY_URL = ( settings.SEARCH_URL = "http://localhost:8081/api/v1.0/documents/search/"
"http://localhost:8081/api/v1.0/documents/search/"
)
settings.SEARCH_INDEXER_COUNTDOWN = 1 settings.SEARCH_INDEXER_COUNTDOWN = 1
yield settings yield settings

View file

@ -16,7 +16,16 @@ fake = Faker()
pytestmark = pytest.mark.django_db pytestmark = pytest.mark.django_db
def test_api_documents_list_filter_and_access_rights(): @pytest.mark.parametrize(
"title_search_field",
# for integration with indexer search we must have
# the same filtering behaviour with "q" and "title" parameters
[
("title"),
("q"),
],
)
def test_api_documents_list_filter_and_access_rights(title_search_field):
"""Filtering on querystring parameters should respect access rights.""" """Filtering on querystring parameters should respect access rights."""
user = factories.UserFactory() user = factories.UserFactory()
client = APIClient() client = APIClient()
@ -76,7 +85,7 @@ def test_api_documents_list_filter_and_access_rights():
filters = { filters = {
"link_reach": random.choice([None, *models.LinkReachChoices.values]), "link_reach": random.choice([None, *models.LinkReachChoices.values]),
"title": random.choice([None, *word_list]), title_search_field: random.choice([None, *word_list]),
"favorite": random.choice([None, True, False]), "favorite": random.choice([None, True, False]),
"creator": random.choice([None, user, other_user]), "creator": random.choice([None, user, other_user]),
"ordering": random.choice( "ordering": random.choice(

View file

@ -59,6 +59,7 @@ def test_api_documents_retrieve_anonymous_public_standalone():
"partial_update": document.link_role == "editor", "partial_update": document.link_role == "editor",
"restore": False, "restore": False,
"retrieve": True, "retrieve": True,
"search": True,
"tree": True, "tree": True,
"update": document.link_role == "editor", "update": document.link_role == "editor",
"versions_destroy": False, "versions_destroy": False,
@ -136,6 +137,7 @@ def test_api_documents_retrieve_anonymous_public_parent():
"partial_update": grand_parent.link_role == "editor", "partial_update": grand_parent.link_role == "editor",
"restore": False, "restore": False,
"retrieve": True, "retrieve": True,
"search": True,
"tree": True, "tree": True,
"update": grand_parent.link_role == "editor", "update": grand_parent.link_role == "editor",
"versions_destroy": False, "versions_destroy": False,
@ -246,6 +248,7 @@ def test_api_documents_retrieve_authenticated_unrelated_public_or_authenticated(
"partial_update": document.link_role == "editor", "partial_update": document.link_role == "editor",
"restore": False, "restore": False,
"retrieve": True, "retrieve": True,
"search": True,
"tree": True, "tree": True,
"update": document.link_role == "editor", "update": document.link_role == "editor",
"versions_destroy": False, "versions_destroy": False,
@ -330,6 +333,7 @@ def test_api_documents_retrieve_authenticated_public_or_authenticated_parent(rea
"partial_update": grand_parent.link_role == "editor", "partial_update": grand_parent.link_role == "editor",
"restore": False, "restore": False,
"retrieve": True, "retrieve": True,
"search": True,
"tree": True, "tree": True,
"update": grand_parent.link_role == "editor", "update": grand_parent.link_role == "editor",
"versions_destroy": False, "versions_destroy": False,
@ -529,6 +533,7 @@ def test_api_documents_retrieve_authenticated_related_parent():
"partial_update": access.role not in ["reader", "commenter"], "partial_update": access.role not in ["reader", "commenter"],
"restore": access.role == "owner", "restore": access.role == "owner",
"retrieve": True, "retrieve": True,
"search": True,
"tree": True, "tree": True,
"update": access.role not in ["reader", "commenter"], "update": access.role not in ["reader", "commenter"],
"versions_destroy": access.role in ["administrator", "owner"], "versions_destroy": access.role in ["administrator", "owner"],

View file

@ -1,46 +1,31 @@
""" """
Tests for Documents API endpoint in impress's core app: list Tests for Documents API endpoint in impress's core app: search
""" """
import random from unittest import mock
from json import loads as json_loads
from django.test import RequestFactory
import pytest import pytest
import responses import responses
from faker import Faker from faker import Faker
from rest_framework import response as drf_response
from rest_framework.test import APIClient from rest_framework.test import APIClient
from core import factories, models from core import factories
from core.services.search_indexers import get_document_indexer from core.services.search_indexers import get_document_indexer
fake = Faker() fake = Faker()
pytestmark = pytest.mark.django_db pytestmark = pytest.mark.django_db
def build_search_url(**kwargs): @mock.patch("core.services.search_indexers.FindDocumentIndexer.search_query")
"""Build absolute uri for search endpoint with ORDERED query arguments"""
return (
RequestFactory()
.get("/api/v1.0/documents/search/", dict(sorted(kwargs.items())))
.build_absolute_uri()
)
@pytest.mark.parametrize("role", models.LinkRoleChoices.values)
@pytest.mark.parametrize("reach", models.LinkReachChoices.values)
@responses.activate @responses.activate
def test_api_documents_search_anonymous(reach, role, indexer_settings): def test_api_documents_search_anonymous(search_query, indexer_settings):
""" """
Anonymous users should not be allowed to search documents whatever the Anonymous users should be allowed to search documents with Find.
link reach and link role
""" """
indexer_settings.SEARCH_INDEXER_QUERY_URL = "http://find/api/v1.0/search" indexer_settings.SEARCH_URL = "http://find/api/v1.0/search"
factories.DocumentFactory(link_reach=reach, link_role=role) # mock Find response
# Find response
responses.add( responses.add(
responses.POST, responses.POST,
"http://find/api/v1.0/search", "http://find/api/v1.0/search",
@ -48,7 +33,22 @@ def test_api_documents_search_anonymous(reach, role, indexer_settings):
status=200, status=200,
) )
response = APIClient().get("/api/v1.0/documents/search/", data={"q": "alpha"}) q = "alpha"
response = APIClient().get("/api/v1.0/documents/search/", data={"q": q})
assert search_query.call_count == 1
assert search_query.call_args[1] == {
"data": {
"q": q,
"visited": [],
"services": ["docs"],
"nb_results": 50,
"order_by": "updated_at",
"order_direction": "desc",
"path": None,
},
"token": None,
}
assert response.status_code == 200 assert response.status_code == 200
assert response.json() == { assert response.json() == {
@ -59,64 +59,121 @@ def test_api_documents_search_anonymous(reach, role, indexer_settings):
} }
def test_api_documents_search_endpoint_is_none(indexer_settings): @mock.patch("core.api.viewsets.DocumentViewSet.list")
def test_api_documents_search_fall_back_on_search_list(mock_list, indexer_settings):
""" """
Missing SEARCH_INDEXER_QUERY_URL, so the indexer is not properly configured. When indexer is not configured and no path is provided,
Should fallback on title filter should fall back on list method
""" """
indexer_settings.SEARCH_INDEXER_QUERY_URL = None indexer_settings.SEARCH_URL = None
assert get_document_indexer() is None assert get_document_indexer() is None
user = factories.UserFactory() user = factories.UserFactory()
document = factories.DocumentFactory(title="alpha")
access = factories.UserDocumentAccessFactory(document=document, user=user)
client = APIClient() client = APIClient()
client.force_login(user) client.force_login(user)
response = client.get("/api/v1.0/documents/search/", data={"q": "alpha"}) mocked_response = {
"count": 0,
assert response.status_code == 200
content = response.json()
results = content.pop("results")
assert content == {
"count": 1,
"next": None, "next": None,
"previous": None, "previous": None,
"results": [{"title": "mocked list result"}],
} }
assert len(results) == 1 mock_list.return_value = drf_response.Response(mocked_response)
assert results[0] == {
"id": str(document.id), q = "alpha"
"abilities": document.get_abilities(user), response = client.get("/api/v1.0/documents/search/", data={"q": q})
"ancestors_link_reach": None,
"ancestors_link_role": None, assert mock_list.call_count == 1
"computed_link_reach": document.computed_link_reach, assert mock_list.call_args[0][0].GET.get("q") == q
"computed_link_role": document.computed_link_role, assert response.json() == mocked_response
"created_at": document.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(document.creator.id),
"depth": 1, @mock.patch("core.api.viewsets.DocumentViewSet._list_descendants")
"excerpt": document.excerpt, def test_api_documents_search_fallback_on_search_list_sub_docs(
"link_reach": document.link_reach, mock_list_descendants, indexer_settings
"link_role": document.link_role, ):
"nb_accesses_ancestors": 1, """
"nb_accesses_direct": 1, When indexer is not configured and path parameter is provided,
"numchild": 0, should call _list_descendants() method
"path": document.path, """
"title": document.title, indexer_settings.SEARCH_URL = "http://find/api/v1.0/search"
"updated_at": document.updated_at.isoformat().replace("+00:00", "Z"), assert get_document_indexer() is not None
"deleted_at": None,
"user_role": access.role, user = factories.UserFactory()
client = APIClient()
client.force_login(user)
parent = factories.DocumentFactory(title="parent", users=[user])
mocked_response = {
"count": 0,
"next": None,
"previous": None,
"results": [{"title": "mocked _list_descendants result"}],
} }
mock_list_descendants.return_value = drf_response.Response(mocked_response)
q = "alpha"
response = client.get(
"/api/v1.0/documents/search/", data={"q": q, "path": parent.path}
)
assert mock_list_descendants.call_count == 1
assert mock_list_descendants.call_args[0][0].GET.get("q") == q
assert mock_list_descendants.call_args[0][0].GET.get("path") == parent.path
assert response.json() == mocked_response
@mock.patch("core.api.viewsets.DocumentViewSet._title_search")
def test_api_documents_search_indexer_crashes(mock_title_search, indexer_settings):
"""
When indexer is configured but crashes -> falls back on title_search
"""
# indexer is properly configured
indexer_settings.SEARCH_URL = None
assert get_document_indexer() is None
# but returns an error when the query is sent
responses.add(
responses.POST,
"http://find/api/v1.0/search",
json=[{"error": "Some indexer error"}],
status=404,
)
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
mocked_response = {
"count": 0,
"next": None,
"previous": None,
"results": [{"title": "mocked title_search result"}],
}
mock_title_search.return_value = drf_response.Response(mocked_response)
parent = factories.DocumentFactory(title="parent", users=[user])
q = "alpha"
response = client.get(
"/api/v1.0/documents/search/", data={"q": "alpha", "path": parent.path}
)
# the search endpoint did not crash
assert response.status_code == 200
# fallback on title_search
assert mock_title_search.call_count == 1
assert mock_title_search.call_args[0][0].GET.get("q") == q
assert mock_title_search.call_args[0][0].GET.get("path") == parent.path
assert response.json() == mocked_response
@responses.activate @responses.activate
def test_api_documents_search_invalid_params(indexer_settings): def test_api_documents_search_invalid_params(indexer_settings):
"""Validate the format of documents as returned by the search view.""" """Validate the format of documents as returned by the search view."""
indexer_settings.SEARCH_INDEXER_QUERY_URL = "http://find/api/v1.0/search" indexer_settings.SEARCH_URL = "http://find/api/v1.0/search"
assert get_document_indexer() is not None
user = factories.UserFactory() user = factories.UserFactory()
client = APIClient() client = APIClient()
client.force_login(user) client.force_login(user)
@ -125,49 +182,28 @@ def test_api_documents_search_invalid_params(indexer_settings):
assert response.status_code == 400 assert response.status_code == 400
assert response.json() == {"q": ["This field is required."]} assert response.json() == {"q": ["This field is required."]}
response = client.get("/api/v1.0/documents/search/", data={"q": " "})
assert response.status_code == 400
assert response.json() == {"q": ["This field may not be blank."]}
response = client.get(
"/api/v1.0/documents/search/", data={"q": "any", "page": "NaN"}
)
assert response.status_code == 400
assert response.json() == {"page": ["A valid integer is required."]}
@responses.activate @responses.activate
def test_api_documents_search_format(indexer_settings): def test_api_documents_search_success(indexer_settings):
"""Validate the format of documents as returned by the search view.""" """Validate the format of documents as returned by the search view."""
indexer_settings.SEARCH_INDEXER_QUERY_URL = "http://find/api/v1.0/search" indexer_settings.SEARCH_URL = "http://find/api/v1.0/search"
assert get_document_indexer() is not None assert get_document_indexer() is not None
user = factories.UserFactory() document = {"id": "doc-123", "title": "alpha", "path": "path/to/alpha.pdf"}
client = APIClient()
client.force_login(user)
user_a, user_b, user_c = factories.UserFactory.create_batch(3)
document = factories.DocumentFactory(
title="alpha",
users=(user_a, user_c),
link_traces=(user, user_b),
)
access = factories.UserDocumentAccessFactory(document=document, user=user)
# Find response # Find response
responses.add( responses.add(
responses.POST, responses.POST,
"http://find/api/v1.0/search", "http://find/api/v1.0/search",
json=[ json=[
{"_id": str(document.pk)}, {
"_id": str(document["id"]),
"_source": {"title": document["title"], "path": document["path"]},
},
], ],
status=200, status=200,
) )
response = client.get("/api/v1.0/documents/search/", data={"q": "alpha"}) response = APIClient().get("/api/v1.0/documents/search/", data={"q": "alpha"})
assert response.status_code == 200 assert response.status_code == 200
content = response.json() content = response.json()
@ -177,249 +213,6 @@ def test_api_documents_search_format(indexer_settings):
"next": None, "next": None,
"previous": None, "previous": None,
} }
assert len(results) == 1 assert results == [
assert results[0] == { {"id": document["id"], "title": document["title"], "path": document["path"]}
"id": str(document.id), ]
"abilities": document.get_abilities(user),
"ancestors_link_reach": None,
"ancestors_link_role": None,
"computed_link_reach": document.computed_link_reach,
"computed_link_role": document.computed_link_role,
"created_at": document.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(document.creator.id),
"depth": 1,
"excerpt": document.excerpt,
"link_reach": document.link_reach,
"link_role": document.link_role,
"nb_accesses_ancestors": 3,
"nb_accesses_direct": 3,
"numchild": 0,
"path": document.path,
"title": document.title,
"updated_at": document.updated_at.isoformat().replace("+00:00", "Z"),
"deleted_at": None,
"user_role": access.role,
}
@responses.activate
@pytest.mark.parametrize(
"pagination, status, expected",
(
(
{"page": 1, "page_size": 10},
200,
{
"count": 10,
"previous": None,
"next": None,
"range": (0, None),
},
),
(
{},
200,
{
"count": 10,
"previous": None,
"next": None,
"range": (0, None),
"api_page_size": 21, # default page_size is 20
},
),
(
{"page": 2, "page_size": 10},
404,
{},
),
(
{"page": 1, "page_size": 5},
200,
{
"count": 10,
"previous": None,
"next": {"page": 2, "page_size": 5},
"range": (0, 5),
},
),
(
{"page": 2, "page_size": 5},
200,
{
"count": 10,
"previous": {"page_size": 5},
"next": None,
"range": (5, None),
},
),
({"page": 3, "page_size": 5}, 404, {}),
),
)
def test_api_documents_search_pagination(
indexer_settings, pagination, status, expected
):
"""Documents should be ordered by descending "score" by default"""
indexer_settings.SEARCH_INDEXER_QUERY_URL = "http://find/api/v1.0/search"
assert get_document_indexer() is not None
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
docs = factories.DocumentFactory.create_batch(10, title="alpha", users=[user])
docs_by_uuid = {str(doc.pk): doc for doc in docs}
api_results = [{"_id": id} for id in docs_by_uuid.keys()]
# reorder randomly to simulate score ordering
random.shuffle(api_results)
# Find response
# pylint: disable-next=assignment-from-none
api_search = responses.add(
responses.POST,
"http://find/api/v1.0/search",
json=api_results,
status=200,
)
response = client.get(
"/api/v1.0/documents/search/",
data={
"q": "alpha",
**pagination,
},
)
assert response.status_code == status
if response.status_code < 300:
previous_url = (
build_search_url(q="alpha", **expected["previous"])
if expected["previous"]
else None
)
next_url = (
build_search_url(q="alpha", **expected["next"])
if expected["next"]
else None
)
start, end = expected["range"]
content = response.json()
assert content["count"] == expected["count"]
assert content["previous"] == previous_url
assert content["next"] == next_url
results = content.pop("results")
# The find api results ordering by score is kept
assert [r["id"] for r in results] == [r["_id"] for r in api_results[start:end]]
# Check the query parameters.
assert api_search.call_count == 1
assert api_search.calls[0].response.status_code == 200
assert json_loads(api_search.calls[0].request.body) == {
"q": "alpha",
"visited": [],
"services": ["docs"],
"nb_results": 50,
"order_by": "updated_at",
"order_direction": "desc",
}
@responses.activate
@pytest.mark.parametrize(
"pagination, status, expected",
(
(
{"page": 1, "page_size": 10},
200,
{"count": 10, "previous": None, "next": None, "range": (0, None)},
),
(
{},
200,
{"count": 10, "previous": None, "next": None, "range": (0, None)},
),
(
{"page": 2, "page_size": 10},
404,
{},
),
(
{"page": 1, "page_size": 5},
200,
{
"count": 10,
"previous": None,
"next": {"page": 2, "page_size": 5},
"range": (0, 5),
},
),
(
{"page": 2, "page_size": 5},
200,
{
"count": 10,
"previous": {"page_size": 5},
"next": None,
"range": (5, None),
},
),
({"page": 3, "page_size": 5}, 404, {}),
),
)
def test_api_documents_search_pagination_endpoint_is_none(
indexer_settings, pagination, status, expected
):
"""Documents should be ordered by descending "-updated_at" by default"""
indexer_settings.SEARCH_INDEXER_QUERY_URL = None
assert get_document_indexer() is None
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
factories.DocumentFactory.create_batch(10, title="alpha", users=[user])
response = client.get(
"/api/v1.0/documents/search/",
data={
"q": "alpha",
**pagination,
},
)
assert response.status_code == status
if response.status_code < 300:
previous_url = (
build_search_url(q="alpha", **expected["previous"])
if expected["previous"]
else None
)
next_url = (
build_search_url(q="alpha", **expected["next"])
if expected["next"]
else None
)
queryset = models.Document.objects.order_by("-updated_at")
start, end = expected["range"]
expected_results = [str(d.pk) for d in queryset[start:end]]
content = response.json()
assert content["count"] == expected["count"]
assert content["previous"] == previous_url
assert content["next"] == next_url
results = content.pop("results")
assert [r["id"] for r in results] == expected_results

View file

@ -0,0 +1,956 @@
"""
Tests for search API endpoint in impress's core app when indexer is not
available and a path param is given.
"""
import random
from django.contrib.auth.models import AnonymousUser
import pytest
from rest_framework.test import APIClient
from core import factories
from core.api.filters import remove_accents
pytestmark = pytest.mark.django_db
@pytest.fixture(autouse=True)
def disable_indexer(indexer_settings):
"""Disable search indexer for all tests in this file."""
indexer_settings.SEARCH_INDEXER_CLASS = None
def test_api_documents_search_descendants_list_anonymous_public_standalone():
"""Anonymous users should be allowed to retrieve the descendants of a public document."""
document = factories.DocumentFactory(link_reach="public", title="doc parent")
child1, child2 = factories.DocumentFactory.create_batch(
2, parent=document, title="doc child"
)
grand_child = factories.DocumentFactory(parent=child1, title="doc grand child")
factories.UserDocumentAccessFactory(document=child1)
response = APIClient().get(
"/api/v1.0/documents/search/", data={"q": "doc", "path": document.path}
)
assert response.status_code == 200
assert response.json() == {
"count": 4,
"next": None,
"previous": None,
"results": [
{
# the search should include the parent document itself
"abilities": document.get_abilities(AnonymousUser()),
"ancestors_link_role": None,
"ancestors_link_reach": None,
"computed_link_reach": document.computed_link_reach,
"computed_link_role": document.computed_link_role,
"created_at": document.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(document.creator.id),
"deleted_at": None,
"depth": 1,
"excerpt": document.excerpt,
"id": str(document.id),
"is_favorite": False,
"link_reach": document.link_reach,
"link_role": document.link_role,
"numchild": 2,
"nb_accesses_ancestors": 0,
"nb_accesses_direct": 0,
"path": document.path,
"title": document.title,
"updated_at": document.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": child1.get_abilities(AnonymousUser()),
"ancestors_link_reach": document.link_reach,
"ancestors_link_role": document.link_role,
"computed_link_reach": child1.computed_link_reach,
"computed_link_role": child1.computed_link_role,
"created_at": child1.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child1.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child1.excerpt,
"id": str(child1.id),
"is_favorite": False,
"link_reach": child1.link_reach,
"link_role": child1.link_role,
"numchild": 1,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 1,
"path": child1.path,
"title": child1.title,
"updated_at": child1.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": grand_child.get_abilities(AnonymousUser()),
"ancestors_link_reach": document.link_reach,
"ancestors_link_role": document.link_role
if (child1.link_reach == "public" and child1.link_role == "editor")
else document.link_role,
"computed_link_reach": "public",
"computed_link_role": grand_child.computed_link_role,
"created_at": grand_child.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(grand_child.creator.id),
"deleted_at": None,
"depth": 3,
"excerpt": grand_child.excerpt,
"id": str(grand_child.id),
"is_favorite": False,
"link_reach": grand_child.link_reach,
"link_role": grand_child.link_role,
"numchild": 0,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": grand_child.path,
"title": grand_child.title,
"updated_at": grand_child.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": child2.get_abilities(AnonymousUser()),
"ancestors_link_reach": document.link_reach,
"ancestors_link_role": document.link_role,
"computed_link_reach": "public",
"computed_link_role": child2.computed_link_role,
"created_at": child2.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child2.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child2.excerpt,
"id": str(child2.id),
"is_favorite": False,
"link_reach": child2.link_reach,
"link_role": child2.link_role,
"numchild": 0,
"nb_accesses_ancestors": 0,
"nb_accesses_direct": 0,
"path": child2.path,
"title": child2.title,
"updated_at": child2.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
],
}
def test_api_documents_search_descendants_list_anonymous_public_parent():
"""
Anonymous users should be allowed to retrieve the descendants of a document who
has a public ancestor.
"""
grand_parent = factories.DocumentFactory(
link_reach="public", title="grand parent doc"
)
parent = factories.DocumentFactory(
parent=grand_parent,
link_reach=random.choice(["authenticated", "restricted"]),
title="parent doc",
)
document = factories.DocumentFactory(
link_reach=random.choice(["authenticated", "restricted"]),
parent=parent,
title="document",
)
child1, child2 = factories.DocumentFactory.create_batch(
2, parent=document, title="child doc"
)
grand_child = factories.DocumentFactory(parent=child1, title="grand child doc")
factories.UserDocumentAccessFactory(document=child1)
response = APIClient().get(
"/api/v1.0/documents/search/", data={"q": "doc", "path": document.path}
)
assert response.status_code == 200
assert response.json() == {
"count": 4,
"next": None,
"previous": None,
"results": [
{
# the search should include the parent document itself
"abilities": document.get_abilities(AnonymousUser()),
"ancestors_link_reach": "public",
"ancestors_link_role": grand_parent.link_role,
"computed_link_reach": document.computed_link_reach,
"computed_link_role": document.computed_link_role,
"created_at": document.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(document.creator.id),
"deleted_at": None,
"depth": 3,
"excerpt": document.excerpt,
"id": str(document.id),
"is_favorite": False,
"link_reach": document.link_reach,
"link_role": document.link_role,
"numchild": 2,
"nb_accesses_ancestors": 0,
"nb_accesses_direct": 0,
"path": document.path,
"title": document.title,
"updated_at": document.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": child1.get_abilities(AnonymousUser()),
"ancestors_link_reach": "public",
"ancestors_link_role": grand_parent.link_role,
"computed_link_reach": child1.computed_link_reach,
"computed_link_role": child1.computed_link_role,
"created_at": child1.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child1.creator.id),
"deleted_at": None,
"depth": 4,
"excerpt": child1.excerpt,
"id": str(child1.id),
"is_favorite": False,
"link_reach": child1.link_reach,
"link_role": child1.link_role,
"numchild": 1,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 1,
"path": child1.path,
"title": child1.title,
"updated_at": child1.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": grand_child.get_abilities(AnonymousUser()),
"ancestors_link_reach": "public",
"ancestors_link_role": grand_child.ancestors_link_role,
"computed_link_reach": "public",
"computed_link_role": grand_child.computed_link_role,
"created_at": grand_child.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(grand_child.creator.id),
"deleted_at": None,
"depth": 5,
"excerpt": grand_child.excerpt,
"id": str(grand_child.id),
"is_favorite": False,
"link_reach": grand_child.link_reach,
"link_role": grand_child.link_role,
"numchild": 0,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": grand_child.path,
"title": grand_child.title,
"updated_at": grand_child.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": child2.get_abilities(AnonymousUser()),
"ancestors_link_reach": "public",
"ancestors_link_role": grand_parent.link_role,
"computed_link_reach": "public",
"computed_link_role": child2.computed_link_role,
"created_at": child2.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child2.creator.id),
"deleted_at": None,
"depth": 4,
"excerpt": child2.excerpt,
"id": str(child2.id),
"is_favorite": False,
"link_reach": child2.link_reach,
"link_role": child2.link_role,
"numchild": 0,
"nb_accesses_ancestors": 0,
"nb_accesses_direct": 0,
"path": child2.path,
"title": child2.title,
"updated_at": child2.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
],
}
@pytest.mark.parametrize("reach", ["restricted", "authenticated"])
def test_api_documents_search_descendants_list_anonymous_restricted_or_authenticated(
reach,
):
"""
Anonymous users should not be able to retrieve descendants of a document that is not public.
"""
document = factories.DocumentFactory(title="parent", link_reach=reach)
child = factories.DocumentFactory(title="child", parent=document)
_grand_child = factories.DocumentFactory(title="grand child", parent=child)
response = APIClient().get(
"/api/v1.0/documents/search/", data={"q": "child", "path": document.path}
)
assert response.status_code == 403
assert response.json() == {
"detail": "You do not have permission to search within this document."
}
@pytest.mark.parametrize("reach", ["public", "authenticated"])
def test_api_documents_search_descendants_list_authenticated_unrelated_public_or_authenticated(
reach,
):
"""
Authenticated users should be able to retrieve the descendants of a public/authenticated
document to which they are not related.
"""
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
document = factories.DocumentFactory(link_reach=reach, title="parent")
child1, child2 = factories.DocumentFactory.create_batch(
2, parent=document, link_reach="restricted", title="child"
)
grand_child = factories.DocumentFactory(parent=child1, title="grand child")
factories.UserDocumentAccessFactory(document=child1)
response = client.get(
"/api/v1.0/documents/search/", data={"q": "child", "path": document.path}
)
assert response.status_code == 200
assert response.json() == {
"count": 3,
"next": None,
"previous": None,
"results": [
{
"abilities": child1.get_abilities(user),
"ancestors_link_reach": reach,
"ancestors_link_role": document.link_role,
"computed_link_reach": child1.computed_link_reach,
"computed_link_role": child1.computed_link_role,
"created_at": child1.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child1.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child1.excerpt,
"id": str(child1.id),
"is_favorite": False,
"link_reach": child1.link_reach,
"link_role": child1.link_role,
"numchild": 1,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 1,
"path": child1.path,
"title": child1.title,
"updated_at": child1.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": grand_child.get_abilities(user),
"ancestors_link_reach": reach,
"ancestors_link_role": document.link_role,
"computed_link_reach": grand_child.computed_link_reach,
"computed_link_role": grand_child.computed_link_role,
"created_at": grand_child.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(grand_child.creator.id),
"deleted_at": None,
"depth": 3,
"excerpt": grand_child.excerpt,
"id": str(grand_child.id),
"is_favorite": False,
"link_reach": grand_child.link_reach,
"link_role": grand_child.link_role,
"numchild": 0,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": grand_child.path,
"title": grand_child.title,
"updated_at": grand_child.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": child2.get_abilities(user),
"ancestors_link_reach": reach,
"ancestors_link_role": document.link_role,
"computed_link_reach": child2.computed_link_reach,
"computed_link_role": child2.computed_link_role,
"created_at": child2.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child2.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child2.excerpt,
"id": str(child2.id),
"is_favorite": False,
"link_reach": child2.link_reach,
"link_role": child2.link_role,
"numchild": 0,
"nb_accesses_ancestors": 0,
"nb_accesses_direct": 0,
"path": child2.path,
"title": child2.title,
"updated_at": child2.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
],
}
@pytest.mark.parametrize("reach", ["public", "authenticated"])
def test_api_documents_search_descendants_list_authenticated_public_or_authenticated_parent(
reach,
):
"""
Authenticated users should be allowed to retrieve the descendants of a document who
has a public or authenticated ancestor.
"""
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
grand_parent = factories.DocumentFactory(link_reach=reach, title="grand parent")
parent = factories.DocumentFactory(
parent=grand_parent, link_reach="restricted", title="parent"
)
document = factories.DocumentFactory(
link_reach="restricted", parent=parent, title="document"
)
child1, child2 = factories.DocumentFactory.create_batch(
2, parent=document, link_reach="restricted", title="child"
)
grand_child = factories.DocumentFactory(parent=child1, title="grand child")
factories.UserDocumentAccessFactory(document=child1)
response = client.get(
"/api/v1.0/documents/search/", data={"q": "child", "path": document.path}
)
assert response.status_code == 200
assert response.json() == {
"count": 3,
"next": None,
"previous": None,
"results": [
{
"abilities": child1.get_abilities(user),
"ancestors_link_reach": reach,
"ancestors_link_role": grand_parent.link_role,
"computed_link_reach": child1.computed_link_reach,
"computed_link_role": child1.computed_link_role,
"created_at": child1.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child1.creator.id),
"deleted_at": None,
"depth": 4,
"excerpt": child1.excerpt,
"id": str(child1.id),
"is_favorite": False,
"link_reach": child1.link_reach,
"link_role": child1.link_role,
"numchild": 1,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 1,
"path": child1.path,
"title": child1.title,
"updated_at": child1.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": grand_child.get_abilities(user),
"ancestors_link_reach": reach,
"ancestors_link_role": grand_parent.link_role,
"computed_link_reach": grand_child.computed_link_reach,
"computed_link_role": grand_child.computed_link_role,
"created_at": grand_child.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(grand_child.creator.id),
"deleted_at": None,
"depth": 5,
"excerpt": grand_child.excerpt,
"id": str(grand_child.id),
"is_favorite": False,
"link_reach": grand_child.link_reach,
"link_role": grand_child.link_role,
"numchild": 0,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": grand_child.path,
"title": grand_child.title,
"updated_at": grand_child.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
{
"abilities": child2.get_abilities(user),
"ancestors_link_reach": reach,
"ancestors_link_role": grand_parent.link_role,
"computed_link_reach": child2.computed_link_reach,
"computed_link_role": child2.computed_link_role,
"created_at": child2.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child2.creator.id),
"deleted_at": None,
"depth": 4,
"excerpt": child2.excerpt,
"id": str(child2.id),
"is_favorite": False,
"link_reach": child2.link_reach,
"link_role": child2.link_role,
"numchild": 0,
"nb_accesses_ancestors": 0,
"nb_accesses_direct": 0,
"path": child2.path,
"title": child2.title,
"updated_at": child2.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": None,
},
],
}
def test_api_documents_search_descendants_list_authenticated_unrelated_restricted():
"""
Authenticated users should not be allowed to retrieve the descendants of a document that is
restricted and to which they are not related.
"""
user = factories.UserFactory(with_owned_document=True)
client = APIClient()
client.force_login(user)
document = factories.DocumentFactory(link_reach="restricted", title="parent")
child1, _child2 = factories.DocumentFactory.create_batch(
2, parent=document, title="child"
)
_grand_child = factories.DocumentFactory(parent=child1, title="grand child")
factories.UserDocumentAccessFactory(document=child1)
response = client.get(
"/api/v1.0/documents/search/", data={"q": "child", "path": document.path}
)
assert response.status_code == 403
assert response.json() == {
"detail": "You do not have permission to search within this document."
}
def test_api_documents_search_descendants_list_authenticated_related_direct():
"""
Authenticated users should be allowed to retrieve the descendants of a document
to which they are directly related whatever the role.
"""
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
document = factories.DocumentFactory(title="parent")
access = factories.UserDocumentAccessFactory(document=document, user=user)
factories.UserDocumentAccessFactory(document=document)
child1, child2 = factories.DocumentFactory.create_batch(
2, parent=document, title="child"
)
factories.UserDocumentAccessFactory(document=child1)
grand_child = factories.DocumentFactory(parent=child1, title="grand child")
response = client.get(
"/api/v1.0/documents/search/", data={"q": "child", "path": document.path}
)
assert response.status_code == 200
assert response.json() == {
"count": 3,
"next": None,
"previous": None,
"results": [
{
"abilities": child1.get_abilities(user),
"ancestors_link_reach": child1.ancestors_link_reach,
"ancestors_link_role": child1.ancestors_link_role,
"computed_link_reach": child1.computed_link_reach,
"computed_link_role": child1.computed_link_role,
"created_at": child1.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child1.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child1.excerpt,
"id": str(child1.id),
"is_favorite": False,
"link_reach": child1.link_reach,
"link_role": child1.link_role,
"numchild": 1,
"nb_accesses_ancestors": 3,
"nb_accesses_direct": 1,
"path": child1.path,
"title": child1.title,
"updated_at": child1.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": access.role,
},
{
"abilities": grand_child.get_abilities(user),
"ancestors_link_reach": grand_child.ancestors_link_reach,
"ancestors_link_role": grand_child.ancestors_link_role,
"computed_link_reach": grand_child.computed_link_reach,
"computed_link_role": grand_child.computed_link_role,
"created_at": grand_child.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(grand_child.creator.id),
"deleted_at": None,
"depth": 3,
"excerpt": grand_child.excerpt,
"id": str(grand_child.id),
"is_favorite": False,
"link_reach": grand_child.link_reach,
"link_role": grand_child.link_role,
"numchild": 0,
"nb_accesses_ancestors": 3,
"nb_accesses_direct": 0,
"path": grand_child.path,
"title": grand_child.title,
"updated_at": grand_child.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": access.role,
},
{
"abilities": child2.get_abilities(user),
"ancestors_link_reach": child2.ancestors_link_reach,
"ancestors_link_role": child2.ancestors_link_role,
"computed_link_reach": child2.computed_link_reach,
"computed_link_role": child2.computed_link_role,
"created_at": child2.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child2.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child2.excerpt,
"id": str(child2.id),
"is_favorite": False,
"link_reach": child2.link_reach,
"link_role": child2.link_role,
"numchild": 0,
"nb_accesses_ancestors": 2,
"nb_accesses_direct": 0,
"path": child2.path,
"title": child2.title,
"updated_at": child2.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": access.role,
},
],
}
def test_api_documents_search_descendants_list_authenticated_related_parent():
"""
Authenticated users should be allowed to retrieve the descendants of a document if they
are related to one of its ancestors whatever the role.
"""
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
grand_parent = factories.DocumentFactory(link_reach="restricted", title="parent")
grand_parent_access = factories.UserDocumentAccessFactory(
document=grand_parent, user=user
)
parent = factories.DocumentFactory(
parent=grand_parent, link_reach="restricted", title="parent"
)
document = factories.DocumentFactory(
parent=parent, link_reach="restricted", title="document"
)
child1, child2 = factories.DocumentFactory.create_batch(
2, parent=document, title="child"
)
factories.UserDocumentAccessFactory(document=child1)
grand_child = factories.DocumentFactory(parent=child1, title="grand child")
response = client.get(
"/api/v1.0/documents/search/", data={"q": "child", "path": document.path}
)
assert response.status_code == 200
assert response.json() == {
"count": 3,
"next": None,
"previous": None,
"results": [
{
"abilities": child1.get_abilities(user),
"ancestors_link_reach": child1.ancestors_link_reach,
"ancestors_link_role": child1.ancestors_link_role,
"computed_link_reach": child1.computed_link_reach,
"computed_link_role": child1.computed_link_role,
"created_at": child1.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child1.creator.id),
"deleted_at": None,
"depth": 4,
"excerpt": child1.excerpt,
"id": str(child1.id),
"is_favorite": False,
"link_reach": child1.link_reach,
"link_role": child1.link_role,
"numchild": 1,
"nb_accesses_ancestors": 2,
"nb_accesses_direct": 1,
"path": child1.path,
"title": child1.title,
"updated_at": child1.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": grand_parent_access.role,
},
{
"abilities": grand_child.get_abilities(user),
"ancestors_link_reach": grand_child.ancestors_link_reach,
"ancestors_link_role": grand_child.ancestors_link_role,
"computed_link_reach": grand_child.computed_link_reach,
"computed_link_role": grand_child.computed_link_role,
"created_at": grand_child.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(grand_child.creator.id),
"deleted_at": None,
"depth": 5,
"excerpt": grand_child.excerpt,
"id": str(grand_child.id),
"is_favorite": False,
"link_reach": grand_child.link_reach,
"link_role": grand_child.link_role,
"numchild": 0,
"nb_accesses_ancestors": 2,
"nb_accesses_direct": 0,
"path": grand_child.path,
"title": grand_child.title,
"updated_at": grand_child.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": grand_parent_access.role,
},
{
"abilities": child2.get_abilities(user),
"ancestors_link_reach": child2.ancestors_link_reach,
"ancestors_link_role": child2.ancestors_link_role,
"computed_link_reach": child2.computed_link_reach,
"computed_link_role": child2.computed_link_role,
"created_at": child2.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child2.creator.id),
"deleted_at": None,
"depth": 4,
"excerpt": child2.excerpt,
"id": str(child2.id),
"is_favorite": False,
"link_reach": child2.link_reach,
"link_role": child2.link_role,
"numchild": 0,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": child2.path,
"title": child2.title,
"updated_at": child2.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": grand_parent_access.role,
},
],
}
def test_api_documents_search_descendants_list_authenticated_related_child():
"""
Authenticated users should not be allowed to retrieve all the descendants of a document
as a result of being related to one of its children.
"""
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
document = factories.DocumentFactory(link_reach="restricted")
child1, _child2 = factories.DocumentFactory.create_batch(2, parent=document)
_grand_child = factories.DocumentFactory(parent=child1)
factories.UserDocumentAccessFactory(document=child1, user=user)
factories.UserDocumentAccessFactory(document=document)
response = client.get(
"/api/v1.0/documents/search/", data={"q": "doc", "path": document.path}
)
assert response.status_code == 403
assert response.json() == {
"detail": "You do not have permission to search within this document."
}
def test_api_documents_search_descendants_list_authenticated_related_team_none(
mock_user_teams,
):
"""
Authenticated users should not be able to retrieve the descendants of a restricted document
related to teams in which the user is not.
"""
mock_user_teams.return_value = []
user = factories.UserFactory(with_owned_document=True)
client = APIClient()
client.force_login(user)
document = factories.DocumentFactory(link_reach="restricted", title="document")
factories.DocumentFactory.create_batch(2, parent=document, title="child")
factories.TeamDocumentAccessFactory(document=document, team="myteam")
response = client.get(
"/api/v1.0/documents/search/", data={"q": "doc", "path": document.path}
)
assert response.status_code == 403
assert response.json() == {
"detail": "You do not have permission to search within this document."
}
def test_api_documents_search_descendants_list_authenticated_related_team_members(
mock_user_teams,
):
"""
Authenticated users should be allowed to retrieve the descendants of a document to which they
are related via a team whatever the role.
"""
mock_user_teams.return_value = ["myteam"]
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
document = factories.DocumentFactory(link_reach="restricted", title="parent")
child1, child2 = factories.DocumentFactory.create_batch(
2, parent=document, title="child"
)
grand_child = factories.DocumentFactory(parent=child1, title="grand child")
access = factories.TeamDocumentAccessFactory(document=document, team="myteam")
response = client.get(
"/api/v1.0/documents/search/", data={"q": "child", "path": document.path}
)
# pylint: disable=R0801
assert response.status_code == 200
assert response.json() == {
"count": 3,
"next": None,
"previous": None,
"results": [
{
"abilities": child1.get_abilities(user),
"ancestors_link_reach": child1.ancestors_link_reach,
"ancestors_link_role": child1.ancestors_link_role,
"computed_link_reach": child1.computed_link_reach,
"computed_link_role": child1.computed_link_role,
"created_at": child1.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child1.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child1.excerpt,
"id": str(child1.id),
"is_favorite": False,
"link_reach": child1.link_reach,
"link_role": child1.link_role,
"numchild": 1,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": child1.path,
"title": child1.title,
"updated_at": child1.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": access.role,
},
{
"abilities": grand_child.get_abilities(user),
"ancestors_link_reach": grand_child.ancestors_link_reach,
"ancestors_link_role": grand_child.ancestors_link_role,
"computed_link_reach": grand_child.computed_link_reach,
"computed_link_role": grand_child.computed_link_role,
"created_at": grand_child.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(grand_child.creator.id),
"deleted_at": None,
"depth": 3,
"excerpt": grand_child.excerpt,
"id": str(grand_child.id),
"is_favorite": False,
"link_reach": grand_child.link_reach,
"link_role": grand_child.link_role,
"numchild": 0,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": grand_child.path,
"title": grand_child.title,
"updated_at": grand_child.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": access.role,
},
{
"abilities": child2.get_abilities(user),
"ancestors_link_reach": child2.ancestors_link_reach,
"ancestors_link_role": child2.ancestors_link_role,
"computed_link_reach": child2.computed_link_reach,
"computed_link_role": child2.computed_link_role,
"created_at": child2.created_at.isoformat().replace("+00:00", "Z"),
"creator": str(child2.creator.id),
"deleted_at": None,
"depth": 2,
"excerpt": child2.excerpt,
"id": str(child2.id),
"is_favorite": False,
"link_reach": child2.link_reach,
"link_role": child2.link_role,
"numchild": 0,
"nb_accesses_ancestors": 1,
"nb_accesses_direct": 0,
"path": child2.path,
"title": child2.title,
"updated_at": child2.updated_at.isoformat().replace("+00:00", "Z"),
"user_role": access.role,
},
],
}
@pytest.mark.parametrize(
"query,nb_results",
[
("", 7), # Empty string
("Project Alpha", 1), # Exact match
("project", 2), # Partial match (case-insensitive)
("Guide", 2), # Word match within a title
("Special", 0), # No match (nonexistent keyword)
("2024", 2), # Match by numeric keyword
("velo", 1), # Accent-insensitive match (velo vs vélo)
("bêta", 1), # Accent-insensitive match (bêta vs beta)
],
)
def test_api_documents_search_descendants_search_on_title(query, nb_results):
"""Authenticated users should be able to search documents by their unaccented title."""
user = factories.UserFactory()
client = APIClient()
client.force_login(user)
parent = factories.DocumentFactory(users=[user])
# Create documents with predefined titles
titles = [
"Project Alpha Documentation",
"Project Beta Overview",
"User Guide",
"Financial Report 2024",
"Annual Review 2024",
"Guide du vélo urbain", # <-- Title with accent for accent-insensitive test
]
for title in titles:
factories.DocumentFactory(title=title, parent=parent)
# Perform the search query
response = client.get(
"/api/v1.0/documents/search/", data={"q": query, "path": parent.path}
)
assert response.status_code == 200
results = response.json()["results"]
assert len(results) == nb_results
# Ensure all results contain the query in their title
for result in results:
assert (
remove_accents(query).lower().strip()
in remove_accents(result["title"]).lower()
)

View file

@ -101,6 +101,7 @@ def test_api_documents_trashbin_format():
"partial_update": False, "partial_update": False,
"restore": True, "restore": True,
"retrieve": True, "retrieve": True,
"search": False,
"tree": True, "tree": True,
"update": False, "update": False,
"versions_destroy": False, "versions_destroy": False,

View file

@ -189,6 +189,7 @@ def test_models_documents_get_abilities_forbidden(
"versions_destroy": False, "versions_destroy": False,
"versions_list": False, "versions_list": False,
"versions_retrieve": False, "versions_retrieve": False,
"search": False,
} }
nb_queries = 1 if is_authenticated else 0 nb_queries = 1 if is_authenticated else 0
with django_assert_num_queries(nb_queries): with django_assert_num_queries(nb_queries):
@ -255,6 +256,7 @@ def test_models_documents_get_abilities_reader(
"versions_destroy": False, "versions_destroy": False,
"versions_list": False, "versions_list": False,
"versions_retrieve": False, "versions_retrieve": False,
"search": True,
} }
nb_queries = 1 if is_authenticated else 0 nb_queries = 1 if is_authenticated else 0
with django_assert_num_queries(nb_queries): with django_assert_num_queries(nb_queries):
@ -326,6 +328,7 @@ def test_models_documents_get_abilities_commenter(
"versions_destroy": False, "versions_destroy": False,
"versions_list": False, "versions_list": False,
"versions_retrieve": False, "versions_retrieve": False,
"search": True,
} }
nb_queries = 1 if is_authenticated else 0 nb_queries = 1 if is_authenticated else 0
with django_assert_num_queries(nb_queries): with django_assert_num_queries(nb_queries):
@ -394,6 +397,7 @@ def test_models_documents_get_abilities_editor(
"versions_destroy": False, "versions_destroy": False,
"versions_list": False, "versions_list": False,
"versions_retrieve": False, "versions_retrieve": False,
"search": True,
} }
nb_queries = 1 if is_authenticated else 0 nb_queries = 1 if is_authenticated else 0
with django_assert_num_queries(nb_queries): with django_assert_num_queries(nb_queries):
@ -451,6 +455,7 @@ def test_models_documents_get_abilities_owner(django_assert_num_queries):
"versions_destroy": True, "versions_destroy": True,
"versions_list": True, "versions_list": True,
"versions_retrieve": True, "versions_retrieve": True,
"search": True,
} }
with django_assert_num_queries(1): with django_assert_num_queries(1):
assert document.get_abilities(user) == expected_abilities assert document.get_abilities(user) == expected_abilities
@ -494,6 +499,7 @@ def test_models_documents_get_abilities_owner(django_assert_num_queries):
"versions_destroy": False, "versions_destroy": False,
"versions_list": False, "versions_list": False,
"versions_retrieve": False, "versions_retrieve": False,
"search": False,
} }
@ -541,6 +547,7 @@ def test_models_documents_get_abilities_administrator(django_assert_num_queries)
"versions_destroy": True, "versions_destroy": True,
"versions_list": True, "versions_list": True,
"versions_retrieve": True, "versions_retrieve": True,
"search": True,
} }
with django_assert_num_queries(1): with django_assert_num_queries(1):
assert document.get_abilities(user) == expected_abilities assert document.get_abilities(user) == expected_abilities
@ -598,6 +605,7 @@ def test_models_documents_get_abilities_editor_user(django_assert_num_queries):
"versions_destroy": False, "versions_destroy": False,
"versions_list": True, "versions_list": True,
"versions_retrieve": True, "versions_retrieve": True,
"search": True,
} }
with django_assert_num_queries(1): with django_assert_num_queries(1):
assert document.get_abilities(user) == expected_abilities assert document.get_abilities(user) == expected_abilities
@ -663,6 +671,7 @@ def test_models_documents_get_abilities_reader_user(
"versions_destroy": False, "versions_destroy": False,
"versions_list": True, "versions_list": True,
"versions_retrieve": True, "versions_retrieve": True,
"search": True,
} }
with override_settings(AI_ALLOW_REACH_FROM=ai_access_setting): with override_settings(AI_ALLOW_REACH_FROM=ai_access_setting):
@ -729,6 +738,7 @@ def test_models_documents_get_abilities_commenter_user(
"versions_destroy": False, "versions_destroy": False,
"versions_list": True, "versions_list": True,
"versions_retrieve": True, "versions_retrieve": True,
"search": True,
} }
with override_settings(AI_ALLOW_REACH_FROM=ai_access_setting): with override_settings(AI_ALLOW_REACH_FROM=ai_access_setting):
@ -791,6 +801,7 @@ def test_models_documents_get_abilities_preset_role(django_assert_num_queries):
"versions_destroy": False, "versions_destroy": False,
"versions_list": True, "versions_list": True,
"versions_retrieve": True, "versions_retrieve": True,
"search": True,
} }

View file

@ -1,5 +1,5 @@
""" """
Unit tests for the Document model Unit tests for FindDocumentIndexer
""" """
# pylint: disable=too-many-lines # pylint: disable=too-many-lines
@ -12,7 +12,7 @@ from django.db import transaction
import pytest import pytest
from core import factories, models from core import factories, models
from core.services.search_indexers import SearchIndexer from core.services.search_indexers import FindDocumentIndexer
pytestmark = pytest.mark.django_db pytestmark = pytest.mark.django_db
@ -30,7 +30,7 @@ def reset_throttle():
reset_batch_indexer_throttle() reset_batch_indexer_throttle()
@mock.patch.object(SearchIndexer, "push") @mock.patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings") @pytest.mark.usefixtures("indexer_settings")
@pytest.mark.django_db(transaction=True) @pytest.mark.django_db(transaction=True)
def test_models_documents_post_save_indexer(mock_push): def test_models_documents_post_save_indexer(mock_push):
@ -41,7 +41,7 @@ def test_models_documents_post_save_indexer(mock_push):
accesses = {} accesses = {}
data = [call.args[0] for call in mock_push.call_args_list] data = [call.args[0] for call in mock_push.call_args_list]
indexer = SearchIndexer() indexer = FindDocumentIndexer()
assert len(data) == 1 assert len(data) == 1
@ -64,14 +64,14 @@ def test_models_documents_post_save_indexer_no_batches(indexer_settings):
"""Test indexation task on doculment creation, no throttle""" """Test indexation task on doculment creation, no throttle"""
indexer_settings.SEARCH_INDEXER_COUNTDOWN = 0 indexer_settings.SEARCH_INDEXER_COUNTDOWN = 0
with mock.patch.object(SearchIndexer, "push") as mock_push: with mock.patch.object(FindDocumentIndexer, "push") as mock_push:
with transaction.atomic(): with transaction.atomic():
doc1, doc2, doc3 = factories.DocumentFactory.create_batch(3) doc1, doc2, doc3 = factories.DocumentFactory.create_batch(3)
accesses = {} accesses = {}
data = [call.args[0] for call in mock_push.call_args_list] data = [call.args[0] for call in mock_push.call_args_list]
indexer = SearchIndexer() indexer = FindDocumentIndexer()
# 3 calls # 3 calls
assert len(data) == 3 assert len(data) == 3
@ -91,7 +91,7 @@ def test_models_documents_post_save_indexer_no_batches(indexer_settings):
assert cache.get("file-batch-indexer-throttle") is None assert cache.get("file-batch-indexer-throttle") is None
@mock.patch.object(SearchIndexer, "push") @mock.patch.object(FindDocumentIndexer, "push")
@pytest.mark.django_db(transaction=True) @pytest.mark.django_db(transaction=True)
def test_models_documents_post_save_indexer_not_configured(mock_push, indexer_settings): def test_models_documents_post_save_indexer_not_configured(mock_push, indexer_settings):
"""Task should not start an indexation when disabled""" """Task should not start an indexation when disabled"""
@ -106,13 +106,13 @@ def test_models_documents_post_save_indexer_not_configured(mock_push, indexer_se
assert mock_push.assert_not_called assert mock_push.assert_not_called
@mock.patch.object(SearchIndexer, "push") @mock.patch.object(FindDocumentIndexer, "push")
@pytest.mark.django_db(transaction=True) @pytest.mark.django_db(transaction=True)
def test_models_documents_post_save_indexer_wrongly_configured( def test_models_documents_post_save_indexer_wrongly_configured(
mock_push, indexer_settings mock_push, indexer_settings
): ):
"""Task should not start an indexation when disabled""" """Task should not start an indexation when disabled"""
indexer_settings.SEARCH_INDEXER_URL = None indexer_settings.INDEXING_URL = None
user = factories.UserFactory() user = factories.UserFactory()
@ -123,7 +123,7 @@ def test_models_documents_post_save_indexer_wrongly_configured(
assert mock_push.assert_not_called assert mock_push.assert_not_called
@mock.patch.object(SearchIndexer, "push") @mock.patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings") @pytest.mark.usefixtures("indexer_settings")
@pytest.mark.django_db(transaction=True) @pytest.mark.django_db(transaction=True)
def test_models_documents_post_save_indexer_with_accesses(mock_push): def test_models_documents_post_save_indexer_with_accesses(mock_push):
@ -145,7 +145,7 @@ def test_models_documents_post_save_indexer_with_accesses(mock_push):
data = [call.args[0] for call in mock_push.call_args_list] data = [call.args[0] for call in mock_push.call_args_list]
indexer = SearchIndexer() indexer = FindDocumentIndexer()
assert len(data) == 1 assert len(data) == 1
assert sorted(data[0], key=itemgetter("id")) == sorted( assert sorted(data[0], key=itemgetter("id")) == sorted(
@ -158,7 +158,7 @@ def test_models_documents_post_save_indexer_with_accesses(mock_push):
) )
@mock.patch.object(SearchIndexer, "push") @mock.patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings") @pytest.mark.usefixtures("indexer_settings")
@pytest.mark.django_db(transaction=True) @pytest.mark.django_db(transaction=True)
def test_models_documents_post_save_indexer_deleted(mock_push): def test_models_documents_post_save_indexer_deleted(mock_push):
@ -207,7 +207,7 @@ def test_models_documents_post_save_indexer_deleted(mock_push):
data = [call.args[0] for call in mock_push.call_args_list] data = [call.args[0] for call in mock_push.call_args_list]
indexer = SearchIndexer() indexer = FindDocumentIndexer()
assert len(data) == 2 assert len(data) == 2
@ -244,14 +244,14 @@ def test_models_documents_indexer_hard_deleted():
factories.UserDocumentAccessFactory(document=doc, user=user) factories.UserDocumentAccessFactory(document=doc, user=user)
# Call task on deleted document. # Call task on deleted document.
with mock.patch.object(SearchIndexer, "push") as mock_push: with mock.patch.object(FindDocumentIndexer, "push") as mock_push:
doc.delete() doc.delete()
# Hard delete document are not re-indexed. # Hard delete document are not re-indexed.
assert mock_push.assert_not_called assert mock_push.assert_not_called
@mock.patch.object(SearchIndexer, "push") @mock.patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings") @pytest.mark.usefixtures("indexer_settings")
@pytest.mark.django_db(transaction=True) @pytest.mark.django_db(transaction=True)
def test_models_documents_post_save_indexer_restored(mock_push): def test_models_documents_post_save_indexer_restored(mock_push):
@ -308,7 +308,7 @@ def test_models_documents_post_save_indexer_restored(mock_push):
data = [call.args[0] for call in mock_push.call_args_list] data = [call.args[0] for call in mock_push.call_args_list]
indexer = SearchIndexer() indexer = FindDocumentIndexer()
# All docs are re-indexed # All docs are re-indexed
assert len(data) == 2 assert len(data) == 2
@ -337,16 +337,16 @@ def test_models_documents_post_save_indexer_restored(mock_push):
@pytest.mark.usefixtures("indexer_settings") @pytest.mark.usefixtures("indexer_settings")
def test_models_documents_post_save_indexer_throttle(): def test_models_documents_post_save_indexer_throttle():
"""Test indexation task skipping on document update""" """Test indexation task skipping on document update"""
indexer = SearchIndexer() indexer = FindDocumentIndexer()
user = factories.UserFactory() user = factories.UserFactory()
with mock.patch.object(SearchIndexer, "push"): with mock.patch.object(FindDocumentIndexer, "push"):
with transaction.atomic(): with transaction.atomic():
docs = factories.DocumentFactory.create_batch(5, users=(user,)) docs = factories.DocumentFactory.create_batch(5, users=(user,))
accesses = {str(item.path): {"users": [user.sub]} for item in docs} accesses = {str(item.path): {"users": [user.sub]} for item in docs}
with mock.patch.object(SearchIndexer, "push") as mock_push: with mock.patch.object(FindDocumentIndexer, "push") as mock_push:
# Simulate 1 running task # Simulate 1 running task
cache.set("document-batch-indexer-throttle", 1) cache.set("document-batch-indexer-throttle", 1)
@ -359,7 +359,7 @@ def test_models_documents_post_save_indexer_throttle():
assert [call.args[0] for call in mock_push.call_args_list] == [] assert [call.args[0] for call in mock_push.call_args_list] == []
with mock.patch.object(SearchIndexer, "push") as mock_push: with mock.patch.object(FindDocumentIndexer, "push") as mock_push:
# No waiting task # No waiting task
cache.delete("document-batch-indexer-throttle") cache.delete("document-batch-indexer-throttle")
@ -389,7 +389,7 @@ def test_models_documents_access_post_save_indexer():
"""Test indexation task on DocumentAccess update""" """Test indexation task on DocumentAccess update"""
users = factories.UserFactory.create_batch(3) users = factories.UserFactory.create_batch(3)
with mock.patch.object(SearchIndexer, "push"): with mock.patch.object(FindDocumentIndexer, "push"):
with transaction.atomic(): with transaction.atomic():
doc = factories.DocumentFactory(users=users) doc = factories.DocumentFactory(users=users)
doc_accesses = models.DocumentAccess.objects.filter(document=doc).order_by( doc_accesses = models.DocumentAccess.objects.filter(document=doc).order_by(
@ -398,7 +398,7 @@ def test_models_documents_access_post_save_indexer():
reset_batch_indexer_throttle() reset_batch_indexer_throttle()
with mock.patch.object(SearchIndexer, "push") as mock_push: with mock.patch.object(FindDocumentIndexer, "push") as mock_push:
with transaction.atomic(): with transaction.atomic():
for doc_access in doc_accesses: for doc_access in doc_accesses:
doc_access.save() doc_access.save()
@ -426,7 +426,7 @@ def test_models_items_access_post_save_indexer_no_throttle(indexer_settings):
reset_batch_indexer_throttle() reset_batch_indexer_throttle()
with mock.patch.object(SearchIndexer, "push") as mock_push: with mock.patch.object(FindDocumentIndexer, "push") as mock_push:
with transaction.atomic(): with transaction.atomic():
for doc_access in doc_accesses: for doc_access in doc_accesses:
doc_access.save() doc_access.save()
@ -439,3 +439,70 @@ def test_models_items_access_post_save_indexer_no_throttle(indexer_settings):
assert [len(d) for d in data] == [1] * 3 assert [len(d) for d in data] == [1] * 3
# the same document is indexed 3 times # the same document is indexed 3 times
assert [d[0]["id"] for d in data] == [str(doc.pk)] * 3 assert [d[0]["id"] for d in data] == [str(doc.pk)] * 3
@mock.patch.object(FindDocumentIndexer, "search_query")
@pytest.mark.usefixtures("indexer_settings")
def test_find_document_indexer_search(mock_search_query):
"""Test search function of FindDocumentIndexer returns formatted results"""
# Mock API response from Find
hits = [
{
"_id": "doc-123",
"_source": {
"title": "Test Document",
"content": "This is test content",
"updated_at": "2024-01-01T00:00:00Z",
"path": "/some/path/doc-123",
},
},
{
"_id": "doc-456",
"_source": {
"title.fr": "Document de test",
"content": "Contenu de test",
"updated_at": "2024-01-02T00:00:00Z",
},
},
]
mock_search_query.return_value = hits
q = "test"
token = "fake-token"
nb_results = 10
path = "/some/path/"
visited = ["doc-123"]
results = FindDocumentIndexer().search(
q=q, token=token, nb_results=nb_results, path=path, visited=visited
)
mock_search_query.assert_called_once()
call_args = mock_search_query.call_args
assert call_args[1]["data"] == {
"q": q,
"visited": visited,
"services": ["docs"],
"nb_results": nb_results,
"order_by": "updated_at",
"order_direction": "desc",
"path": path,
}
assert len(results) == 2
assert results == [
{
"id": hits[0]["_id"],
"title": hits[0]["_source"]["title"],
"content": hits[0]["_source"]["content"],
"updated_at": hits[0]["_source"]["updated_at"],
"path": hits[0]["_source"]["path"],
},
{
"id": hits[1]["_id"],
"title": hits[1]["_source"]["title.fr"],
"title.fr": hits[1]["_source"]["title.fr"], # <- Find response artefact
"content": hits[1]["_source"]["content"],
"updated_at": hits[1]["_source"]["updated_at"],
},
]

View file

@ -15,7 +15,7 @@ from requests import HTTPError
from core import factories, models, utils from core import factories, models, utils
from core.services.search_indexers import ( from core.services.search_indexers import (
BaseDocumentIndexer, BaseDocumentIndexer,
SearchIndexer, FindDocumentIndexer,
get_document_indexer, get_document_indexer,
get_visited_document_ids_of, get_visited_document_ids_of,
) )
@ -78,41 +78,41 @@ def test_services_search_indexer_is_configured(indexer_settings):
# Valid class # Valid class
indexer_settings.SEARCH_INDEXER_CLASS = ( indexer_settings.SEARCH_INDEXER_CLASS = (
"core.services.search_indexers.SearchIndexer" "core.services.search_indexers.FindDocumentIndexer"
) )
get_document_indexer.cache_clear() get_document_indexer.cache_clear()
assert get_document_indexer() is not None assert get_document_indexer() is not None
indexer_settings.SEARCH_INDEXER_URL = "" indexer_settings.INDEXING_URL = ""
# Invalid url # Invalid url
get_document_indexer.cache_clear() get_document_indexer.cache_clear()
assert not get_document_indexer() assert not get_document_indexer()
def test_services_search_indexer_url_is_none(indexer_settings): def test_services_indexing_url_is_none(indexer_settings):
""" """
Indexer should raise RuntimeError if SEARCH_INDEXER_URL is None or empty. Indexer should raise RuntimeError if INDEXING_URL is None or empty.
""" """
indexer_settings.SEARCH_INDEXER_URL = None indexer_settings.INDEXING_URL = None
with pytest.raises(ImproperlyConfigured) as exc_info: with pytest.raises(ImproperlyConfigured) as exc_info:
SearchIndexer() FindDocumentIndexer()
assert "SEARCH_INDEXER_URL must be set in Django settings." in str(exc_info.value) assert "INDEXING_URL must be set in Django settings." in str(exc_info.value)
def test_services_search_indexer_url_is_empty(indexer_settings): def test_services_indexing_url_is_empty(indexer_settings):
""" """
Indexer should raise RuntimeError if SEARCH_INDEXER_URL is empty string. Indexer should raise RuntimeError if INDEXING_URL is empty string.
""" """
indexer_settings.SEARCH_INDEXER_URL = "" indexer_settings.INDEXING_URL = ""
with pytest.raises(ImproperlyConfigured) as exc_info: with pytest.raises(ImproperlyConfigured) as exc_info:
SearchIndexer() FindDocumentIndexer()
assert "SEARCH_INDEXER_URL must be set in Django settings." in str(exc_info.value) assert "INDEXING_URL must be set in Django settings." in str(exc_info.value)
def test_services_search_indexer_secret_is_none(indexer_settings): def test_services_search_indexer_secret_is_none(indexer_settings):
@ -122,7 +122,7 @@ def test_services_search_indexer_secret_is_none(indexer_settings):
indexer_settings.SEARCH_INDEXER_SECRET = None indexer_settings.SEARCH_INDEXER_SECRET = None
with pytest.raises(ImproperlyConfigured) as exc_info: with pytest.raises(ImproperlyConfigured) as exc_info:
SearchIndexer() FindDocumentIndexer()
assert "SEARCH_INDEXER_SECRET must be set in Django settings." in str( assert "SEARCH_INDEXER_SECRET must be set in Django settings." in str(
exc_info.value exc_info.value
@ -136,39 +136,35 @@ def test_services_search_indexer_secret_is_empty(indexer_settings):
indexer_settings.SEARCH_INDEXER_SECRET = "" indexer_settings.SEARCH_INDEXER_SECRET = ""
with pytest.raises(ImproperlyConfigured) as exc_info: with pytest.raises(ImproperlyConfigured) as exc_info:
SearchIndexer() FindDocumentIndexer()
assert "SEARCH_INDEXER_SECRET must be set in Django settings." in str( assert "SEARCH_INDEXER_SECRET must be set in Django settings." in str(
exc_info.value exc_info.value
) )
def test_services_search_endpoint_is_none(indexer_settings): def test_services_search_url_is_none(indexer_settings):
""" """
Indexer should raise RuntimeError if SEARCH_INDEXER_QUERY_URL is None. Indexer should raise RuntimeError if SEARCH_URL is None.
""" """
indexer_settings.SEARCH_INDEXER_QUERY_URL = None indexer_settings.SEARCH_URL = None
with pytest.raises(ImproperlyConfigured) as exc_info: with pytest.raises(ImproperlyConfigured) as exc_info:
SearchIndexer() FindDocumentIndexer()
assert "SEARCH_INDEXER_QUERY_URL must be set in Django settings." in str( assert "SEARCH_URL must be set in Django settings." in str(exc_info.value)
exc_info.value
)
def test_services_search_endpoint_is_empty(indexer_settings): def test_services_search_url_is_empty(indexer_settings):
""" """
Indexer should raise RuntimeError if SEARCH_INDEXER_QUERY_URL is empty. Indexer should raise RuntimeError if SEARCH_URL is empty.
""" """
indexer_settings.SEARCH_INDEXER_QUERY_URL = "" indexer_settings.SEARCH_URL = ""
with pytest.raises(ImproperlyConfigured) as exc_info: with pytest.raises(ImproperlyConfigured) as exc_info:
SearchIndexer() FindDocumentIndexer()
assert "SEARCH_INDEXER_QUERY_URL must be set in Django settings." in str( assert "SEARCH_URL must be set in Django settings." in str(exc_info.value)
exc_info.value
)
@pytest.mark.usefixtures("indexer_settings") @pytest.mark.usefixtures("indexer_settings")
@ -192,7 +188,7 @@ def test_services_search_indexers_serialize_document_returns_expected_json():
} }
} }
indexer = SearchIndexer() indexer = FindDocumentIndexer()
result = indexer.serialize_document(document, accesses) result = indexer.serialize_document(document, accesses)
assert set(result.pop("users")) == {str(user_a.sub), str(user_b.sub)} assert set(result.pop("users")) == {str(user_a.sub), str(user_b.sub)}
@ -221,7 +217,7 @@ def test_services_search_indexers_serialize_document_deleted():
parent.soft_delete() parent.soft_delete()
document.refresh_from_db() document.refresh_from_db()
indexer = SearchIndexer() indexer = FindDocumentIndexer()
result = indexer.serialize_document(document, {}) result = indexer.serialize_document(document, {})
assert result["is_active"] is False assert result["is_active"] is False
@ -232,7 +228,7 @@ def test_services_search_indexers_serialize_document_empty():
"""Empty documents returns empty content in the serialized json.""" """Empty documents returns empty content in the serialized json."""
document = factories.DocumentFactory(content="", title=None) document = factories.DocumentFactory(content="", title=None)
indexer = SearchIndexer() indexer = FindDocumentIndexer()
result = indexer.serialize_document(document, {}) result = indexer.serialize_document(document, {})
assert result["content"] == "" assert result["content"] == ""
@ -246,7 +242,7 @@ def test_services_search_indexers_index_errors(indexer_settings):
""" """
factories.DocumentFactory() factories.DocumentFactory()
indexer_settings.SEARCH_INDEXER_URL = "http://app-find/api/v1.0/documents/index/" indexer_settings.INDEXING_URL = "http://app-find/api/v1.0/documents/index/"
responses.add( responses.add(
responses.POST, responses.POST,
@ -256,10 +252,10 @@ def test_services_search_indexers_index_errors(indexer_settings):
) )
with pytest.raises(HTTPError): with pytest.raises(HTTPError):
SearchIndexer().index() FindDocumentIndexer().index()
@patch.object(SearchIndexer, "push") @patch.object(FindDocumentIndexer, "push")
def test_services_search_indexers_batches_pass_only_batch_accesses( def test_services_search_indexers_batches_pass_only_batch_accesses(
mock_push, indexer_settings mock_push, indexer_settings
): ):
@ -276,7 +272,7 @@ def test_services_search_indexers_batches_pass_only_batch_accesses(
access = factories.UserDocumentAccessFactory(document=document) access = factories.UserDocumentAccessFactory(document=document)
expected_user_subs[str(document.id)] = str(access.user.sub) expected_user_subs[str(document.id)] = str(access.user.sub)
assert SearchIndexer().index() == 5 assert FindDocumentIndexer().index() == 5
# Should be 3 batches: 2 + 2 + 1 # Should be 3 batches: 2 + 2 + 1
assert mock_push.call_count == 3 assert mock_push.call_count == 3
@ -299,7 +295,7 @@ def test_services_search_indexers_batches_pass_only_batch_accesses(
assert seen_doc_ids == {str(d.id) for d in documents} assert seen_doc_ids == {str(d.id) for d in documents}
@patch.object(SearchIndexer, "push") @patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings") @pytest.mark.usefixtures("indexer_settings")
def test_services_search_indexers_batch_size_argument(mock_push): def test_services_search_indexers_batch_size_argument(mock_push):
""" """
@ -314,7 +310,7 @@ def test_services_search_indexers_batch_size_argument(mock_push):
access = factories.UserDocumentAccessFactory(document=document) access = factories.UserDocumentAccessFactory(document=document)
expected_user_subs[str(document.id)] = str(access.user.sub) expected_user_subs[str(document.id)] = str(access.user.sub)
assert SearchIndexer().index(batch_size=2) == 5 assert FindDocumentIndexer().index(batch_size=2) == 5
# Should be 3 batches: 2 + 2 + 1 # Should be 3 batches: 2 + 2 + 1
assert mock_push.call_count == 3 assert mock_push.call_count == 3
@ -337,7 +333,7 @@ def test_services_search_indexers_batch_size_argument(mock_push):
assert seen_doc_ids == {str(d.id) for d in documents} assert seen_doc_ids == {str(d.id) for d in documents}
@patch.object(SearchIndexer, "push") @patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings") @pytest.mark.usefixtures("indexer_settings")
def test_services_search_indexers_ignore_empty_documents(mock_push): def test_services_search_indexers_ignore_empty_documents(mock_push):
""" """
@ -349,7 +345,7 @@ def test_services_search_indexers_ignore_empty_documents(mock_push):
empty_title = factories.DocumentFactory(title="") empty_title = factories.DocumentFactory(title="")
empty_content = factories.DocumentFactory(content="") empty_content = factories.DocumentFactory(content="")
assert SearchIndexer().index() == 3 assert FindDocumentIndexer().index() == 3
assert mock_push.call_count == 1 assert mock_push.call_count == 1
@ -365,7 +361,7 @@ def test_services_search_indexers_ignore_empty_documents(mock_push):
} }
@patch.object(SearchIndexer, "push") @patch.object(FindDocumentIndexer, "push")
def test_services_search_indexers_skip_empty_batches(mock_push, indexer_settings): def test_services_search_indexers_skip_empty_batches(mock_push, indexer_settings):
""" """
Documents indexing batch can be empty if all the docs are empty. Documents indexing batch can be empty if all the docs are empty.
@ -377,14 +373,14 @@ def test_services_search_indexers_skip_empty_batches(mock_push, indexer_settings
# Only empty docs # Only empty docs
factories.DocumentFactory.create_batch(5, content="", title="") factories.DocumentFactory.create_batch(5, content="", title="")
assert SearchIndexer().index() == 1 assert FindDocumentIndexer().index() == 1
assert mock_push.call_count == 1 assert mock_push.call_count == 1
results = [doc["id"] for doc in mock_push.call_args[0][0]] results = [doc["id"] for doc in mock_push.call_args[0][0]]
assert results == [str(document.id)] assert results == [str(document.id)]
@patch.object(SearchIndexer, "push") @patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings") @pytest.mark.usefixtures("indexer_settings")
def test_services_search_indexers_ancestors_link_reach(mock_push): def test_services_search_indexers_ancestors_link_reach(mock_push):
"""Document accesses and reach should take into account ancestors link reaches.""" """Document accesses and reach should take into account ancestors link reaches."""
@ -395,7 +391,7 @@ def test_services_search_indexers_ancestors_link_reach(mock_push):
parent = factories.DocumentFactory(parent=grand_parent, link_reach="public") parent = factories.DocumentFactory(parent=grand_parent, link_reach="public")
document = factories.DocumentFactory(parent=parent, link_reach="restricted") document = factories.DocumentFactory(parent=parent, link_reach="restricted")
assert SearchIndexer().index() == 4 assert FindDocumentIndexer().index() == 4
results = {doc["id"]: doc for doc in mock_push.call_args[0][0]} results = {doc["id"]: doc for doc in mock_push.call_args[0][0]}
assert len(results) == 4 assert len(results) == 4
@ -405,7 +401,7 @@ def test_services_search_indexers_ancestors_link_reach(mock_push):
assert results[str(document.id)]["reach"] == "public" assert results[str(document.id)]["reach"] == "public"
@patch.object(SearchIndexer, "push") @patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings") @pytest.mark.usefixtures("indexer_settings")
def test_services_search_indexers_ancestors_users(mock_push): def test_services_search_indexers_ancestors_users(mock_push):
"""Document accesses and reach should include users from ancestors.""" """Document accesses and reach should include users from ancestors."""
@ -415,7 +411,7 @@ def test_services_search_indexers_ancestors_users(mock_push):
parent = factories.DocumentFactory(parent=grand_parent, users=[user_p]) parent = factories.DocumentFactory(parent=grand_parent, users=[user_p])
document = factories.DocumentFactory(parent=parent, users=[user_d]) document = factories.DocumentFactory(parent=parent, users=[user_d])
assert SearchIndexer().index() == 3 assert FindDocumentIndexer().index() == 3
results = {doc["id"]: doc for doc in mock_push.call_args[0][0]} results = {doc["id"]: doc for doc in mock_push.call_args[0][0]}
assert len(results) == 3 assert len(results) == 3
@ -428,7 +424,7 @@ def test_services_search_indexers_ancestors_users(mock_push):
} }
@patch.object(SearchIndexer, "push") @patch.object(FindDocumentIndexer, "push")
@pytest.mark.usefixtures("indexer_settings") @pytest.mark.usefixtures("indexer_settings")
def test_services_search_indexers_ancestors_teams(mock_push): def test_services_search_indexers_ancestors_teams(mock_push):
"""Document accesses and reach should include teams from ancestors.""" """Document accesses and reach should include teams from ancestors."""
@ -436,7 +432,7 @@ def test_services_search_indexers_ancestors_teams(mock_push):
parent = factories.DocumentFactory(parent=grand_parent, teams=["team_p"]) parent = factories.DocumentFactory(parent=grand_parent, teams=["team_p"])
document = factories.DocumentFactory(parent=parent, teams=["team_d"]) document = factories.DocumentFactory(parent=parent, teams=["team_d"])
assert SearchIndexer().index() == 3 assert FindDocumentIndexer().index() == 3
results = {doc["id"]: doc for doc in mock_push.call_args[0][0]} results = {doc["id"]: doc for doc in mock_push.call_args[0][0]}
assert len(results) == 3 assert len(results) == 3
@ -451,9 +447,9 @@ def test_push_uses_correct_url_and_data(mock_post, indexer_settings):
push() should call requests.post with the correct URL from settings push() should call requests.post with the correct URL from settings
the timeout set to 10 seconds and the data as JSON. the timeout set to 10 seconds and the data as JSON.
""" """
indexer_settings.SEARCH_INDEXER_URL = "http://example.com/index" indexer_settings.INDEXING_URL = "http://example.com/index"
indexer = SearchIndexer() indexer = FindDocumentIndexer()
sample_data = [{"id": "123", "title": "Test"}] sample_data = [{"id": "123", "title": "Test"}]
mock_response = mock_post.return_value mock_response = mock_post.return_value
@ -464,7 +460,7 @@ def test_push_uses_correct_url_and_data(mock_post, indexer_settings):
mock_post.assert_called_once() mock_post.assert_called_once()
args, kwargs = mock_post.call_args args, kwargs = mock_post.call_args
assert args[0] == indexer_settings.SEARCH_INDEXER_URL assert args[0] == indexer_settings.INDEXING_URL
assert kwargs.get("json") == sample_data assert kwargs.get("json") == sample_data
assert kwargs.get("timeout") == 10 assert kwargs.get("timeout") == 10
@ -542,9 +538,7 @@ def test_services_search_indexers_search_errors(indexer_settings):
""" """
factories.DocumentFactory() factories.DocumentFactory()
indexer_settings.SEARCH_INDEXER_QUERY_URL = ( indexer_settings.SEARCH_URL = "http://app-find/api/v1.0/documents/search/"
"http://app-find/api/v1.0/documents/search/"
)
responses.add( responses.add(
responses.POST, responses.POST,
@ -554,17 +548,17 @@ def test_services_search_indexers_search_errors(indexer_settings):
) )
with pytest.raises(HTTPError): with pytest.raises(HTTPError):
SearchIndexer().search("alpha", token="mytoken") FindDocumentIndexer().search("alpha", token="mytoken")
@patch("requests.post") @patch("requests.post")
def test_services_search_indexers_search(mock_post, indexer_settings): def test_services_search_indexers_search(mock_post, indexer_settings):
""" """
search() should call requests.post to SEARCH_INDEXER_QUERY_URL with the search() should call requests.post to SEARCH_URL with the
document ids from linktraces. document ids from linktraces.
""" """
user = factories.UserFactory() user = factories.UserFactory()
indexer = SearchIndexer() indexer = FindDocumentIndexer()
mock_response = mock_post.return_value mock_response = mock_post.return_value
mock_response.raise_for_status.return_value = None # No error mock_response.raise_for_status.return_value = None # No error
@ -582,7 +576,7 @@ def test_services_search_indexers_search(mock_post, indexer_settings):
args, kwargs = mock_post.call_args args, kwargs = mock_post.call_args
assert args[0] == indexer_settings.SEARCH_INDEXER_QUERY_URL assert args[0] == indexer_settings.SEARCH_URL
query_data = kwargs.get("json") query_data = kwargs.get("json")
assert query_data["q"] == "alpha" assert query_data["q"] == "alpha"
@ -605,7 +599,7 @@ def test_services_search_indexers_search_nb_results(mock_post, indexer_settings)
indexer_settings.SEARCH_INDEXER_QUERY_LIMIT = 25 indexer_settings.SEARCH_INDEXER_QUERY_LIMIT = 25
user = factories.UserFactory() user = factories.UserFactory()
indexer = SearchIndexer() indexer = FindDocumentIndexer()
mock_response = mock_post.return_value mock_response = mock_post.return_value
mock_response.raise_for_status.return_value = None # No error mock_response.raise_for_status.return_value = None # No error
@ -623,7 +617,7 @@ def test_services_search_indexers_search_nb_results(mock_post, indexer_settings)
args, kwargs = mock_post.call_args args, kwargs = mock_post.call_args
assert args[0] == indexer_settings.SEARCH_INDEXER_QUERY_URL assert args[0] == indexer_settings.SEARCH_URL
assert kwargs.get("json")["nb_results"] == 25 assert kwargs.get("json")["nb_results"] == 25
# The argument overrides the setting value # The argument overrides the setting value
@ -631,5 +625,53 @@ def test_services_search_indexers_search_nb_results(mock_post, indexer_settings)
args, kwargs = mock_post.call_args args, kwargs = mock_post.call_args
assert args[0] == indexer_settings.SEARCH_INDEXER_QUERY_URL assert args[0] == indexer_settings.SEARCH_URL
assert kwargs.get("json")["nb_results"] == 109 assert kwargs.get("json")["nb_results"] == 109
def test_search_indexer_get_title_with_localized_field():
"""Test extracting title from localized title field."""
source = {"title.extension": "Bonjour", "id": 1, "content": "test"}
result = FindDocumentIndexer.get_title(source)
assert result == "Bonjour"
def test_search_indexer_get_title_with_multiple_localized_fields():
"""Test that first matching localized title is returned."""
source = {"title.extension": "Bonjour", "title.en": "Hello", "id": 1}
result = FindDocumentIndexer.get_title(source)
assert result in ["Bonjour", "Hello"]
def test_search_indexer_get_title_fallback_to_plain_title():
"""Test fallback to plain 'title' field when no localized field exists."""
source = {"title": "Hello World", "id": 1}
result = FindDocumentIndexer.get_title(source)
assert result == "Hello World"
def test_search_indexer_get_title_no_title_field():
"""Test that empty string is returned when no title field exists."""
source = {"id": 1, "content": "test"}
result = FindDocumentIndexer.get_title(source)
assert result == ""
def test_search_indexer_get_title_with_empty_localized_title():
"""Test that fallback works when localized title is empty."""
source = {"title.extension": "", "title": "Fallback Title", "id": 1}
result = FindDocumentIndexer.get_title(source)
assert result == "Fallback Title"
def test_search_indexer_get_title_with_multiple_extension():
"""Test extracting title from title field with multiple extensions."""
source = {"title.extension_1.extension_2": "Bonjour", "id": 1, "content": "test"}
result = FindDocumentIndexer.get_title(source)
assert result == "Bonjour"

View file

@ -205,3 +205,38 @@ def test_utils_users_sharing_documents_with_empty_result():
cached_data = cache.get(cache_key) cached_data = cache.get(cache_key)
assert cached_data == {} assert cached_data == {}
def test_utils_get_value_by_pattern_matching_key():
"""Test extracting value from a dictionary with a matching key pattern."""
data = {"title.extension": "Bonjour", "id": 1, "content": "test"}
result = utils.get_value_by_pattern(data, r"^title\.")
assert set(result) == {"Bonjour"}
def test_utils_get_value_by_pattern_multiple_matches():
"""Test that all matching keys are returned."""
data = {"title.extension_1": "Bonjour", "title.extension_2": "Hello", "id": 1}
result = utils.get_value_by_pattern(data, r"^title\.")
assert set(result) == {
"Bonjour",
"Hello",
}
def test_utils_get_value_by_pattern_multiple_extensions():
"""Test that all matching keys are returned."""
data = {"title.extension_1.extension_2": "Bonjour", "id": 1}
result = utils.get_value_by_pattern(data, r"^title\.")
assert set(result) == {"Bonjour"}
def test_utils_get_value_by_pattern_no_match():
"""Test that empty list is returned when no key matches the pattern."""
data = {"name": "Test", "id": 1}
result = utils.get_value_by_pattern(data, r"^title\.")
assert result == []

View file

@ -18,6 +18,27 @@ from core import enums, models
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
def get_value_by_pattern(data, pattern):
"""
Get all values from keys matching a regex pattern in a dictionary.
Args:
data (dict): Source dictionary to search
pattern (str): Regex pattern to match against keys
Returns:
list: List of values for all matching keys, empty list if no matches
Example:
>>> get_value_by_pattern({"title.fr": "Bonjour", "id": 1}, r"^title\\.")
["Bonjour"]
>>> get_value_by_pattern({"title.fr": "Bonjour", "title.en": "Hello"}, r"^title\\.")
["Bonjour", "Hello"]
"""
regex = re.compile(pattern)
return [value for key, value in data.items() if regex.match(key)]
def get_ancestor_to_descendants_map(paths, steplen): def get_ancestor_to_descendants_map(paths, steplen):
""" """
Given a list of document paths, return a mapping of ancestor_path -> set of descendant_paths. Given a list of document paths, return a mapping of ancestor_path -> set of descendant_paths.

View file

@ -113,8 +113,8 @@ class Base(Configuration):
SEARCH_INDEXER_BATCH_SIZE = values.IntegerValue( SEARCH_INDEXER_BATCH_SIZE = values.IntegerValue(
default=100_000, environ_name="SEARCH_INDEXER_BATCH_SIZE", environ_prefix=None default=100_000, environ_name="SEARCH_INDEXER_BATCH_SIZE", environ_prefix=None
) )
SEARCH_INDEXER_URL = values.Value( INDEXING_URL = values.Value(
default=None, environ_name="SEARCH_INDEXER_URL", environ_prefix=None default=None, environ_name="INDEXING_URL", environ_prefix=None
) )
SEARCH_INDEXER_COUNTDOWN = values.IntegerValue( SEARCH_INDEXER_COUNTDOWN = values.IntegerValue(
default=1, environ_name="SEARCH_INDEXER_COUNTDOWN", environ_prefix=None default=1, environ_name="SEARCH_INDEXER_COUNTDOWN", environ_prefix=None
@ -122,8 +122,8 @@ class Base(Configuration):
SEARCH_INDEXER_SECRET = values.Value( SEARCH_INDEXER_SECRET = values.Value(
default=None, environ_name="SEARCH_INDEXER_SECRET", environ_prefix=None default=None, environ_name="SEARCH_INDEXER_SECRET", environ_prefix=None
) )
SEARCH_INDEXER_QUERY_URL = values.Value( SEARCH_URL = values.Value(
default=None, environ_name="SEARCH_INDEXER_QUERY_URL", environ_prefix=None default=None, environ_name="SEARCH_URL", environ_prefix=None
) )
SEARCH_INDEXER_QUERY_LIMIT = values.PositiveIntegerValue( SEARCH_INDEXER_QUERY_LIMIT = values.PositiveIntegerValue(
default=50, environ_name="SEARCH_INDEXER_QUERY_LIMIT", environ_prefix=None default=50, environ_name="SEARCH_INDEXER_QUERY_LIMIT", environ_prefix=None

View file

@ -3,6 +3,7 @@ import {
StyleSchema, StyleSchema,
} from '@blocknote/core'; } from '@blocknote/core';
import { useBlockNoteEditor } from '@blocknote/react'; import { useBlockNoteEditor } from '@blocknote/react';
import { useTreeContext } from '@gouvfr-lasuite/ui-kit';
import type { KeyboardEvent } from 'react'; import type { KeyboardEvent } from 'react';
import { useEffect, useRef, useState } from 'react'; import { useEffect, useRef, useState } from 'react';
import { useTranslation } from 'react-i18next'; import { useTranslation } from 'react-i18next';
@ -26,12 +27,13 @@ import {
import FoundPageIcon from '@/docs/doc-editor/assets/doc-found.svg'; import FoundPageIcon from '@/docs/doc-editor/assets/doc-found.svg';
import AddPageIcon from '@/docs/doc-editor/assets/doc-plus.svg'; import AddPageIcon from '@/docs/doc-editor/assets/doc-plus.svg';
import { import {
Doc,
getEmojiAndTitle, getEmojiAndTitle,
useCreateChildDocTree, useCreateChildDocTree,
useDocStore, useDocStore,
useTrans, useTrans,
} from '@/docs/doc-management'; } from '@/docs/doc-management';
import { DocSearchSubPageContent, DocSearchTarget } from '@/docs/doc-search'; import { DocSearchContent, DocSearchTarget } from '@/docs/doc-search';
import { useResponsiveStore } from '@/stores'; import { useResponsiveStore } from '@/stores';
const inputStyle = css` const inputStyle = css`
@ -87,7 +89,7 @@ export const SearchPage = ({
const { isDesktop } = useResponsiveStore(); const { isDesktop } = useResponsiveStore();
const { untitledDocument } = useTrans(); const { untitledDocument } = useTrans();
const isEditable = editor.isEditable; const isEditable = editor.isEditable;
const treeContext = useTreeContext<Doc>();
/** /**
* createReactInlineContentSpec add automatically the focus after * createReactInlineContentSpec add automatically the focus after
* the inline content, so we need to set the focus on the input * the inline content, so we need to set the focus on the input
@ -226,9 +228,11 @@ export const SearchPage = ({
`} `}
$margin={{ top: '0.5rem' }} $margin={{ top: '0.5rem' }}
> >
<DocSearchSubPageContent <DocSearchContent
groupName={t('Select a document')}
search={search} search={search}
filters={{ target: DocSearchTarget.CURRENT }} target={DocSearchTarget.CURRENT}
parentPath={treeContext?.root?.path}
onSelect={(doc) => { onSelect={(doc) => {
if (!isEditable) { if (!isEditable) {
return; return;
@ -256,7 +260,7 @@ export const SearchPage = ({
editor.focus(); editor.focus();
}} }}
renderElement={(doc) => { renderSearchElement={(doc) => {
const { emoji, titleWithoutEmoji } = getEmojiAndTitle( const { emoji, titleWithoutEmoji } = getEmojiAndTitle(
doc.title || untitledDocument, doc.title || untitledDocument,
); );

View file

@ -9,5 +9,5 @@ export * from './useDocsFavorite';
export * from './useDuplicateDoc'; export * from './useDuplicateDoc';
export * from './useMoveDoc'; export * from './useMoveDoc';
export * from './useRestoreDoc'; export * from './useRestoreDoc';
export * from './useSubDocs';
export * from './useUpdateDoc'; export * from './useUpdateDoc';
export * from './useSearchDocs';

View file

@ -15,7 +15,6 @@ export type DocsParams = {
page: number; page: number;
ordering?: DocsOrdering; ordering?: DocsOrdering;
is_creator_me?: boolean; is_creator_me?: boolean;
title?: string;
is_favorite?: boolean; is_favorite?: boolean;
}; };
@ -31,9 +30,6 @@ export const constructParams = (params: DocsParams): URLSearchParams => {
if (params.is_creator_me !== undefined) { if (params.is_creator_me !== undefined) {
searchParams.set('is_creator_me', params.is_creator_me.toString()); searchParams.set('is_creator_me', params.is_creator_me.toString());
} }
if (params.title && params.title.length > 0) {
searchParams.set('title', params.title);
}
if (params.is_favorite !== undefined) { if (params.is_favorite !== undefined) {
searchParams.set('is_favorite', params.is_favorite.toString()); searchParams.set('is_favorite', params.is_favorite.toString());
} }

View file

@ -0,0 +1,81 @@
import { useQuery } from '@tanstack/react-query';
import {
APIError,
APIList,
errorCauses,
fetchAPI,
useAPIInfiniteQuery,
} from '@/api';
import { Doc } from '@/docs/doc-management';
import { DocSearchTarget } from '@/docs/doc-search';
export type SearchDocsParams = {
page: number;
q: string;
target?: DocSearchTarget;
parentPath?: string;
};
const constructParams = ({
q,
page,
target,
parentPath,
}: SearchDocsParams): URLSearchParams => {
const searchParams = new URLSearchParams();
searchParams.set('q', q);
if (target === DocSearchTarget.CURRENT && parentPath) {
searchParams.set('path', parentPath);
}
if (page) {
searchParams.set('page', page.toString());
}
return searchParams;
};
const searchDocs = async ({
q,
page,
target,
parentPath,
}: SearchDocsParams): Promise<APIList<Doc>> => {
const searchParams = constructParams({ q, page, target, parentPath });
const response = await fetchAPI(
`documents/search/?${searchParams.toString()}`,
);
if (!response.ok) {
throw new APIError('Failed to get the docs', await errorCauses(response));
}
return response.json() as Promise<APIList<Doc>>;
};
export const KEY_LIST_SEARCH_DOC = 'search-docs';
export const useSearchDocs = (
{ q, page, target, parentPath }: SearchDocsParams,
queryConfig?: { enabled?: boolean },
) => {
return useQuery<APIList<Doc>, APIError, APIList<Doc>>({
queryKey: [KEY_LIST_SEARCH_DOC, 'search', { q, page, target, parentPath }],
queryFn: () => searchDocs({ q, page, target, parentPath }),
...queryConfig,
});
};
export const useInfiniteSearchDocs = (
params: SearchDocsParams,
queryConfig?: { enabled?: boolean },
) => {
return useAPIInfiniteQuery(
KEY_LIST_SEARCH_DOC,
searchDocs,
params,
queryConfig,
);
};

View file

@ -1,62 +0,0 @@
import { UseQueryOptions, useQuery } from '@tanstack/react-query';
import {
APIError,
InfiniteQueryConfig,
errorCauses,
fetchAPI,
useAPIInfiniteQuery,
} from '@/api';
import { DocsOrdering } from '../types';
import { DocsResponse, constructParams } from './useDocs';
export type SubDocsParams = {
page: number;
ordering?: DocsOrdering;
is_creator_me?: boolean;
title?: string;
is_favorite?: boolean;
parent_id: string;
};
export const getSubDocs = async (
params: SubDocsParams,
): Promise<DocsResponse> => {
const searchParams = constructParams(params);
searchParams.set('parent_id', params.parent_id);
const response: Response = await fetchAPI(
`documents/${params.parent_id}/descendants/?${searchParams.toString()}`,
);
if (!response.ok) {
throw new APIError(
'Failed to get the sub docs',
await errorCauses(response),
);
}
return response.json() as Promise<DocsResponse>;
};
export const KEY_LIST_SUB_DOC = 'sub-docs';
export function useSubDocs(
params: SubDocsParams,
queryConfig?: UseQueryOptions<DocsResponse, APIError, DocsResponse>,
) {
return useQuery<DocsResponse, APIError, DocsResponse>({
queryKey: [KEY_LIST_SUB_DOC, params],
queryFn: () => getSubDocs(params),
...queryConfig,
});
}
export const useInfiniteSubDocs = (
params: SubDocsParams,
queryConfig?: InfiniteQueryConfig<DocsResponse>,
) => {
return useAPIInfiniteQuery(KEY_LIST_SUB_DOC, getSubDocs, params, queryConfig);
};

View file

@ -4,7 +4,10 @@ import { InView } from 'react-intersection-observer';
import { Box } from '@/components/'; import { Box } from '@/components/';
import { QuickSearchData, QuickSearchGroup } from '@/components/quick-search'; import { QuickSearchData, QuickSearchGroup } from '@/components/quick-search';
import { Doc, useInfiniteDocs } from '@/docs/doc-management'; import { useInfiniteSearchDocs } from '@/docs/doc-management/api/useSearchDocs';
import { DocSearchTarget } from '@/docs/doc-search';
import { Doc } from '../../doc-management';
import { DocSearchItem } from './DocSearchItem'; import { DocSearchItem } from './DocSearchItem';
@ -15,6 +18,8 @@ type DocSearchContentProps = {
isSearchNotMandatory?: boolean; isSearchNotMandatory?: boolean;
onSelect: (doc: Doc) => void; onSelect: (doc: Doc) => void;
onLoadingChange?: (loading: boolean) => void; onLoadingChange?: (loading: boolean) => void;
target?: DocSearchTarget;
parentPath?: string;
renderSearchElement?: (doc: Doc) => React.ReactNode; renderSearchElement?: (doc: Doc) => React.ReactNode;
}; };
@ -25,6 +30,8 @@ export const DocSearchContent = ({
onSelect, onSelect,
onLoadingChange, onLoadingChange,
renderSearchElement, renderSearchElement,
target,
parentPath,
isSearchNotMandatory, isSearchNotMandatory,
}: DocSearchContentProps) => { }: DocSearchContentProps) => {
const { const {
@ -34,10 +41,17 @@ export const DocSearchContent = ({
isLoading, isLoading,
fetchNextPage, fetchNextPage,
hasNextPage, hasNextPage,
} = useInfiniteDocs({ } = useInfiniteSearchDocs(
{
q: search,
page: 1, page: 1,
...(search ? { title: search } : {}), target,
}); parentPath,
},
{
enabled: target !== DocSearchTarget.CURRENT || !!parentPath,
},
);
const loading = isFetching || isRefetching || isLoading; const loading = isFetching || isRefetching || isLoading;
const [docsData, setDocsData] = useState<QuickSearchData<Doc>>({ const [docsData, setDocsData] = useState<QuickSearchData<Doc>>({
@ -79,12 +93,12 @@ export const DocSearchContent = ({
}, [ }, [
search, search,
data?.pages, data?.pages,
fetchNextPage,
hasNextPage,
filterResults, filterResults,
groupName, groupName,
isSearchNotMandatory, isSearchNotMandatory,
loading, loading,
hasNextPage,
fetchNextPage,
]); ]);
useEffect(() => { useEffect(() => {

View file

@ -1,4 +1,5 @@
import { Modal, ModalSize } from '@gouvfr-lasuite/cunningham-react'; import { Modal, ModalSize } from '@gouvfr-lasuite/cunningham-react';
import { TreeContextType, useTreeContext } from '@gouvfr-lasuite/ui-kit';
import Image from 'next/image'; import Image from 'next/image';
import { useRouter } from 'next/router'; import { useRouter } from 'next/router';
import { useState } from 'react'; import { useState } from 'react';
@ -8,45 +9,39 @@ import { useDebouncedCallback } from 'use-debounce';
import { Box, ButtonCloseModal, Text } from '@/components'; import { Box, ButtonCloseModal, Text } from '@/components';
import { QuickSearch } from '@/components/quick-search'; import { QuickSearch } from '@/components/quick-search';
import { Doc, useDocUtils } from '@/docs/doc-management'; import { Doc, useDocUtils } from '@/docs/doc-management';
import {
DocSearchFilters,
DocSearchFiltersValues,
DocSearchTarget,
} from '@/docs/doc-search';
import { useResponsiveStore } from '@/stores'; import { useResponsiveStore } from '@/stores';
import EmptySearchIcon from '../assets/illustration-docs-empty.png'; import EmptySearchIcon from '../assets/illustration-docs-empty.png';
import { DocSearchContent } from './DocSearchContent'; import { DocSearchContent } from './DocSearchContent';
import {
DocSearchFilters,
DocSearchFiltersValues,
DocSearchTarget,
} from './DocSearchFilters';
import { DocSearchItem } from './DocSearchItem';
import { DocSearchSubPageContent } from './DocSearchSubPageContent';
type DocSearchModalGlobalProps = { type DocSearchModalGlobalProps = {
onClose: () => void; onClose: () => void;
isOpen: boolean; isOpen: boolean;
showFilters?: boolean; showFilters?: boolean;
defaultFilters?: DocSearchFiltersValues; defaultFilters?: DocSearchFiltersValues;
treeContext?: TreeContextType<Doc> | null;
}; };
const DocSearchModalGlobal = ({ const DocSearchModalGlobal = ({
showFilters = false, showFilters = false,
defaultFilters, defaultFilters,
treeContext,
...modalProps ...modalProps
}: DocSearchModalGlobalProps) => { }: DocSearchModalGlobalProps) => {
const { t } = useTranslation(); const { t } = useTranslation();
const [loading, setLoading] = useState(false); const [loading, setLoading] = useState(false);
const router = useRouter(); const router = useRouter();
const isDocPage = router.pathname === '/docs/[id]';
const [search, setSearch] = useState(''); const [search, setSearch] = useState('');
const [filters, setFilters] = useState<DocSearchFiltersValues>( const [filters, setFilters] = useState<DocSearchFiltersValues>(
defaultFilters ?? {}, defaultFilters ?? {},
); );
const target = filters.target ?? DocSearchTarget.ALL;
const { isDesktop } = useResponsiveStore(); const { isDesktop } = useResponsiveStore();
const handleInputSearch = useDebouncedCallback(setSearch, 300); const handleInputSearch = useDebouncedCallback(setSearch, 300);
const handleSelect = (doc: Doc) => { const handleSelect = (doc: Doc) => {
@ -121,26 +116,23 @@ const DocSearchModalGlobal = ({
</Box> </Box>
)} )}
{search && ( {search && (
<>
{target === DocSearchTarget.ALL && (
<DocSearchContent <DocSearchContent
groupName={t('Select a document')} groupName={t('Select a document')}
search={search} search={search}
onSelect={handleSelect} onSelect={handleSelect}
onLoadingChange={setLoading} onLoadingChange={setLoading}
target={
filters.target === DocSearchTarget.CURRENT
? DocSearchTarget.CURRENT
: DocSearchTarget.ALL
}
parentPath={
filters.target === DocSearchTarget.CURRENT
? treeContext?.root?.path
: undefined
}
/> />
)} )}
{isDocPage && target === DocSearchTarget.CURRENT && (
<DocSearchSubPageContent
search={search}
filters={filters}
onSelect={handleSelect}
onLoadingChange={setLoading}
renderElement={(doc) => <DocSearchItem doc={doc} />}
/>
)}
</>
)}
</Box> </Box>
</QuickSearch> </QuickSearch>
</Box> </Box>
@ -158,6 +150,7 @@ const DocSearchModalDetail = ({
}: DocSearchModalDetailProps) => { }: DocSearchModalDetailProps) => {
const { hasChildren, isChild } = useDocUtils(doc); const { hasChildren, isChild } = useDocUtils(doc);
const isWithChildren = isChild || hasChildren; const isWithChildren = isChild || hasChildren;
const treeContext = useTreeContext<Doc>();
let defaultFilters = DocSearchTarget.ALL; let defaultFilters = DocSearchTarget.ALL;
let showFilters = false; let showFilters = false;
@ -171,6 +164,7 @@ const DocSearchModalDetail = ({
{...modalProps} {...modalProps}
showFilters={showFilters} showFilters={showFilters}
defaultFilters={{ target: defaultFilters }} defaultFilters={{ target: defaultFilters }}
treeContext={treeContext}
/> />
); );
}; };

View file

@ -1,103 +0,0 @@
import { useTreeContext } from '@gouvfr-lasuite/ui-kit';
import { t } from 'i18next';
import React, { useEffect, useState } from 'react';
import { InView } from 'react-intersection-observer';
import { QuickSearchData, QuickSearchGroup } from '@/components/quick-search';
import { Doc, useInfiniteSubDocs } from '@/docs/doc-management';
import { DocSearchFiltersValues } from './DocSearchFilters';
type DocSearchSubPageContentProps = {
search: string;
filters: DocSearchFiltersValues;
onSelect: (doc: Doc) => void;
onLoadingChange?: (loading: boolean) => void;
renderElement: (doc: Doc) => React.ReactNode;
};
export const DocSearchSubPageContent = ({
search,
filters,
onSelect,
onLoadingChange,
renderElement,
}: DocSearchSubPageContentProps) => {
const treeContext = useTreeContext<Doc>();
const {
data: subDocsData,
isFetching,
isRefetching,
isLoading,
fetchNextPage: subDocsFetchNextPage,
hasNextPage: subDocsHasNextPage,
} = useInfiniteSubDocs(
{
page: 1,
title: search,
...filters,
parent_id: treeContext?.root?.id ?? '',
},
{
enabled: !!treeContext?.root?.id,
},
);
const [docsData, setDocsData] = useState<QuickSearchData<Doc>>({
groupName: '',
elements: [],
emptyString: '',
});
const loading = isFetching || isRefetching || isLoading;
useEffect(() => {
if (loading) {
return;
}
const subDocs = subDocsData?.pages.flatMap((page) => page.results) || [];
if (treeContext?.root) {
const isRootTitleIncludeSearch = treeContext.root?.title
?.toLowerCase()
.includes(search.toLowerCase());
if (isRootTitleIncludeSearch) {
subDocs.unshift(treeContext.root);
}
}
setDocsData({
groupName: subDocs.length > 0 ? t('Select a doc') : '',
elements: search ? subDocs : [],
emptyString: search ? t('No document found') : t('Search by title'),
endActions: subDocsHasNextPage
? [
{
content: <InView onChange={() => void subDocsFetchNextPage()} />,
},
]
: [],
});
}, [
loading,
search,
subDocsData?.pages,
subDocsFetchNextPage,
subDocsHasNextPage,
treeContext?.root,
]);
useEffect(() => {
onLoadingChange?.(loading);
}, [loading, onLoadingChange]);
return (
<QuickSearchGroup
onSelect={onSelect}
group={docsData}
renderElement={renderElement}
/>
);
};

View file

@ -1,4 +1,3 @@
export * from './DocSearchContent'; export * from './DocSearchContent';
export * from './DocSearchModal'; export * from './DocSearchModal';
export * from './DocSearchFilters'; export * from './DocSearchFilters';
export * from './DocSearchSubPageContent';