twenty/packages/twenty-server/scripts/setup-db.ts
Abdullah. 2a7a83de81
feat(search): Add unaccent support for accent-insensitive search. (#14464)
##  Add accent-insensitive search functionality

### 🎯 Overview
Implements accent-insensitive search across all searchable fields in
Twenty CRM.
Users can now search for "jose" to find "José", "muller" to find
"Müller", "cafe" to find "café", etc.

### 🔍 Problem
Twenty's search functionality was accent-sensitive, requiring users to
type exact accented characters to find records.
This created a poor user experience, especially for international names
and content.

### 💡 Solution
Added PostgreSQL `unaccent` extension with a custom immutable wrapper
function to enable accent-insensitive full-text search across all
searchable field types.

### 📋 Changes Made
**Modified Files:**  
- `packages/twenty-server/scripts/setup-db.ts`  
-
`packages/twenty-server/src/engine/api/graphql/graphql-query-runner/utils/compute-where-condition-parts.ts`
-
`packages/twenty-server/src/engine/workspace-manager/workspace-sync-metadata/utils/get-ts-vector-column-expression.util.ts`

### 🗄️ Database Setup (`setup-db.ts`)
```sql
-- Added unaccent extension
CREATE EXTENSION IF NOT EXISTS "unaccent";

-- Created immutable wrapper function
CREATE OR REPLACE FUNCTION unaccent_immutable(text) RETURNS text AS $$
  SELECT public.unaccent($1)
$$ LANGUAGE sql IMMUTABLE;
```

### 🔍 Search Vector Generation
(`get-ts-vector-column-expression.util.ts`)
Applied `public.unaccent_immutable()` to all searchable field types:  
- TEXT fields (job titles, names, etc.)  
- FULL_NAME fields (first/last names)  
- EMAILS fields (both email address and domain)  
- ADDRESS fields  
- LINKS fields  
- RICH_TEXT and RICH_TEXT_V2 fields

### 🔎 Query Processing (`compute-where-condition-parts.ts`)
Enhanced search queries to use `public.unaccent_immutable()` for both:  
- Full-text search (`@@` operator with `to_tsquery`)  
- Pattern matching (`ILIKE` operator)

### 🧠 Technical Rationale: Why the Wrapper Function?
**The Challenge:**  
PostgreSQL's built-in `unaccent()` is marked as **STABLE**, but
`GENERATED ALWAYS AS` expressions (used for search vector columns)
require **IMMUTABLE** functions.

**The Solution:**  
Created an IMMUTABLE wrapper function that calls the underlying
`unaccent()` function:
-  Satisfies PostgreSQL's immutability requirements for generated
columns
-  Maintains the exact same functionality as the original `unaccent()`
-  Uses fully qualified `public.unaccent_immutable()` to ensure
function resolution from workspace schemas

**Alternative Approaches Considered:**  
-  Modifying `search_path`: would affect workspace isolation  
-  Computing unaccent at query time: would hurt performance  
-  Using triggers: would complicate data consistency

### 🎯 Impact
For **Person** records, accent-insensitive search now works on:  
- Name (first/last name): `"jose garcia"` finds `"José García"`  
- Email: `"jose@cafe.com"` finds `"josé@café.com"`  
- Job Title: `"manager"` finds `"Managér"` or `"Gerente de Café"`

Applies to all searchable standard objects:  
- Companies, People, Opportunities, Notes, Tasks, etc.  
- Any custom fields of searchable types (TEXT, EMAILS, etc.)

###  Testing
- Database reset completes successfully  
- Workspace seeding works without errors  
- Search vectors generate with unaccent functionality  
- All searchable field types properly handle accented characters

---------

Co-authored-by: Félix Malfait <felix.malfait@gmail.com>
2025-09-17 09:01:35 +02:00

95 lines
2.4 KiB
TypeScript

import console from 'console';
import { rawDataSource } from 'src/database/typeorm/raw/raw.datasource';
import { camelToSnakeCase, performQuery } from './utils';
rawDataSource
.initialize()
.then(async () => {
await performQuery(
'CREATE SCHEMA IF NOT EXISTS "public"',
'create schema "public"',
);
await performQuery(
'CREATE SCHEMA IF NOT EXISTS "core"',
'create schema "core"',
);
await performQuery(
'CREATE EXTENSION IF NOT EXISTS "uuid-ossp"',
'create extension "uuid-ossp"',
);
await performQuery(
'CREATE EXTENSION IF NOT EXISTS "unaccent"',
'create extension "unaccent"',
);
await performQuery(
`CREATE OR REPLACE FUNCTION unaccent_immutable(text) RETURNS text AS $$
SELECT public.unaccent($1)
$$ LANGUAGE sql IMMUTABLE;`,
'create immutable unaccent wrapper function',
);
// We paused the work on FDW
if (process.env.IS_FDW_ENABLED !== 'true') {
return;
}
await performQuery(
'CREATE EXTENSION IF NOT EXISTS "postgres_fdw"',
'create extension "postgres_fdw"',
);
await performQuery(
'CREATE EXTENSION IF NOT EXISTS "wrappers"',
'create extension "wrappers"',
);
await performQuery(
'CREATE EXTENSION IF NOT EXISTS "mysql_fdw"',
'create extension "mysql_fdw"',
);
const supabaseWrappers = [
'airtable',
'bigQuery',
'clickHouse',
'firebase',
'logflare',
's3',
'stripe',
]; // See https://supabase.github.io/wrappers/
for (const wrapper of supabaseWrappers) {
if (await checkForeignDataWrapperExists(`${wrapper.toLowerCase()}_fdw`)) {
continue;
}
await performQuery(
`
CREATE FOREIGN DATA WRAPPER "${wrapper.toLowerCase()}_fdw"
HANDLER "${camelToSnakeCase(wrapper)}_fdw_handler"
VALIDATOR "${camelToSnakeCase(wrapper)}_fdw_validator";
`,
`create ${wrapper} "wrappers"`,
true,
true,
);
}
})
.catch((err) => {
console.error('Error during Data Source initialization:', err);
});
async function checkForeignDataWrapperExists(
wrapperName: string,
): Promise<boolean> {
const result = await rawDataSource.query(
`SELECT 1 FROM pg_foreign_data_wrapper WHERE fdwname = $1`,
[wrapperName],
);
return result.length > 0;
}