OpenMetadata/docker/docker-compose-quickstart/Dockerfile.fuseki-alpine
Sriharsha Chintalapani 0ab31cb647
fix(rdf): reclaim Fuseki disk via compaction + upgrade Jena 4.10 → 5.6.0 (#28242)
* fix(rdf): reclaim Fuseki disk via compaction + upgrade Jena 4.10 → 5.6.0

PR #28117's SPARQL cleanup converged the logical RDF state but never freed
disk: TDB2 deletes only mark blocks free and the journal grows monotonically
until /$/compact runs. RdfIndexApp.clearRdfData() now calls a new
RdfStorageInterface.compactStorage() between clearAll() and reloadOntologies()
so each recreate run reclaims to a fresh dataset directory. JenaFusekiStorage
posts to /$/compact/{dataset}?deleteOld=true and polls /$/tasks/{id} until
finished, with failures logged and swallowed (disk hygiene, not correctness).

Also unifies the Jena classpath: openmetadata-service was on 4.10.0 and
openmetadata-integration-tests on 5.0.0. Both now pin to 5.6.0 via a single
root pom property, dropping the apache-jena-libs BOM in favour of explicit
jena-core/arq/rdfconnection deps (we're a remote-Fuseki client and never
embed TDB; pulling jena-tdb1/2 triggers a Jena 5/6 static-init regression).
Picks up CVE-2025-49656 and CVE-2025-50151 (admin-side fixes shipped in
Jena 5.5.0). Jena 6.x parked: both 6.0.0 and 6.1.0 hit a recursive clinit
bug where TypeMapper.reset reads RDF.dtLangString before RDF.<clinit>
completes.

Fuseki server bumped 4.10/5.0 → 5.6.0 across all in-repo Dockerfiles; the
unmaintained stain/jena-fuseki:* image references in dev compose files
switched to building from docker/rdf-store/Dockerfile, and Testcontainers
moved to secoresearch/fuseki:5.5.0 (maintained, CVE-fixed; the dataset is
created by JenaFusekiStorage.ensureDatasetExists() so the stain-only
FUSEKI_DATASET_1 env var is no longer needed).
2026-05-18 23:08:46 -07:00

38 lines
No EOL
1.2 KiB
Text

# Use eclipse-temurin which supports ARM64
FROM eclipse-temurin:17-jre
# Install minimal packages
RUN apt-get update && \
apt-get install -y --no-install-recommends wget && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Set Fuseki version and paths
ENV FUSEKI_VERSION=5.6.0
ENV FUSEKI_HOME=/fuseki
ENV FUSEKI_BASE=/fuseki
# Create fuseki user and directories
RUN addgroup -g 1000 -S fuseki && \
adduser -u 1000 -S fuseki -G fuseki && \
mkdir -p ${FUSEKI_HOME} && \
chown -R fuseki:fuseki ${FUSEKI_HOME}
# Switch to fuseki user
USER fuseki
WORKDIR ${FUSEKI_HOME}
# Download and install Fuseki
RUN wget -q https://archive.apache.org/dist/jena/binaries/apache-jena-fuseki-${FUSEKI_VERSION}.tar.gz && \
tar -xzf apache-jena-fuseki-${FUSEKI_VERSION}.tar.gz --strip-components=1 && \
rm apache-jena-fuseki-${FUSEKI_VERSION}.tar.gz && \
mkdir -p ${FUSEKI_HOME}/run ${FUSEKI_HOME}/databases
# JVM options
ENV JVM_ARGS="-Xmx4g -Xms2g"
# Expose port
EXPOSE 3030
# Start Fuseki with openmetadata dataset
CMD ["sh", "-c", "mkdir -p ${FUSEKI_HOME}/databases/openmetadata && exec ${FUSEKI_HOME}/fuseki-server --update --loc=${FUSEKI_HOME}/databases/openmetadata /openmetadata"]