perf(parsers): remplacer tous les subprocesses par des libs in-process
CBR: remplace unrar/unar CLI par le crate `unrar` (bindings libunrar vendorisé, zéro dépendance système). Supprime XADRegexException, les forks de processus et les dossiers temporaires. PDF: remplace pdfinfo + pdftoppm par pdfium-render. Le PDF est ouvert une seule fois pour obtenir le nombre de pages ET rasteriser la première page. lopdf reste pour parse_metadata (page count seul). convert_cbr_to_cbz: reécrit sans subprocess ni dossier temporaire — les images sont lues en mémoire via unrar puis packées directement en ZIP. Dockerfile indexer: retire unrar-free, unar, poppler-utils. Télécharge libpdfium.so depuis bblanchon/pdfium-binaries au build. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -21,11 +21,24 @@ RUN --mount=type=cache,target=/sccache \
|
||||
cargo build --release -p indexer
|
||||
|
||||
FROM debian:bookworm-slim
|
||||
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
ca-certificates wget \
|
||||
unrar-free unar \
|
||||
poppler-utils \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Download pdfium shared library (replaces pdftoppm + pdfinfo subprocesses)
|
||||
RUN ARCH=$(dpkg --print-architecture) && \
|
||||
case "$ARCH" in \
|
||||
amd64) PDFIUM_ARCH="linux-x64" ;; \
|
||||
arm64) PDFIUM_ARCH="linux-arm64" ;; \
|
||||
*) echo "Unsupported arch: $ARCH" && exit 1 ;; \
|
||||
esac && \
|
||||
wget -q "https://github.com/bblanchon/pdfium-binaries/releases/latest/download/pdfium-${PDFIUM_ARCH}.tgz" -O /tmp/pdfium.tgz && \
|
||||
tar -xzf /tmp/pdfium.tgz -C /tmp && \
|
||||
cp /tmp/lib/libpdfium.so /usr/local/lib/ && \
|
||||
rm -rf /tmp/pdfium.tgz /tmp/lib /tmp/include && \
|
||||
ldconfig
|
||||
|
||||
COPY --from=builder /app/target/release/indexer /usr/local/bin/indexer
|
||||
EXPOSE 7081
|
||||
CMD ["/usr/local/bin/indexer"]
|
||||
|
||||
Reference in New Issue
Block a user