Commit Graph

14 Commits

Author SHA1 Message Date
8d98056375 fix: fallback for fake cbr 2026-03-12 14:17:21 +01:00
6abaa96fba perf(parsers): remplacer tous les subprocesses par des libs in-process
CBR: remplace unrar/unar CLI par le crate `unrar` (bindings libunrar
vendorisé, zéro dépendance système). Supprime XADRegexException, les
forks de processus et les dossiers temporaires.

PDF: remplace pdfinfo + pdftoppm par pdfium-render. Le PDF est ouvert
une seule fois pour obtenir le nombre de pages ET rasteriser la première
page. lopdf reste pour parse_metadata (page count seul).

convert_cbr_to_cbz: reécrit sans subprocess ni dossier temporaire —
les images sont lues en mémoire via unrar puis packées directement en ZIP.

Dockerfile indexer: retire unrar-free, unar, poppler-utils. Télécharge
libpdfium.so depuis bblanchon/pdfium-binaries au build.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 16:46:43 +01:00
f2d9bedcc7 fix(parsers): corriger la génération de thumbnails CBR/CBZ/PDF
- CBR: contourner le bug XADRegexException de unar en appelant unar
  avec un symlink à nom neutre (archive.cbr) au lieu du chemin réel,
  qui peut contenir des caractères regex spéciaux comme [ ] ( )
- CBR/CBZ: remplacer le tri lexicographique par natord (tri naturel)
  pour que page2.jpg soit trié avant page10.jpg
- PDF: brancher pdftoppm -scale-to sur config.width.max(config.height)
  au lieu d'une valeur hardcodée (800px → 400px par défaut)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 16:17:20 +01:00
137e8ce11c fix: slow thumbnail and analyser test 2026-03-09 23:16:21 +01:00
e0b80cae38 feat: conversion CBR → CBZ via job asynchrone
Ajoute la possibilité de convertir un livre CBR en CBZ depuis le backoffice.
La conversion est sécurisée : le CBR original n'est supprimé qu'après vérification
du CBZ généré et mise à jour de la base de données.

- parsers: nouvelle fn `convert_cbr_to_cbz` (unar extract → zip pack → vérification → rename atomique)
- api: `POST /books/:id/convert` crée un job `cbr_to_cbz` (vérifie format CBR, détecte collision)
- indexer: nouveau `converter.rs` dispatché depuis `job.rs`
- backoffice: bouton "Convert to CBZ" sur la page détail (visible si CBR), label dans JobRow
- migrations: colonne `book_id` sur `index_jobs` + type `cbr_to_cbz` dans le check constraint

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 23:02:08 +01:00
cfc896e92f feat: two-phase indexation with direct thumbnail generation in indexer
Phase 1 (discovery): walkdir + filename-only metadata, zero archive I/O.
Books are visible immediately in the UI while Phase 2 runs in background.

Phase 2 (analysis): open each archive once via analyze_book() to extract
page_count and first page bytes, then generate WebP thumbnail directly in
the indexer — removing the HTTP roundtrip to the API checkup endpoint.

- Add parse_metadata_fast() (infallible, no archive I/O)
- Add analyze_book() returning (page_count, first_page_bytes) in one pass
- Add looks_like_image() magic bytes check for unrar p stdout validation
- Add lsar fallback in list_cbr_images() for UTF-16BE encoded filenames
- Add directory_mtimes table to skip unchanged dirs on incremental scans
- Add analyzer.rs: generate_thumbnail, analyze_library_books, regenerate_thumbnails
- Remove run_checkup() from API; indexer handles thumbnail jobs directly
- Remove api_base_url/api_bootstrap_token from IndexerConfig and AppState
- Add unar + poppler-utils to indexer Dockerfile
- Fix smoke.sh: wait for job completion, check thumbnail_url field

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 22:13:05 +01:00
0f5094575a docs: add AGENTS.md per module and unify ports to 70XX
- Add CLAUDE.md at root and AGENTS.md in apps/api, apps/indexer,
  apps/backoffice, crates/parsers with module-specific guidelines
- Unify all service ports to 70XX (no more internal/external split):
  API 7080, Indexer 7081, Backoffice 7082
- Update docker-compose.yml, Dockerfiles, config.rs defaults,
  .env.example, backoffice routes, bench.sh, smoke.sh

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 13:57:39 +01:00
c93a7d5d29 feat: thumbnails : part1 2026-03-08 17:54:47 +01:00
f0a967515b fix: improve series detection and add detailed indexing logs
- Fix series detection to handle path variations (symlinks, separators)
- Add comprehensive logging for job processing and file scanning
- Better error handling for path prefix stripping
- Track files scanned, indexed, and errors per library
2026-03-06 22:35:11 +01:00
5f51955f4d feat(indexing): Lot 4 - Progression temps reel, Full Rebuild, Optimisations
- Ajout migrations DB: index_job_errors, library_monitoring, full_rebuild_type
- API: endpoints progression temps reel (/jobs/:id/stream), active jobs, details
- API: support full_rebuild avec suppression donnees existantes
- Indexer: logs detailles avec timing [SCAN][META][PARSER][BDD]
- Indexer: optimisation parsing PDF (lopdf -> pdfinfo) 235x plus rapide
- Indexer: corrections chemins LIBRARIES_ROOT_PATH pour dev local
- Backoffice: composants JobProgress, JobsIndicator (header), JobsList
- Backoffice: SSE streaming pour progression temps reel
- Backoffice: boutons Index/Index Full sur page libraries
- Backoffice: highlight job apres creation avec redirection
- Fix: parsing volume type i32, sync meilisearch cleanup

Perf: parsing PDF passe de 8.7s a 37ms
Perf: indexation 45 fichiers en ~15s vs plusieurs minutes avant
2026-03-06 11:33:32 +01:00
82294a1bee feat: change volume from string to integer type
Parser:
- Change volume type from Option<String> to Option<i32>
- Parse volume as integer to remove leading zeros
- Keep original title with volume info

Indexer:
- Update SQL queries to insert volume as integer
- Add volume column to INSERT and UPDATE statements

API:
- Change BookItem.volume and BookDetails.volume to Option<i32>
- Add natural sorting for books

Backoffice:
- Update volume type to number
- Update book detail page
- Add CSS styles
2026-03-05 23:32:01 +01:00
d33a4b02d8 feat: add series support for book organization
API:
- Add /libraries/{id}/series endpoint to list series with book counts
- Add series filter to /books endpoint
- Fix SeriesItem to return first_book_id properly (using CTE with ROW_NUMBER)

Indexer:
- Parse series from parent folder name relative to library root
- Store series in database when indexing books

Backoffice:
- Add Books page with grid view, search, and pagination
- Add Series page showing series with cover images
- Add Library books page filtered by series
- Add book detail page
- Add Series column in libraries list with clickable link
- Create BookCard component for reusable book display
- Add CSS styles for books grid, series grid, and book details
- Add proxy API route for book cover images (fixing CORS issues)

Parser:
- Add series field to ParsedMetadata
- Extract series from file path relative to library root

Books without a parent folder are categorized as 'unclassified' series.
2026-03-05 22:58:28 +01:00
6eaf2ba5dc add indexing jobs, parsers, and search APIs 2026-03-05 15:05:34 +01:00
88db9805b5 bootstrap rust services, auth, and compose stack 2026-03-05 14:51:02 +01:00