feat: two-phase indexation with direct thumbnail generation in indexer
Phase 1 (discovery): walkdir + filename-only metadata, zero archive I/O. Books are visible immediately in the UI while Phase 2 runs in background. Phase 2 (analysis): open each archive once via analyze_book() to extract page_count and first page bytes, then generate WebP thumbnail directly in the indexer — removing the HTTP roundtrip to the API checkup endpoint. - Add parse_metadata_fast() (infallible, no archive I/O) - Add analyze_book() returning (page_count, first_page_bytes) in one pass - Add looks_like_image() magic bytes check for unrar p stdout validation - Add lsar fallback in list_cbr_images() for UTF-16BE encoded filenames - Add directory_mtimes table to skip unchanged dirs on incremental scans - Add analyzer.rs: generate_thumbnail, analyze_library_books, regenerate_thumbnails - Remove run_checkup() from API; indexer handles thumbnail jobs directly - Remove api_base_url/api_bootstrap_token from IndexerConfig and AppState - Add unar + poppler-utils to indexer Dockerfile - Fix smoke.sh: wait for job completion, check thumbnail_url field Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -251,9 +251,10 @@ stripstream-librarian/
|
||||
|------|---------|
|
||||
| `apps/api/src/books.rs` | Book CRUD endpoints |
|
||||
| `apps/api/src/pages.rs` | Page rendering & caching (LRU + disk) |
|
||||
| `apps/api/src/thumbnails.rs` | Thumbnail generation (triggered by indexer) |
|
||||
| `apps/api/src/thumbnails.rs` | Endpoints pour créer des jobs thumbnail (rebuild/regenerate) |
|
||||
| `apps/api/src/state.rs` | AppState, Semaphore concurrent_renders |
|
||||
| `apps/indexer/src/scanner.rs` | Filesystem scan, rayon parallel parsing |
|
||||
| `apps/indexer/src/scanner.rs` | Phase 1 discovery : scan rapide sans I/O archive, skip dossiers inchangés |
|
||||
| `apps/indexer/src/analyzer.rs` | Phase 2 analysis : `analyze_book` + génération thumbnails WebP |
|
||||
| `apps/indexer/src/batch.rs` | Bulk DB ops via UNNEST |
|
||||
| `apps/indexer/src/worker.rs` | Job loop, watcher, scheduler orchestration |
|
||||
| `crates/parsers/src/lib.rs` | Format detection, metadata parsing |
|
||||
@@ -302,5 +303,5 @@ fn remap_libraries_path(path: &str) -> String {
|
||||
- **Dependencies**: External crates are defined in workspace `Cargo.toml`, not individual `Cargo.toml`.
|
||||
- **Database**: PostgreSQL is required. Run migrations before starting services.
|
||||
- **External Tools**: 4 system tools required — `unrar` (CBR page count), `unar` (CBR extraction), `pdfinfo` (PDF page count), `pdftoppm` (PDF page render). Note: `unrar` and `unar` are distinct tools.
|
||||
- **Thumbnails**: generated by the **API** service (not the indexer). The indexer triggers a checkup via `POST /index/jobs/:id/thumbnails/checkup` after indexing.
|
||||
- **Thumbnails**: generated by the **indexer** service (phase 2, `analyzer.rs`). The API only creates jobs in DB — it does not generate thumbnails directly.
|
||||
- **Sub-AGENTS.md**: module-specific guidelines in `apps/api/`, `apps/indexer/`, `apps/backoffice/`, `crates/parsers/`.
|
||||
|
||||
Reference in New Issue
Block a user