feat: two-phase indexation with direct thumbnail generation in indexer

Phase 1 (discovery): walkdir + filename-only metadata, zero archive I/O.
Books are visible immediately in the UI while Phase 2 runs in background.

Phase 2 (analysis): open each archive once via analyze_book() to extract
page_count and first page bytes, then generate WebP thumbnail directly in
the indexer — removing the HTTP roundtrip to the API checkup endpoint.

- Add parse_metadata_fast() (infallible, no archive I/O)
- Add analyze_book() returning (page_count, first_page_bytes) in one pass
- Add looks_like_image() magic bytes check for unrar p stdout validation
- Add lsar fallback in list_cbr_images() for UTF-16BE encoded filenames
- Add directory_mtimes table to skip unchanged dirs on incremental scans
- Add analyzer.rs: generate_thumbnail, analyze_library_books, regenerate_thumbnails
- Remove run_checkup() from API; indexer handles thumbnail jobs directly
- Remove api_base_url/api_bootstrap_token from IndexerConfig and AppState
- Add unar + poppler-utils to indexer Dockerfile
- Fix smoke.sh: wait for job completion, check thumbnail_url field

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

This commit is contained in:

Julien Froidefond

2026-03-09 22:13:05 +01:00

parent 36af34443e

commit cfc896e92f

22 changed files with 1274 additions and 768 deletions

									
										3

apps/indexer/src/lib.rs
									
												View File
												
				@@ -1,3 +1,4 @@

				pub mod analyzer;

				pub mod api;

				pub mod batch;

				pub mod job;

				@@ -15,6 +16,4 @@ pub struct AppState {

				    pub pool: PgPool,

				    pub meili_url: String,

				    pub meili_master_key: String,

				    pub api_base_url: String,

				    pub api_bootstrap_token: String,

				}

feat: two-phase indexation with direct thumbnail generation in indexer

3 apps/indexer/src/lib.rs Unescape Escape View File

3

apps/indexer/src/lib.rs

View File