For each series with missing volumes and an approved metadata link, calls Prowlarr to find available matching releases and stores them in a report (no auto-download). Includes per-series detail page, Telegram notifications with per-event toggles, and stats display in the jobs table. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 KiB
13 KiB
Stripstream Librarian — Features & Business Rules
Libraries
Multi-Library Management
- Create and manage multiple independent libraries, each with its own root path
- Enable/disable libraries individually
- Delete a library cascades to all its books, jobs, and metadata
Scanning & Indexing
- Incremental scan: uses directory mtime tracking to skip unchanged directories
- Full rebuild: force re-walk all directories, ignoring cached mtimes
- Rescan: deep rescan to discover newly supported formats
- Two-phase pipeline:
- Phase 1 (Discovery): fast filename-based metadata extraction (no archive I/O)
- Phase 2 (Analysis): extract page counts, first page image from archives
Real-Time Monitoring
- Automatic periodic scanning: configurable interval (default 5 seconds)
- Filesystem watcher: real-time detection of file changes for instant indexing
- Each can be toggled per library (
monitor_enabled,watcher_enabled)
Books
Format Support
- CBZ (ZIP-based comic archives)
- CBR (RAR-based comic archives)
- EPUB
- Automatic format detection from file extension and magic bytes
Metadata Extraction
- Title: derived from filename or external metadata
- Series: derived from directory structure (first directory level under library root)
- Volume: extracted from filename with pattern detection:
T##(Tome) — most common for French comicsVol.##,Vol ##,Volume #####(standalone number)-##(dash-separated)
- Author(s): single scalar and array support
- Page count: extracted from archive analysis
- Language, kind (ebook, comic, bd)
Thumbnails
- Generated from the first page of each archive
- Output format configurable: WebP (default), JPEG, PNG
- Configurable dimensions (default 300×400)
- Lazy generation: created on first access if missing
- Bulk operations: rebuild missing or regenerate all
CBR to CBZ Conversion
- Convert RAR archives to ZIP format
- Tracked as background job with progress
Series
Automatic Aggregation
- Series derived from directory structure during scanning
- Books without series grouped as "unclassified"
Series Metadata
- Description, publisher, start year, status (
ongoing,ended,completed,on_hold,hiatus) - Total volume count (from external providers)
- Authors (aggregated from books or metadata)
Filtering & Discovery
- Filter by: series name (partial match), reading status, series status, metadata provider linkage
- Sort by: name, reading status, book count
- Missing books detection: identifies gaps in volume numbering within a series
Reading Progress
Per-Book Tracking
- Three states:
unread(default),reading,read - Current page tracking when status is
reading last_read_attimestamp auto-updated
Series-Level Status
- Calculated from book statuses:
- All read → series
read - None read → series
unread - Mixed → series
reading
- All read → series
Bulk Operations
- Mark entire series as read (updates all books)
Search & Discovery
Full-Text Search
- PostgreSQL-based (
ILIKE+pg_trgm) - Searches across: book titles, series names, authors (scalar and array fields), series metadata authors
- Case-insensitive partial matching
- Library-scoped filtering
Results
- Book hits: title, authors, series, volume, language, kind
- Series hits: name, book count, read count, first book (for linking)
- Processing time included in response
Authors
- Unique author aggregation from books and series metadata
- Per-author book and series count
- Searchable by name (partial match)
- Sortable by name or book count
External Metadata
Supported Providers
| Provider | Focus |
|---|---|
| Google Books | General books (default fallback) |
| ComicVine | Comics |
| BedéThèque | Franco-Belgian comics |
| AniList | Manga/anime |
| Open Library | General books |
Provider Configuration
- Global default provider with library-level override
- Fallback provider if primary is unavailable
Matching Workflow
- Search: query a provider, get candidates with confidence scores
- Match: link a series to an external result (status
pending) - Approve: validate and sync metadata to series and books
- Reject: discard a match
Batch Processing
- Auto-match all series in a library via
metadata_batchjob - Configurable confidence threshold
- Result statuses:
auto_matched,no_results,too_many_results,low_confidence,already_linked
Metadata Refresh
- Update approved links with latest data from providers
- Change tracking reports per series/book
- Non-destructive: only updates when provider has new data
Field Locking
- Individual book fields can be locked to prevent external sync from overwriting manual edits
AniList Reading Status Sync
Integration with AniList to synchronize reading progress in both directions for linked series.
Configuration
- AniList user ID required for pull/push operations
- Configured per library in the reading status provider settings
- Auto-push schedule configurable per library:
manual,hourly,daily,weekly
Reading Status Match (reading_status_match)
- Pull reading progress from AniList and update local book statuses
- Maps AniList list status:
PLANNING→unread,CURRENT→reading,COMPLETED→read - Detailed per-series report: matched, updated, skipped, errors
- Rate limit handling: waits 10s and retries once on HTTP 429, aborts on second 429
Reading Status Push (reading_status_push)
- Differential push: only syncs series that changed since last push, have new books, or have never been synced
- Maps local status to AniList:
unread→PLANNING,reading→CURRENT,read→COMPLETED - Never auto-completes a series on AniList based solely on owned books (requires all books read)
- Per-series result tracking: pushed, skipped, no_books, error
- Same 429 retry logic as
reading_status_match - Auto-push schedule runs every minute check via indexer scheduler
External Integrations
Komga Sync
- Import reading progress from a Komga server
- Matches local series/books by name
- Detailed sync report: matched, already read, newly marked, unmatched
Prowlarr (Indexer Search)
- Search Prowlarr for missing volumes in a series
- Volume pattern matching against release titles
- Results: title, size, seeders/leechers, download URL, matched missing volumes
qBittorrent
- Add torrents directly from Prowlarr search results
- Connection test endpoint
Notifications
Telegram
- Real-time notifications via Telegram Bot API (
sendMessageandsendPhoto) - Configuration: bot token, chat ID, enable/disable toggle
- Test connection button in settings
Granular Event Toggles
16 individually configurable notification events grouped by category:
| Category | Events |
|---|---|
| Scans | scan_completed, scan_failed, scan_cancelled |
| Thumbnails | thumbnail_completed, thumbnail_failed, thumbnail_cancelled |
| Conversion | conversion_completed, conversion_failed, conversion_cancelled |
| Metadata | metadata_approved, metadata_batch_completed, metadata_refresh_completed |
| Reading status | reading_status_match_completed, reading_status_match_failed, reading_status_push_completed, reading_status_push_failed |
Thumbnail Images in Notifications
- Book cover thumbnails attached to applicable notifications (conversion, metadata approval)
- Uses
sendPhotomultipart upload with fallback to text-onlysendMessage
Implementation
- Shared
crates/notificationscrate used by both API and indexer - Fire-and-forget: notification failures are logged but never block the main operation
- Messages formatted in HTML with event-specific icons
Page Rendering & Caching
Page Extraction
- Render any page from supported archive formats
- 1-indexed page numbers
Image Processing
- Output formats: original, JPEG, PNG, WebP
- Quality parameter (1–100)
- Max width parameter (1–2160 px)
- Configurable resampling filter: lanczos3, nearest, triangle/bilinear
- Concurrent render limit (default 8) with semaphore
Caching
- LRU in-memory cache: 512 entries
- Disk cache: SHA256-keyed, two-level directory structure
- Cache key = hash(path + page + format + quality + width)
- Configurable cache directory and max size
- Manual cache clear via settings
Background Jobs
Job Types
| Type | Description |
|---|---|
rebuild |
Incremental scan |
full_rebuild |
Full filesystem rescan |
rescan |
Deep rescan for new formats |
thumbnail_rebuild |
Generate missing thumbnails |
thumbnail_regenerate |
Clear and regenerate all thumbnails |
cbr_to_cbz |
Convert RAR to ZIP |
metadata_batch |
Auto-match series to metadata |
metadata_refresh |
Update approved metadata links |
reading_status_match |
Pull reading progress from AniList to local |
reading_status_push |
Differential push of reading statuses to AniList |
Job Lifecycle
- Status flow:
pending→running→success|failed|cancelled - Intermediate statuses:
extracting_pages,generating_thumbnails - Real-time progress via Server-Sent Events (SSE)
- Per-file error tracking (non-fatal: job continues on errors)
- Cancellation support for pending/running jobs
Progress Tracking
- Percentage (0–100), current file, processed/total counts
- Timing: started_at, finished_at, phase2_started_at
- Stats JSON blob with job-specific metrics
Authentication & Security
Token System
- Bootstrap token: admin token via
API_BOOTSTRAP_TOKENenv var - API tokens: create, list, revoke with scopes
- Token format:
stl_{prefix}_{secret}with Argon2 hashing - Expiration dates, last usage tracking, revocation
Access Control
- Two scopes:
admin(full access) andread(read-only) - Route-level middleware enforcement
- Rate limiting: configurable sliding window (default 120 req/s)
Backoffice (Web UI)
Dashboard
- Statistics cards: books, series, authors, libraries, pages, total size
- Interactive charts (recharts): donut, area, stacked bar, horizontal bar
- Reading status breakdown, format distribution, library distribution
- Currently reading section with progress bars
- Recently read section with cover thumbnails
- Reading activity over time (area chart)
- Books added over time (area chart)
- Per-library stacked reading progress
- Top series by book count
- Metadata coverage and provider breakdown
Pages
- Libraries: list, create, delete, configure monitoring and metadata provider
- Books: global list with filtering/sorting, detail view with metadata and page rendering
- Series: global list, per-library view, detail with metadata management
- Authors: list with book/series counts, detail with author's books
- Jobs: history, live progress via SSE, error details
- Tokens: create, list, revoke API tokens
- Settings: image processing, cache, thumbnails, external services (Prowlarr, qBittorrent), notifications (Telegram)
Interactive Features
- Real-time search with suggestions
- Metadata search and matching modals
- Prowlarr search modal for missing volumes
- Folder browser/picker for library paths
- Book/series editing forms
- Quick reading status toggles
- CBR to CBZ conversion trigger
API
Documentation
- OpenAPI/Swagger UI available at
/swagger-ui - Health check (
/health), readiness (/ready), Prometheus metrics (/metrics)
Public Endpoints (no auth)
GET /health,GET /ready,GET /metrics,GET /swagger-ui
Read Endpoints (read scope)
- Libraries, books, series, authors listing and detail
- Book pages and thumbnails
- Reading progress get/update
- Full-text search, collection statistics
Admin Endpoints (admin scope)
- Library CRUD and configuration
- Book metadata editing, CBR conversion
- Series metadata editing
- Indexing job management (trigger, cancel, stream)
- API token management
- Metadata operations (search, match, approve, reject, batch, refresh)
- External integrations (Prowlarr, qBittorrent, Komga)
- Application settings and cache management
Database
Key Design Decisions
- PostgreSQL with
pg_trgmfor full-text search (no external search engine) - All deletions cascade from libraries
- Unique constraints: file paths, token prefixes, metadata links (library + series + provider)
- Directory mtime caching for incremental scan optimization
- Connection pool: 10 (API), 20 (indexer)
Archive Resilience
- CBZ: fallback streaming reader if central directory corrupted
- CBR: RAR extraction via system
unar, fallback to CBZ parsing - PDF:
pdfinfofor page count,pdftoppmfor rendering - EPUB: ZIP-based extraction
- FD exhaustion detection: aborts if too many consecutive IO errors