All checks were successful
Deploy with Docker Compose / deploy (push) Successful in 1m1s
Replace the minimal README features section with a concise categorized summary and link to a detailed docs/FEATURES.md covering all features, business rules, API endpoints, and integrations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
9.9 KiB
9.9 KiB
Stripstream Librarian — Features & Business Rules
Libraries
Multi-Library Management
- Create and manage multiple independent libraries, each with its own root path
- Enable/disable libraries individually
- Delete a library cascades to all its books, jobs, and metadata
Scanning & Indexing
- Incremental scan: uses directory mtime tracking to skip unchanged directories
- Full rebuild: force re-walk all directories, ignoring cached mtimes
- Rescan: deep rescan to discover newly supported formats
- Two-phase pipeline:
- Phase 1 (Discovery): fast filename-based metadata extraction (no archive I/O)
- Phase 2 (Analysis): extract page counts, first page image from archives
Real-Time Monitoring
- Automatic periodic scanning: configurable interval (default 5 seconds)
- Filesystem watcher: real-time detection of file changes for instant indexing
- Each can be toggled per library (
monitor_enabled,watcher_enabled)
Books
Format Support
- CBZ (ZIP-based comic archives)
- CBR (RAR-based comic archives)
- EPUB
- Automatic format detection from file extension and magic bytes
Metadata Extraction
- Title: derived from filename or external metadata
- Series: derived from directory structure (first directory level under library root)
- Volume: extracted from filename with pattern detection:
T##(Tome) — most common for French comicsVol.##,Vol ##,Volume #####(standalone number)-##(dash-separated)
- Author(s): single scalar and array support
- Page count: extracted from archive analysis
- Language, kind (ebook, comic, bd)
Thumbnails
- Generated from the first page of each archive
- Output format configurable: WebP (default), JPEG, PNG
- Configurable dimensions (default 300×400)
- Lazy generation: created on first access if missing
- Bulk operations: rebuild missing or regenerate all
CBR to CBZ Conversion
- Convert RAR archives to ZIP format
- Tracked as background job with progress
Series
Automatic Aggregation
- Series derived from directory structure during scanning
- Books without series grouped as "unclassified"
Series Metadata
- Description, publisher, start year, status (
ongoing,ended,completed,on_hold,hiatus) - Total volume count (from external providers)
- Authors (aggregated from books or metadata)
Filtering & Discovery
- Filter by: series name (partial match), reading status, series status, metadata provider linkage
- Sort by: name, reading status, book count
- Missing books detection: identifies gaps in volume numbering within a series
Reading Progress
Per-Book Tracking
- Three states:
unread(default),reading,read - Current page tracking when status is
reading last_read_attimestamp auto-updated
Series-Level Status
- Calculated from book statuses:
- All read → series
read - None read → series
unread - Mixed → series
reading
- All read → series
Bulk Operations
- Mark entire series as read (updates all books)
Search & Discovery
Full-Text Search
- PostgreSQL-based (
ILIKE+pg_trgm) - Searches across: book titles, series names, authors (scalar and array fields), series metadata authors
- Case-insensitive partial matching
- Library-scoped filtering
Results
- Book hits: title, authors, series, volume, language, kind
- Series hits: name, book count, read count, first book (for linking)
- Processing time included in response
Authors
- Unique author aggregation from books and series metadata
- Per-author book and series count
- Searchable by name (partial match)
- Sortable by name or book count
External Metadata
Supported Providers
| Provider | Focus |
|---|---|
| Google Books | General books (default fallback) |
| ComicVine | Comics |
| BedéThèque | Franco-Belgian comics |
| AniList | Manga/anime |
| Open Library | General books |
Provider Configuration
- Global default provider with library-level override
- Fallback provider if primary is unavailable
Matching Workflow
- Search: query a provider, get candidates with confidence scores
- Match: link a series to an external result (status
pending) - Approve: validate and sync metadata to series and books
- Reject: discard a match
Batch Processing
- Auto-match all series in a library via
metadata_batchjob - Configurable confidence threshold
- Result statuses:
auto_matched,no_results,too_many_results,low_confidence,already_linked
Metadata Refresh
- Update approved links with latest data from providers
- Change tracking reports per series/book
- Non-destructive: only updates when provider has new data
Field Locking
- Individual book fields can be locked to prevent external sync from overwriting manual edits
External Integrations
Komga Sync
- Import reading progress from a Komga server
- Matches local series/books by name
- Detailed sync report: matched, already read, newly marked, unmatched
Prowlarr (Indexer Search)
- Search Prowlarr for missing volumes in a series
- Volume pattern matching against release titles
- Results: title, size, seeders/leechers, download URL, matched missing volumes
qBittorrent
- Add torrents directly from Prowlarr search results
- Connection test endpoint
Page Rendering & Caching
Page Extraction
- Render any page from supported archive formats
- 1-indexed page numbers
Image Processing
- Output formats: original, JPEG, PNG, WebP
- Quality parameter (1–100)
- Max width parameter (1–2160 px)
- Configurable resampling filter: lanczos3, nearest, triangle/bilinear
- Concurrent render limit (default 8) with semaphore
Caching
- LRU in-memory cache: 512 entries
- Disk cache: SHA256-keyed, two-level directory structure
- Cache key = hash(path + page + format + quality + width)
- Configurable cache directory and max size
- Manual cache clear via settings
Background Jobs
Job Types
| Type | Description |
|---|---|
rebuild |
Incremental scan |
full_rebuild |
Full filesystem rescan |
rescan |
Deep rescan for new formats |
thumbnail_rebuild |
Generate missing thumbnails |
thumbnail_regenerate |
Clear and regenerate all thumbnails |
cbr_to_cbz |
Convert RAR to ZIP |
metadata_batch |
Auto-match series to metadata |
metadata_refresh |
Update approved metadata links |
Job Lifecycle
- Status flow:
pending→running→success|failed|cancelled - Intermediate statuses:
extracting_pages,generating_thumbnails - Real-time progress via Server-Sent Events (SSE)
- Per-file error tracking (non-fatal: job continues on errors)
- Cancellation support for pending/running jobs
Progress Tracking
- Percentage (0–100), current file, processed/total counts
- Timing: started_at, finished_at, phase2_started_at
- Stats JSON blob with job-specific metrics
Authentication & Security
Token System
- Bootstrap token: admin token via
API_BOOTSTRAP_TOKENenv var - API tokens: create, list, revoke with scopes
- Token format:
stl_{prefix}_{secret}with Argon2 hashing - Expiration dates, last usage tracking, revocation
Access Control
- Two scopes:
admin(full access) andread(read-only) - Route-level middleware enforcement
- Rate limiting: configurable sliding window (default 120 req/s)
Backoffice (Web UI)
Dashboard
- Statistics cards: books, series, authors, libraries
- Donut charts: reading status breakdown, format distribution
- Bar charts: books per language
- Per-library reading progress bars
- Top series by book/page count
- Monthly addition timeline
- Metadata coverage stats
Pages
- Libraries: list, create, delete, configure monitoring and metadata provider
- Books: global list with filtering/sorting, detail view with metadata and page rendering
- Series: global list, per-library view, detail with metadata management
- Authors: list with book/series counts, detail with author's books
- Jobs: history, live progress via SSE, error details
- Tokens: create, list, revoke API tokens
- Settings: image processing, cache, thumbnails, external services (Prowlarr, qBittorrent)
Interactive Features
- Real-time search with suggestions
- Metadata search and matching modals
- Prowlarr search modal for missing volumes
- Folder browser/picker for library paths
- Book/series editing forms
- Quick reading status toggles
- CBR to CBZ conversion trigger
API
Documentation
- OpenAPI/Swagger UI available at
/swagger-ui - Health check (
/health), readiness (/ready), Prometheus metrics (/metrics)
Public Endpoints (no auth)
GET /health,GET /ready,GET /metrics,GET /swagger-ui
Read Endpoints (read scope)
- Libraries, books, series, authors listing and detail
- Book pages and thumbnails
- Reading progress get/update
- Full-text search, collection statistics
Admin Endpoints (admin scope)
- Library CRUD and configuration
- Book metadata editing, CBR conversion
- Series metadata editing
- Indexing job management (trigger, cancel, stream)
- API token management
- Metadata operations (search, match, approve, reject, batch, refresh)
- External integrations (Prowlarr, qBittorrent, Komga)
- Application settings and cache management
Database
Key Design Decisions
- PostgreSQL with
pg_trgmfor full-text search (no external search engine) - All deletions cascade from libraries
- Unique constraints: file paths, token prefixes, metadata links (library + series + provider)
- Directory mtime caching for incremental scan optimization
- Connection pool: 10 (API), 20 (indexer)
Archive Resilience
- CBZ: fallback streaming reader if central directory corrupted
- CBR: RAR extraction via system
unar, fallback to CBZ parsing - PDF:
pdfinfofor page count,pdftoppmfor rendering - EPUB: ZIP-based extraction
- FD exhaustion detection: aborts if too many consecutive IO errors