Files

Deploy with Docker Compose / deploy (push) Successful in 1m1s

Details

docs: add comprehensive features list to README and docs/FEATURES.md

Replace the minimal README features section with a concise categorized
summary and link to a detailed docs/FEATURES.md covering all features,
business rules, API endpoints, and integrations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-21 14:34:36 +01:00

9.9 KiB

Raw Permalink Blame History

Stripstream Librarian — Features & Business Rules

Libraries

Multi-Library Management

Create and manage multiple independent libraries, each with its own root path
Enable/disable libraries individually
Delete a library cascades to all its books, jobs, and metadata

Scanning & Indexing

Incremental scan: uses directory mtime tracking to skip unchanged directories
Full rebuild: force re-walk all directories, ignoring cached mtimes
Rescan: deep rescan to discover newly supported formats
Two-phase pipeline:
- Phase 1 (Discovery): fast filename-based metadata extraction (no archive I/O)
- Phase 2 (Analysis): extract page counts, first page image from archives

Real-Time Monitoring

Automatic periodic scanning: configurable interval (default 5 seconds)
Filesystem watcher: real-time detection of file changes for instant indexing
Each can be toggled per library (monitor_enabled, watcher_enabled)

Books

Format Support

CBZ (ZIP-based comic archives)
CBR (RAR-based comic archives)
PDF
EPUB
Automatic format detection from file extension and magic bytes

Metadata Extraction

Title: derived from filename or external metadata
Series: derived from directory structure (first directory level under library root)
Volume: extracted from filename with pattern detection:
- T## (Tome) — most common for French comics
- Vol.##, Vol ##, Volume ##
- ### (standalone number)
- -## (dash-separated)
Author(s): single scalar and array support
Page count: extracted from archive analysis
Language, kind (ebook, comic, bd)

Thumbnails

Generated from the first page of each archive
Output format configurable: WebP (default), JPEG, PNG
Configurable dimensions (default 300×400)
Lazy generation: created on first access if missing
Bulk operations: rebuild missing or regenerate all

CBR to CBZ Conversion

Convert RAR archives to ZIP format
Tracked as background job with progress

Series

Automatic Aggregation

Series derived from directory structure during scanning
Books without series grouped as "unclassified"

Series Metadata

Description, publisher, start year, status (ongoing, ended, completed, on_hold, hiatus)
Total volume count (from external providers)
Authors (aggregated from books or metadata)

Filtering & Discovery

Filter by: series name (partial match), reading status, series status, metadata provider linkage
Sort by: name, reading status, book count
Missing books detection: identifies gaps in volume numbering within a series

Reading Progress

Per-Book Tracking

Three states: unread (default), reading, read
Current page tracking when status is reading
last_read_at timestamp auto-updated

Series-Level Status

Calculated from book statuses:
- All read → series read
- None read → series unread
- Mixed → series reading

Bulk Operations

Mark entire series as read (updates all books)

Search & Discovery

Full-Text Search

PostgreSQL-based (ILIKE + pg_trgm)
Searches across: book titles, series names, authors (scalar and array fields), series metadata authors
Case-insensitive partial matching
Library-scoped filtering

Results

Book hits: title, authors, series, volume, language, kind
Series hits: name, book count, read count, first book (for linking)
Processing time included in response

Authors

Unique author aggregation from books and series metadata
Per-author book and series count
Searchable by name (partial match)
Sortable by name or book count

External Metadata

Supported Providers

Provider	Focus
Google Books	General books (default fallback)
ComicVine	Comics
BedéThèque	Franco-Belgian comics
AniList	Manga/anime
Open Library	General books

Provider Configuration

Global default provider with library-level override
Fallback provider if primary is unavailable

Matching Workflow

Search: query a provider, get candidates with confidence scores
Match: link a series to an external result (status pending)
Approve: validate and sync metadata to series and books
Reject: discard a match

Batch Processing

Auto-match all series in a library via metadata_batch job
Configurable confidence threshold
Result statuses: auto_matched, no_results, too_many_results, low_confidence, already_linked

Metadata Refresh

Update approved links with latest data from providers
Change tracking reports per series/book
Non-destructive: only updates when provider has new data

Field Locking

Individual book fields can be locked to prevent external sync from overwriting manual edits

External Integrations

Komga Sync

Import reading progress from a Komga server
Matches local series/books by name
Detailed sync report: matched, already read, newly marked, unmatched

Prowlarr (Indexer Search)

Search Prowlarr for missing volumes in a series
Volume pattern matching against release titles
Results: title, size, seeders/leechers, download URL, matched missing volumes

qBittorrent

Add torrents directly from Prowlarr search results
Connection test endpoint

Page Rendering & Caching

Page Extraction

Render any page from supported archive formats
1-indexed page numbers

Image Processing

Output formats: original, JPEG, PNG, WebP
Quality parameter (1–100)
Max width parameter (1–2160 px)
Configurable resampling filter: lanczos3, nearest, triangle/bilinear
Concurrent render limit (default 8) with semaphore

Caching

LRU in-memory cache: 512 entries
Disk cache: SHA256-keyed, two-level directory structure
Cache key = hash(path + page + format + quality + width)
Configurable cache directory and max size
Manual cache clear via settings

Background Jobs

Job Types

Type	Description
`rebuild`	Incremental scan
`full_rebuild`	Full filesystem rescan
`rescan`	Deep rescan for new formats
`thumbnail_rebuild`	Generate missing thumbnails
`thumbnail_regenerate`	Clear and regenerate all thumbnails
`cbr_to_cbz`	Convert RAR to ZIP
`metadata_batch`	Auto-match series to metadata
`metadata_refresh`	Update approved metadata links

Job Lifecycle

Status flow: pending → running → success | failed | cancelled
Intermediate statuses: extracting_pages, generating_thumbnails
Real-time progress via Server-Sent Events (SSE)
Per-file error tracking (non-fatal: job continues on errors)
Cancellation support for pending/running jobs

Progress Tracking

Percentage (0–100), current file, processed/total counts
Timing: started_at, finished_at, phase2_started_at
Stats JSON blob with job-specific metrics

Authentication & Security

Token System

Bootstrap token: admin token via API_BOOTSTRAP_TOKEN env var
API tokens: create, list, revoke with scopes
Token format: stl_{prefix}_{secret} with Argon2 hashing
Expiration dates, last usage tracking, revocation

Access Control

Two scopes: admin (full access) and read (read-only)
Route-level middleware enforcement
Rate limiting: configurable sliding window (default 120 req/s)

Backoffice (Web UI)

Dashboard

Statistics cards: books, series, authors, libraries
Donut charts: reading status breakdown, format distribution
Bar charts: books per language
Per-library reading progress bars
Top series by book/page count
Monthly addition timeline
Metadata coverage stats

Interactive Features

Real-time search with suggestions
Metadata search and matching modals
Prowlarr search modal for missing volumes
Folder browser/picker for library paths
Book/series editing forms
Quick reading status toggles
CBR to CBZ conversion trigger

API

Documentation

OpenAPI/Swagger UI available at /swagger-ui
Health check (/health), readiness (/ready), Prometheus metrics (/metrics)

Public Endpoints (no auth)

GET /health, GET /ready, GET /metrics, GET /swagger-ui

Read Endpoints (read scope)

Libraries, books, series, authors listing and detail
Book pages and thumbnails
Reading progress get/update
Full-text search, collection statistics

Admin Endpoints (admin scope)

Library CRUD and configuration
Book metadata editing, CBR conversion
Series metadata editing
Indexing job management (trigger, cancel, stream)
API token management
Metadata operations (search, match, approve, reject, batch, refresh)
External integrations (Prowlarr, qBittorrent, Komga)
Application settings and cache management

Database

Key Design Decisions

PostgreSQL with pg_trgm for full-text search (no external search engine)
All deletions cascade from libraries
Unique constraints: file paths, token prefixes, metadata links (library + series + provider)
Directory mtime caching for incremental scan optimization
Connection pool: 10 (API), 20 (indexer)

Archive Resilience

CBZ: fallback streaming reader if central directory corrupted
CBR: RAR extraction via system unar, fallback to CBZ parsing
PDF: pdfinfo for page count, pdftoppm for rendering
EPUB: ZIP-based extraction
FD exhaustion detection: aborts if too many consecutive IO errors

9.9 KiB Raw Permalink Blame History Unescape Escape

Stripstream Librarian — Features & Business Rules

Libraries

Multi-Library Management

Scanning & Indexing

Real-Time Monitoring

Books

Format Support

Metadata Extraction

Thumbnails

CBR to CBZ Conversion

Series

Automatic Aggregation

Series Metadata

Filtering & Discovery

Reading Progress

Per-Book Tracking

Series-Level Status

Bulk Operations

Search & Discovery

Full-Text Search

Results

Authors

External Metadata

Supported Providers

Provider Configuration

Matching Workflow

Batch Processing

Metadata Refresh

Field Locking

External Integrations

Komga Sync

Prowlarr (Indexer Search)

qBittorrent

Page Rendering & Caching

Page Extraction

Image Processing

Caching

Background Jobs

Job Types

Job Lifecycle

Progress Tracking

Authentication & Security

Token System

Access Control

Backoffice (Web UI)

Dashboard

Pages

Interactive Features

API

Documentation

Public Endpoints (no auth)

Read Endpoints (read scope)

Admin Endpoints (admin scope)

Database

Key Design Decisions

Archive Resilience

9.9 KiB

Raw Permalink Blame History