docs: update docs to match current codebase

This commit is contained in:
imSp4rky
2026-05-08 17:54:45 -06:00
parent 4c981e2ffd
commit 7fb88e9a97
8 changed files with 689 additions and 216 deletions

View File

@@ -13,6 +13,7 @@ unshackle serve # Default: localhost:8786
unshackle serve -h 0.0.0.0 -p 8888 # Listen on all interfaces
unshackle serve --no-key # Disable authentication
unshackle serve --api-only # REST API only, no CDM endpoints
unshackle serve --remote-only # Only expose remote service session endpoints
```
### CLI Options
@@ -28,22 +29,34 @@ unshackle serve --api-only # REST API only, no CDM endpoints
| `--no-key` | `false` | Disable API key authentication (allows all requests) |
| `--debug-api` | `false` | Include tracebacks and stderr in API error responses |
| `--debug` | `false` | Enable debug logging for API operations |
| `--remote-only` | `false` | Only expose remote service session endpoints (health, services, search, session) |
### Configuration
- `api_secret` - Secret key for REST API authentication. Required unless `--no-key` is used. All API requests must include this key via the `X-API-Key` header or `api_key` query parameter.
- `api_secret` - Secret key for REST API authentication. Required unless `--no-key` is used. All API requests must include this key via the `X-Secret-Key` header.
- `compression_level` - Compression level for API payloads (manifests, cache, cookies). `0`=off, `1`=fast, `6`=balanced, `9`=max. Default: `1`.
- `session_ttl` - Session inactivity timeout in seconds. Each request resets the timer. Default: `300`.
- `max_sessions` - Maximum concurrent sessions before the oldest is evicted. Default: `100`.
- `services` - Optional global service allowlist. Only these service tags are exposed. If omitted, all services are available.
- `devices` - List of Widevine device files (.wvd). If not specified, auto-populated from the WVDs directory.
- `playready_devices` - List of PlayReady device files (.prd). If not specified, auto-populated from the PRDs directory.
- `users` - Dictionary mapping user secret keys to their access configuration:
- `devices` - List of Widevine devices this user can access
- `playready_devices` - List of PlayReady devices this user can access
- `username` - Internal logging name for the user (not visible to users)
- `services` - Optional per-user service allowlist. Effective access is the intersection of global and per-user allowlists.
For example,
```yaml
serve:
api_secret: "your-secret-key-here"
compression_level: 1
session_ttl: 300
max_sessions: 100
# services: # global allowlist (optional)
# - EXAMPLE1
# - EXAMPLE2
users:
secret_key_for_jane: # 32bit hex recommended, case-sensitive
devices: # list of allowed Widevine devices for this user
@@ -51,18 +64,18 @@ serve:
playready_devices: # list of allowed PlayReady devices for this user
- my_playready_device
username: jane # only for internal logging, users will not see this name
# services: # per-user allowlist (optional)
# - EXAMPLE1
secret_key_for_james:
devices:
- generic_nexus_4464_l3
username: james
secret_key_for_john:
devices:
- generic_nexus_4464_l3
username: john
# devices can be manually specified by path if you don't want to add it to
# unshackle's WVDs directory for whatever reason
# devices:
# - 'C:\Users\john\Devices\test_devices_001.wvd'
# playready_devices:
# - '/path/to/device.prd'
```
### REST API
@@ -133,3 +146,51 @@ Check for updates from the GitHub repository on startup. Default: `true`.
How often to check for updates, in hours. Default: `24`.
---
## title_cache_enabled (bool)
Enable or disable title metadata caching globally. Default: `true`.
---
## title_cache_time (int)
Title cache duration in seconds. Default: `1800` (30 minutes).
---
## title_cache_max_retention (int)
Maximum cache retention in seconds, used as fallback when the upstream API fails. Default: `86400` (24 hours).
---
## unicode_filenames (bool)
When `false`, replaces non-ASCII characters in output filenames with ASCII equivalents. Default: `false`.
---
## ipinfo_api_key (str)
Optional ipinfo.io token. When set, unshackle uses the ipinfo.io Lite endpoint for IP/geolocation lookups instead of the unauthenticated fallback.
---
## tmdb_api_key (str)
Optional TMDB API key, used for metadata enrichment and IMDb/TMDb tagging.
---
## simkl_client_id (str)
Optional Simkl client ID for metadata lookups.
---
## decrypt_labs_api_key (str)
Optional Decrypt Labs API key, used by services that integrate with the service.
---

View File

@@ -1,6 +1,8 @@
# REST API Documentation
The unshackle REST API allows you to control downloads, search services, and manage jobs remotely. Start the server with `unshackle serve` and access the interactive Swagger UI at `http://localhost:8786/api/docs/`.
The unshackle REST API allows you to control downloads, search services, drive remote downloads from a thin client, and (optionally) co-host the pywidevine/pyplayready CDM. Start the server with `unshackle serve` and access the interactive Swagger UI at `http://localhost:8786/api/docs/`.
The server is built on **aiohttp** (not FastAPI). Implementation lives in `unshackle/commands/serve.py` and `unshackle/core/api/` (`routes.py`, `handlers.py`, `session_store.py`, `input_bridge.py`, `download_manager.py`, `download_worker.py`).
## Quick Start
@@ -8,32 +10,109 @@ The unshackle REST API allows you to control downloads, search services, and man
# Start the server (no authentication)
unshackle serve --no-key
# Start with authentication
unshackle serve # Requires api_secret in unshackle.yaml
# Start with authentication (api_secret in unshackle.yaml)
unshackle serve
# Serve only the REST API (no pywidevine/pyplayready CDM)
unshackle serve --api-only
# Serve only the remote-dl session endpoints (CORS/Cloudflare friendly)
unshackle serve --remote-only
# Disable just one CDM
unshackle serve --no-widevine
unshackle serve --no-playready
# Verbose error responses (tracebacks/stderr in JSON)
unshackle serve --debug-api
```
`serve` flags:
| Flag | Description |
| --- | --- |
| `-h, --host` | Bind host (default `127.0.0.1`) |
| `-p, --port` | Bind port (default `8786`) |
| `--caddy` | Also launch Caddy using `Caddyfile` next to the unshackle config |
| `--api-only` | REST API only; skip the bundled pywidevine/pyplayready CDM endpoints |
| `--no-widevine` | Disable Widevine CDM endpoints |
| `--no-playready` | Disable PlayReady CDM endpoints |
| `--no-key` | Disable API key authentication entirely |
| `--debug-api` | Include tracebacks/stderr in error responses |
| `--debug` | Enable DEBUG-level logging for API operations |
| `--remote-only` | Expose only `/api/health`, `/api/services`, `/api/search`, and `/api/session/*` (implies `--api-only`) |
## Authentication
When `api_secret` is set in `unshackle.yaml`, all API requests require authentication via:
- **Header**: `X-API-Key: your-secret-key-here`
- **Query parameter**: `?api_key=your-secret-key-here`
Use `--no-key` to disable authentication entirely (not recommended for public-facing servers).
When `api_secret` is set in `unshackle.yaml`, all API requests require the **`X-Secret-Key`** header. There is no query-parameter fallback. `/api/health` is always reachable without authentication. `--no-key` disables auth entirely (not recommended for public-facing servers).
```yaml
# unshackle.yaml
serve:
api_secret: "your-secret-key-here"
api_secret: "your-master-secret" # falls back to global users map below
remote_only: false # also toggleable via --remote-only
services: ["EXAMPLE1", "EXAMPLE2"] # optional global service allowlist
users:
user-secret-1:
username: alice
devices: ["my_widevine_l3"] # Widevine WVD names this user may use
playready_devices: ["my_pr_sl2000"] # PlayReady PRD names; defaults to [] (no access)
services: ["EXAMPLE1"] # optional per-user allowlist (intersected with global)
user-secret-2:
username: bob
devices: []
playready_devices: []
```
### Service allowlists
`config.serve.services` is the global allowlist; `users.<key>.services` further narrows it per key. The effective set is the intersection. Endpoints affected: `/api/services`, `/api/search`, `/api/list-titles`, `/api/list-tracks`, `/api/download`, and all `/api/session/*` routes.
### CDM access (server-side decryption)
There is no separate "tier" flag. Whether the server can return KID:KEY for a session-mode download depends solely on the device lists configured for the calling user key:
- Empty `devices` and `playready_devices` -> server can only proxy CDM challenges; the client must run its own CDM and parse the license.
- Populated lists -> the client may set `mode: "server_cdm"` on `/api/session/{id}/license` and receive `{ "keys": { "<track_id>": { "<KID>": "<KEY>" } } }` instead of raw license bytes.
Per-service CDM type can be pinned via `config.cdm` (`widevine`/`playready`) or per-service `cdm_type`; otherwise the server picks the type the user has devices for.
---
## Endpoint Map
Standard endpoints (suppressed in `--remote-only` mode are marked R):
| Method | Path | R |
| --- | --- | :-: |
| GET | `/api/health` | ok |
| GET | `/api/services` | ok |
| POST | `/api/search` | ok |
| POST | `/api/list-titles` | hidden |
| POST | `/api/list-tracks` | hidden |
| POST | `/api/download` | hidden |
| GET | `/api/download/jobs` | hidden |
| GET | `/api/download/jobs/{job_id}` | hidden |
| DELETE | `/api/download/jobs/{job_id}` | hidden |
| POST | `/api/session/create` | ok |
| GET | `/api/session/{session_id}` | ok |
| DELETE | `/api/session/{session_id}` | ok |
| GET | `/api/session/{session_id}/titles` | ok |
| POST | `/api/session/{session_id}/tracks` | ok |
| POST | `/api/session/{session_id}/segments` | ok |
| POST | `/api/session/{session_id}/license` | ok |
| GET | `/api/session/{session_id}/prompt` | ok |
| POST | `/api/session/{session_id}/prompt` | ok |
CDM endpoints (`/{wvd}/...`, `/playready/{prd}/...`) are exposed unless `--api-only` / `--remote-only` / `--no-widevine` / `--no-playready` is set, and use pywidevine / pyplayready's own auth scheme.
---
## Endpoints
### GET /api/health
Health check with version and update information.
Health check with version and update information. Always reachable without auth.
```bash
curl http://localhost:8786/api/health
@@ -55,13 +134,13 @@ curl http://localhost:8786/api/health
### GET /api/services
List all available streaming services.
List all available streaming services (filtered by the effective allowlist for the caller).
```bash
curl http://localhost:8786/api/services
curl -H "X-Secret-Key: $KEY" http://localhost:8786/api/services
```
Returns an array of services with `tag`, `aliases`, `geofence`, `title_regex`, `url`, and `help` text.
Returns `{"services": [...]}`. Each entry has `tag`, `aliases`, `geofence`, `title_regex`, `url` (from `cli.short_help`), `help` (full docstring), and `cli_params` describing the service-level Click parameters.
---
@@ -84,8 +163,9 @@ Search for titles from a streaming service.
```bash
curl -X POST http://localhost:8786/api/search \
-H "X-Secret-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{"service": "EXAMPLE", "query": "example show"}'
-d '{"service": "EXAMPLE1", "query": "example show"}'
```
```json
@@ -107,7 +187,7 @@ curl -X POST http://localhost:8786/api/search \
### POST /api/list-titles
Get available titles (seasons/episodes/movies) for a service and title ID.
Get available titles (seasons/episodes/movies) for a service and title ID. Disabled in `--remote-only` mode.
**Required parameters:**
| Parameter | Type | Description |
@@ -117,31 +197,16 @@ Get available titles (seasons/episodes/movies) for a service and title ID.
```bash
curl -X POST http://localhost:8786/api/list-titles \
-H "X-Secret-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{"service": "EXAMPLE", "title_id": "abc123def456"}'
```
```json
{
"titles": [
{
"type": "episode",
"name": "Pilot",
"series_title": "Example Show",
"season": 1,
"number": 1,
"year": 2024,
"id": "abc123def789"
}
]
}
-d '{"service": "EXAMPLE1", "title_id": "abc123def456"}'
```
---
### POST /api/list-tracks
Get video, audio, and subtitle tracks for a title.
Get video, audio, and subtitle tracks for a title. Disabled in `--remote-only` mode.
**Required parameters:**
| Parameter | Type | Description |
@@ -157,23 +222,13 @@ Get video, audio, and subtitle tracks for a title.
| `proxy` | string | `null` | Proxy URI or country code |
| `no_proxy` | boolean | `false` | Disable all proxy use |
```bash
curl -X POST http://localhost:8786/api/list-tracks \
-H "Content-Type: application/json" \
-d '{
"service": "EXAMPLE",
"title_id": "abc123def456",
"wanted": ["S01E01"]
}'
```
Returns video, audio, and subtitle tracks with codec, bitrate, resolution, language, and DRM information.
---
### POST /api/download
Start a download job. Returns immediately with a job ID (HTTP 202).
Start a download job. Returns immediately with a job ID (HTTP 202). Disabled in `--remote-only` mode.
**Required parameters:**
| Parameter | Type | Description |
@@ -185,8 +240,8 @@ Start a download job. Returns immediately with a job ID (HTTP 202).
| Parameter | Type | Default | Description |
| --- | --- | --- | --- |
| `quality` | array[int] | best | Resolution(s) (e.g., `[1080, 2160]`) |
| `vcodec` | string or array | any | Video codec(s): `H264`, `H265`/`HEVC`, `VP9`, `AV1`, `VC1` |
| `acodec` | string or array | any | Audio codec(s): `AAC`, `AC3`, `EC3`, `AC4`, `OPUS`, `FLAC`, `ALAC`, `DTS` |
| `vcodec` | string or array | any | Video codec(s): `H264`, `H265`/`HEVC`, `VP9`, `AV1`, `VC1`, `VP8` |
| `acodec` | string or array | any | Audio codec(s): `AAC`, `AC3`, `EC3`, `AC4`, `OPUS`, `FLAC`, `ALAC`, `DTS`, `OGG` |
| `vbitrate` | int | highest | Video bitrate in kbps |
| `abitrate` | int | highest | Audio bitrate in kbps |
| `range` | array[string] | `["SDR"]` | Color range(s): `SDR`, `HDR10`, `HDR10+`, `HLG`, `DV`, `HYBRID` |
@@ -247,7 +302,7 @@ Start a download job. Returns immediately with a job ID (HTTP 202).
| `no_proxy` | boolean | `false` | Disable all proxy use |
| `workers` | int | `null` | Max threads per track download |
| `downloads` | int | `1` | Concurrent track downloads |
| `slow` | string | `null` | Add delay between titles. `true` for 60-120s, or `MIN-MAX` (e.g. `20-40`) |
| `slow` | boolean | `false` | Add 60-120s delay between titles |
| `best_available` | boolean | `false` | Continue if requested quality unavailable |
| `skip_dl` | boolean | `false` | Skip download, only get decryption keys |
| `export` | boolean | `false` | Export manifest, track URLs, keys, and subtitles to JSON in the exports directory |
@@ -259,9 +314,10 @@ Start a download job. Returns immediately with a job ID (HTTP 202).
```bash
curl -X POST http://localhost:8786/api/download \
-H "X-Secret-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{
"service": "EXAMPLE",
"service": "EXAMPLE1",
"title_id": "abc123def456",
"wanted": ["S01E01"],
"quality": [1080, 2160],
@@ -285,25 +341,18 @@ curl -X POST http://localhost:8786/api/download \
### GET /api/download/jobs
List all download jobs with optional filtering and sorting.
List all download jobs with optional filtering and sorting. Disabled in `--remote-only` mode.
**Query parameters:**
| Parameter | Type | Default | Description |
| --- | --- | --- | --- |
| `status` | string | all | Filter by status: `queued`, `downloading`, `completed`, `failed`, `cancelled` |
| `service` | string | all | Filter by service tag |
| `sort_by` | string | `created_time` | Sort field: `created_time`, `status`, `service` |
| `sort_by` | string | `created_time` | Sort field: `created_time`, `started_time`, `completed_time`, `progress`, `status`, `service` |
| `sort_order` | string | `desc` | Sort order: `asc`, `desc` |
```bash
# List all jobs
curl http://localhost:8786/api/download/jobs
# Filter by status
curl "http://localhost:8786/api/download/jobs?status=completed"
# Filter by service
curl "http://localhost:8786/api/download/jobs?service=EXAMPLE"
curl -H "X-Secret-Key: $KEY" "http://localhost:8786/api/download/jobs?status=completed"
```
---
@@ -312,19 +361,15 @@ curl "http://localhost:8786/api/download/jobs?service=EXAMPLE"
Get detailed information about a specific download job including progress, parameters, and error details.
```bash
curl http://localhost:8786/api/download/jobs/504db959-80b0-446c-a764-7924b761d613
```
```json
{
"job_id": "504db959-80b0-446c-a764-7924b761d613",
"status": "completed",
"created_time": "2026-02-27T18:00:00.000000",
"service": "EXAMPLE",
"service": "EXAMPLE1",
"title_id": "abc123def456",
"progress": 100.0,
"parameters": { ... },
"parameters": { },
"started_time": "2026-02-27T18:00:01.000000",
"completed_time": "2026-02-27T18:00:15.000000",
"output_files": [],
@@ -337,13 +382,339 @@ curl http://localhost:8786/api/download/jobs/504db959-80b0-446c-a764-7924b761d61
### DELETE /api/download/jobs/{job_id}
Cancel a queued or running download job.
Cancel a queued or running download job. Returns 400 if the job has already terminated.
```bash
curl -X DELETE http://localhost:8786/api/download/jobs/504db959-80b0-446c-a764-7924b761d613
---
## Remote Service Sessions
These endpoints back the `RemoteService` adapter in `unshackle/core/remote_service.py`. They let a thin `dl` client (or any consumer) authenticate against a service on the server, fetch titles/tracks/manifests, and either proxy CDM challenges or have the server resolve KID:KEY directly. The `dl` command's `RemoteService` adapter replaces the old `remote_dl` command. These endpoints are the only `/api/*` routes available in `--remote-only` mode (in addition to `health`, `services`, and `search`).
### POST /api/session/create
Authenticate against a service and open a session. Body fields:
| Field | Type | Description |
| --- | --- | --- |
| `service` | string | Service tag (required) |
| `title_id` | string | Title ID/URL (required) |
| `credentials` | object | Auth credentials forwarded to `Service.authenticate` |
| `cookies` | string | Cookie blob (Netscape or JSON) |
| `proxy` | string | Proxy URI or country code |
| `no_proxy` | bool | Force-disable proxies |
| `profile` | string | Profile name |
| `cache` | object | Optional pre-warmed title cache payload |
If the service requires interactive input during authentication, poll `GET /api/session/{id}/prompt` and submit responses via `POST /api/session/{id}/prompt` until status is `authenticated`.
**Request:**
```json
{
"service": "EXAMPLE1",
"title_id": "abc123def456",
"credentials": {"username": "alice", "password": "hunter2"},
"cookies": "# Netscape HTTP Cookie File\n...",
"proxy": "us",
"no_proxy": false,
"profile": "default",
"cache": {}
}
```
Returns confirmation on success, or an error if the job has already completed or been cancelled.
**Response (202-style; auth runs asynchronously):**
```json
{
"session_id": "f1c4a8b2-9c7e-4d2a-bf91-2d3e4f5a6b7c",
"service": "EXAMPLE1",
"status": "authenticating"
}
```
### GET /api/session/{session_id}
Returns session metadata. 404 if expired or unknown.
```json
{
"session_id": "f1c4a8b2-9c7e-4d2a-bf91-2d3e4f5a6b7c",
"service": "EXAMPLE1",
"valid": true,
"expires_in": 3600,
"track_count": 0,
"title_count": 0
}
```
### DELETE /api/session/{session_id}
Tears down the session, cancels any pending prompts, and returns any updated per-session cache files (base64-encoded, zlib-compressed) so the client can re-warm next time.
```json
{
"status": "ok",
"cache": {
"tokens": "eJzLSM3JyVcozy/KSVGo5AIAGgQEvQ=="
}
}
```
### GET /api/session/{session_id}/titles
Returns the resolved titles list.
```json
{
"session_id": "f1c4a8b2-9c7e-4d2a-bf91-2d3e4f5a6b7c",
"titles": [
{
"type": "episode",
"name": "Pilot",
"series_title": "Example Show",
"season": 1,
"number": 1,
"year": 2024,
"id": "ep-0001",
"language": "en"
},
{
"type": "movie",
"name": "Example Movie",
"year": 2024,
"id": "mov-0001",
"language": "en"
}
]
}
```
### POST /api/session/{session_id}/tracks
**Request:**
```json
{"title_id": "ep-0001"}
```
**Response:**
```json
{
"title": {
"type": "episode",
"name": "Pilot",
"series_title": "Example Show",
"season": 1,
"number": 1,
"year": 2024,
"id": "ep-0001",
"language": "en"
},
"video": [
{
"id": "v-1080p-h264",
"codec": "H264",
"codec_display": "H.264",
"bitrate": 6000,
"width": 1920,
"height": 1080,
"resolution": "1920x1080",
"fps": "23.976",
"range": "SDR",
"range_display": "SDR",
"language": "en",
"drm": [
{
"type": "widevine",
"pssh": "AAAAW3Bzc2gAAAAA7e+...",
"kids": ["abcdef0123456789abcdef0123456789"],
"license_url": "https://license.example.com/widevine"
}
],
"descriptor": "DASH",
"url": "https://cdn.example.com/manifest.mpd"
}
],
"audio": [
{
"id": "a-en-eac3",
"codec": "EC3",
"codec_display": "Dolby Digital Plus",
"bitrate": 640,
"channels": "5.1",
"language": "en",
"atmos": false,
"descriptive": false,
"drm": null,
"descriptor": "DASH",
"url": "https://cdn.example.com/manifest.mpd"
}
],
"subtitles": [
{
"id": "s-en-vtt",
"codec": "WebVTT",
"language": "en",
"forced": false,
"sdh": false,
"cc": false,
"descriptor": "DASH",
"url": "https://cdn.example.com/subs/en.vtt"
}
],
"chapters": [
{"timestamp": "00:00:00.000", "name": "Chapter 1"}
],
"attachments": [],
"manifests": [
{
"type": "dash",
"url": "https://cdn.example.com/manifest.mpd",
"data": "eJzNVk1v2zAM/Ss..."
}
],
"session_headers": {
"User-Agent": "Mozilla/5.0 ..."
},
"session_cookies": {
"session": "abc123"
},
"server_cdm_type": "widevine"
}
```
### POST /api/session/{session_id}/segments
**Request:**
```json
{"track_ids": ["v-1080p-h264", "a-en-eac3"]}
```
**Response:**
```json
{
"tracks": {
"v-1080p-h264": {
"descriptor": "DASH",
"url": "https://cdn.example.com/manifest.mpd",
"drm": [
{
"type": "widevine",
"pssh": "AAAAW3Bzc2gAAAAA7e+...",
"kids": ["abcdef0123456789abcdef0123456789"],
"license_url": "https://license.example.com/widevine"
}
],
"headers": {"User-Agent": "Mozilla/5.0 ..."},
"cookies": {"session": "abc123"},
"data": {}
},
"a-en-eac3": {
"descriptor": "DASH",
"url": "https://cdn.example.com/manifest.mpd",
"drm": null,
"headers": {"User-Agent": "Mozilla/5.0 ..."},
"cookies": {"session": "abc123"},
"data": {}
}
}
}
```
### POST /api/session/{session_id}/license
Two modes, selected by the `mode` field.
**`mode: "proxy"` (default)** -- forward a client-built CDM challenge to the service's license endpoint.
Request:
```json
{
"mode": "proxy",
"track_id": "v-1080p-h264",
"challenge": "CAESxQEK...",
"drm_type": "widevine",
"pssh": "AAAAW3Bzc2gAAAAA7e+..."
}
```
Response:
```json
{"license": "CAIS3wIK..."}
```
**`mode: "server_cdm"`** -- the server uses its own CDM to license the track and extract keys. Single-track form takes `track_id`; batch form takes `track_ids`. Requires the calling user key to have a matching device (`devices` for Widevine, `playready_devices` for PlayReady) in `unshackle.yaml`.
Request (batch):
```json
{
"mode": "server_cdm",
"track_ids": ["v-1080p-h264", "a-en-eac3"],
"drm_type": "widevine"
}
```
Response:
```json
{
"keys": {
"v-1080p-h264": {
"abcdef0123456789abcdef0123456789": "00112233445566778899aabbccddeeff"
},
"a-en-eac3": {
"abcdef0123456789abcdef0123456789": "00112233445566778899aabbccddeeff"
}
},
"drm_type": "widevine"
}
```
### GET /api/session/{session_id}/prompt
Polled by the client during interactive authentication (OTP, PIN, device codes). Backed by the `InputBridge` in `unshackle/core/api/input_bridge.py`; `Service.request_input()` blocks server-side until the client posts a response.
Pending input:
```json
{"status": "pending_input", "prompt": "Enter OTP code: "}
```
Other states:
```json
{"status": "authenticating"}
```
```json
{"status": "authenticated"}
```
```json
{"status": "failed", "error": "Invalid credentials"}
```
### POST /api/session/{session_id}/prompt
Unblocks the server-side `request_input()` call.
Request:
```json
{"response": "123456"}
```
Response:
```json
{"status": "accepted"}
```
---
@@ -357,22 +728,25 @@ All endpoints return consistent error responses:
"error_code": "INVALID_PARAMETERS",
"message": "Invalid vcodec: XYZ. Must be one of: H264, H265, VP9, AV1, VC1, VP8",
"timestamp": "2026-02-27T18:00:00.000000+00:00",
"details": { ... }
"details": { }
}
```
Common error codes:
- `INVALID_INPUT` - Malformed request body
- `INVALID_PARAMETERS` - Invalid parameter values
- `MISSING_SERVICE` - Service tag not provided
- `INVALID_SERVICE` - Service not found
- `SERVICE_ERROR` - Service initialization or runtime error
- `AUTH_FAILED` - Authentication failure
- `NOT_FOUND` - Job or resource not found
- `INTERNAL_ERROR` - Unexpected server error
- `INVALID_INPUT` -- malformed request body
- `INVALID_PARAMETERS` -- invalid parameter values
- `MISSING_SERVICE` -- service tag not provided
- `INVALID_SERVICE` -- service not found or not in the caller's allowlist
- `SERVICE_ERROR` -- service initialization or runtime error
- `AUTH_FAILED` -- authentication failure
- `NOT_FOUND` / `TRACK_NOT_FOUND` / session not found -- job/session/track/title missing
- `INTERNAL_ERROR` -- unexpected server error
When `--debug-api` is enabled, error responses include additional `debug_info` with tracebacks and stderr output.
Authentication errors from the auth middleware are returned as `{"status": 401, "message": "..."}` (not the standard error envelope).
---
## Download Job Lifecycle
@@ -385,3 +759,5 @@ downloading -> cancelled
```
Jobs are retained for 24 hours after completion. The server supports up to 2 concurrent downloads by default.
Remote sessions are managed by `SessionStore` (`unshackle/core/api/session_store.py`); idle sessions and their `InputBridge` instances are cleaned up by a background loop started/stopped with the app lifecycle.

View File

@@ -2,103 +2,28 @@
This document covers configuration options related to downloading and processing media content.
## aria2c (dict)
## downloader
- `max_concurrent_downloads`
Maximum number of parallel downloads. Default: `min(32,(cpu_count+4))`
Note: Overrides the `max_workers` parameter of the aria2(c) downloader function.
- `max_connection_per_server`
Maximum number of connections to one server for each download. Default: `1`
- `split`
Split a file into N chunks and download each chunk on its own connection. Default: `5`
- `file_allocation`
Specify file allocation method. Default: `"prealloc"`
unshackle ships a single unified downloader at `unshackle/core/downloaders/requests.py`. The legacy
`aria2c`, `curl_impersonate`, and `n_m3u8dl_re` backends have been removed; their config blocks no
longer have any effect.
- `"none"` doesn't pre-allocate file space.
- `"prealloc"` pre-allocates file space before download begins. This may take some time depending on the size of the
file.
- `"falloc"` is your best choice if you are using newer file systems such as ext4 (with extents support), btrfs, xfs
or NTFS (MinGW build only). It allocates large(few GiB) files almost instantly. Don't use falloc with legacy file
systems such as ext3 and FAT32 because it takes almost same time as prealloc, and it blocks aria2 entirely until
allocation finishes. falloc may not be available if your system doesn't have posix_fallocate(3) function.
- `"trunc"` uses ftruncate(2) system call or platform-specific counterpart to truncate a file to a specified length.
The unified downloader:
---
- Works with both a standard `requests.Session` and `RnetSession` (rnet/BoringSSL TLS impersonation,
which replaces the previous `curl_cffi` backend). When a service exposes its own session via
`self.session`, TLS fingerprinting is preserved on every segment.
- Uses adaptive chunk sizing between **512 KB and 4 MB**, picked from the response `Content-Length`.
- Spawns **up to `min(16, cpu_count + 4)` worker threads** by default for segmented downloads
(override via `--workers` / `dl.workers`).
- Resumes interrupted downloads via HTTP `Range` requests (a sibling `<file>.!dev` control file
marks an in-progress download).
- Has a single-URL fast path: if the server supports byte ranges and the file is at least 64 MB,
the file is split into 16 MB parts and downloaded in parallel into a pre-allocated file.
- Is selected per-track via `track.downloader`, which defaults to this unified `requests` downloader.
## curl_impersonate (dict)
- `browser` - The Browser to impersonate as. A list of available Browsers and Versions are listed here:
<https://github.com/yifeikong/curl_cffi#sessions>
Default: `"chrome124"`
For example,
```yaml
curl_impersonate:
browser: "chrome120"
```
---
## downloader (str | dict)
Choose what software to use to download data throughout unshackle where needed.
You may provide a single downloader globally or a mapping of service tags to
downloaders.
Options:
- `requests` (default) - <https://github.com/psf/requests>
- `aria2c` - <https://github.com/aria2/aria2>
- `curl_impersonate` - <https://github.com/yifeikong/curl-impersonate> (via <https://github.com/yifeikong/curl_cffi>)
- `n_m3u8dl_re` - <https://github.com/nilaoda/N_m3u8DL-RE>
Note that aria2c can reach the highest speeds as it utilizes threading and more connections than the other downloaders. However, aria2c can also be one of the more unstable downloaders. It will work one day, then not another day. It also does not support HTTP(S) proxies natively (non-HTTP proxies are bridged via pproxy).
Note that `n_m3u8dl_re` will automatically fall back to `requests` for track types it does not support, specifically: direct URL downloads, Subtitle tracks, and Attachment tracks.
Example mapping:
```yaml
downloader:
EXAMPLE: requests
EXAMPLE2: n_m3u8dl_re
EXAMPLE3: n_m3u8dl_re
default: requests
```
The `default` entry is optional. If omitted, `requests` will be used for services not listed.
---
## n_m3u8dl_re (dict)
Configuration for N_m3u8DL-RE downloader. This downloader supports HLS, DASH, and ISM (Smooth Streaming) manifests.
It will automatically fall back to the `requests` downloader for unsupported track types (direct URLs, subtitles, attachments).
- `thread_count`
Number of threads to use for downloading. Default: Uses the same value as max_workers from the command
(which defaults to `min(32,(cpu_count+4))`).
- `ad_keyword`
Keyword to identify and potentially skip advertisement segments. Default: `None`
- `use_proxy`
Whether to use proxy when downloading. Default: `true`
- `retry_count`
Number of times to retry failed downloads. Default: `10`
N_m3u8DL-RE also respects the `decryption` config setting. When content keys are provided, it will use
the configured decryption engine (`shaka` or `mp4decrypt`) and automatically locate the corresponding binary.
For example,
```yaml
n_m3u8dl_re:
thread_count: 16
ad_keyword: "advertisement"
use_proxy: true
retry_count: 10
```
There is no `downloader:` config key to set anymore. Setting one to a legacy value will emit a
`DeprecationWarning` and otherwise be ignored.
---
@@ -159,6 +84,8 @@ to a CLI option on the `dl` command. CLI arguments always take priority over con
| `acodec` | str or list | any | Audio codec(s): `AAC`, `AC3`, `EC3`, `AC4`, `OPUS`, `FLAC`, `ALAC`, `DTS` |
| `vbitrate` | int | highest | Video bitrate in kbps |
| `abitrate` | int | highest | Audio bitrate in kbps |
| `vbitrate_range` | str | none | Video bitrate window in kbps, format `MIN-MAX` (e.g., `6000-7000`) |
| `abitrate_range` | str | none | Audio bitrate window in kbps, format `MIN-MAX` |
| `range_` | str or list | `SDR` | Color range(s): `SDR`, `HDR10`, `HDR10+`, `HLG`, `DV`, `HYBRID` |
| `channels` | float | any | Audio channels (e.g., `5.1`, `7.1`) |
| `worst` | bool | `false` | Select the lowest bitrate track within the specified quality. Requires `quality` |
@@ -202,6 +129,7 @@ to a CLI option on the `dl` command. CLI arguments always take priority over con
| `no_source` | bool | `false` | Remove source tag from filename |
| `no_mux` | bool | `false` | Do not mux tracks into a container file |
| `split_audio` | bool | `false` | Create separate output files per audio codec |
| `export` | bool | `false` | Write a JSON sidecar with manifest URLs, subtitles, per-track KID:KEY, codec/track info |
**Metadata enrichment:**
@@ -217,8 +145,8 @@ to a CLI option on the `dl` command. CLI arguments always take priority over con
| Key | Type | Default | Description |
| --- | --- | --- | --- |
| `downloads` | int | `1` | Concurrent track downloads |
| `workers` | int | auto | Max threads per track download |
| `slow` | bool | `false` | Add 60-120s delay between titles |
| `workers` | int | `min(16, cpu_count + 4)` | Max threads per track download (segments / ranged parts) |
| `slow` | bool or `MIN-MAX` | `false` | Randomized delay between titles. `true` uses 60-120s; pass `MIN-MAX` (e.g., `20-40`) for a custom range |
| `skip_dl` | bool | `false` | Skip download, only get decryption keys |
| `cdm_only` | bool | `null` | Only use CDM (`true`) or only vaults (`false`) |

View File

@@ -153,8 +153,16 @@ providers:
### Specific Server Selection
Use `--proxy gluetun:nordvpn:us1239` for specific server selection. Unshackle builds the hostname
automatically based on the provider (e.g., `us1239.nordvpn.com` for NordVPN).
Use a `<country><number>` region (e.g. `us1239`) to target a specific server. Unshackle builds the
hostname automatically per provider:
| Provider | Hostname format |
|----------|-----------------|
| NordVPN | `us1239.nordvpn.com` |
| Surfshark | `us-1239.prod.surfshark.com` |
| ExpressVPN | `us-1239.expressvpn.com` |
| CyberGhost | `us-s1239.cg-dialup.net` |
| Other | `us1239` (passed as-is to `SERVER_HOSTNAMES`) |
### Extra Environment Variables
@@ -187,12 +195,14 @@ proxy_providers:
## Features
- **Container Reuse**: First request takes 10-30s; subsequent requests are instant. Containers from other sessions are also detected and reused.
- **IP Verification**: Automatically verifies VPN exit IP matches requested region (configurable via `verify_ip`)
- **Concurrent Sessions**: Multiple downloads share the same container
- **Specific Servers**: Use `--proxy gluetun:nordvpn:us1239` for specific server selection
- **Automatic Image Pull**: The Gluetun Docker image (`qmcgaw/gluetun:latest`) is pulled automatically on first use
- **Secure Credentials**: Credentials are passed via temporary env files (mode 0600) rather than command-line arguments
- **Container Reuse**: First request takes 10-30s; subsequent requests are instant. Containers created by other unshackle processes are auto-detected via `docker inspect` and reused.
- **Ready Detection**: Waits up to 60s for both the HTTP proxy to listen (`[http proxy] listening`) and the VPN tunnel to come up (`initialization sequence completed` or `public ip address is`) before returning the proxy URI. Bails early on `fatal` or `invalid credentials` log lines.
- **IP Verification**: When `verify_ip: true` (default), looks up the exit IP via `ipinfo.io` through the proxy and compares country code to the requested region. Retries 3 times with exponential backoff (1s, 2s, 4s).
- **Concurrent Sessions**: Multiple downloads share the same container; ports are allocated thread-safely starting at `base_port`.
- **Specific Servers**: Use `--proxy gluetun:nordvpn:us1239` for specific server selection (see table above).
- **Automatic Image Pull**: The Gluetun Docker image (`qmcgaw/gluetun:latest`) is pulled automatically on first use (5 min timeout).
- **Secure Credentials**: Credentials are passed via temporary env files (mode 0600), then zero-overwritten and unlinked after `docker run`. They never appear in process listings.
- **Auto Cleanup**: Containers are removed via `atexit` (Ctrl+C still works normally). Disable with `auto_cleanup: false` to leave them stopped instead.
## Container Management

View File

@@ -138,8 +138,8 @@ All requests will use these unless changed explicitly or implicitly via a Server
These should be sane defaults and anything that would only be useful for some Services should not
be put here.
Avoid headers like 'Accept-Encoding' as that would be a compatibility header that curl_cffi will
set for you.
Avoid headers like 'Accept-Encoding' as that would be a compatibility header that the underlying
HTTP backend (rnet) will set for you as part of its browser impersonation profile.
I recommend using,
@@ -149,3 +149,30 @@ User-Agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML
```
---
## HTTP Session Backend
unshackle uses [`rnet`](https://github.com/0x676e67/rnet) (Rust + BoringSSL) for HTTP with TLS
fingerprinting. `RnetSession` is a drop-in `requests.Session` replacement and is what
`self.session` exposes to services. It supports:
- Browser/app impersonation via named `rnet.Impersonate` presets (Chrome, Edge, Firefox, Safari,
OkHttp, etc.) — picks JA3, ALPN, HTTP/2 SETTINGS and header order to match the chosen client.
- Native rnet proxy support (HTTP, HTTPS, SOCKS5) — used by all proxy providers below.
- Cookie-jar and `requests`-style `data=` / `json=` / `headers=` kwargs for compatibility.
The legacy `curl_cffi` backend has been removed. The config key is still spelled
`curl_impersonate` for backward compatibility, but its value now selects an rnet preset.
### curl_impersonate (dict)
```yaml
curl_impersonate:
browser: Chrome131 # exact rnet.Impersonate preset name
```
`browser` must be an exact `rnet.Impersonate` preset name (e.g. `Chrome131`, `Chrome124`,
`Edge101`, `Firefox133`, `Safari18`, `OkHttp4_12`). See the rnet README for the full list.
Default when unset: `Chrome131`.
---

View File

@@ -38,7 +38,7 @@ This is **required** in your `unshackle.yaml` — a warning is shown if not conf
Available variables: `{title}`, `{year}`, `{season}`, `{episode}`, `{season_episode}`, `{episode_name}`,
`{quality}`, `{resolution}`, `{source}`, `{audio}`, `{audio_channels}`, `{audio_full}`,
`{video}`, `{hdr}`, `{hfr}`, `{atmos}`, `{dual}`, `{multi}`, `{tag}`, `{edition}`, `{repack}`,
`{lang_tag}`
`{lang_tag}`, `{track_number}`, `{artist}`, `{album}`, `{disc}`
Add `?` suffix to make a variable conditional (omitted when empty): `{year?}`, `{hdr?}`, `{repack?}`
@@ -70,17 +70,26 @@ If not configured, the default folder naming is used:
- Series: Derived from the `series` template with episode-specific variables removed
- Songs: `Artist - Album (Year)`
`folder` accepts either a single string (applies to all title kinds) or a mapping with per-kind
templates keyed by `movies`, `series`, and/or `songs`. Unknown keys are warned about and ignored.
```yaml
output_template:
movies: '{title}.{year}.{repack?}.{edition?}.{quality}.{source}.WEB-DL.{dual?}.{multi?}.{audio_full}.{atmos?}.{hdr?}.{hfr?}.{video}-{tag}'
series: '{title}.{year?}.{season_episode}.{episode_name?}.{repack?}.{edition?}.{quality}.{source}.WEB-DL.{dual?}.{multi?}.{audio_full}.{atmos?}.{hdr?}.{hfr?}.{video}-{tag}'
songs: '{track_number}.{title}.{repack?}.{edition?}.{source?}.WEB-DL.{audio_full}.{atmos?}-{tag}'
# Scene-style folder
# Scene-style folder (single template, applies to all kinds)
folder: '{title}.{year?}.{repack?}.{edition?}.{lang_tag?}.{quality}.{source}.WEB-DL.{dual?}.{multi?}.{audio_full}.{atmos?}.{hdr?}.{hfr?}.{video}-{tag}'
# Plex-friendly folder
# folder: '{title} ({year?})'
# Per-kind folder templates
# folder:
# movies: '{title} ({year})'
# series: '{title} ({year?})'
# songs: '{artist} - {album} ({year?})'
```
Example outputs:
@@ -230,6 +239,7 @@ The following directories are available and may be overridden,
- `cache` - Expiring data like Authorization tokens, or other misc data.
- `cookies` - Expiring Cookie data.
- `logs` - Logs.
- `exports` - JSON sidecar exports written when `--export` is used on `dl`.
- `wvds` - Widevine Devices.
- `prds` - PlayReady Devices.
- `dcsl` - Device Certificate Status List.

View File

@@ -26,8 +26,8 @@ EXAMPLE:
### Per-Service Configuration Overrides
You can override many global configuration options on a per-service basis by nesting them under the
service tag in the `services` section. Supported override keys include: `dl`, `aria2c`, `n_m3u8dl_re`,
`curl_impersonate`, `subtitle`, `muxing`, `headers`, and more.
service tag in the `services` section. Supported override keys include: `dl`, `subtitle`, `muxing`,
`headers`, `proxy_map`, and more.
Overrides are merged with global config (not replaced) -- only specified keys are overridden, others
use global defaults. CLI arguments always take priority over service-specific config.
@@ -40,12 +40,52 @@ services:
dl:
downloads: 2 # Limit concurrent track downloads
workers: 4 # Reduce workers to avoid rate limits
n_m3u8dl_re:
thread_count: 4 # Very low thread count
aria2c:
max_concurrent_downloads: 1
headers:
User-Agent: "..." # Service-specific UA override
```
Note: unshackle uses a single unified `requests`-based downloader. The legacy `aria2c`,
`n_m3u8dl_re`, and `curl_impersonate` override sections have been removed.
### Service Class Conventions
Each service directory under `unshackle/services/` exports a class extending
`unshackle.core.service.Service`. The class name must match the directory name (the service tag).
Key class variables (defined on `Service` or by service-level idiom):
- `ALIASES: tuple[str, ...]` — alternative tags accepted on the CLI. Empty by default.
- `GEOFENCE: tuple[str, ...]` — ISO country codes the service is available in. Empty == no geofence.
- `TITLE_RE: str` — regex (with named groups, e.g. `(?P<id>...)`, `(?P<type>...)`) used by the
service to parse the CLI title argument. Service-level idiom, not declared on the base class.
- `NO_SUBTITLES: bool` — service-level idiom indicating the service has no subtitle tracks.
`self.*` helpers available after `super().__init__(ctx)`:
- `self.session` — pre-configured HTTP session (`requests.Session`, or `RnetSession` when TLS
impersonation is active). Cookies, headers, proxies pre-applied.
- `self.config` — merged service config (per-service `config.yaml` plus the `services.<TAG>` block
from `unshackle.yaml`).
- `self.log``logging.Logger` named for the service class.
- `self.cache` — generic `Cacher` for arbitrary key/value persistence.
- `self.title_cache` — specialized `TitleCacher` for title metadata.
- `self.track_request``TrackRequest` built from CLI flags. Fields: `codecs: list[Video.Codec]`,
`ranges: list[Video.Range]` (defaults to `[SDR]`), `best_available: bool`. Services may
read or rewrite these (e.g. force HEVC for HDR ranges).
- `self.credential` — set during `authenticate()`; `None` if cookies-only.
- `self.current_region` — lowercase ISO country code from proxy/geolocation, or `None`.
- `self.request_input(prompt: str) -> str` — interactive prompt. Falls through to `input()`
locally; under `serve`, the attached `InputBridge` relays the prompt to the remote client.
Driving CLI flags (parsed into `self.track_request`):
- `-v` / `--vcodec` — comma-separated `Video.Codec` list (e.g. `H264,H265`).
- `-a` / `--acodec` — comma-separated audio codec list.
- `-r` / `--range` — comma-separated `Video.Range` list (`SDR`, `HDR10`, `HDR10+`, `DV`,
`HYBRID`). Defaults to `[SDR]`.
- `-q` / `--quality` — resolution list.
- `--vbitrate-range` / `--abitrate-range``MIN-MAX` kbps windows.
---
## credentials (dict[str, str|list|dict])

View File

@@ -1,39 +1,41 @@
# Subtitle Processing Configuration
This document covers subtitle processing and formatting options.
This document covers subtitle processing and formatting options under the top-level `subtitle:` key in `unshackle.yaml`.
For the canonical example, see `unshackle/unshackle-example.yaml`.
## subtitle (dict)
Control subtitle conversion and SDH (hearing-impaired) stripping behavior.
Control subtitle conversion, SDH (hearing-impaired) stripping, formatting preservation, and output behavior.
- `conversion_method`: How to convert subtitles between formats. Default: `auto`.
- `auto`: Smart routing - use subby for WebVTT/SAMI, pycaption for others.
- `subby`: Always use subby with CommonIssuesFixer.
- `subtitleedit`: Prefer SubtitleEdit when available; otherwise fallback to standard conversion.
- `pycaption`: Use only the pycaption library (no SubtitleEdit, no subby).
- `pysubs2`: Use pysubs2 library (supports SRT, SSA, ASS, WebVTT, TTML, SAMI, MicroDVD, MPL2, TMP formats).
- `auto`: Smart routing - subby for WebVTT/fVTT/SAMI; for SSA/ASS/MicroDVD/MPL2/TMP use SubtitleEdit when available, otherwise pysubs2; standard pycaption/SubtitleEdit pipeline for everything else.
- `subby`: Always use subby with `CommonIssuesFixer` (falls back to standard if the source codec isn't supported by subby).
- `subtitleedit`: Prefer SubtitleEdit when available; otherwise fall back to the standard pycaption pipeline.
- `pycaption`: Use only the pycaption library (no SubtitleEdit, no subby). Limited to SRT, TTML, and WebVTT outputs.
- `pysubs2`: Use pysubs2 (supports SRT, SSA, ASS, WebVTT, TTML, SAMI, MicroDVD, MPL2, TMP).
- `sdh_method`: How to strip SDH cues. Default: `auto`.
- `auto`: Try subby for SRT first, then SubtitleEdit, then filter-subs.
- `subby`: Use subby's SDHStripper. **Note:** Only works with SRT files; other formats will fall back to alternative methods.
- `subtitleedit`: Use SubtitleEdit's RemoveTextForHI when available.
- `filter-subs`: Use the subtitle-filter library.
- `auto`: Try subby for SRT first, then SubtitleEdit (when `conversion_method` is `auto`/`subtitleedit` and the binary is available), then subtitle-filter as the final fallback.
- `subby`: Use subby's `SDHStripper`. **Only operates on SRT**; for other codecs the call returns without stripping.
- `subtitleedit`: Use SubtitleEdit's `/RemoveTextForHI` when the binary is available; otherwise falls through to subtitle-filter.
- `filter-subs`: Use the `subtitle-filter` library directly (`rm_fonts`, `rm_ast`, `rm_music`, `rm_effects`, `rm_names`, `rm_author`).
- `strip_sdh`: Enable/disable automatic SDH (hearing-impaired) cue stripping. Default: `true`.
- `strip_sdh`: Enable/disable automatic SDH stripping for tracks flagged as SDH. Default: `true`.
- `convert_before_strip`: When using `filter-subs` SDH method, automatically convert subtitles to SRT format first for better compatibility. Default: `true`.
- `convert_before_strip`: When falling through to the subtitle-filter path, auto-convert non-SRT subtitles to SRT first for better compatibility. Default: `true`. Has no effect when SubtitleEdit handles stripping directly.
- `preserve_formatting`: Keep original subtitle tags and positioning during conversion. When true, skips pycaption processing for WebVTT files to keep tags like `<i>`, `<b>`, and positioning intact. Default: `true`.
- `preserve_formatting`: Keep original subtitle tags and positioning during WebVTT processing. When `true`, sanitized WebVTT is written back without round-tripping through pycaption, preserving tags like `<i>`, `<b>`, and `line:` positioning. Default: `true`.
- `output_mode`: Controls how subtitles are included in the output. Default: `mux`.
- `mux`: Embed subtitles in the MKV container only.
- `sidecar`: Save subtitles as separate files only (not muxed into the container).
- `both`: Embed subtitles in the MKV container and save as sidecar files.
- `sidecar`: Save subtitles as separate files only (not muxed).
- `both`: Embed in the MKV container and save as sidecar files.
- `sidecar_format`: Format for sidecar subtitle files (used when `output_mode` is `sidecar` or `both`). Default: `srt`.
- `srt`: SubRip format.
- `vtt`: WebVTT format.
- `ass`: Advanced SubStation Alpha format.
- `srt`: SubRip.
- `vtt`: WebVTT.
- `ass`: Advanced SubStation Alpha.
- `original`: Keep the subtitle in its current format without conversion.
Example:
@@ -49,4 +51,23 @@ subtitle:
sidecar_format: srt
```
## WebVTT Sanitization (automatic, not configurable)
After download, WebVTT and segmented WebVTT (`fVTT`/`WVTT`) tracks pass through a fixed sanitization pipeline before any conversion or muxing:
1. **Segment merge** — segmented DASH/HLS WebVTT is stitched via `merge_segmented_webvtt` (uses pysubs2 for lenient parsing when `conversion_method` is `auto` or `pysubs2`, otherwise pycaption directly).
2. **Negative timestamps**`sanitize_webvtt_timestamps` rewrites `-HH:MM:SS.mmm` cues to `00:00:00.000`.
3. **Cue identifiers**`sanitize_webvtt_cue_identifiers` strips letter+digit IDs (e.g. `Q0`, `S12`) on their own line before a timing line, which otherwise confuse parsers like pysubs2.
4. **Overlapping cues**`merge_overlapping_webvtt_cues` collapses cues with start times within 50 ms and matching end times into a single multi-line cue, ordered by `line:` percentage (lower % = higher on screen = first line).
5. **Fallback hardening** — when `preserve_formatting` is `false` and the first pycaption parse fails, `sanitize_webvtt` retries with a `WEBVTT` header guard, hour-padded timings, and another negative-timestamp pass; if that still fails, the sanitized text is written as-is.
`sanitize_broken_webvtt` and `space_webvtt_headers` additionally run inside `Subtitle.parse()` to drop malformed `-->` lines and reflow merged-segment headers. `merge_same_cues` and `filter_unwanted_cues` (drops `&nbsp;`/whitespace-only cues) run only on the pycaption path.
These behaviors are intentional and have no config knobs — they apply to every WebVTT track regardless of `conversion_method`.
## Related
- Filename sanitization (e.g. parenthesis handling, unidecode bracket artifacts from PR #105) lives in `unshackle/core/utilities.py::sanitize_filename` and is governed by `output_template`, not the `subtitle:` config block.
- Subtitle codec support and the conversion matrix are defined in `unshackle/core/tracks/subtitle.py`.
---