Update Unshackle to v2.1.0 #3

Open
unshackle-dl wants to merge 0 commits from update-unshackle into main
Owner
No description provided.
unshackle-dl added 90 commits 2025-12-19 14:58:08 +00:00
Very early dev work, more changes will be active in this branch.

- Implement download queue management and worker system
- Add OpenAPI/Swagger documentation
- Include download progress tracking and status endpoints
- Add API authentication and error handling
- Update core components to support API integration
- Update lxml dependency to allow version 6.x (required by subby 0.3.23)
- Fix pyplayready exception import path (moved to misc.exceptions in 0.6.3)

fixes #17
Implements a complete structured logging system for troubleshooting and service development.

Features:
- Binary toggle via --debug flag or debug: true in config
- JSON Lines (.jsonl) format for easy parsing and analysis
- Comprehensive logging of all operations:
  * Session info (version, platform, Python version)
  * CLI parameters and service configuration
  * CDM details (Widevine/PlayReady, security levels)
  * Authentication status
  * Title and track metadata
  * DRM operations (PSSH, KIDs, license requests)
  * Vault queries with key retrieval
  * Full error traces with context

- Configurable key logging via debug_keys option
- Smart redaction (passwords, tokens, cookies always redacted)
- Error logging for all critical operations:
  * Authentication failures
  * Title fetching errors
  * Track retrieval errors
  * License request failures (Widevine & PlayReady)
  * Vault operation errors

- Removed old text logging system
Add new CustomRemoteCDM class to support custom CDM API providers with maximum configurability through YAML configuration alone. This addresses GitHub issue #26 by enabling integration with third-party CDM APIs.
Add WindscribeVPN as a new proxy provider option, following the same pattern as NordVPN and SurfsharkVPN implementations.

Fixes: #29
Adds a new CLI option `-le, --latest-episode` that automatically selects and downloads only the single most recent episode from a series, regardless of which season it's in.

Fixes #28
Add support for per-service configuration overrides allowing fine-tuned control of downloader and command options on a service-by-service basis.

Fixes #13
Simkl now requires a client_id from https://simkl.com/settings/developer/
Refactor code to search for binaries either in root of binary folder or in a subfolder named after the binary.
Fixes #23
Add validation to check that both HDR10 and DV tracks are available when HYBRID mode is requested. This prevents wasted downloads when the hybrid processing would fail due to missing tracks.
Add support for downloading audio description tracks via the --audio-description/-ad flag. Previously, descriptive audio tracks were always filtered out. Users can now optionally include them.

Fixes #33
- Add missing download parameters (latest_episode, exact_lang, audio_description, no_mux)
- Expand OpenAPI schema with comprehensive documentation for all 40+ download parameters
- Add robust parameter validation with clear error messages
- Implement job filtering by status/service and sorting capabilities
Fixes 'charmap' codec can't decode byte error that occurs on Windows
when mp4decrypt outputs non-ASCII characters. Without explicit encoding,
Add default parameter system to API server that matches CLI behavior, eliminating errors from missing optional parameters.
Fixes #34
HDR10/PQ detection now includes:
- PQ (most common)
- SMPTE ST 2084 (CICP value 16)
- BT.2100
- BT.2020-10
- smpte2084 (lowercase variant)

HLG detection now includes:
- HLG
- Hybrid Log-Gamma
- ARIB STD-B67 (CICP value 18)
- arib-std-b67 (lowercase variant)

Hybrid DV+HDR10 detection:
- Now checks full hdr_format field for both "Dolby Vision" AND
  ("HDR10" OR "SMPTE ST 2086")
- Properly generates filenames like "Movie.2160p.DV HDR H.265.mkv"
- MediaInfo reports: "Dolby Vision / SMPTE ST 2086, HDR10 compatible"

Also adds null safety for transfer characteristics to prevent errors when the field is None.
Fix off-by-one error in SegmentTemplate segment enumeration when startNumber is 0. Previously, the code would request one extra segment beyond what exists, causing 404 errors on the final segment.

The issue was that end_number was calculated as a segment count via math.ceil(), but then used incorrectly with range(start_number, end_number + 1), treating it as both a count and an inclusive endpoint.

Changed to explicitly calculate segment_count first, then derive end_number as: start_number + segment_count - 1

Example:
- Duration: 3540.996s, segment duration: 4s
- Before: segments 0-886 (887 segments) - segment 886 doesn't exist
- After: segments 0-885 (886 segments) - correct
Add support for custom TLS/HTTP fingerprints to session() function, enabling services to impersonate Android/OkHttp clients instead of just browsers.
Remove extension 21 (TLS padding) from okhttp4 and okhttp5 JA3 strings to resolve SSL/TLS handshake failures.
Implement comprehensive per-service config override system that allows any configuration section (dl, n_m3u8dl_re, aria2c, subtitle, etc.) to be customized on a per-service basis.

Fixes #13
Implements cross-platform font discovery and intelligent fallback system for ASS/SSA subtitle rendering on Linux/macOS systems.

Windows support has not been tested
Add preserve_formatting config option to prevent automatic subtitle processing that strips formatting tags and styling. When enabled (default: true), WebVTT files skip pycaption read/write cycle to preserve tags like <i>, <b>, positioning, and other formatting.
The --cdm-only flag was only preventing vault queries during DRM operations but vaults were still being loaded
When decrypt-labs returns cached keys that don't cover all required KIDs, the CDM now properly stores them in session["cached_keys"] instead of session["keys"]. This allows parse_license() to correctly combine vault_keys + cached_keys + license_keys, fixing downloads that previously failed when mixing cached and fresh licenses.
Apply the same partial cached keys fix from decrypt_labs_remote_cdm to custom_remote_cdm. When cached keys don't cover all required KIDs, store them in session["cached_keys"] instead of session["keys"] to allow parse_license() to properly combine vault_keys + cached_keys + license_keys.
The track_selection function was using findall() to search for lang child elements, but in DASH manifests lang is an XML attribute on AdaptationSet. This caused language selection to fail for region-specific codes like es-419.
* feat: Add 'from_file', 'downloader_args' to Track

* feat: Add loading HLS playlist from file

* refactor: Improve track selection, args for n_m3u8dl_re
Fixed a Python late binding closure issue in the SDH subtitle duplication logic that prevented strip_hearing_impaired() from being called correctly.
PR #38 refactored n_m3u8dl_re track selection to support DASH/ISM subtitle tracks, but this broke some subtitle downloads. Services that use direct URL downloads (Descriptor.URL) for subtitles, which n_m3u8dl_re does not support.
Add new -nv/--no-video CLI flag that allows users to download audio, subtitles, attachments, and chapters without downloading video tracks.

Fixes #39
Fixed TypeError in calculate_byte_range where range_offset was a string instead of int. The byte_range.split("-")[0] returns a string, but the calculate_byte_range method expects fallback_offset parameter to be int.
- Add Primaries.Unspecified (value 2) per user request and H.273 spec
- Rename Primaries value 0 from Unspecified to Reserved for spec accuracy
- Rename Transfer value 0 from Unspecified to Reserved for consistency
- Simplify Transfer value 2 from Unspecified_Image to Unspecified
- Update condition check to use enum values instead of numeric tuple
- Enhance docstring with detailed sources and rationale for changes

All CICP values verified against ITU-T H.273, ISO/IEC 23091-2, H.264/H.265 specifications, and FFmpeg AVColorPrimaries/AVColorTransferCharacteristic enums.
Enable quality-based CDM selection during runtime DRM switching by passing track quality to get_cdm() calls. This allows different CDMs to be used for different video quality levels within the same download session.

Example configuration:
  cdm:
    SERVICE:
      "<=1080": wv_l3_local     # Widevine L3 for SD/HD
      ">1080": pr_sl3_remote    # PlayReady SL3 for 4K
Pre-process space-hyphen-space patterns (e.g., "Title - Episode") before other character replacements to prevent creating problematic dot-hyphen-dot (.-.) patterns in filenames.

This addresses PR #44 by fixing the root cause rather than post-processing the problematic pattern. The fix ensures that titles like "Show - S01E01" become "Show.S01E01"
Add support for BaseURL elements at the AdaptationSet level per DASH spec. The URL resolution chain now properly follows: MPD → Period → AdaptationSet → Representation.
Attachments (screenshots, fonts) were being dropped when title.tracks was rebuilt from kept_tracks, causing image files to remain in temp directory after muxing. The cleanup code iterated over an empty attachments list since they were orphaned during track filtering.
unshackle-dl added 126 commits 2026-02-28 14:19:25 +00:00
- Add Gluetun dynamic VPN-to-HTTP proxy provider
   - Add remote services and authentication system
   - Add country code utilities
   - Add Docker binary detection
   - Update proxy providers
Hybrid DV+HDR10 files were named "DV.H.265" instead of "DV.HDR.H.265" because the HDR10 detection only checked hdr_format_full which contains "Dolby Vision / SMPTE ST 2094". The "HDR10" indicator is in hdr_format_commercial, not hdr_format_full.

Now checks both fields for HDR10 compatibility indicators.
This reverts commit 7e7bc7aecf.
Add PlayReady PSSH/KID extraction from track and init data with CDM-aware ordering. When PlayReady CDM is selected, tries PlayReady first then falls back to Widevine. When Widevine CDM is selected (default), tries Widevine first then falls back to PlayReady.
Remove erroneous `.bytes` accessor from PSSH.SYSTEM_ID comparisons in from_track() and from_init_data() methods. The pyplayready PSSH.SYSTEM_ID is already the correct type for comparison with parsed PSSH box system_ID values.
- Add CENC namespace support for kid/default_KID attributes
- Detect and replace placeholder/test KIDs in Widevine PSSH:
  - All zeros (key rotation default)
  - Sequential 0x00-0x0f pattern
  - Shaka Packager test pattern
- Change DRM init condition from `not track.drm` to `init_data` to ensure DRM is always re-initialized from init segments

Fixes issue where Widevine PSSH contains placeholder KIDs while the real KID is only in ContentProtection default_KID attributes.
- Add skip_merge flag for N_m3u8DL-RE to prevent duplicate init data
- Pass content_keys to N_m3u8DL-RE for internal decryption handling
- Use shutil.move() instead of manual merge when skip_merge is True
- Skip manual decryption when N_m3u8DL-RE handles it internally

Fixes audio corruption ("Box 'OG 2' size is too large") when using N_m3u8DL-RE with DASH manifests that have SegmentBase init data. The init segment was being written twice: once by N_m3u8DL-RE during its internal merge, and again by dash.py during post-processing.
Use removeprefix instead of removesuffix and add strip() to handle ASS subtitle files that have spaces after commas in Style definitions.

Fixes #57
Add config option to disable ASCII transliteration in filenames, allowing preservation of Korean, Japanese, Chinese, and other native language characters instead of converting them via unidecode.

Closes #49
The previous regex only matched negative size values when they were the entire quoted attribute (e.g., "-5%"). This failed for multi-value attributes like tts:extent="-5% 7.5%" causing pycaption parse errors.

The new pattern matches negative values anywhere in the text and preserves the unit during replacement.

Closes #47
The previous sorting approach crashed with KeyError when unsupported DRM systems were present in the init segment. Now uses direct filtering
Some services use WebVTT files with:
- Cue identifiers (Q0, Q1, etc.) before timing lines that pysubs2/pycaption incorrectly parses as subtitle text
- Multi-line subtitles split into separate cues with 1ms offset times and different line: positions (e.g., line:77% for top, line:84% for bottom)

Added detection and sanitization functions:
- has_webvtt_cue_identifiers(): detects cue identifiers before timing
- sanitize_webvtt_cue_identifiers(): removes problematic cue identifiers
- has_overlapping_webvtt_cues(): detects overlapping cues needing merge
- merge_overlapping_webvtt_cues(): merges cues sorted by line position
- Use lowercase format names (subrip, webvtt, advancedsubstationalpha) to match SubtitleEdit 4.x CLI requirements
- Change /Convert to /convert for consistency with CLI docs
- Convert Path objects to strings explicitly for subprocess calls
- Respect conversion_method config in SDH stripping - skip SubtitleEdit when user has set pysubs2/pycaption/subby as their preferred method
- Add stderr suppression to SubtitleEdit calls
When DASH manifests have multiple audio AdaptationSets with the same representation IDs (e.g., both English and Japanese having id="0"), N_m3u8DL-RE would download the same track twice.

Now includes the language alongside the ID in selection args to properly disambiguate tracks across adaptation sets.
Session keys from master playlists often contain PSSHs with multiple KIDs covering all tracks, causing licensing to return keys for wrong KIDs.

Changes:
- Unified DRM licensing logic for all downloaders
- Prefer media playlist EXT-X-KEY tags which contain track-specific KIDs
- Add filter_keys_for_cdm() to select keys matching configured CDM type
- Add get_track_kid_from_init() to extract KID from init segment with fallback to drm.kid from PSSH
- Track initial_drm_key to prevent double-licensing on first segment
- Simplify n_m3u8dl_re block to reuse common licensing flow
- Use strict PlayReady keyformat matching via PR_PSSH.SYSTEM_ID URN instead of loose substring match
- Fix PlayReady keyformat comparisons that incorrectly compared strings to PlayReadyCdm class
- Fix byterange header format in get_track_kid_from_init() to use HLS.calculate_byte_range()

Also fixes PlayReady keyformat matching in:
- unshackle/core/tracks/track.py
- unshackle/core/drm/playready.py

Fixes download failures where track_kid was null or mismatched, causing wrong content keys to be obtained during PlayReady/Widevine licensing.
- urllib3: 2.5.0 -> 2.6.3 (CVE-2025-66418, CVE-2025-66471, CVE-2026-21441)
- aiohttp: 3.13.2 -> 3.13.3 (8 CVEs including CVE-2025-69223, CVE-2025-69227)
- fonttools: 4.60.1 -> 4.61.1 (CVE-2025-66034)
- filelock: 3.19.1 -> 3.20.3 (CVE-2025-68146, CVE-2026-22701)
- virtualenv: 20.34.0 -> 20.36.1 (CVE-2026-22702)
pywidevine's serve module expects users to be a dict mapping secret keys to user objects with devices and username, not a simple list.
This was causing TypeError when accessing CDM endpoints.
* chore(deps): update subby to 0.3.27

* fix(subs): add WVTT (WebVTT in MP4) subtitle converter for subby

- Use WVTTConverter if self.codec == Subtitle.Codec.fVTT
- Read or save files as Paths instead of strings to avoid AttributeErrors
- Add CommonIssuesFixer when stripping SDH
- Set Subtitle.Codec.fVTT to use subby if conversion_method == "auto"
- Silence the underlying srt library logging used by subby to avoid info messages being printed to console
Merging after code review - fixes binary path handling
Merging dash-naming feature - adds optional dash separator format for filenames
The API returns error details with capital "Error" key, but the code was checking for lowercase "error", causing error details to be ignored.
Allow services to set a custom `source` attribute on tracks, which will be used in the filename instead of the service class name.
- Use singleton _Aria2Manager to reuse one aria2c process via RPC
- Add downloads via aria2.addUri instead of stdin input file
- Track per-GID byte-level progress (completedLength/totalLength)
- Add thread-safe operations with threading.Lock
- Enable graceful cancellation by removing individual downloads via RPC
The replace("Bearer ", "") approach returned the full Authorization header value when the prefix was not present, incorrectly treating other auth schemes (e.g., "Basic xyz") as API keys.
The session was created with headers but never used. The class saves sessions locally via the cache rather than uploading to a remote server.
Add comprehensive debug logging to diagnose N_m3u8DL-RE download failures where the process exits successfully but produces no output files.
Major features:
- Native Docker-based Gluetun VPN proxy provider with multi-provider support
  (NordVPN, Windscribe, Surfshark, ExpressVPN, and 50+ more)
- Stateless remote service architecture with local session caching
- Client-side authentication for remote services (browser, 2FA, OAuth support)

Key changes:
- core/proxies/windscribevpn.py: Enhanced proxy handling
- core/crypto.py: Cryptographic utilities
- docs/VPN_PROXY_SETUP.md: Comprehensive VPN/proxy documentation
Temporarily removes client-side remote service discovery and authentication until the implementation is more fleshed out and working.
HLS: Filter segment keys by CDM type during aria2c merge phase to prevent incorrect Widevine selection when using PlayReady-only CDMs. The merge phase now uses filter_keys_for_cdm() before get_supported_key(), matching the pattern used in initial licensing.

DASH: Extend PlayReady CDM detection to include remote CDMs with is_playready attribute, not just native PlayReadyCdm instances. This ensures correct DRM extraction order from init_data when using remote PlayReady CDMs.
fix(proxies): Fixes WindscribeVPN server authentication
This reverts commit 55bc2b16ee, reversing
changes made to 8c8c9368ba.
OpenVPN credentials now work reliably on all regions, not just US, AU, and NZ. Remove the supported_regions check that was blocking other country codes.
Previously get_random_server only collected servers from the first location matching a country code. Now it collects from all matching locations before selecting randomly.
Use segmented=True when downloading multiple URLs to prevent inner downloads from overriding the total segment count, which caused the progress bar to always appear green (finished state).

This is still WIP so will continue to monitor.
- Add MonaLisaCDM class wrapping wasmtime for key extraction
- Add MonaLisa DRM class with decrypt_segment() for per-segment decryption
- Display Content ID and keys in download output (matching Widevine/PlayReady)
- Add wasmtime dependency for WASM module execution
Allow binaries to be found in subdirectories of the binaries folder.
When a DASH manifest has a high startNumber (common in DVR/catch-up content from live streams), the segment range calculation would produce an empty range because end_number was set to len(segment_durations) rather than being offset by startNumber.
Allow binaries to be found in subdirectories of the binaries folder.
Some MPD manifests use the cenc: namespace prefix for PSSH elements (e.g., <cenc:pssh>) instead of non-namespaced <pssh>. This caused DRM extraction to fail for services.

- Add {urn:mpeg:cenc:2013}pssh fallback for Widevine PSSH extraction
- Add {urn:mpeg:cenc:2013}pssh fallback for PlayReady PSSH extraction
The init_data DRM extraction was unconditionally overwriting DRM already extracted from MPD ContentProtection elements. This caused failures when init segments contain malformed PSSH data while the MPD has valid PSSH.

Now only falls back to init_data extraction when no DRM was found from the manifest, matching the behavior in version 2.1.0.
Use original subtitle files for sidecar output while keeping muxed conversion behavior.

Fixes #59
Set DOWNLOAD_LICENCE_ONLY earlier in the download command so services build tracks in license-only mode.

Update Attachment URL handling to avoid eager downloads in license-only mode while keeping metadata, stable IDs, and safe cleanup behavior.
Fixed HDR Vivid showing the file name Range as 'None'
Fix Missing HLS Curl Session Processing
When --audio-description is set, keep standard selections and include descriptive tracks for requested languages, including --a-lang with orig and best selection paths.

Fixes #72
Add a hybrid track to the track processing list to fix the problem that the hybrid-processed hevc file remains in the temp folder.
Remove hybrid havc temp file
- Add a small helper to move N_m3u8DL-RE final outputs into the expected temp path (preserve actual suffix) and keep subtitle codec consistent with the produced file.
- Skip generic HLS segment merging when N_m3u8DL-RE is in use to avoid mixing in sidecar files and reduce Windows file-lock issues.
- Harden segmented WebVTT merging to avoid IndexError when caption segment indexes exceed the provided duration list.
merge commit filtering, deduplication, granular chore parsing, and regenerate CHANGELOG.md using git-cliff.
# Conflicts:
#	CHANGELOG.md
Use safe get() fallbacks for RemoteCdm config keys and default security_level to 3000 to avoid KeyError when snake_case is used.
When ensure_started() is called while aria2c is already running, it now compares the requested proxy/max_workers against the values the process was started with and logs a warning if they differ (since the running process cannot be reconfigured in-place). Startup no longer uses a fixed sleep; instead it probes the JSON-RPC endpoint with a bounded retry loop (aria2.getVersion) and only proceeds once RPC is responsive, terminating the subprocess and raising on timeout.
Ensure playready_config['users'] and API-only config always use a dict, even under --no-key, to avoid type mismatches.

Also stop implicitly granting PlayReady access by defaulting per-user 'playready_devices' to all devices; missing 'playready_devices' now defaults to an empty list and logs a warning including the user key.

BREAKING CHANGE: users without an explicit 'playready_devices' list no longer get access to all PlayReady devices by default.
- Switch docker run to use a temporary --env-file instead of per-var -e flags\n- Ensure temp env file is always removed (best-effort overwrite + unlink)\n- Tighten _is_container_running to exact-name matching via anchored docker filter\n- Close requests.Session used for IP verification to release connections\n- Redact more secret-like env keys in debug logs\n
- Pass ML-Worker key via env/stdin instead of argv to reduce exposure in process listings/logs.

- Add a hard timeout to the ML-Worker subprocess call and convert timeouts into DecryptionFailed errors.

- Make ticket bytes decoding defensive: try UTF-8, fall back to ASCII (base64), otherwise raise a descriptive ValueError.
When --a-lang is specific (not best/all) and multiple codecs are requested via -a/--acodec, select only the best-bitrate track per codec per language (plus descriptive if --audio-description).

Blame: regression introduced by 939ca25 (fix(dl): keep descriptive and standard audio for requested langs).
- Parse init section byterange offset as int to avoid string arithmetic bugs

- Wrap MonaLisa licensing in the same progress + error handling flow as Widevine/PlayReady
Ensure dynamic-range tokens use safe fallback when not in DYNAMIC_RANGE_MAP and append exactly one space-separated token without trailing/double spaces.
Title filenames now include resolution/service/WEB-DL/codecs/HDR tokens in both modes; scene_naming only changes the spacer ('.' vs ' ').

Also avoid overwriting muxed outputs by disambiguating on collision (append codec suffix when needed, then a numeric suffix).
Proxy URIs may contain embedded userinfo (username/password). Add a small sanitizer helper and use it for proxy mapping and proxy selection logs to avoid leaking credentials.
Add unshackle.core.cdm.detect helpers to classify CDMs consistently across local and remote backends.

- Add is_playready_cdm/is_widevine_cdm for DRM selection across pyplayready, pywidevine, and wrappers

- Add is_remote_cdm/is_local_cdm/cdm_location so services can branch on CDM execution location

- Switch core DASH/HLS parsing, track DRM selection, and dl CDM switching away from brittle isinstance/DecryptLabs-only checks

- Make unshackle.core.cdm import-light via lazy __getattr__ so optional CDM deps are only imported when needed
- Validate _monalisa_context_alloc return and cleanup on init failure
- Derive deterministic KID when DCID missing to avoid collisions
- Ensure stackRestore always runs via try/finally in _ccall
- Log base64 decode failures without leaking license contents
- Add bounds/alignment checks for i32 memory writes
Reverts the env/stdin key passing change introduced in 6c83790, since ML-Worker builds in use expect the key as argv[1].
Remove unreachable fallback to all devices; if a user has no explicit playready_devices configured, the PlayReady subapp receives an empty list (secure-by-default).
BREAKING CHANGE: PlayReady users without explicit playready_devices no longer get access to all devices by default.

Key changes:
- feat(drm): add MonaLisa DRM support to core infrastructure
- feat(cdm): add remote PlayReady CDM support via pyplayready RemoteCdm
- feat(serve): add PlayReady CDM support alongside Widevine
- feat(gluetun): Gluetun VPN integration and Windscribe support
- feat(audio): codec lists and split muxing
- feat(tracks): prioritize Atmos audio tracks over higher bitrate non-Atmos
- feat(video): detect interlaced scan type from MPD manifests
- feat(cdm): normalize CDM detection for local and remote implementations
- fix(serve)!: make PlayReady users config consistently a mapping
- 50+ additional bug fixes across HLS/DASH, proxies, subtitles, and more
This pull request has changes conflicting with the target branch.
  • CHANGELOG.md
  • pyproject.toml
  • unshackle/commands/dl.py
  • unshackle/commands/serve.py
  • unshackle/core/__init__.py
  • unshackle/core/__main__.py
  • unshackle/core/api/download_manager.py
  • unshackle/core/api/handlers.py
  • unshackle/core/api/routes.py
  • unshackle/core/binaries.py
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin update-unshackle:update-unshackle
git checkout update-unshackle
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: unshackle-dl/unshackle#3
No description provided.