feat(subtitle): data-driven conversion registry + SubtitleEdit 5 support

Replace the hardcoded conversion if/elif in Subtitle.convert with a capability-matrix backend registry (subtitle_convert.py): each backend declares the source->target pairs it supports plus a rank, and run_conversion tries them in order as a real fallback chain. conversion_method pins a backend but still falls back (pin-then-fallback). - Detect the cross-platform SubtitleEdit 5+ CLI (seconv) and use its --flag syntax for convert, SDH stripping, and reverse-RTL - Protect styled ASS/SSA from automatic SRT downconversion; honor an explicit --sub-format / sidecar_format - Read segmented fVTT (wvtt) and fTTML (stpp) directly from fragmented MP4 - Improve ASS/SSA font detection: inline \fn overrides, Format-located Fontname column, @-prefix strip, case-insensitive de-dup; covers SSA too - Update SUBTITLE_CONFIG.md, example yaml, README; add regression tests and a backend benchmark script
2026-06-10 03:02:09 +00:00 · 2026-06-07 22:21:25 -06:00
parent e6613e8ed8
commit 29232925d5
11 changed files with 871 additions and 357 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -24,6 +24,8 @@ device_private_key
 device_vmp_blob
 unshackle/binaries/*
 !unshackle/binaries/placehere.txt
 # test fixtures (binary subtitle samples) must be tracked despite the *.mp4 rule above
 !tests/tracks/fixtures/*.mp4
 unshackle/cache/
 unshackle/cookies/
 unshackle/certs/
--- a/README.md
+++ b/README.md
@@ -47,6 +47,10 @@ External tools on your `PATH` (recommended versions):
 - [Bento4](https://github.com/axiomatic-systems/Bento4) - ≥ 1.6.0-639
 - [dovi_tool](https://github.com/quietvoid/dovi_tool) - ≥ 2.1
 Optional:
 - [SubtitleEdit](https://github.com/SubtitleEdit/subtitleedit/releases) - ≥ 5.0 (`SeConv` CLI)
 ## License
 [GPL-3.0](LICENSE). Do not use unshackle for content you lack the rights to. Keep the core free and open; keep service code private. Be kind.
--- a/docs/SUBTITLE_CONFIG.md
+++ b/docs/SUBTITLE_CONFIG.md
@@ -8,17 +8,44 @@ For the canonical example, see `unshackle/unshackle-example.yaml`.
 Control subtitle conversion, SDH (hearing-impaired) stripping, formatting preservation, and output behavior.
- `conversion_method`: How to convert subtitles between formats. Default: `auto`.
+- `conversion_method`: Which backend to convert subtitles with. Default: `auto`.
-  - `auto`: Smart routing - subby for WebVTT/fVTT/SAMI; for SSA/ASS/MicroDVD/MPL2/TMP use SubtitleEdit when available, otherwise pysubs2; standard pycaption/SubtitleEdit pipeline for everything else.
+
-  - `subby`: Always use subby with `CommonIssuesFixer` (falls back to standard if the source codec isn't supported by subby).
+  Routing is data-driven (`unshackle/core/tracks/subtitle_convert.py`): a registry of backends each
-  - `subtitleedit`: Prefer SubtitleEdit when available; otherwise fall back to the standard pycaption pipeline.
+  declares the source→target codec pairs it supports plus a preference rank. For a conversion, the
-  - `pycaption`: Use only the pycaption library (no SubtitleEdit, no subby). Limited to SRT, TTML, and WebVTT outputs.
+  available backends that support the pair are tried in rank order — a real fallback chain. A
-  - `pysubs2`: Use pysubs2 (supports SRT, SSA, ASS, WebVTT, TTML, SAMI, MicroDVD, MPL2, TMP).
+  non-`auto` value **pins** that backend first, then still falls back through the chain if it can't
  handle the pair or errors (pin-then-fallback). A service may also set `preferred_conversion_method`
  on its tracks; an explicit `conversion_method` in config always wins.
  - `auto`: Best available backend by rank — SubtitleEdit (if installed) for highest fidelity;
    otherwise subby for WebVTT/fVTT/SAMI→SRT (adds `CommonIssuesFixer` cleanup), pysubs2 for SSA/ASS
    and the broad format set, pycaption as last resort.
  - `subby`: Prefer subby (`CommonIssuesFixer`); reads WebVTT/fVTT/SAMI, writes SRT (and TTML/VTT via
    an SRT intermediate).
  - `subtitleedit`: Prefer SubtitleEdit / `seconv`. Highest fidelity — preserves positioning/italics.
  - `pycaption`: Prefer pycaption. **Flattens positioning/italics**, writes only SRT/TTML/WebVTT.
  - `pysubs2`: Prefer pysubs2 (SRT, SSA, ASS, WebVTT, TTML, SAMI, MicroDVD, MPL2, TMP). The only
    pure-Python backend that reads ASS/SSA, so it is the default for styled SubStation sources.
  **Styled-subtitle protection**: ASS/SSA are never *automatically* downconverted to SRT (the
  conversion is skipped and the original kept) — SRT cannot carry their positioning/colours/styling.
  This applies to the default muxed track only; explicit requests still convert: a per-download
  `--sub-format srt` for the muxed track, or `sidecar_format: srt` for sidecars. To keep raw styled
  sidecars, set `sidecar_format: original`.
  **Segmented subtitles** (`fVTT`/WVTT and `fTTML`/STPP from DASH/HLS, e.g. HBO Max) are read directly
  from the fragmented MP4: fVTT via subby's `WVTTConverter`, fTTML via pycaption's box parsing. They
  can be converted *from* but not *to*.
  **SubtitleEdit on Linux/macOS**: install the SubtitleEdit 5+ CLI (`SeConv` / `seconv`, the
  self-contained cross-platform build from the SubtitleEdit releases) onto `PATH` or into
  `unshackle/binaries/`. unshackle targets the SubtitleEdit **5+** command syntax. The Windows
  `SubtitleEdit.exe` is the GUI app — use the `SeConv` CLI binary for headless conversion.
 - `sdh_method`: How to strip SDH cues. Default: `auto`.
  - `auto`: Try subby for SRT first, then SubtitleEdit (when `conversion_method` is `auto`/`subtitleedit` and the binary is available), then subtitle-filter as the final fallback.
  - `subby`: Use subby's `SDHStripper`. **Only operates on SRT**; for other codecs the call returns without stripping.
-  - `subtitleedit`: Use SubtitleEdit's `/RemoveTextForHI` when the binary is available; otherwise falls through to subtitle-filter.
+  - `subtitleedit`: Use SubtitleEdit's `--remove-text-for-hi` (SE5 CLI) when the binary is available; otherwise falls through to subtitle-filter.
  - `filter-subs`: Use the `subtitle-filter` library directly (`rm_fonts`, `rm_ast`, `rm_music`, `rm_effects`, `rm_names`, `rm_author`).
 - `strip_sdh`: Enable/disable automatic SDH stripping for tracks flagged as SDH. Default: `true`.
@@ -68,6 +95,7 @@ These behaviors are intentional and have no config knobs — they apply to every
 ## Related
 - Filename sanitization (e.g. parenthesis handling, unidecode bracket artifacts from PR #105) lives in `unshackle/core/utilities.py::sanitize_filename` and is governed by `output_template`, not the `subtitle:` config block.
- Subtitle codec support and the conversion matrix are defined in `unshackle/core/tracks/subtitle.py`.
+- Subtitle codec support is defined in `unshackle/core/tracks/subtitle.py`; the conversion backend
  registry, capability matrix, and ranks live in `unshackle/core/tracks/subtitle_convert.py`.
 ---
--- a/scripts/bench_subtitle_backends.py
+++ b/scripts/bench_subtitle_backends.py
@@ -0,0 +1,98 @@
 #!/usr/bin/env python3
 """
 Benchmark subtitle conversion backends to (re-)tune the preference ranks in
 ``unshackle/core/tracks/subtitle_convert.py``.
 Runs every backend that can read each input file, converting to a target format (default
 SRT), and reports cue count, leaked ASS override tags, and output size — so you can compare
 fidelity per (source, target) pair on real files. Read-only: copies inputs to a temp dir.
 Usage:
    uv run python scripts/bench_subtitle_backends.py <file-or-dir> [<file-or-dir> ...] [--target SRT]
 Example:
    uv run python scripts/bench_subtitle_backends.py downloads/
 """
 from __future__ import annotations
 import argparse
 import re
 import shutil
 import tempfile
 from pathlib import Path
 from unshackle.core.tracks import subtitle_convert as sc
 from unshackle.core.tracks.subtitle import Subtitle
 Codec = Subtitle.Codec
 EXT_TO_CODEC = {
    ".srt": Codec.SubRip,
    ".vtt": Codec.WebVTT,
    ".ass": Codec.SubStationAlphav4,
    ".ssa": Codec.SubStationAlpha,
    ".ttml": Codec.TimedTextMarkupLang,
    ".smi": Codec.SAMI,
    ".sami": Codec.SAMI,
 }
 def gather(paths: list[str]) -> list[Path]:
    files: list[Path] = []
    for p in paths:
        path = Path(p)
        if path.is_dir():
            files.extend(f for f in path.rglob("*") if f.suffix.lower() in EXT_TO_CODEC)
        elif path.suffix.lower() in EXT_TO_CODEC:
            files.append(path)
    return sorted(files)
 def metrics(text: str) -> tuple[int, int, int]:
    cues = len(re.findall(r"-->", text))
    ass_residue = len(re.findall(r"\{\\", text))
    return cues, ass_residue, len(text)
 def main() -> None:
    ap = argparse.ArgumentParser()
    ap.add_argument("paths", nargs="+", help="subtitle files or directories")
    ap.add_argument("--target", default="SRT", help="target codec value (SRT, VTT, ASS, ...)")
    args = ap.parse_args()
    target = Codec(args.target.upper())
    files = gather(args.paths)
    if not files:
        print("No subtitle files found.")
        return
    tmp = Path(tempfile.mkdtemp(prefix="subbench_"))
    print(f"{'file':40} {'source':10} {'backend':12} {'ok':3} {'cues':>5} {'resid':>5} {'bytes':>7}")
    for f in files:
        source = EXT_TO_CODEC[f.suffix.lower()]
        if source == target:
            continue
        for backend in sc.REGISTRY:
            if not (backend.is_available() and backend.can_convert(source, target)):
                continue
            work = tmp / f"{f.stem}.{backend.name}{f.suffix}"
            shutil.copy2(f, work)
            sub = Subtitle(url="x", language="en", codec=source)
            sub.path = work
            try:
                # Call the backend directly so each row reflects only that backend (no fallback).
                out = work.with_suffix(f".{target.value.lower()}")
                backend.convert(sub, target, out)
                cues, resid, size = metrics(out.read_text("utf8", errors="replace"))
                print(
                    f"{f.name[:40]:40} {source.name[:10]:10} {backend.name:12} {'Y':3} {cues:>5} {resid:>5} {size:>7}"
                )
            except Exception as e:  # noqa: BLE001 - benchmark reports failures, does not raise
                print(
                    f"{f.name[:40]:40} {source.name[:10]:10} {backend.name:12} {'N':3} {'-':>5} {'-':>5} {'-':>7}  {type(e).__name__}"
                )
 if __name__ == "__main__":
    main()
--- a/tests/tracks/fixtures/segmented.wvtt.mp4
+++ b/tests/tracks/fixtures/segmented.wvtt.mp4
--- a/tests/tracks/test_subtitle_convert.py
+++ b/tests/tracks/test_subtitle_convert.py
@@ -0,0 +1,314 @@
 """Tests for the data-driven subtitle conversion registry (``tracks/subtitle_convert.py``).
 Covers three things the refactor must guarantee:
 - the capability matrix resolves the right backend chain per (source, target) and env
  (SubtitleEdit present or not),
 - ``conversion_method`` pins a backend but still falls back (pin-then-fallback),
 - styled SubStation (ASS/SSA) is never auto-downconverted to SRT unless explicitly forced.
 Backends pysubs2/subby/pycaption are hard deps so the conversion paths run in CI without
 SubtitleEdit; SubtitleEdit availability is simulated by patching ``binaries.SubtitleEdit``.
 """
 from __future__ import annotations
 import pathlib
 import re
 import struct
 import pytest
 from unshackle.core import binaries
 from unshackle.core.tracks import subtitle_convert as sc
 from unshackle.core.tracks.subtitle import Subtitle
 Codec = Subtitle.Codec
 VTT_SAMPLE = """WEBVTT
 1
 00:00:01.000 --> 00:00:02.000
 Hello
 2
 00:00:03.000 --> 00:00:04.000
 World
 """
 ASS_SAMPLE = """[Script Info]
 ScriptType: v4.00+
 [V4+ Styles]
 Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
 Style: Default,Arial,20,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,1,2,10,10,18,1
 [Events]
 Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
 Dialogue: 0,0:00:01.00,0:00:02.00,Default,,0,0,0,,{\\i1}Hello{\\i0}
 Dialogue: 0,0:00:03.00,0:00:04.00,Default,,0,0,0,,World
 """
@pytest.fixture(autouse=True)
 def _no_subtitleedit(monkeypatch):
    """Default every test to a SubtitleEdit-less environment; tests opt in when needed."""
    monkeypatch.setattr(binaries, "SubtitleEdit", None)
 def make_sub(tmp_path, name: str, text: str, codec: Codec) -> Subtitle:
    path = tmp_path / name
    path.write_text(text, encoding="utf8")
    sub = Subtitle(url="https://example.test/x", language="en", codec=codec)
    sub.path = path
    return sub
 def cue_count(path) -> int:
    return len(re.findall(r"-->", path.read_text("utf8")))
 # --- capability matrix / resolver -------------------------------------------------------
 def test_resolve_webvtt_to_srt_order():
    chain = [b.name for b in sc.resolve_backends(Codec.WebVTT, Codec.SubRip)]
    assert chain == ["subby", "pysubs2", "pycaption"]
 def test_resolve_ass_to_srt_only_pysubs2_without_subtitleedit():
    # subby and pycaption cannot read ASS, so only pysubs2 remains.
    chain = [b.name for b in sc.resolve_backends(Codec.SubStationAlphav4, Codec.SubRip)]
    assert chain == ["pysubs2"]
 def test_subtitleedit_ranks_first_when_available(monkeypatch):
    monkeypatch.setattr(binaries, "SubtitleEdit", "/usr/bin/seconv")
    chain = [b.name for b in sc.resolve_backends(Codec.WebVTT, Codec.SubRip)]
    assert chain[0] == "subtitleedit"
 def test_pin_then_fallback_orders_pin_first():
    chain = [b.name for b in sc.resolve_backends(Codec.WebVTT, Codec.SubRip, pin="pysubs2")]
    assert chain[0] == "pysubs2"
    assert "subby" in chain  # fallbacks remain after the pin
 def test_pin_unavailable_falls_back_to_ranked_chain():
    # subtitleedit pinned but not installed -> just the ranked available backends.
    chain = [b.name for b in sc.resolve_backends(Codec.WebVTT, Codec.SubRip, pin="subtitleedit")]
    assert chain == ["subby", "pysubs2", "pycaption"]
 def test_fallback_runs_when_first_backend_fails(tmp_path, monkeypatch):
    monkeypatch.setattr("unshackle.core.config.config.subtitle", {}, raising=False)
    def boom(self, source, src, target, out):
        raise RuntimeError("backend exploded")
    # WebVTT->SRT chain is [subby, pysubs2, pycaption]; kill subby, expect pysubs2 to finish.
    monkeypatch.setattr(sc.SubbyBackend, "convert", boom)
    sub = make_sub(tmp_path, "x.vtt", VTT_SAMPLE, Codec.WebVTT)
    out = sub.convert(Codec.SubRip, forced=True)
    assert sub.codec == Codec.SubRip
    assert cue_count(out) == 2
 def test_no_backend_for_unsupported_target_raises(tmp_path):
    sub = make_sub(tmp_path, "x.ass", ASS_SAMPLE, Codec.SubStationAlphav4)
    with pytest.raises(NotImplementedError):
        sub.convert(Codec.fVTT, forced=True)  # no backend writes segmented fVTT
 # --- styled-ASS protection --------------------------------------------------------------
 def test_ass_to_srt_kept_as_is_when_not_forced(tmp_path, monkeypatch):
    monkeypatch.setattr("unshackle.core.config.config.subtitle", {}, raising=False)
    sub = make_sub(tmp_path, "x.ass", ASS_SAMPLE, Codec.SubStationAlphav4)
    out = sub.convert(Codec.SubRip, forced=False)
    assert sub.codec == Codec.SubStationAlphav4  # unchanged
    assert out == sub.path
    assert out.suffix == ".ass"
 def test_ass_to_srt_converts_when_forced(tmp_path, monkeypatch):
    monkeypatch.setattr("unshackle.core.config.config.subtitle", {}, raising=False)
    sub = make_sub(tmp_path, "x.ass", ASS_SAMPLE, Codec.SubStationAlphav4)
    out = sub.convert(Codec.SubRip, forced=True)
    assert sub.codec == Codec.SubRip
    assert out.suffix == ".srt"
    assert cue_count(out) == 2
    assert "{\\" not in out.read_text("utf8")  # override tags stripped
 # --- conversion paths -------------------------------------------------------------------
 def test_webvtt_to_srt_conversion(tmp_path, monkeypatch):
    monkeypatch.setattr("unshackle.core.config.config.subtitle", {}, raising=False)
    sub = make_sub(tmp_path, "x.vtt", VTT_SAMPLE, Codec.WebVTT)
    out = sub.convert(Codec.SubRip, forced=True)
    assert sub.codec == Codec.SubRip
    assert cue_count(out) == 2
 def test_same_codec_is_noop(tmp_path):
    sub = make_sub(tmp_path, "x.srt", "1\n00:00:01,000 --> 00:00:02,000\nHi\n", Codec.SubRip)
    assert sub.convert(Codec.SubRip) == sub.path
    assert sub.codec == Codec.SubRip
 # --- ASS/SSA font detection ------------------------------------------------------------
 FONT_ASS = """[Script Info]
 ScriptType: v4.00+
 [V4+ Styles]
 Format: Name, Fontname, Fontsize, PrimaryColour, Bold, Italic, Alignment, MarginV, Encoding
 Style: Default,Trebuchet MS,24,&H00FFFFFF,0,0,2,18,1
 Style: sign,@Arial Unicode MS,20,&H00FFFFFF,0,0,8,10,1
 [Events]
 Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
 Dialogue: 0,0:00:01.00,0:00:02.00,Default,,0,0,0,,{\\fnTimes New Roman}A sign
 Dialogue: 0,0:00:03.00,0:00:04.00,Default,,0,0,0,,{\\fntimes new roman}lower case
 Dialogue: 0,0:00:05.00,0:00:06.00,Default,,0,0,0,,{\\fnGeorgia\\b1}bold note
 """
 def test_extract_fonts_styles_and_inline_overrides():
    fonts = Subtitle.extract_fonts(FONT_ASS)
    # Style fontnames (column located via Format line, @-prefix stripped) + inline \fn overrides
    assert fonts == {"Trebuchet MS", "Arial Unicode MS", "Times New Roman", "Georgia"}
    # case-insensitive de-dup keeps the mixed-case spelling, not "times new roman"
    assert "times new roman" not in fonts
 def test_extract_fonts_handles_non_default_column_order():
    ass = (
        "[V4+ Styles]\n"
        "Format: Name, Fontsize, Fontname, Bold\n"  # Fontname not in the usual position
        "Style: Main,28,Verdana,0\n"
    )
    assert Subtitle.extract_fonts(ass) == {"Verdana"}
 # --- non-Latin scripts (RTL / CJK) preserved through conversion ------------------------
 CJK_RTL_VTT = """WEBVTT
 1
 00:00:01.000 --> 00:00:02.000
 مرحبا بالعالم
 2
 00:00:03.000 --> 00:00:04.000
 안녕하세요
 3
 00:00:05.000 --> 00:00:06.000
 你好世界
 """
@pytest.mark.parametrize(
    "pattern",
    [r"[؀-ۿ]", r"[가-힣]", r"[一-鿿]"],  # Arabic, Hangul, CJK
 )
 def test_non_latin_scripts_survive_vtt_to_srt(tmp_path, monkeypatch, pattern):
    monkeypatch.setattr("unshackle.core.config.config.subtitle", {}, raising=False)
    sub = make_sub(tmp_path, "x.vtt", CJK_RTL_VTT, Codec.WebVTT)
    out = sub.convert(Codec.SubRip, forced=True)
    text = out.read_text("utf8")
    assert cue_count(out) == 3
    assert re.search(pattern, text)  # script survived the round-trip, no mojibake
 # --- SDH stripping ----------------------------------------------------------------------
 SDH_SRT = """1
 00:00:01,000 --> 00:00:02,000
 [door creaks]
 2
 00:00:03,000 --> 00:00:04,000
 Hello there.
 3
 00:00:05,000 --> 00:00:06,000
 ♪ upbeat music ♪
 """
 def test_sdh_stripping_removes_effects_keeps_dialogue(tmp_path, monkeypatch):
    # subby's SDHStripper runs on SRT without SubtitleEdit installed.
    monkeypatch.setattr("unshackle.core.config.config.subtitle", {"sdh_method": "subby"}, raising=False)
    sub = make_sub(tmp_path, "x.srt", SDH_SRT, Codec.SubRip)
    sub.strip_hearing_impaired()
    out = sub.path.read_text("utf8")
    assert "Hello there." in out  # real dialogue kept
    assert "door creaks" not in out  # bracketed effect removed (subby SDHStripper)
 # --- segmented (box-encapsulated) formats: fVTT (wvtt) / fTTML (stpp) --------------------
 # These ship from DASH/HLS as fragmented MP4 (e.g. HBO Max). The downloader concatenates
 # init + media segments into one file; parse() reads the MP4 boxes directly.
 FIXTURES = pathlib.Path(__file__).parent / "fixtures"
 def caption_total(caption_set) -> int:
    return sum(len(caption_set.get_captions(lang)) for lang in caption_set.get_languages())
 def build_stpp_mp4(*ttml_fragments: str) -> bytes:
    """A minimal stpp-style MP4: ftyp + one mdat per TTML fragment (what fTTML.parse reads)."""
    def box(box_type: bytes, payload: bytes) -> bytes:
        return struct.pack(">I", 8 + len(payload)) + box_type + payload
    data = box(b"ftyp", b"isom" + struct.pack(">I", 0) + b"isomiso6")
    for frag in ttml_fragments:
        data += box(b"mdat", frag.encode("utf8"))
    return data
 def test_segmented_fvtt_parses_and_converts(tmp_path, monkeypatch):
    monkeypatch.setattr("unshackle.core.config.config.subtitle", {}, raising=False)
    data = (FIXTURES / "segmented.wvtt.mp4").read_bytes()
    caption_set = Subtitle.parse(data, Codec.fVTT)
    assert caption_total(caption_set) == 2
    seg = tmp_path / "seg.wvtt"
    seg.write_bytes(data)
    sub = Subtitle(url="https://example.test/x", language="en", codec=Codec.fVTT)
    sub.path = seg
    # download() converts fVTT -> WebVTT (not "forced"); chain is subby then pycaption.
    out = sub.convert(Codec.WebVTT)
    assert sub.codec == Codec.WebVTT
    assert cue_count(out) == 2
 def test_segmented_fttml_parses_and_converts(tmp_path, monkeypatch):
    monkeypatch.setattr("unshackle.core.config.config.subtitle", {}, raising=False)
    frag = (
        '<?xml version="1.0" encoding="utf-8"?>'
        '<tt xmlns="http://www.w3.org/ns/ttml" xml:lang="en"><body><div>'
        '<p begin="00:00:0{a}.000" end="00:00:0{b}.000">Line {a}</p>'
        "</div></body></tt>"
    )
    data = build_stpp_mp4(frag.format(a=1, b=2), frag.format(a=3, b=4))
    caption_set = Subtitle.parse(data, Codec.fTTML)
    assert caption_total(caption_set) == 2
    seg = tmp_path / "seg.stpp"
    seg.write_bytes(data)
    sub = Subtitle(url="https://example.test/x", language="en", codec=Codec.fTTML)
    sub.path = seg
    # download() converts fTTML -> TTML (only pycaption can read fTTML); then -> SRT.
    sub.convert(Codec.TimedTextMarkupLang)
    assert sub.codec == Codec.TimedTextMarkupLang
    out = sub.convert(Codec.SubRip, forced=True)
    assert cue_count(out) == 2
--- a/unshackle/commands/dl.py
+++ b/unshackle/commands/dl.py
@@ -311,7 +311,7 @@ class dl:
                )
                temp_sub.path = temp_path
                try:
-                    temp_sub.convert(target_codec)
+                    temp_sub.convert(target_codec, forced=True)
                    if temp_sub.path and temp_sub.path.exists():
                        shutil.copy2(temp_sub.path, sidecar_path)
                finally:
@@ -528,9 +528,13 @@ class dl:
    @click.option("-S", "--subs-only", is_flag=True, default=False, help="Only download subtitle tracks.")
    @click.option("-C", "--chapters-only", is_flag=True, default=False, help="Only download chapter markers.")
    @click.option("-ns", "--no-subs", is_flag=True, default=False, help="Do not download subtitle tracks.")
-    @click.option("--skip-subtitle-errors", is_flag=True, default=False,
+    @click.option(
        "--skip-subtitle-errors",
        is_flag=True,
        default=False,
        help="If a subtitle track fails to download, skip it and continue instead of "
-                       "aborting the whole title (video/audio failures stay fatal).")
+        "aborting the whole title (video/audio failures stay fatal).",
    )
    @click.option("-na", "--no-audio", is_flag=True, default=False, help="Do not download audio tracks.")
    @click.option("-nc", "--no-chapters", is_flag=True, default=False, help="Do not download chapter markers.")
    @click.option("-nv", "--no-video", is_flag=True, default=False, help="Do not download video tracks.")
@@ -2367,18 +2371,16 @@ class dl:
                    for subtitle in title.tracks.subtitles:
                        if sub_format:
                            if subtitle.codec != sub_format:
-                                subtitle.convert(sub_format)
+                                subtitle.convert(sub_format, forced=True)
                        elif subtitle.codec == Subtitle.Codec.TimedTextMarkupLang:
                            # MKV does not support TTML, VTT is the next best option
                            subtitle.convert(Subtitle.Codec.WebVTT)
                with console.status("Checking Subtitles for Fonts..."):
-                    font_names = []
+                    font_names: list[str] = []
                    for subtitle in title.tracks.subtitles:
-                        if subtitle.codec == Subtitle.Codec.SubStationAlphav4:
+                        if subtitle.codec in (Subtitle.Codec.SubStationAlpha, Subtitle.Codec.SubStationAlphav4):
-                            for line in subtitle.path.read_text("utf8").splitlines():
+                            font_names.extend(Subtitle.extract_fonts(subtitle.path.read_text("utf8")))
                                if line.startswith("Style: "):
                                    font_names.append(line.removeprefix("Style: ").split(",")[1].strip())
                    font_count, missing_fonts = self.attach_subtitle_fonts(font_names, title, temp_font_files)
--- a/unshackle/core/binaries.py
+++ b/unshackle/core/binaries.py
@@ -37,7 +37,7 @@ def find(*names: str) -> Optional[Path]:
 FFMPEG = find("ffmpeg")
 FFProbe = find("ffprobe")
 FFPlay = find("ffplay")
-SubtitleEdit = find("SubtitleEdit")
+SubtitleEdit = find("SubtitleEdit", "seconv")  # seconv = cross-platform subtitleedit-cli (.NET 8)
 ShakaPackager = find(
    "shaka-packager",
    "packager",
--- a/unshackle/core/tracks/subtitle.py
+++ b/unshackle/core/tracks/subtitle.py
@@ -11,13 +11,12 @@ from pathlib import Path
 from typing import Any, Callable, Iterable, Optional, Union
 import pycaption
 import pysubs2
 import requests
 from construct import Container
 from pycaption import Caption, CaptionList, CaptionNode, WebVTTReader
 from pycaption.geometry import Layout
 from pymp4.parser import MP4
-from subby import CommonIssuesFixer, SAMIConverter, SDHStripper, WebVTTConverter, WVTTConverter
+from subby import CommonIssuesFixer, SAMIConverter, SDHStripper
 from subtitle_filter import Subtitles
 from unshackle.core import binaries
@@ -600,306 +599,73 @@ class Subtitle(Track):
        return "\n".join(sanitized_lines)
-    def convert_with_subby(self, codec: Subtitle.Codec) -> Path:
+    def convert(self, codec: Subtitle.Codec, *, forced: bool = False) -> Path:
        """
-        Convert subtitle using subby library for better format support and processing.
+        Convert this Subtitle to another format.
-        This method leverages subby's advanced subtitle processing capabilities
+        Backend selection is data-driven (see ``tracks/subtitle_convert.py``): the best
-        including better WebVTT handling, SDH stripping, and common issue fixing.
+        available backend that supports source->target is used, falling back through the
        capability chain on failure. The backend can be pinned via the ``conversion_method``
        config key (``auto`` | ``subby`` | ``pysubs2`` | ``subtitleedit`` | ``pycaption``),
        or nudged per-service via ``preferred_conversion_method``; an explicit config value
        always wins.
        ``forced`` marks an explicit user request (``--sub-format``). Lossy downconverts of
        styled formats (SSA/ASS -> SRT) are skipped unless ``forced`` is True.
        """
        from unshackle.core.tracks.subtitle_convert import run_conversion
        if not self.path or not self.path.exists():
            raise ValueError("You must download the subtitle track first.")
-        if self.codec == codec:
+        method = (
-            return self.path
+            config.subtitle.get("conversion_method") or getattr(self, "preferred_conversion_method", None) or "auto"
        )
        pin = None if method == "auto" else method
        return run_conversion(self, codec, pin=pin, forced=forced)
-        output_path = self.path.with_suffix(f".{codec.value.lower()}")
+    @staticmethod
-        original_path = self.path
+    def extract_fonts(text: str) -> set[str]:
        try:
            # Convert to SRT using subby first
            srt_subtitles = None
            if self.codec == Subtitle.Codec.WebVTT:
                converter = WebVTTConverter()
                srt_subtitles = converter.from_file(self.path)
            if self.codec == Subtitle.Codec.fVTT:
                converter = WVTTConverter()
                srt_subtitles = converter.from_file(self.path)
            elif self.codec == Subtitle.Codec.SAMI:
                converter = SAMIConverter()
                srt_subtitles = converter.from_file(self.path)
            if srt_subtitles is not None:
                # Apply common fixes
                fixer = CommonIssuesFixer()
                fixed_srt, _ = fixer.from_srt(srt_subtitles)
                # If target is SRT, we're done
                if codec == Subtitle.Codec.SubRip:
                    fixed_srt.save(output_path, encoding="utf8")
                else:
                    # Convert from SRT to target format using existing pycaption logic
                    temp_srt_path = self.path.with_suffix(".temp.srt")
                    fixed_srt.save(temp_srt_path, encoding="utf8")
                    # Parse the SRT and convert to target format
                    caption_set = self.parse(temp_srt_path.read_bytes(), Subtitle.Codec.SubRip)
                    self.merge_same_cues(caption_set)
                    writer = {
                        Subtitle.Codec.TimedTextMarkupLang: pycaption.DFXPWriter,
                        Subtitle.Codec.WebVTT: pycaption.WebVTTWriter,
                    }.get(codec)
                    if writer:
                        subtitle_text = writer().write(caption_set)
                        output_path.write_text(subtitle_text, encoding="utf8")
                    else:
                        # Fall back to existing conversion method
                        temp_srt_path.unlink()
                        return self._convert_standard(codec)
                    temp_srt_path.unlink()
                if original_path.exists() and original_path != output_path:
                    original_path.unlink()
                self.path = output_path
                self.codec = codec
                if callable(self.OnConverted):
                    self.OnConverted(codec)
                return output_path
            else:
                # Fall back to existing conversion method
                return self._convert_standard(codec)
        except Exception:
            # Fall back to existing conversion method on any error
            return self._convert_standard(codec)
    def convert_with_pysubs2(self, codec: Subtitle.Codec) -> Path:
        """
-        Convert subtitle using pysubs2 library for broad format support.
+        Font names referenced by an ASS/SSA subtitle.
-        pysubs2 is a pure-Python library supporting SubRip (SRT), SubStation Alpha
+        Covers both sources that need attaching for correct rendering:
-        (SSA/ASS), WebVTT, TTML, SAMI, MicroDVD, MPL2, and TMP formats.
+        - the ``Fontname`` column of every ``Style:`` line in ``[V4+ Styles]``/``[V4 Styles]``
          (column located from the section's ``Format:`` line, not assumed by index), and
        - inline ``\\fn`` font overrides inside ``Dialogue`` override blocks.
        Leading ``@`` (vertical-writing prefix) is stripped and names are de-duplicated
        case-insensitively, preferring a mixed-case spelling over an all-lowercase one.
        """
-        if not self.path or not self.path.exists():
+        names: set[str] = set()
-            raise ValueError("You must download the subtitle track first.")
+        name_index = 1  # ASS default Style order: Name, Fontname, ...
        in_styles = False
        for line in text.splitlines():
            stripped = line.strip()
            if stripped.startswith("["):
                in_styles = stripped.lower() in ("[v4+ styles]", "[v4 styles]")
                continue
            if not in_styles:
                continue
            if stripped.lower().startswith("format:"):
                columns = [c.strip().lower() for c in stripped.split(":", 1)[1].split(",")]
                if "fontname" in columns:
                    name_index = columns.index("fontname")
            elif stripped.lower().startswith("style:"):
                fields = stripped.split(":", 1)[1].split(",")
                if len(fields) > name_index:
                    names.add(fields[name_index].strip())
-        if self.codec == codec:
+        names.update(match.strip() for match in re.findall(r"\\fn([^\\}]+)", text))
            return self.path
-        output_path = self.path.with_suffix(f".{codec.value.lower()}")
+        canonical: dict[str, str] = {}
-        original_path = self.path
+        for name in (raw.lstrip("@").strip() for raw in names):
-
+            if not name:
-        codec_to_pysubs2_format = {
+                continue
-            Subtitle.Codec.SubRip: "srt",
+            key = name.lower()
-            Subtitle.Codec.SubStationAlpha: "ssa",
+            if key not in canonical or (name != name.lower() and canonical[key] == canonical[key].lower()):
-            Subtitle.Codec.SubStationAlphav4: "ass",
+                canonical[key] = name
-            Subtitle.Codec.WebVTT: "vtt",
+        return set(canonical.values())
            Subtitle.Codec.TimedTextMarkupLang: "ttml",
            Subtitle.Codec.SAMI: "sami",
            Subtitle.Codec.MicroDVD: "microdvd",
            Subtitle.Codec.MPL2: "mpl2",
            Subtitle.Codec.TMP: "tmp",
        }
        pysubs2_output_format = codec_to_pysubs2_format.get(codec)
        if pysubs2_output_format is None:
            return self._convert_standard(codec)
        try:
            subs = pysubs2.load(str(self.path), encoding="utf-8")
            subs.save(str(output_path), format_=pysubs2_output_format, encoding="utf-8")
            if original_path.exists() and original_path != output_path:
                original_path.unlink()
            self.path = output_path
            self.codec = codec
            if callable(self.OnConverted):
                self.OnConverted(codec)
            return output_path
        except Exception:
            return self._convert_standard(codec)
    def convert(self, codec: Subtitle.Codec) -> Path:
        """
        Convert this Subtitle to another Format.
        The conversion method is determined by the 'conversion_method' setting in config:
        - 'auto' (default): Uses subby for WebVTT/fVTT/SAMI; for SSA/ASS/MicroDVD/MPL2/TMP
          uses SubtitleEdit if available, otherwise pysubs2; standard for others
        - 'subby': Always uses subby with CommonIssuesFixer
        - 'subtitleedit': Uses SubtitleEdit when available, falls back to pycaption
        - 'pycaption': Uses only pycaption library
        - 'pysubs2': Uses pysubs2 library
        """
        # Check configuration for conversion method
        conversion_method = config.subtitle.get("conversion_method", "auto")
        if conversion_method == "subby":
            return self.convert_with_subby(codec)
        elif conversion_method == "subtitleedit":
            return self._convert_standard(codec)
        elif conversion_method == "pycaption":
            return self._convert_pycaption_only(codec)
        elif conversion_method == "pysubs2":
            return self.convert_with_pysubs2(codec)
        elif conversion_method == "auto":
            if self.codec in (Subtitle.Codec.WebVTT, Subtitle.Codec.fVTT, Subtitle.Codec.SAMI):
                return self.convert_with_subby(codec)
            elif self.codec in (
                Subtitle.Codec.SubStationAlpha,
                Subtitle.Codec.SubStationAlphav4,
                Subtitle.Codec.MicroDVD,
                Subtitle.Codec.MPL2,
                Subtitle.Codec.TMP,
            ):
                if binaries.SubtitleEdit:
                    return self._convert_standard(codec)
                else:
                    return self.convert_with_pysubs2(codec)
            else:
                return self._convert_standard(codec)
        else:
            return self._convert_standard(codec)
    def _convert_pycaption_only(self, codec: Subtitle.Codec) -> Path:
        """
        Convert subtitle using only pycaption library (no SubtitleEdit, no subby).
        This is the original conversion method that only uses pycaption.
        """
        if not self.path or not self.path.exists():
            raise ValueError("You must download the subtitle track first.")
        if self.codec == codec:
            return self.path
        output_path = self.path.with_suffix(f".{codec.value.lower()}")
        original_path = self.path
        # Use only pycaption for conversion
        writer = {
            Subtitle.Codec.SubRip: pycaption.SRTWriter,
            Subtitle.Codec.TimedTextMarkupLang: pycaption.DFXPWriter,
            Subtitle.Codec.WebVTT: pycaption.WebVTTWriter,
        }.get(codec)
        if writer is None:
            raise NotImplementedError(f"Cannot convert {self.codec.name} to {codec.name} using pycaption only.")
        caption_set = self.parse(self.path.read_bytes(), self.codec)
        Subtitle.merge_same_cues(caption_set)
        if codec == Subtitle.Codec.WebVTT:
            Subtitle.filter_unwanted_cues(caption_set)
        subtitle_text = writer().write(caption_set)
        output_path.write_text(subtitle_text, encoding="utf8")
        if original_path.exists() and original_path != output_path:
            original_path.unlink()
        self.path = output_path
        self.codec = codec
        if callable(self.OnConverted):
            self.OnConverted(codec)
        return output_path
    def _convert_standard(self, codec: Subtitle.Codec) -> Path:
        """
        Convert this Subtitle to another Format.
        The file path location of the Subtitle data will be kept at the same
        location but the file extension will be changed appropriately.
        Supported formats:
        - SubRip - SubtitleEdit or pycaption.SRTWriter
        - TimedTextMarkupLang - SubtitleEdit or pycaption.DFXPWriter
        - WebVTT - SubtitleEdit or pycaption.WebVTTWriter
        - SubStationAlphav4 - SubtitleEdit
        - SAMI - subby.SAMIConverter (when available)
        - fTTML* - custom code using some pycaption functions
        - fVTT* - custom code using some pycaption functions
        *: Can read from format, but cannot convert to format
        Note: It currently prioritizes using SubtitleEdit over PyCaption as
        I have personally noticed more oddities with PyCaption parsing over
        SubtitleEdit. Especially when working with TTML/DFXP where it would
        often have timecodes and stuff mixed in/duplicated.
        Returns the new file path of the Subtitle.
        """
        if not self.path or not self.path.exists():
            raise ValueError("You must download the subtitle track first.")
        if self.codec == codec:
            return self.path
        output_path = self.path.with_suffix(f".{codec.value.lower()}")
        original_path = self.path
        if binaries.SubtitleEdit and self.codec not in (Subtitle.Codec.fTTML, Subtitle.Codec.fVTT):
            sub_edit_format = {
                Subtitle.Codec.SubRip: "subrip",
                Subtitle.Codec.SubStationAlpha: "substationalpha",
                Subtitle.Codec.SubStationAlphav4: "advancedsubstationalpha",
                Subtitle.Codec.TimedTextMarkupLang: "timedtext1.0",
                Subtitle.Codec.WebVTT: "webvtt",
                Subtitle.Codec.SAMI: "sami",
                Subtitle.Codec.MicroDVD: "microdvd",
            }.get(codec, codec.name.lower())
            sub_edit_args = [
                str(binaries.SubtitleEdit),
                "/convert",
                str(self.path),
                sub_edit_format,
                f"/outputfilename:{output_path.name}",
                "/encoding:utf8",
            ]
            if codec == Subtitle.Codec.SubRip:
                sub_edit_args.append("/ConvertColorsToDialog")
            subprocess.run(sub_edit_args, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
        else:
            writer = {
                # pycaption generally only supports these subtitle formats
                Subtitle.Codec.SubRip: pycaption.SRTWriter,
                Subtitle.Codec.TimedTextMarkupLang: pycaption.DFXPWriter,
                Subtitle.Codec.WebVTT: pycaption.WebVTTWriter,
            }.get(codec)
            if writer is None:
                raise NotImplementedError(f"Cannot yet convert {self.codec.name} to {codec.name}.")
            caption_set = self.parse(self.path.read_bytes(), self.codec)
            Subtitle.merge_same_cues(caption_set)
            if codec == Subtitle.Codec.WebVTT:
                Subtitle.filter_unwanted_cues(caption_set)
            subtitle_text = writer().write(caption_set)
            output_path.write_text(subtitle_text, encoding="utf8")
        if original_path.exists() and original_path != output_path:
            original_path.unlink()
        self.path = output_path
        self.codec = codec
        if callable(self.OnConverted):
            self.OnConverted(codec)
        return output_path
    @staticmethod
    def parse(data: bytes, codec: Subtitle.Codec) -> pycaption.CaptionSet:
@@ -1267,25 +1033,13 @@ class Subtitle(Track):
        )
        if binaries.SubtitleEdit and use_subtitleedit:
-            output_format = {
+            from unshackle.core.tracks.subtitle_convert import SUBTITLE_EDIT_FORMATS, subtitleedit_args
-                Subtitle.Codec.SubRip: "subrip",
+
-                Subtitle.Codec.SubStationAlpha: "substationalpha",
+            output_format = SUBTITLE_EDIT_FORMATS.get(self.codec, self.codec.name.lower())
                Subtitle.Codec.SubStationAlphav4: "advancedsubstationalpha",
                Subtitle.Codec.TimedTextMarkupLang: "timedtext1.0",
                Subtitle.Codec.WebVTT: "webvtt",
                Subtitle.Codec.SAMI: "sami",
                Subtitle.Codec.MicroDVD: "microdvd",
            }.get(self.codec, self.codec.name.lower())
            subprocess.run(
-                [
+                subtitleedit_args(
-                    str(binaries.SubtitleEdit),
+                    binaries.SubtitleEdit, self.path, output_format, output_folder=self.path.parent, remove_hi=True
-                    "/convert",
+                ),
                    str(self.path),
                    output_format,
                    "/encoding:utf8",
                    "/overwrite",
                    "/RemoveTextForHI",
                ],
                check=True,
                stdout=subprocess.DEVNULL,
                stderr=subprocess.DEVNULL,
@@ -1330,26 +1084,14 @@ class Subtitle(Track):
        if not binaries.SubtitleEdit:
            raise EnvironmentError("SubtitleEdit executable not found...")
-        output_format = {
+        from unshackle.core.tracks.subtitle_convert import SUBTITLE_EDIT_FORMATS, subtitleedit_args
-            Subtitle.Codec.SubRip: "subrip",
+
-            Subtitle.Codec.SubStationAlpha: "substationalpha",
+        output_format = SUBTITLE_EDIT_FORMATS.get(self.codec, self.codec.name.lower())
            Subtitle.Codec.SubStationAlphav4: "advancedsubstationalpha",
            Subtitle.Codec.TimedTextMarkupLang: "timedtext1.0",
            Subtitle.Codec.WebVTT: "webvtt",
            Subtitle.Codec.SAMI: "sami",
            Subtitle.Codec.MicroDVD: "microdvd",
        }.get(self.codec, self.codec.name.lower())
        subprocess.run(
-            [
+            subtitleedit_args(
-                str(binaries.SubtitleEdit),
+                binaries.SubtitleEdit, self.path, output_format, output_folder=self.path.parent, reverse_rtl=True
-                "/convert",
+            ),
                str(self.path),
                output_format,
                "/ReverseRtlStartEnd",
                "/encoding:utf8",
                "/overwrite",
            ],
            check=True,
            stdout=subprocess.DEVNULL,
            stderr=subprocess.DEVNULL,
--- a/unshackle/core/tracks/subtitle_convert.py
+++ b/unshackle/core/tracks/subtitle_convert.py
@@ -0,0 +1,313 @@
 """
 Subtitle conversion backend registry.
 Routing is data-driven: each backend declares which (source -> target) codec pairs it can
 read/write, whether it is available in the current environment, and a preference rank.
 ``resolve_backends`` filters the registry to the available backends that support the
 requested pair and orders them by rank; ``run_conversion`` tries each in turn (a real
 fallback chain) until one succeeds.
 The public entry point stays ``Subtitle.convert`` / ``Subtitle.strip_hearing_impaired`` in
 subtitle.py — this module only holds the selection + conversion logic so subtitle.py keeps
 the codec enum, ``parse``, sanitizers and cue helpers (the collaborators backends reuse).
 """
 from __future__ import annotations
 import logging
 import subprocess
 from pathlib import Path
 from typing import Optional, Protocol
 import pycaption
 import pysubs2
 from subby import CommonIssuesFixer, SAMIConverter, WebVTTConverter, WVTTConverter
 from unshackle.core import binaries
 from unshackle.core.tracks.subtitle import Subtitle
 log = logging.getLogger("subtitle")
 Codec = Subtitle.Codec
 # SubtitleEdit (and the cross-platform seconv port) /convert format names.
 # Shared by SubtitleEditBackend, strip_hearing_impaired and reverse_rtl so the map lives once.
 SUBTITLE_EDIT_FORMATS: dict[Codec, str] = {
    Codec.SubRip: "subrip",
    Codec.SubStationAlpha: "substationalpha",
    Codec.SubStationAlphav4: "advancedsubstationalpha",
    Codec.TimedTextMarkupLang: "timedtext1.0",
    Codec.WebVTT: "webvtt",
    Codec.SAMI: "sami",
    Codec.MicroDVD: "microdvd",
 }
 # pycaption can only WRITE these three formats.
 PYCAPTION_WRITERS = {
    Codec.SubRip: pycaption.SRTWriter,
    Codec.TimedTextMarkupLang: pycaption.DFXPWriter,
    Codec.WebVTT: pycaption.WebVTTWriter,
 }
 # pysubs2 format identifiers per codec.
 PYSUBS2_FORMATS: dict[Codec, str] = {
    Codec.SubRip: "srt",
    Codec.SubStationAlpha: "ssa",
    Codec.SubStationAlphav4: "ass",
    Codec.WebVTT: "vtt",
    Codec.TimedTextMarkupLang: "ttml",
    Codec.SAMI: "sami",
    Codec.MicroDVD: "microdvd",
    Codec.MPL2: "mpl2",
    Codec.TMP: "tmp",
 }
 def subtitleedit_args(
    binary: object,
    src: Path,
    fmt: str,
    *,
    output_folder: Optional[Path] = None,
    convert_colors: bool = False,
    remove_hi: bool = False,
    reverse_rtl: bool = False,
 ) -> list[str]:
    """
    Build a SubtitleEdit batch-convert command.
    Targets the SubtitleEdit 5+ CLI (``SeConv`` / ``seconv`` on every platform), which takes
    ``--flags`` with a positional ``<pattern> <format>`` (no legacy ``/convert`` verb). The
    SE5 converter names the output ``<input-stem>.<format-ext>``; pass ``output_folder`` to
    place it next to a chosen path (a bare ``--output-filename`` resolves against the *cwd*,
    not the input dir, so we always steer with ``--output-folder``). ``--overwrite`` is always
    set so re-runs and in-place transforms (SDH/RTL) don't fail on an existing file.
    """
    args = [str(binary), str(src), fmt, "--encoding:utf-8", "--overwrite"]
    if output_folder is not None:
        args.append(f"--output-folder:{output_folder}")
    if convert_colors:
        args.append("--convert-colors-to-dialog")
    if remove_hi:
        args.append("--remove-text-for-hi")
    if reverse_rtl:
        args.append("--reverse-rtl-start-end")
    return args
 # Styled SubStation formats flattened to SRT lose positioning/colours/italics.
 # Never performed automatically — only when the user explicitly forces a target format.
 LOSSY_DOWNCONVERTS: frozenset[tuple[Codec, Codec]] = frozenset(
    {
        (Codec.SubStationAlpha, Codec.SubRip),
        (Codec.SubStationAlphav4, Codec.SubRip),
    }
 )
 class SubtitleBackend(Protocol):
    name: str
    def is_available(self) -> bool: ...
    def can_convert(self, source: Codec, target: Codec) -> bool: ...
    def rank(self, source: Codec, target: Codec) -> int: ...
    def convert(self, source: Codec, src: Path, target: Codec, out: Path) -> None:
        """Convert ``src`` (a ``source`` file) to ``target``, writing to ``out``. Raise on failure."""
        ...
 class SubtitleEditBackend:
    """SubtitleEdit / seconv CLI. Highest fidelity (keeps positioning + italics) when present."""
    name = "subtitleedit"
    reads = frozenset(SUBTITLE_EDIT_FORMATS)
    writes = frozenset(SUBTITLE_EDIT_FORMATS)
    def is_available(self) -> bool:
        return bool(binaries.SubtitleEdit)
    def can_convert(self, source: Codec, target: Codec) -> bool:
        # Segmented box formats cannot be read by SubtitleEdit.
        if source in (Codec.fTTML, Codec.fVTT):
            return False
        return source in self.reads and target in self.writes
    def rank(self, source: Codec, target: Codec) -> int:
        return 0
    def convert(self, source: Codec, src: Path, target: Codec, out: Path) -> None:
        args = subtitleedit_args(
            binaries.SubtitleEdit,
            src,
            SUBTITLE_EDIT_FORMATS[target],
            output_folder=out.parent,
            convert_colors=(target == Codec.SubRip),
        )
        subprocess.run(args, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
        # SE5 names the output <input-stem>.<format-ext>, which may differ from our target
        # suffix (e.g. timedtext1.0 -> .ttml). Normalise it onto `out`.
        if not out.exists():
            produced = next((p for p in src.parent.glob(f"{src.stem}.*") if p not in (src, out)), None)
            if produced is None:
                raise FileNotFoundError(f"SubtitleEdit produced no output for {src.name} -> {target.name}")
            produced.replace(out)
 class Pysubs2Backend:
    """pysubs2 — pure Python, broad format support, best fidelity for SSA/ASS (native style model)."""
    name = "pysubs2"
    formats = frozenset(PYSUBS2_FORMATS)
    def is_available(self) -> bool:
        return True
    def can_convert(self, source: Codec, target: Codec) -> bool:
        return source in self.formats and target in self.formats
    def rank(self, source: Codec, target: Codec) -> int:
        # Preferred reader for styled SubStation sources; solid general fallback otherwise.
        return 1 if source in (Codec.SubStationAlpha, Codec.SubStationAlphav4) else 2
    def convert(self, source: Codec, src: Path, target: Codec, out: Path) -> None:
        subs = pysubs2.load(str(src), encoding="utf-8")
        subs.save(str(out), format_=PYSUBS2_FORMATS[target], encoding="utf-8")
 class SubbyBackend:
    """subby — purpose-built for streaming subs. WebVTT/fVTT/SAMI -> SRT + CommonIssuesFixer cleanup."""
    name = "subby"
    reads = frozenset({Codec.WebVTT, Codec.fVTT, Codec.SAMI})
    # Native SRT output; non-SRT targets re-encoded from the SRT intermediate via pycaption.
    writes = frozenset({Codec.SubRip, Codec.TimedTextMarkupLang, Codec.WebVTT})
    converters = {
        Codec.WebVTT: WebVTTConverter,
        Codec.fVTT: WVTTConverter,
        Codec.SAMI: SAMIConverter,
    }
    def is_available(self) -> bool:
        return True
    def can_convert(self, source: Codec, target: Codec) -> bool:
        return source in self.reads and target in self.writes
    def rank(self, source: Codec, target: Codec) -> int:
        # Great for *->SRT (adds cleanup); the SRT intermediate is lossy for other targets.
        return 1 if target == Codec.SubRip else 5
    def convert(self, source: Codec, src: Path, target: Codec, out: Path) -> None:
        srt_subtitles = self.converters[source]().from_file(src)
        fixed_srt, _ = CommonIssuesFixer().from_srt(srt_subtitles)
        if target == Codec.SubRip:
            fixed_srt.save(out, encoding="utf8")
            return
        temp_srt = src.with_suffix(".temp.srt")
        fixed_srt.save(temp_srt, encoding="utf8")
        try:
            caption_set = Subtitle.parse(temp_srt.read_bytes(), Codec.SubRip)
            Subtitle.merge_same_cues(caption_set)
            out.write_text(PYCAPTION_WRITERS[target]().write(caption_set), encoding="utf8")
        finally:
            temp_srt.unlink(missing_ok=True)
 class PycaptionBackend:
    """pycaption — last resort. Note: flattens positioning/italics (devine #39), so ranked last."""
    name = "pycaption"
    reads = frozenset({Codec.SubRip, Codec.TimedTextMarkupLang, Codec.WebVTT, Codec.SAMI, Codec.fTTML, Codec.fVTT})
    writes = frozenset(PYCAPTION_WRITERS)
    def is_available(self) -> bool:
        return True
    def can_convert(self, source: Codec, target: Codec) -> bool:
        return source in self.reads and target in self.writes
    def rank(self, source: Codec, target: Codec) -> int:
        return 9
    def convert(self, source: Codec, src: Path, target: Codec, out: Path) -> None:
        caption_set = Subtitle.parse(src.read_bytes(), source)
        Subtitle.merge_same_cues(caption_set)
        if target == Codec.WebVTT:
            Subtitle.filter_unwanted_cues(caption_set)
        out.write_text(PYCAPTION_WRITERS[target]().write(caption_set), encoding="utf8")
 REGISTRY: list[SubtitleBackend] = [
    SubtitleEditBackend(),
    SubbyBackend(),
    Pysubs2Backend(),
    PycaptionBackend(),
 ]
 def resolve_backends(source: Codec, target: Codec, *, pin: Optional[str] = None) -> list[SubtitleBackend]:
    """Available backends that support source->target, ordered by rank. A pin is tried first."""
    available = [b for b in REGISTRY if b.is_available() and b.can_convert(source, target)]
    if pin:
        pinned = [b for b in available if b.name == pin]
        rest = sorted((b for b in available if b.name != pin), key=lambda b: b.rank(source, target))
        return pinned + rest
    return sorted(available, key=lambda b: b.rank(source, target))
 def finalize(sub: Subtitle, target: Codec, out: Path) -> Path:
    """Swap the track onto the converted file and fire the OnConverted callback."""
    original = sub.path
    if original and original.exists() and original != out:
        original.unlink()
    sub.path = out
    sub.codec = target
    if callable(sub.OnConverted):
        sub.OnConverted(target)
    return out
 def run_conversion(sub: Subtitle, target: Codec, *, pin: Optional[str] = None, forced: bool = False) -> Path:
    """
    Convert ``sub`` to ``target`` using the best available backend, falling back through the
    capability chain on failure.
    ``forced`` is True only for explicit user requests (``--sub-format``); lossy downconverts
    (styled SubStation -> SRT) are skipped unless forced.
    """
    if sub.path is None or not sub.path.exists():
        raise ValueError("You must download the subtitle track first.")
    if sub.codec is None:
        raise ValueError("Subtitle has no codec to convert from.")
    source, src = sub.codec, sub.path
    if source == target:
        return src
    if (source, target) in LOSSY_DOWNCONVERTS and not forced:
        log.info(
            f"Keeping {source.name} subtitle as-is "
            f"(skipping lossy auto-conversion to {target.name}; pass --sub-format to force)"
        )
        return src
    chain = resolve_backends(source, target, pin=pin)
    if not chain:
        raise NotImplementedError(f"Cannot convert {source.name} to {target.name}.")
    out = src.with_suffix(f".{target.value.lower()}")
    last_exc: Optional[Exception] = None
    for backend in chain:
        try:
            backend.convert(source, src, target, out)
        except Exception as e:
            last_exc = e
            log.debug(f"Subtitle backend {backend.name} failed ({source.name}->{target.name}): {e}")
            continue
        log.debug(f"Converted subtitle {source.name}->{target.name} via {backend.name}")
        return finalize(sub, target, out)
    raise RuntimeError(f"All subtitle backends failed for {source.name}->{target.name}") from last_exc
--- a/unshackle/unshackle-example.yaml
+++ b/unshackle/unshackle-example.yaml
@@ -448,30 +448,41 @@ filenames:
 # - pysubs2: Use pysubs2 library (supports SRT/SSA/ASS/WebVTT/TTML/SAMI/MicroDVD/MPL2/TMP)
 subtitle:
  conversion_method: auto
-  # sdh_method: Method to use for SDH (hearing impaired) stripping
+  # Which backend converts subtitles (data-driven registry, pin-then-fallback)
  # - auto (default): best available by rank (SubtitleEdit > subby/pysubs2 > pycaption)
  # - subby | pysubs2 | subtitleedit | pycaption: pin that backend first, still falls back
  # Styled ASS/SSA are never auto-downconverted to SRT (kept as-is); --sub-format srt overrides.
  # SubtitleEdit on Linux/macOS = install the SE5 "SeConv" (seconv) CLI on PATH or unshackle/binaries/.
  sdh_method: auto
  # Method to use for SDH (hearing impaired) stripping
  # - auto (default): Try subby (SRT only), then SubtitleEdit (if available), then subtitle-filter
  # - subby: Use subby library (SRT only)
-  # - subtitleedit: Use SubtitleEdit tool (Windows only, falls back to subtitle-filter)
+  # - subtitleedit: Use SubtitleEdit / seconv (SE5 CLI, cross-platform), falls back to subtitle-filter
  # - filter-subs: Use subtitle-filter library directly
-  sdh_method: auto
+
  # strip_sdh: Automatically create stripped (non-SDH) versions of SDH subtitles
  # Set to false to disable automatic SDH stripping entirely (default: true)
  strip_sdh: true
-  # convert_before_strip: Auto-convert VTT/other formats to SRT before using subtitle-filter
+  # Automatically create stripped (non-SDH) versions of SDH subtitles
-  # This ensures compatibility when subtitle-filter is used as fallback (default: true)
+  # Set to false to disable automatic SDH stripping entirely (default: true)
  convert_before_strip: true
-  # preserve_formatting: Preserve original subtitle formatting (tags, positioning, styling)
+  # Auto-convert VTT/other formats to SRT before using subtitle-filter
  # This ensures compatibility when subtitle-filter is used as fallback (default: true)
  preserve_formatting: true
  # Preserve original subtitle formatting (tags, positioning, styling)
  # When true, skips pycaption processing for WebVTT files to keep tags like <i>, <b>, positioning intact
  # Combined with no sub_format setting, ensures subtitles remain in their original format (default: true)
-  preserve_formatting: true
+
-  # output_mode: Output mode for subtitles
+  output_mode: mux
  # Output mode for subtitles
  # - mux: Embed subtitles in MKV container only (default)
  # - sidecar: Save subtitles as separate files only
  # - both: Embed in MKV AND save as sidecar files
-  output_mode: mux
+
  # sidecar_format: Format for sidecar subtitle files
  # Options: srt, vtt, ass, original (keep current format)
  sidecar_format: srt
  # Format for sidecar subtitle files
  # Options: srt, vtt, ass, original (keep current format)
 # Configuration for pywidevine and pyplayready's serve functionality
 # Also used for remote services (unshackle serve)