[
  {
    "path": ".cursor/rules/testing-conventions.mdc",
    "content": "---\ndescription: Writing tests\nglobs: \nalwaysApply: false\n---\n# Testing Conventions\n\nThis document describes testing conventions used in the Podly project.\n\n## Fixtures and Dependency Injection\n\nThe project uses pytest fixtures for dependency injection and test setup. Common fixtures are defined in [src/tests/conftest.py](mdc:src/tests/conftest.py).\n\nKey fixtures include:\n- `app` - Flask application context for testing\n- `test_config` - Configuration loaded from config.yml\n- `mock_db_session` - Mock database session\n- Mock classes for core components (TranscriptionManager, AdClassifier, etc.)\n\n## SQLAlchemy Model Mocking\n\nWhen testing code that uses SQLAlchemy models, prefer creating custom mock classes over using `MagicMock(spec=ModelClass)` to avoid Flask context issues:\n\n```python\n# Example from test_podcast_downloader.py\nclass MockPost:\n    \"\"\"A mock Post class that doesn't require Flask context.\"\"\"\n    def __init__(self, id=1, title=\"Test Episode\", download_url=\"https://example.com/podcast.mp3\"):\n        self.id = id\n        self.title = title\n        self.download_url = download_url\n```\n\nSee [src/tests/test_podcast_downloader.py](mdc:src/tests/test_podcast_downloader.py) for a complete example.\n\n## Dependency Injection\n\nPrefer injecting dependencies via the contstructor rather than patching. See [src/tests/test_podcast_processor.py](mdc:src/tests/test_podcast_processor.py) for examples of:\n- Creating test fixtures with mock dependencies\n- Testing error handling with failing components\n- Using Flask app context when needed\n\n## Improving Coverage\n\nWhen writing tests to improve coverage:\n1. Focus on one module at a time\n2. Create mock objects for dependencies\n3. Test successful and error paths \n4. Use `monkeypatch` to replace functions that access external resources\n5. Use `tmp_path` fixture for file operations\n\nSee [src/tests/test_feeds.py](mdc:src/tests/test_feeds.py) for comprehensive examples of these patterns.\n"
  },
  {
    "path": ".dockerignore",
    "content": "# Python cache files\n__pycache__/\n*.py[cod]\n*$py.class\n.pytest_cache/\n.mypy_cache/\n\n# Git\n.git/\n.github/\n.gitignore\n\n# Editor files\n.vscode/\n.idea/\n*.swp\n*.swo\n\n# Virtual environments\nvenv/\n.env/\n.venv/\nenv/\nENV/\n\n# Build artifacts\n*.so\n*.egg-info/\ndist/\nbuild/\n\n# Input/Output directories (these can be mounted as volumes instead)\nin/\nprocessing/\n\n# App instance data\nsrc/app/instance/\nsrc/instance/\n\n# Logs\n*.log\n\n# Database files\n*.db\n*.sqlite\n*.sqlite3\n\n# Local configuration files\n.env\n.env.*\n!.env.example\n\n# Node / JS\nnode_modules/\n.DS_Store\n*.DS_Store\n\n# Frontend specific\nfrontend/node_modules/\nfrontend/dist/\nfrontend/.vite/\nfrontend/coverage/\nfrontend/.nyc_output/\nfrontend/.eslintcache\n\n# Documentation\ndocs/\n*.md\n!README.md\n\n# Coverage / lint caches\n.coverage\ncoverage.xml\nhtmlcov/\n.ruff_cache/\n"
  },
  {
    "path": ".github/FUNDING.yml",
    "content": "# These are supported funding model platforms\n\ngithub: jdrbc\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.md",
    "content": "---\nname: Bug report\nabout: Report a problem or regression\ntitle: \"[Bug]: \"\nlabels: bug\nassignees: \"\"\n---\n\n## Summary\n- \n\n## Steps to reproduce\n1. \n\n## Expected behavior\n- \n\n## Actual behavior\n- \n\n## Environment\n- App version/commit: \n- OS: \n- Deployment: local / docker / other\n\n## Logs or screenshots\n- \n\n## Additional context\n- \n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.md",
    "content": "---\nname: Feature request\nabout: Suggest an idea or enhancement\ntitle: \"[Feature]: \"\nlabels: enhancement\nassignees: \"\"\n---\n\n## Summary\n- \n\n## Problem to solve\n- \n\n## Proposed solution\n- \n\n## Alternatives considered\n- \n\n## Additional context\n- \n"
  },
  {
    "path": ".github/pull_request_template.md",
    "content": "## Summary\n- \n\n## Type of change\n- [ ] Bug fix\n- [ ] New feature\n- [ ] Refactor\n- [ ] Docs\n- [ ] Other\n\n## Testing\n- [ ] `scripts/ci.sh`\n- [ ] Not run (explain below)\n\n## Docs\n- [ ] Not needed\n- [ ] Updated (details below)\n\n## Related issues\n- \n\n## Notes\n- \n\n## Checklist\n- [ ] Target branch is `Preview`\n- [ ] Docs updated if needed\n- [ ] Tests run or explicitly skipped with reasoning\n- [ ] If merging to main, at least one commit in this PR follows Conventional Commits (e.g., `feat:`, `fix:`, `chore:`) Please refer to https://www.conventionalcommits.org/en/v1.0.0/#summary for more details.\n"
  },
  {
    "path": ".github/workflows/conventional-commit-check.yml",
    "content": "name: Conventional Commit Check\n\non:\n  pull_request:\n    branches:\n      - main\n\npermissions:\n  contents: read\n\njobs:\n  conventional-commit:\n    runs-on: ubuntu-latest\n    steps:\n      - name: Checkout repository\n        uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Ensure at least one Conventional Commit\n        env:\n          BASE_SHA: ${{ github.event.pull_request.base.sha }}\n          HEAD_SHA: ${{ github.event.pull_request.head.sha }}\n        run: |\n          set -euo pipefail\n          echo \"Checking commit subjects between $BASE_SHA and $HEAD_SHA\"\n          subjects=$(git log --format=%s \"$BASE_SHA..$HEAD_SHA\")\n          if [ -z \"$subjects\" ]; then\n            echo \"No commits found in range.\"\n            exit 1\n          fi\n\n          if echo \"$subjects\" | grep -Eq '^(feat|fix|docs|style|refactor|perf|test|build|ci|chore|revert)(\\([^)]+\\))?(!)?: .+'; then\n            echo \"Conventional Commit found.\"\n          else\n            echo \"No Conventional Commit found in this PR.\"\n            echo \"Add at least one commit like: feat: ..., fix(scope): ..., chore: ...\"\n            echo \"Please refer to https://www.conventionalcommits.org/en/v1.0.0/#summary for more details.\"\n            exit 1\n          fi\n"
  },
  {
    "path": ".github/workflows/docker-publish.yml",
    "content": "name: Build and Publish Docker Images\n\non:\n  push:\n    branches: [main]\n    tags: [\"v*\"]\n  pull_request:\n    branches: [main]\n  release:\n    types: [published]\n\npermissions:\n  contents: read\n  packages: write\n  \nenv:\n  REGISTRY: ghcr.io\n  IMAGE_NAME: ${{ github.repository_owner }}/podly-pure-podcasts\n\njobs:\n  changes:\n    runs-on: ubuntu-latest\n    outputs:\n      skip: ${{ steps.check_files.outputs.skip }}\n    steps:\n      - name: Checkout repository\n        uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Check for documentation-only changes\n        id: check_files\n        run: |\n          # For PRs, compare against the base branch. For pushes, compare against the previous commit.\n          if [ \"${{ github.event_name }}\" = \"pull_request\" ]; then\n            BASE_REF=\"${{ github.event.pull_request.base.ref }}\"\n            echo \"Fetching base branch origin/$BASE_REF\"\n            git fetch --no-tags origin \"$BASE_REF\"\n            BASE_SHA=$(git rev-parse \"origin/$BASE_REF\")\n            HEAD_SHA=$(git rev-parse \"${{ github.sha }}\")\n            echo \"Comparing PR commits: $BASE_SHA...$HEAD_SHA\"\n            files_changed=$(git diff --name-only \"$BASE_SHA\"...\"$HEAD_SHA\")\n          elif [ \"${{ github.event_name }}\" = \"release\" ]; then\n            echo \"Release event detected; building images for release tag\"\n            TARGET_REF=\"${{ github.event.release.target_commitish }}\"\n            echo \"Fetching release target origin/$TARGET_REF\"\n            git fetch --no-tags origin \"$TARGET_REF\" || true\n            HEAD_SHA=$(git rev-parse \"${{ github.sha }}\")\n            BASE_SHA=$(git rev-parse \"origin/$TARGET_REF\" 2>/dev/null || git rev-parse \"$TARGET_REF\" 2>/dev/null || echo \"$HEAD_SHA\")\n            files_changed=$(git diff --name-only \"$BASE_SHA\"...\"$HEAD_SHA\" 2>/dev/null || echo \"release-trigger\")\n          else\n            echo \"Comparing push commits: HEAD~1...HEAD\"\n            if git rev-parse HEAD~1 >/dev/null 2>&1; then\n              files_changed=$(git diff --name-only HEAD~1 HEAD)\n            else\n              echo \"Single commit push detected; using initial commit diff\"\n              files_changed=$(git diff-tree --no-commit-id --name-only -r HEAD)\n            fi\n          fi\n\n          echo \"Files changed:\"\n          echo \"$files_changed\"\n\n          # If no files are documentation, then we should continue\n          non_doc_files=$(echo \"$files_changed\" | grep -v -E '(\\.md$|^docs/|LICENCE)')\n\n          if [ \"${{ github.event_name }}\" = \"release\" ]; then\n            echo \"Release build detected. Skipping documentation-only shortcut.\"\n            echo \"skip=false\" >> $GITHUB_OUTPUT\n          elif [ -z \"$non_doc_files\" ]; then\n            echo \"Only documentation files were changed. Skipping build and publish.\"\n            echo \"skip=true\" >> $GITHUB_OUTPUT\n          else\n            echo \"Code files were changed. Proceeding with build and publish.\"\n            echo \"skip=false\" >> $GITHUB_OUTPUT\n          fi\n        shell: bash\n\n  ## test if build is successful, but don't run every permutation on PRs\n  build-amd64-pr-lite:\n    needs: changes\n    if: ${{ needs.changes.outputs.skip == 'false' && github.event_name == 'pull_request' }}\n    runs-on: ubuntu-latest\n    strategy:\n      matrix:\n        variant:\n          - name: \"lite\"\n            base: \"python:3.11-slim\"\n            gpu: \"false\"\n            gpu_nvidia: \"false\"\n            gpu_amd: \"false\"\n            lite_build: \"true\"\n    env:\n      ARCH: amd64\n    steps:\n      - name: Checkout repository\n        uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Free up disk space\n        if: ${{ matrix.variant.gpu == 'true' || matrix.variant.gpu_nvidia == 'true' || matrix.variant.gpu_amd == 'true' }}\n        run: |\n          echo \"Available disk space before cleanup:\"\n          df -h\n          sudo rm -rf /usr/share/dotnet /usr/local/lib/android /opt/ghc /usr/local/share/boost\n          sudo rm -rf /opt/microsoft/msedge /opt/microsoft/powershell /opt/pipx /usr/lib/mono\n          sudo rm -rf /usr/local/.ghcup /usr/share/swift\n          docker system prune -af\n          echo \"Available disk space after cleanup:\"\n          df -h\n\n      - name: Set up Docker Buildx\n        uses: docker/setup-buildx-action@v3\n        with:\n          driver-opts: |\n            image=moby/buildkit:v0.12.0\n\n      - name: Log in to Container Registry\n        uses: docker/login-action@v3\n        with:\n          registry: ${{ env.REGISTRY }}\n          username: ${{ github.actor }}\n          password: ${{ secrets.GITHUB_TOKEN }}\n\n      - name: Extract metadata\n        id: meta\n        uses: docker/metadata-action@v5\n        with:\n          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}\n          tags: |\n            type=ref,event=branch,suffix=-${{ matrix.variant.name }}-${{ env.ARCH }}\n            type=ref,event=pr,suffix=-${{ matrix.variant.name }}-${{ env.ARCH }}\n            type=semver,pattern={{version}},suffix=-${{ matrix.variant.name }}-${{ env.ARCH }}\n            type=raw,value=${{ matrix.variant.name }}-${{ env.ARCH }},enable={{is_default_branch}}\n\n      - name: Build and push\n        uses: docker/build-push-action@v5\n        with:\n          context: .\n          file: ./Dockerfile\n          push: true\n          platforms: linux/${{ env.ARCH }}\n          tags: ${{ steps.meta.outputs.tags }}\n          labels: ${{ steps.meta.outputs.labels }}\n          build-args: |\n            BASE_IMAGE=${{ matrix.variant.base }}\n            USE_GPU=${{ matrix.variant.gpu }}\n            USE_GPU_NVIDIA=${{ matrix.variant.gpu_nvidia }}\n            USE_GPU_AMD=${{ matrix.variant.gpu_amd }}\n            LITE_BUILD=${{ matrix.variant.lite_build }}\n          # Temporarily disabled due to GitHub Actions Cache service outage\n          # cache-from: type=gha\n          # cache-to: type=gha,mode=max\n\n  build-amd64:\n    needs: changes\n    if: ${{ needs.changes.outputs.skip == 'false' && github.event_name != 'pull_request' }}\n    runs-on: ubuntu-latest\n    strategy:\n      matrix:\n        variant:\n          - name: \"latest\"\n            base: \"python:3.11-slim\"\n            gpu: \"false\"\n            gpu_nvidia: \"false\"\n            gpu_amd: \"false\"\n            lite_build: \"false\"\n          - name: \"lite\"\n            base: \"python:3.11-slim\"\n            gpu: \"false\"\n            gpu_nvidia: \"false\"\n            gpu_amd: \"false\"\n            lite_build: \"true\"\n          - name: \"gpu-nvidia\"\n            base: \"nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04\"\n            gpu: \"true\"\n            gpu_nvidia: \"true\"\n            gpu_amd: \"false\"\n            lite_build: \"false\"\n          - name: \"gpu-amd\"\n            base: \"rocm/dev-ubuntu-22.04:6.4-complete\"\n            gpu: \"false\"\n            gpu_nvidia: \"false\"\n            gpu_amd: \"true\"\n            lite_build: \"false\"\n    env:\n      ARCH: amd64\n    steps:\n      - name: Checkout repository\n        uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Free up disk space\n        if: ${{ matrix.variant.gpu == 'true' || matrix.variant.gpu_nvidia == 'true' || matrix.variant.gpu_amd == 'true' }}\n        run: |\n          echo \"Available disk space before cleanup:\"\n          df -h\n          sudo rm -rf /usr/share/dotnet /usr/local/lib/android /opt/ghc /usr/local/share/boost\n          sudo rm -rf /opt/microsoft/msedge /opt/microsoft/powershell /opt/pipx /usr/lib/mono\n          sudo rm -rf /usr/local/.ghcup /usr/share/swift\n          docker system prune -af\n          echo \"Available disk space after cleanup:\"\n          df -h\n\n      - name: Set up Docker Buildx\n        uses: docker/setup-buildx-action@v3\n        with:\n          driver-opts: |\n            image=moby/buildkit:v0.12.0\n\n      - name: Log in to Container Registry\n        uses: docker/login-action@v3\n        with:\n          registry: ${{ env.REGISTRY }}\n          username: ${{ github.actor }}\n          password: ${{ secrets.GITHUB_TOKEN }}\n\n      - name: Extract metadata\n        id: meta\n        uses: docker/metadata-action@v5\n        with:\n          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}\n          tags: |\n            type=ref,event=branch,suffix=-${{ matrix.variant.name }}-${{ env.ARCH }}\n            type=ref,event=pr,suffix=-${{ matrix.variant.name }}-${{ env.ARCH }}\n            type=semver,pattern={{version}},suffix=-${{ matrix.variant.name }}-${{ env.ARCH }}\n            type=raw,value=${{ matrix.variant.name }}-${{ env.ARCH }},enable={{is_default_branch}}\n\n      - name: Build and push\n        uses: docker/build-push-action@v5\n        with:\n          context: .\n          file: ./Dockerfile\n          push: true\n          platforms: linux/${{ env.ARCH }}\n          tags: ${{ steps.meta.outputs.tags }}\n          labels: ${{ steps.meta.outputs.labels }}\n          build-args: |\n            BASE_IMAGE=${{ matrix.variant.base }}\n            USE_GPU=${{ matrix.variant.gpu }}\n            USE_GPU_NVIDIA=${{ matrix.variant.gpu_nvidia }}\n            USE_GPU_AMD=${{ matrix.variant.gpu_amd }}\n            LITE_BUILD=${{ matrix.variant.lite_build }}\n          # Temporarily disabled due to GitHub Actions Cache service outage\n          # cache-from: type=gha\n          # cache-to: type=gha,mode=max\n\n  build-arm64:\n    needs: changes\n    if: ${{ needs.changes.outputs.skip == 'false' && github.event_name != 'pull_request' }}\n    runs-on: ubuntu-latest\n    strategy:\n      matrix:\n        variant:\n          - {\n              name: \"latest\",\n              base: \"python:3.11-slim\",\n              gpu: \"false\",\n              gpu_nvidia: \"false\",\n              gpu_amd: \"false\",\n              lite_build: \"false\",\n            }\n          - {\n              name: \"lite\",\n              base: \"python:3.11-slim\",\n              gpu: \"false\",\n              gpu_nvidia: \"false\",\n              gpu_amd: \"false\",\n              lite_build: \"true\",\n            }\n    env:\n      ARCH: arm64\n    steps:\n      - name: Checkout repository\n        uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Free up disk space\n        if: ${{ matrix.variant.gpu == 'true' || matrix.variant.gpu_nvidia == 'true' || matrix.variant.gpu_amd == 'true' }}\n        run: |\n          echo \"Available disk space before cleanup:\"\n          df -h\n          sudo rm -rf /usr/share/dotnet /usr/local/lib/android /opt/ghc /usr/local/share/boost\n          sudo rm -rf /opt/microsoft/msedge /opt/microsoft/powershell /opt/pipx /usr/lib/mono\n          sudo rm -rf /usr/local/.ghcup /usr/share/swift\n          docker system prune -af\n          echo \"Available disk space after cleanup:\"\n          df -h\n\n      - name: Set up QEMU\n        uses: docker/setup-qemu-action@v3\n\n      - name: Set up Docker Buildx\n        uses: docker/setup-buildx-action@v3\n        with:\n          driver-opts: |\n            image=moby/buildkit:v0.12.0\n\n      - name: Log in to Container Registry\n        uses: docker/login-action@v3\n        with:\n          registry: ${{ env.REGISTRY }}\n          username: ${{ github.actor }}\n          password: ${{ secrets.GITHUB_TOKEN }}\n\n      - name: Extract metadata\n        id: meta\n        uses: docker/metadata-action@v5\n        with:\n          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}\n          tags: |\n            type=ref,event=branch,suffix=-${{ matrix.variant.name }}-${{ env.ARCH }}\n            type=ref,event=pr,suffix=-${{ matrix.variant.name }}-${{ env.ARCH }}\n            type=semver,pattern={{version}},suffix=-${{ matrix.variant.name }}-${{ env.ARCH }}\n            type=raw,value=${{ matrix.variant.name }}-${{ env.ARCH }},enable={{is_default_branch}}\n\n      - name: Build and push\n        uses: docker/build-push-action@v5\n        with:\n          context: .\n          file: ./Dockerfile\n          push: true\n          platforms: linux/${{ env.ARCH }}\n          tags: ${{ steps.meta.outputs.tags }}\n          labels: ${{ steps.meta.outputs.labels }}\n          build-args: |\n            BASE_IMAGE=${{ matrix.variant.base }}\n            USE_GPU=${{ matrix.variant.gpu }}\n            USE_GPU_NVIDIA=${{ matrix.variant.gpu_nvidia }}\n            USE_GPU_AMD=${{ matrix.variant.gpu_amd }}\n            LITE_BUILD=${{ matrix.variant.lite_build }}\n          # Temporarily disabled due to GitHub Actions Cache service outage\n          # cache-from: type=gha\n          # cache-to: type=gha,mode=max\n\n  manifest:\n    needs: [changes, build-amd64, build-arm64]\n    if: ${{ needs.changes.outputs.skip == 'false' }}\n    runs-on: ubuntu-latest\n    strategy:\n      matrix:\n        variant:\n          - \"latest\"\n          - \"lite\"\n    steps:\n      - name: Checkout repository\n        uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Log in to Container Registry\n        uses: docker/login-action@v3\n        with:\n          registry: ${{ env.REGISTRY }}\n          username: ${{ github.actor }}\n          password: ${{ secrets.GITHUB_TOKEN }}\n\n      - name: Extract metadata (manifest)\n        id: meta\n        uses: docker/metadata-action@v5\n        with:\n          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}\n          tags: |\n            type=ref,event=branch,suffix=-${{ matrix.variant }}\n            type=ref,event=pr,suffix=-${{ matrix.variant }}\n            type=semver,pattern={{version}},suffix=-${{ matrix.variant }}\n            type=raw,value=${{ matrix.variant }},enable={{is_default_branch}}\n\n      - name: Create and push manifest\n        run: |\n          set -euo pipefail\n          tags=\"${{ steps.meta.outputs.tags }}\"\n          while IFS= read -r tag; do\n            [ -z \"$tag\" ] && continue\n            echo \"Creating manifest for ${tag}\"\n            docker buildx imagetools create \\\n              -t \"${tag}\" \\\n              \"${tag}-amd64\" \\\n              \"${tag}-arm64\"\n          done <<< \"$tags\"\n"
  },
  {
    "path": ".github/workflows/lint-and-format.yml",
    "content": "name: Python Linting, Formatting, and Testing\n\non:\n  push:\n    branches:\n      - main\n  pull_request:\n    branches:\n      - main\n\njobs:\n  lint-format-test:\n    runs-on: ubuntu-latest\n    env:\n      PIPENV_VENV_IN_PROJECT: \"1\"\n      PIP_DISABLE_PIP_VERSION_CHECK: \"1\"\n\n    steps:\n      - name: Checkout code\n        uses: actions/checkout@v4\n\n      - name: Set up Python\n        id: python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.11\"\n          cache: \"pip\"\n          cache-dependency-path: \"Pipfile.lock\"\n\n      - name: Install ffmpeg\n        run: sudo apt-get update -y && sudo apt-get install -y --no-install-recommends ffmpeg\n\n      - name: Install pipenv\n        run: pip install pipenv\n\n      - name: Cache pipenv virtualenv\n        uses: actions/cache@v4\n        with:\n          path: .venv\n          key: ${{ runner.os }}-venv-${{ steps.python.outputs.python-version }}-${{ hashFiles('Pipfile.lock') }}\n          restore-keys: |\n            ${{ runner.os }}-venv-${{ steps.python.outputs.python-version }}-\n\n      - name: Cache mypy\n        uses: actions/cache@v4\n        with:\n          path: .mypy_cache\n          key: ${{ runner.os }}-mypy-${{ steps.python.outputs.python-version }}-${{ hashFiles('Pipfile.lock') }}\n          restore-keys: |\n            ${{ runner.os }}-mypy-${{ steps.python.outputs.python-version }}-\n\n      - name: Install dependencies\n        run: pipenv install --dev --deploy\n\n      - name: Install dependencies for mypy\n        run: pipenv run mypy . --install-types --non-interactive --explicit-package-bases --exclude 'migrations' --exclude 'build' --exclude 'scripts' --exclude 'src/tests' --exclude 'src/tests/test_routes.py' --exclude 'src/app/routes.py'\n\n      - name: Run pylint\n        run: pipenv run pylint src --ignore=migrations,tests\n\n      - name: Run black\n        run: pipenv run black --check src\n\n      - name: Run isort\n        run: pipenv run isort --check-only src\n\n      - name: Run pytest\n        run: pipenv run pytest --disable-warnings\n"
  },
  {
    "path": ".github/workflows/release.yml",
    "content": "name: Release\n\non:\n  push:\n    branches:\n      - main\n  workflow_dispatch:\n\npermissions:\n  contents: write\n  issues: write\n  pull-requests: write\n\njobs:\n  release:\n    runs-on: ubuntu-latest\n    steps:\n      - name: Checkout repository\n        uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Set up Node.js\n        uses: actions/setup-node@v4\n        with:\n          node-version: \"20\"\n\n      - name: Run semantic-release\n        env:\n          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}\n        run: >\n          npx --yes\n          -p semantic-release\n          -p @semantic-release/changelog\n          -p @semantic-release/git\n          semantic-release\n"
  },
  {
    "path": ".gitignore",
    "content": ".worktrees/*\n\n__pycache__/\n*.py[cod]\n*$py.class\n*.so\n.Python\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\nwheels/\n*.egg-info/\n.installed.cfg\n*.egg\n*.manifest\n*.spec\npip-log.txt\npip-delete-this-directory.txt\nhtmlcov/\n.tox/\n.coverage\n.coverage.*\n.cache\nnosetests.xml\ncoverage.xml\n*.cover\n.hypothesis/\n.pytest_cache/\n*.mo\n*.pot\n*.log\nout/*\nprocessing/*\nconfig/app.log\n.vscode/*\nin/**/*.mp3\nsrv/**/*.mp3\n*.pickle\n.env\n.env.local\nconfig/config.yml\n*.db\n*.sqlite\n**/sqlite3.db-*\n**/*.sqlite-*\n.DS_Store\nsrc/instance/data/*\n\n# Frontend build logs\nfrontend-build.log\n\n# Claude Code local notes (not committed)\n.claude-notes/\nCLAUDE_NOTES.md\n"
  },
  {
    "path": ".pylintrc",
    "content": "[MASTER]\nignore=frontend,migrations,scripts\nignore-paths=^src/(migrations|tests)/\ndisable=\n    C0114, # missing-module-docstring\n    C0115, # missing-class-docstring\n    C0116, # missing-function-docstring\n    R0913, # too-many-arguments\n    R0914, # too-many-locals\n    R0903, # too-few-public-methods\n    W1203, # logging-fstring-interpolation\n    W1514, # using-constant-test\n    E0401, # import-error\n    C0301, # line-too-long\n    R0911, # too-many-return-statements\n\n[DESIGN]\n# Allow more statements per function to accommodate complex processing routines\nmax-statements=100\n\n[MASTER:src/tests/*.py]\ndisable=\n    W0621, # redefined-outer-name\n    W0212, # protected-access\n    W0613, # Unused argument\n    C0415, # Import outside toplevel\n    W0622,\n    R0902\n\n\n[MASTER:scripts/*.py]\ndisable=\n    R0917, \n    W0718\n\n\n[SIMILARITIES]\n\n# Minimum lines number of a similarity.\nmin-similarity-lines=10\n\n# Ignore comments when computing similarities.\nignore-comments=yes\n\n# Ignore docstrings when computing similarities.\nignore-docstrings=yes\n\n# Ignore imports when computing similarities.\nignore-imports=no\n\n"
  },
  {
    "path": ".releaserc.cjs",
    "content": "const { execSync } = require(\"node:child_process\");\n\nconst resolveRepositoryUrl = () => {\n  if (process.env.GITHUB_REPOSITORY) {\n    return `https://github.com/${process.env.GITHUB_REPOSITORY}.git`;\n  }\n\n  try {\n    return execSync(\"git remote get-url origin\", { stdio: \"pipe\" })\n      .toString()\n      .trim();\n  } catch {\n    return undefined;\n  }\n};\n\nmodule.exports = {\n  branches: [\"main\"],\n  repositoryUrl: resolveRepositoryUrl(),\n  tagFormat: \"v${version}\",\n  plugins: [\n    \"@semantic-release/commit-analyzer\",\n    \"@semantic-release/release-notes-generator\",\n    [\"@semantic-release/changelog\", { changelogFile: \"CHANGELOG.md\" }],\n    [\n      \"@semantic-release/git\",\n      {\n        assets: [\"CHANGELOG.md\"],\n        message:\n          \"chore(release): ${nextRelease.version} [skip ci]\\n\\n${nextRelease.notes}\",\n      },\n    ],\n    \"@semantic-release/github\",\n  ],\n};\n"
  },
  {
    "path": ".worktrees/.gitignore",
    "content": "*\n!.gitignore"
  },
  {
    "path": "AGENTS.md",
    "content": "Project-specific rules:\n- Do not create Alembic migrations yourself; request the user to generate migrations after model changes.\n- Only use ./scripts/ci.sh to run tests & lints - do not attempt to run directly\n- use pipenv\n- All database writes must go through the `writer` service. Do not use `db.session.commit()` directly in application code. Use `writer_client.action()` instead.\n"
  },
  {
    "path": "Dockerfile",
    "content": "# Multi-stage build for combined frontend and backend\nARG BASE_IMAGE=python:3.11-slim\nFROM node:18-alpine AS frontend-build\n\nWORKDIR /app\n\n# Copy frontend package files\nCOPY frontend/package*.json ./\nRUN npm ci\n\n# Copy frontend source code\nCOPY frontend/ ./\n\n# Build frontend assets with explicit error handling\nRUN set -e && \\\n    npm run build && \\\n    test -d dist && \\\n    echo \"Frontend build successful - dist directory created\"\n\n# Backend stage\nFROM ${BASE_IMAGE} AS backend\n\n# Environment variables\nENV PYTHONDONTWRITEBYTECODE=1\nENV PYTHONUNBUFFERED=1\nARG CUDA_VERSION=12.4.1\nARG ROCM_VERSION=6.4\nARG USE_GPU=false\nARG USE_GPU_NVIDIA=${USE_GPU}\nARG USE_GPU_AMD=false\nARG LITE_BUILD=false\n\nWORKDIR /app\n\n# Install dependencies based on base image\nRUN if [ -f /etc/debian_version ]; then \\\n    apt-get update && \\\n    apt-get install -y ca-certificates && \\\n    # Determine if we need to install Python 3.11\n    INSTALL_PYTHON=true && \\\n    if command -v python3 >/dev/null 2>&1; then \\\n        if python3 --version 2>&1 | grep -q \"3.11\"; then \\\n            INSTALL_PYTHON=false; \\\n        fi; \\\n    fi && \\\n    if [ \"$INSTALL_PYTHON\" = \"true\" ]; then \\\n        apt-get install -y software-properties-common && \\\n        if ! apt-cache show python3.11 > /dev/null 2>&1; then \\\n            add-apt-repository ppa:deadsnakes/ppa -y && \\\n            apt-get update; \\\n        fi && \\\n        DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \\\n        python3.11 \\\n        python3.11-distutils \\\n        python3.11-dev \\\n        python3-pip && \\\n        update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 1 && \\\n        update-alternatives --set python3 /usr/bin/python3.11; \\\n    fi && \\\n    # Install other dependencies\n    DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \\\n    ffmpeg \\\n    sqlite3 \\\n    libsqlite3-dev \\\n    build-essential \\\n    gosu && \\\n    apt-get clean && \\\n    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* ; \\\n    fi\n\n# Install python3-tomli if Python version is less than 3.11 (separate step for ARM compatibility)\nRUN if [ -f /etc/debian_version ]; then \\\n    PYTHON_MINOR=$(python3 --version 2>&1 | grep -o 'Python 3\\.[0-9]*' | cut -d '.' -f2) && \\\n    if [ \"$PYTHON_MINOR\" -lt 11 ]; then \\\n    apt-get update && \\\n    apt-get install -y python3-tomli && \\\n    apt-get clean && \\\n    rm -rf /var/lib/apt/lists/* ; \\\n    fi ; \\\n    fi\n\n# Copy all Pipfiles/lock files\nCOPY Pipfile Pipfile.lock Pipfile.lite Pipfile.lite.lock ./\n\n# Remove problematic distutils-installed packages that may conflict\nRUN if [ -f /etc/debian_version ]; then \\\n    apt-get remove -y python3-blinker 2>/dev/null || true; \\\n    fi\n\n# Install pipenv and dependencies\nRUN if command -v pip >/dev/null 2>&1; then \\\n    pip install --no-cache-dir pipenv; \\\n    elif command -v pip3 >/dev/null 2>&1; then \\\n    pip3 install --no-cache-dir pipenv; \\\n    else \\\n    python3 -m pip install --no-cache-dir pipenv; \\\n    fi\n\n# Set pip timeout and retries for better reliability\nENV PIP_DEFAULT_TIMEOUT=1000\nENV PIP_RETRIES=3\nENV PIP_DISABLE_PIP_VERSION_CHECK=1\nENV PIP_NO_CACHE_DIR=1\n\n# Set pipenv configuration for better CI reliability\nENV PIPENV_VENV_IN_PROJECT=1\nENV PIPENV_TIMEOUT=1200\n\n# Install dependencies conditionally based on LITE_BUILD\nRUN set -e && \\\n    if [ \"${LITE_BUILD}\" = \"true\" ]; then \\\n    echo \"Installing lite dependencies (without Whisper)\"; \\\n    echo \"Using lite Pipfile:\" && \\\n    PIPENV_PIPFILE=Pipfile.lite pipenv install --deploy --system; \\\n    else \\\n    echo \"Installing full dependencies (including Whisper)\"; \\\n    echo \"Using full Pipfile:\" && \\\n    PIPENV_PIPFILE=Pipfile pipenv install --deploy --system; \\\n    fi\n\n# Install PyTorch with CUDA support if using NVIDIA image (skip if LITE_BUILD)\nRUN if [ \"${LITE_BUILD}\" = \"true\" ]; then \\\n    echo \"Skipping PyTorch installation in lite mode\"; \\\n    elif [ \"${USE_GPU}\" = \"true\" ] || [ \"${USE_GPU_NVIDIA}\" = \"true\" ]; then \\\n    if command -v pip >/dev/null 2>&1; then \\\n    pip install --no-cache-dir nvidia-cudnn-cu12 torch; \\\n    elif command -v pip3 >/dev/null 2>&1; then \\\n    pip3 install --no-cache-dir nvidia-cudnn-cu12 torch; \\\n    else \\\n    python3 -m pip install --no-cache-dir nvidia-cudnn-cu12 torch; \\\n    fi; \\\n    elif [ \"${USE_GPU_AMD}\" = \"true\" ]; then \\\n    if command -v pip >/dev/null 2>&1; then \\\n    pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/rocm${ROCM_VERSION}; \\\n    elif command -v pip3 >/dev/null 2>&1; then \\\n    pip3 install --no-cache-dir torch --index-url https://download.pytorch.org/whl/rocm${ROCM_VERSION}; \\\n    else \\\n    python3 -m pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/rocm${ROCM_VERSION}; \\\n    fi; \\\n    else \\\n    if command -v pip >/dev/null 2>&1; then \\\n    pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu; \\\n    elif command -v pip3 >/dev/null 2>&1; then \\\n    pip3 install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu; \\\n    else \\\n    python3 -m pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu; \\\n    fi; \\\n    fi\n\n# Copy application code\nCOPY src/ ./src/\nRUN rm -rf ./src/instance\nCOPY scripts/ ./scripts/\nRUN chmod +x scripts/start_services.sh\n\n# Copy built frontend assets to Flask static folder\nCOPY --from=frontend-build /app/dist ./src/app/static\n\n# Create non-root user for running the application\nRUN groupadd -r appuser && \\\n    useradd --no-log-init -r -g appuser -d /home/appuser appuser && \\\n    mkdir -p /home/appuser && \\\n    chown -R appuser:appuser /home/appuser\n\n# Create necessary directories and set permissions\nRUN mkdir -p /app/processing /app/src/instance /app/src/instance/data /app/src/instance/data/in /app/src/instance/data/srv /app/src/instance/config /app/src/instance/db && \\\n    chown -R appuser:appuser /app\n\n# Copy entrypoint script\nCOPY docker-entrypoint.sh /docker-entrypoint.sh\nRUN chmod 755 /docker-entrypoint.sh\n\nEXPOSE 5001\n\n# Run the application through the entrypoint script\nENTRYPOINT [\"/docker-entrypoint.sh\"]\nCMD [\"./scripts/start_services.sh\"]\n"
  },
  {
    "path": "LICENCE",
    "content": "\nMIT License\n\nCopyright (c) 2024 John Rogers\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "Pipfile",
    "content": "[[source]]\nurl = \"https://pypi.org/simple\"\nverify_ssl = true\nname = \"pypi\"\n\n[packages]\nspeechrecognition = \"*\"\nopenai = \"*\"\npython-dotenv = \"*\"\njinja2 = \"*\"\nflask = \"*\"\npyrss2gen = \"*\"\nfeedparser = \"*\"\ncertifi = \"*\"\ncd = \"*\"\npyyaml = \"*\"\nprompt-toolkit = \"*\"\npypodcastparser = \"*\"\nwerkzeug = \"*\"\nexceptiongroup = \"*\"\nzeroconf = \"*\"\nwaitress = \"*\"\nvalidators = \"*\"\nbeartype = \"*\"\nopenai-whisper = \"*\"\nflask-sqlalchemy = \"*\"\nflask-migrate = \"*\"\nFlask-APScheduler = \"*\"\nffmpeg-python = \"*\"\nlitellm = \"*\"  # Pin to avoid fastuuid dependency\nbleach = \"*\"\ntypes-bleach = \"*\"\ngroq = \"*\"\nasync_timeout = \"*\"\npytest-cov = \"*\"\nflask-cors = \"*\"\nbcrypt = \"*\"\nhttpx-aiohttp = \"*\"\nstripe = \"*\"\n\n[dev-packages]\nblack = \"*\"\nmypy = \"*\"\ntypes-pyyaml = \"*\"\ntypes-requests = \"*\"\ntypes-waitress = \"*\"\npylint = \"*\"\npytest = \"*\"\ndill = \"*\"\nisort = \"*\"\ntypes-flask-migrate = \"*\"\npytest-mock = \"*\"\nwatchdog = \"*\"\nrequests = \"*\"\ntypes-flask-cors = \"*\"\n\n[requires]\npython_version = \"3.11\"\n"
  },
  {
    "path": "Pipfile.lite",
    "content": "[[source]]\nurl = \"https://pypi.org/simple\"\nverify_ssl = true\nname = \"pypi\"\n\n[packages]\nspeechrecognition = \"*\"\nopenai = \"*\"\npython-dotenv = \"*\"\njinja2 = \"*\"\nflask = \"*\"\npyrss2gen = \"*\"\nfeedparser = \"*\"\ncertifi = \"*\"\ncd = \"*\"\npyyaml = \"*\"\nprompt-toolkit = \"*\"\npypodcastparser = \"*\"\nwerkzeug = \"*\"\nexceptiongroup = \"*\"\nzeroconf = \"*\"\nwaitress = \"*\"\nvalidators = \"*\"\nbeartype = \"*\"\nflask-sqlalchemy = \"*\"\nflask-migrate = \"*\"\nFlask-APScheduler = \"*\"\nffmpeg-python = \"*\"\nlitellm = \">=1.59.8,<1.75.0\"  # Pin to avoid fastuuid dependency\nbleach = \"*\"\ntypes-bleach = \"*\"\ngroq = \"*\"\nasync_timeout = \"*\"\npytest-cov = \"*\"\nflask-cors = \"*\"\nbcrypt = \"*\"\nstripe = \"*\"\n\n[dev-packages]\nblack = \"*\"\nmypy = \"*\"\ntypes-pyyaml = \"*\"\ntypes-requests = \"*\"\ntypes-waitress = \"*\"\npylint = \"*\"\npytest = \"*\"\ndill = \"*\"\nisort = \"*\"\ntypes-flask-migrate = \"*\"\npytest-mock = \"*\"\nwatchdog = \"*\"\nrequests = \"*\"\ntypes-flask-cors = \"*\"\n\n[requires]\npython_version = \"3.11\"\n"
  },
  {
    "path": "README.md",
    "content": "<h2 align=\"center\">\n<img width=\"50%\" src=\"src/app/static/images/logos/logo_with_text.png\" />\n\n</h2>\n\n<p align=\"center\">\n<p align=\"center\">Ad-block for podcasts. Create an ad-free RSS feed.</p>\n<p align=\"center\">\n  <a href=\"https://discord.gg/FRB98GtF6N\" target=\"_blank\">\n      <img src=\"https://img.shields.io/badge/discord-join-blue.svg?logo=discord&logoColor=white\" alt=\"Discord\">\n  </a>\n</p>\n\n## Overview\n\nPodly uses Whisper and Chat GPT to remove ads from podcasts.\n\n<img width=\"100%\" src=\"docs/images/screenshot.png\" />\n\n## How To Run\n\nYou have a few options to get started:\n\n- [![Deploy on Railway](https://railway.com/button.svg)](https://railway.com/deploy/podly?referralCode=NMdeg5&utm_medium=integration&utm_source=template&utm_campaign=generic)\n   - quick and easy setup in the cloud, follow our [Railway deployment guide](docs/how_to_run_railway.md). \n   - Use this if you want to share your Podly server with others.\n- **Run Locally**: \n   - For local development and customization, \n   - see our [beginner's guide for running locally](docs/how_to_run_beginners.md). \n   - Use this for the most cost-optimal & private setup.\n- **[Join The Preview Server](https://podly.up.railway.app/)**: \n   - pay what you want (limited sign ups available)\n\n\n## How it works:\n\n- You request an episode\n- Podly downloads the requested episode\n- Whisper transcribes the episode\n- LLM labels ad segments\n- Podly removes the ad segments\n- Podly delivers the ad-free version of the podcast to you\n\n### Cost Breakdown\n*Monthly cost breakdown for 5 podcasts*\n\n| Cost    | Hosting  | Transcription | LLM    |\n|---------|----------|---------------|--------|\n| **free**| local    | local         | local  |\n| **$2**  | local    | local         | remote |\n| **$5**  | local    | remote        | remote |\n| **$10** | public (railway)  | remote        | remote |\n| **Pay What You Want** | [preview server](https://podly.up.railway.app/)    | n/a         | n/a  |\n| **$5.99/mo** | https://zeroads.ai/ | production fork of podly | |\n\n\n## Contributing\n\nSee [contributing guide](docs/contributors.md) for local setup & contribution instructions.\n"
  },
  {
    "path": "SECURITY.md",
    "content": "# Security Policy\n\n## Supported Versions\n\nWe only support the latest on main & preview.\n\n## Reporting a Vulnerability\n\nPlease use the Private Vulnerability Reporting feature on GitHub:\n\n- Navigate to the Security tab of this repository.\n- Select \"Vulnerability reporting\" from the left-hand sidebar.\n- Click \"Report a vulnerability\" to open a private advisory.\n\nInclude as much detail as possible:\n\n- Steps to reproduce.\n- Potential impact.\n- Any suggested fixes.\n\nThis allows us to collaborate with you on a fix in a private workspace before the issue is made public.\n"
  },
  {
    "path": "compose.dev.cpu.yml",
    "content": "services:\n  podly:\n    container_name: podly-pure-podcasts\n    image: podly-pure-podcasts\n    volumes:\n      - ./src/instance:/app/src/instance\n    env_file:\n      - ./.env.local\n    build:\n      context: .\n      dockerfile: Dockerfile\n      args:\n        - BASE_IMAGE=${BASE_IMAGE:-python:3.11-slim}\n        - CUDA_VERSION=${CUDA_VERSION:-12.4.1}\n        - USE_GPU=${USE_GPU:-false}\n        - USE_GPU_NVIDIA=${USE_GPU_NVIDIA:-false}\n        - USE_GPU_AMD=${USE_GPU_AMD:-false}\n        - LITE_BUILD=${LITE_BUILD:-false}\n    ports:\n      - \"5001:5001\"\n    environment:\n      - PUID=${PUID:-1000}\n      - PGID=${PGID:-1000}\n      - CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES:--1}\n      - SERVER_THREADS=${SERVER_THREADS:-1}\n    restart: unless-stopped\n    healthcheck:\n      test:\n        [\n          \"CMD\",\n          \"python3\",\n          \"-c\",\n          \"import urllib.request; urllib.request.urlopen('http://127.0.0.1:5001/')\",\n        ]\n      interval: 30s\n      timeout: 10s\n      retries: 3\n      start_period: 10s\n\nnetworks:\n  default:\n    name: podly-pure-podcasts-network\n"
  },
  {
    "path": "compose.dev.nvidia.yml",
    "content": "services:\n  podly:\n    extends:\n      file: compose.dev.cpu.yml\n      service: podly\n    env_file:\n      - ./.env.local\n    environment:\n      - PUID=${PUID:-1000}\n      - PGID=${PGID:-1000}\n      - CUDA_VISIBLE_DEVICES=0\n      - CORS_ORIGINS=*\n      - SERVER_THREADS=${SERVER_THREADS:-1}\n    deploy:\n      resources:\n        reservations:\n          devices:\n            - driver: nvidia\n              count: 1\n              capabilities: [gpu]\n\nnetworks:\n  default:\n    name: podly-pure-podcasts-network\n"
  },
  {
    "path": "compose.dev.rocm.yml",
    "content": "services:\n  podly:\n    extends:\n      file: compose.dev.cpu.yml\n      service: podly\n    env_file:\n      - ./.env.local\n    devices:\n      - /dev/kfd\n      - /dev/dri\n    environment:\n      - PUID=${PUID:-1000}\n      - PGID=${PGID:-1000}\n      - CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES:--1}\n      - CORS_ORIGINS=*\n      - SERVER_THREADS=${SERVER_THREADS:-1}\n      # Don't ask me why this is needed for ROCM. See\n      # https://github.com/openai/whisper/discussions/55#discussioncomment-3714528\n      - HSA_OVERRIDE_GFX_VERSION=10.3.0\n    security_opt:\n      - seccomp=unconfined\n\nnetworks:\n  default:\n    name: podly-pure-podcasts-network\n# This would be ideal. Not currently supported, apparently. Or I just wasn't able to figure out the driver arg.\n# Tried: amdgpu, amd, rocm\n#    deploy:\n#      resources:\n#        reservations:\n#          devices:\n#            - capabilities: [gpu]\n#              driver: \"amdgpu\"\n#              count: 1\n"
  },
  {
    "path": "compose.yml",
    "content": "services:\n  podly:\n    container_name: podly-pure-podcasts\n    ports:\n      - \"5001:5001\"\n    image: ghcr.io/podly-pure-podcasts/podly-pure-podcasts:${BRANCH:-main-latest}\n    volumes:\n      - ./src/instance:/app/src/instance\n    env_file:\n      - ./.env.local\n    environment:\n      - PUID=${PUID:-1000}\n      - PGID=${PGID:-1000}\n      - CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES:--1}\n      - SERVER_THREADS=${SERVER_THREADS:-1}\n    restart: unless-stopped\n    healthcheck:\n      test:\n        [\n          \"CMD\",\n          \"python3\",\n          \"-c\",\n          \"import urllib.request; urllib.request.urlopen('http://127.0.0.1:5001/')\",\n        ]\n      interval: 30s\n      timeout: 10s\n      retries: 3\n      start_period: 10s\n\nnetworks:\n  default:\n    name: podly-pure-podcasts-network\n"
  },
  {
    "path": "docker-entrypoint.sh",
    "content": "#!/bin/bash\nset -e\n\n# Check if PUID/PGID env variables are set\nif [ -n \"${PUID}\" ] && [ -n \"${PGID}\" ] && [ \"$(id -u)\" = \"0\" ]; then\n    echo \"Using custom UID:GID = ${PUID}:${PGID}\"\n    \n    # Update user/group IDs if needed\n    usermod -o -u \"$PUID\" appuser\n    groupmod -o -g \"$PGID\" appuser\n    \n    # Ensure required directories exist\n    mkdir -p /app/src/instance /app/src/instance/data /app/src/instance/data/in /app/src/instance/data/srv /app/src/instance/config /app/src/instance/db /app/src/instance/logs\n    \n    # Set permissions for all application directories\n    APP_DIRS=\"/home/appuser /app/processing /app/src/instance /app/src/instance/data /app/src/instance/config /app/src/instance/db /app/src/instance/logs /app/scripts\"\n    chown -R appuser:appuser $APP_DIRS 2>/dev/null || true\n    \n    # Ensure log file exists and has correct permissions in new location\n    touch /app/src/instance/logs/app.log\n    chmod 664 /app/src/instance/logs/app.log\n    chown appuser:appuser /app/src/instance/logs/app.log\n\n    # Run as appuser\n    export HOME=/home/appuser\n    exec gosu appuser \"$@\"\nelse\n    # Run as current user (but don't assume it's appuser)\n    exec \"$@\"\nfi "
  },
  {
    "path": "docs/contributors.md",
    "content": "# Contributor Guide\n\n### Quick Start (Docker - recommended for local setup)\n\n1. Make the script executable and run:\n\n```bash\nchmod +x run_podly_docker.sh\n./run_podly_docker.sh --build\n./run_podly_docker.sh # foreground with logs \n./run_podly_docker.sh -d # or detached\n```\n\nThis automatically detects NVIDIA GPUs and uses them if available.\n\nAfter the server starts:\n\n- Open `http://localhost:5001` in your browser\n- Configure settings at `http://localhost:5001/config`\n- Add podcast feeds and start processing\n\n## Usage\n\nOnce the server is running:\n\n1. Open `http://localhost:5001`\n2. Configure settings in the Config page at `http://localhost:5001/config`\n3. Add podcast RSS feeds through the web interface\n4. Open your podcast app and subscribe to the Podly endpoint (e.g., `http://localhost:5001/feed/1`)\n5. Select an episode and download\n\n## Transcription Options\n\nPodly supports multiple options for audio transcription:\n\n1. **Local Whisper (Default)**\n   - Slower but self-contained\n2. **OpenAI Hosted Whisper**\n   - Fast and accurate; billed per-feed via Stripe\n3. **Groq Hosted Whisper**\n   - Fast and cost-effective\n\nSelect your preferred method in the Config page (`/config`).\n\n## Remote Setup\n\nPodly automatically detects reverse proxies and generates appropriate URLs via request headers.\n\n### Reverse Proxy Examples\n\n**Nginx:**\n\n```nginx\nserver {\n    listen 443 ssl;\n    server_name your-domain.com;\n\n    location / {\n        proxy_pass http://localhost:5001;\n        proxy_set_header Host $host;\n        proxy_set_header X-Real-IP $remote_addr;\n        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;\n        proxy_set_header X-Forwarded-Proto $scheme;\n        proxy_set_header X-Forwarded-Host $host;\n    }\n}\n```\n\n**Traefik (docker-compose.yml):**\n\n```yaml\nlabels:\n  - \"traefik.enable=true\"\n  - \"traefik.http.routers.podly.rule=Host(`your-domain.com`)\"\n  - \"traefik.http.routers.podly.tls.certresolver=letsencrypt\"\n  - \"traefik.http.services.podly.loadbalancer.server.port=5001\"\n```\n\n> **Note**: Most modern reverse proxies automatically set the required headers. No manual configuration is needed in most cases.\n\n### Built-in Authentication\n\nPodly ships with built-in authentication so you can secure feeds without relying on a reverse proxy.\n\n- Set `REQUIRE_AUTH=true` to enable protection. By default it is `false`, preserving existing behaviour.\n- When auth is enabled, Podly fails fast on startup unless `PODLY_ADMIN_PASSWORD` is supplied and meets the strength policy (≥12 characters with upper, lower, digit, symbol). Override the initial username with `PODLY_ADMIN_USERNAME` (default `podly_admin`).\n- Provide a long, random `PODLY_SECRET_KEY` so Flask sessions remain valid across restarts. If you omit it, the app generates a new key on each boot and all users are signed out.\n- On first boot with an empty database, Podly seeds an admin user using the supplied credentials. **If you are enabling auth on an existing install, start from a fresh data volume.**\n- After signing in, open the Config page to rotate your password and manage additional users. When you change the admin password, update the corresponding environment variable in your deployment platform so restarts continue to succeed.\n- Use the \"Copy protected feed\" button to generate feed-specific access tokens that are embedded in subscription URLs so podcast clients can authenticate without your primary password. Rate limiting is still applied to repeated authentication failures.\n\n## Ubuntu Service\n\nAdd a service file to /etc/systemd/system/podly.service\n\n```\n[Unit]\nDescription=Podly Podcast Service\nAfter=network.target\n\n[Service]\nUser=yourusername\nGroup=yourusername\nWorkingDirectory=/path/to/your/app\nExecStart=/usr/bin/pipenv run python src/main.py\nRestart=always\n\n[Install]\nWantedBy=multi-user.target\n```\n\nenable the service\n\n```\nsudo systemctl daemon-reload\nsudo systemctl enable podly.service\n```\n\n## Database Update\n\nThe database auto-migrates on launch.\n\nTo add a migration after data model change:\n\n```bash\npipenv run flask --app ./src/main.py db migrate -m \"[change description]\"\n```\n\nOn next launch, the database updates automatically.\n\n## Releases and Commit Messages\n\nThis repo uses `semantic-release` to automate versioning and GitHub releases. It relies on\nConventional Commits to determine the next version.\n\nFor pull requests, include **at least one** commit that follows the Conventional Commit format:\n\n- `feat: add new episode filter`\n- `fix(api): handle empty feed`\n- `chore: update dependencies`\n\nIf no Conventional Commit is present, the release pipeline will have nothing to publish.\n\n## Docker Support\n\nPodly can be run in Docker with support for both NVIDIA GPU and non-NVIDIA environments.\n\n### Docker Options\n\n```bash\n./run_podly_docker.sh --dev          # rebuild containers for local changes\n./run_podly_docker.sh --production   # use published images\n./run_podly_docker.sh --lite         # smaller image without local Whisper\n./run_podly_docker.sh --cpu          # force CPU mode\n./run_podly_docker.sh --gpu          # force GPU mode\n./run_podly_docker.sh --build        # build only\n./run_podly_docker.sh --test-build   # test build\n./run_podly_docker.sh -d             # detached\n```\n\n### Development vs Production Modes\n\n**Development Mode** (default):\n\n- Uses local Docker builds\n- Requires rebuilding after code changes: `./run_podly_docker.sh --dev`\n- Mounts essential directories (config, input/output, database) and live code for development\n- Good for: development, testing, customization\n\n**Production Mode**:\n\n- Uses pre-built images from GitHub Container Registry\n- No building required - images are pulled automatically\n- Same volume mounts as development\n- Good for: deployment, quick setup, consistent environments\n\n```bash\n# Start with existing local container\n./run_podly_docker.sh\n\n# Rebuild and start after making code changes\n./run_podly_docker.sh --dev\n\n# Use published images (no local building required)\n./run_podly_docker.sh --production\n```\n\n### Docker Environment Configuration\n\n**Environment Variables**:\n\n- `PUID`/`PGID`: User/group IDs for file permissions (automatically set by run script)\n- `CUDA_VISIBLE_DEVICES`: GPU device selection for CUDA acceleration\n- `CORS_ORIGINS`: Backend CORS configuration (defaults to accept requests from any origin)\n\n## FAQ\n\nQ: What does \"whitelisted\" mean in the UI?\n\nA: It means an episode is eligible for download and ad removal. By default, new episodes are automatically whitelisted (`automatically_whitelist_new_episodes`), and only a limited number of old episodes are auto-whitelisted (`number_of_episodes_to_whitelist_from_archive_of_new_feed`). Adjust these settings in the Config page (/config).\n\nQ: How can I enable whisper GPU acceleration?\n\nA: There are two ways to enable GPU acceleration:\n\n1. **Using Docker**:\n\n   - Use the provided Docker setup with `run_podly_docker.sh` which automatically detects and uses NVIDIA GPUs if available\n   - You can force GPU mode with `./run_podly_docker.sh --gpu` or force CPU mode with `./run_podly_docker.sh --cpu`\n\n2. **In a local environment**:\n   - Install the CUDA version of PyTorch to your virtual environment:\n   ```bash\n   pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118\n   ```\n\n## Contributing\n\nWe welcome contributions to Podly! Here's how you can help:\n\n### Development Setup\n\n1. Fork the repository\n2. Clone your fork:\n   ```bash\n   git clone https://github.com/yourusername/podly.git\n   ```\n3. Create a new branch for your feature:\n   ```bash\n   git checkout -b feature/your-feature-name\n   ```\n4. Create a pull request with a target branch of Preview\n\n#### Application Ports\n\nBoth local and Docker deployments provide a consistent experience:\n\n- **Application**: Runs on port 5001 (configurable via web UI at `/config`)\n  - Serves both the web interface and API endpoints\n  - Frontend is built as static assets and served by the backend\n- **Development**: `run_podly_docker.sh` serves everything on port 5001\n  - Local script builds frontend to static assets (like Docker)\n  - Restart `./run_podly_docker.sh` after frontend changes to rebuild assets\n\n#### Development Modes\n\nBoth scripts provide equivalent core functionality with some unique features:\n\n**Common Options (work in both scripts)**:\n\n- `-b/--background` or `-d/--detach`: Run in background mode\n- `-h/--help`: Show help information\n\n**Local Development**\n\n**Docker Development** (`./run_podly_docker.sh`):\n\n- **Development mode**: `./run_podly_docker.sh --dev` - rebuilds containers with code changes\n- **Production mode**: `./run_podly_docker.sh --production` - uses pre-built images\n- **Docker-specific options**: `--build`, `--test-build`, `--gpu`, `--cpu`, `--cuda=VERSION`, `--rocm=VERSION`, `--branch=BRANCH`\n\n**Functional Equivalence**:\nBoth scripts provide the same core user experience:\n\n- Application runs on port 5001 (configurable)\n- Frontend served as static assets by Flask backend\n- Same web interface and API endpoints\n- Compatible background/detached modes\n\n### Running Tests\n\nBefore submitting a pull request, you can run the same tests that run in CI:\n\nTo prep your pipenv environment to run this script, you will need to first run:\n\n```bash\npipenv install --dev\n```\n\nThen, to run the checks,\n\n```bash\nscripts/ci.sh\n```\n\nThis will run all the necessary checks including:\n\n- Type checking with mypy\n- Code formatting checks\n- Unit tests\n- Linting\n\n### Pull Request Process\n\n1. Ensure all tests pass locally\n2. Update the documentation if needed\n3. Create a Pull Request with a clear description of the changes\n4. Link any related issues\n\n### Code Style\n\n- We use black for code formatting\n- Type hints are required for all new code\n- Follow existing patterns in the codebase\n"
  },
  {
    "path": "docs/how_to_run_beginners.md",
    "content": "# How To Run: Ultimate Beginner's Guide\n\nThis guide will walk you through setting up Podly from scratch using Docker. Podly creates ad-free RSS feeds for podcasts by automatically detecting and removing advertisement segments.\n\n## Highly Recommend!\n\nWant an expert to guide you through the setup? Download an AI powered IDE like cursor https://www.cursor.com/ or windsurf https://windsurf.com/\n\nMost IDEs have a free tier you can use to get started. Alternatively, you can use your own [LLM API key in Cursor](https://docs.cursor.com/settings/api-keys) (you'll need a key for Podly anyways).\n\nOpen the AI chat in the IDE. Enable 'Agent' mode if available, which will allow the IDE to help you run commands, view the output, and debug or take corrective steps if necessary.\n\nPaste one of the prompts below into the chat box.\n\nIf you don't have the repo downloaded:\n\n```\nHelp me install docker and run Podly https://github.com/podly-pure-podcasts/podly_pure_podcasts\nAfter the project is cloned, help me:\n- install docker & docker compose\n- run `./run_podly_docker.sh --build` then `./run_podly_docker.sh -d`\n- configure the app via the web UI at http://localhost:5001/config\nBe sure to check if a dependency is already installed before downloading.\nWe recommend Docker because installing ffmpeg & local whisper can be difficult.\nThe Docker image has both ffmpeg & local whisper preconfigured.\nPodly works with many different LLMs, it does not require an OpenAI key.\nCheck your work by retrieving the index page from localhost:5001 at the end.\n```\n\nIf you do have the repo pulled, open this file and prompt:\n\n```\nReview this project, follow this guide and start Podly on my computer.\nBriefly, help me:\n- install docker & docker compose\n- run `./run_podly_docker.sh --build` and then `./run_podly_docker.sh -d`\n- configure the app via the web UI at http://localhost:5001/config\nBe sure to check if a dependency is already installed before downloading.\nWe recommend docker because installing ffmpeg & local whisper can be difficult.\nThe docker image has both ffmpeg & local whisper preconfigured.\nPodly works with many different LLMs; it does not need to work with OpenAI.\nCheck your work by retrieving the index page from localhost:5001 at the end.\n```\n\n## Prerequisites\n\n### Install Docker and Docker Compose\n\n#### On Windows:\n\n1. Download and install [Docker Desktop for Windows](https://docs.docker.com/desktop/install/windows-install/)\n2. During installation, make sure \"Use WSL 2 instead of Hyper-V\" is checked\n3. Restart your computer when prompted\n4. Open Docker Desktop and wait for it to start completely\n\n#### On macOS:\n\n1. Download and install [Docker Desktop for Mac](https://docs.docker.com/desktop/install/mac-install/)\n2. Drag Docker to your Applications folder\n3. Launch Docker Desktop from Applications\n4. Follow the setup assistant\n\n#### On Linux (Ubuntu/Debian):\n\n```bash\n# Update package index\nsudo apt update\n\n# Install Docker\nsudo apt install docker.io docker-compose-v2\n\n# Add your user to the docker group\nsudo usermod -aG docker $USER\n\n# Log out and log back in for group changes to take effect\n```\n\n#### Verify Installation:\n\nOpen a terminal/command prompt and run:\n\n```bash\ndocker --version\ndocker compose version\n```\n\nYou should see version information for both commands.\n\n### 2. Get an OpenAI API Key\n\n1. Go to [OpenAI's API platform](https://platform.openai.com/)\n2. Sign up for an account or log in if you already have one\n3. Navigate to the [API Keys section](https://platform.openai.com/api-keys)\n4. Click \"Create new secret key\"\n5. Give it a name (e.g., \"Podly\")\n6. **Important**: Copy the key immediately and save it somewhere safe - you won't be able to see it again!\n7. Your API key will look something like: `sk-proj-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx`\n\n> **Note**: OpenAI API usage requires payment. Make sure to set up billing and usage limits in your OpenAI account to avoid unexpected charges.\n\n## Setup Podly\n\n### Download the Project\n\n```bash\ngit clone https://github.com/normand1/podly_pure_podcasts.git\ncd podly_pure_podcasts\n```\n\n## Running Podly\n\n### Run the Application via Docker\n\n```bash\nchmod +x run_podly_docker.sh\n./run_podly_docker.sh --build\n./run_podly_docker.sh            # foreground\n./run_podly_docker.sh -d         # detached\n```\n\n### Optional: Enable Authentication\n\nThe Docker image reads environment variables from `.env` files or your shell. To require login:\n\n1. Export the variables before running Podly, or add them to `config/.env`:\n\n```bash\nexport REQUIRE_AUTH=true\nexport PODLY_ADMIN_USERNAME='podly_admin'\nexport PODLY_ADMIN_PASSWORD='SuperSecurePass!2024'\nexport PODLY_SECRET_KEY='replace-with-a-strong-64-char-secret'\n```\n\n2. Start Podly as usual. On first boot with auth enabled and an empty database, the admin account is created automatically. If you are turning auth on for an existing volume, clear the `sqlite3.db` file so the bootstrap can succeed.\n\n3. Sign in at `http://localhost:5001`, then visit the Config page to change your password, add users, and copy RSS URLs with the \"Copy protected feed\" button. Podly generates feed-specific access tokens and embeds them in the link so podcast players can subscribe without exposing your main password. Remember to update your environment variables whenever you rotate the admin password.\n\n### First Run\n\n1. Docker will download and build the necessary image (this may take 5-15 minutes)\n2. Look for \"Running on http://0.0.0.0:5001\"\n3. Open your browser to `http://localhost:5001`\n4. Configure settings at `http://localhost:5001/config`\n   - Alternatively, set secrets via Docker env file `.env.local` in the project root and restart the container. See .env.local.example\n\n## Advanced Options\n\n```bash\n# Force CPU-only processing (if you have GPU issues)\n./run_podly_docker.sh --cpu\n\n# Force GPU processing\n./run_podly_docker.sh --gpu\n\n# Just build the container without running\n./run_podly_docker.sh --build\n\n# Test build from scratch (useful for troubleshooting)\n./run_podly_docker.sh --test-build\n```\n\n## Using Podly\n\n### Adding Your First Podcast\n\n1. In the web interface, look for an \"Add Podcast\" or similar button\n2. Paste the RSS feed URL of your podcast\n3. Podly will start processing new episodes automatically\n4. Processed episodes will have advertisements removed\n\n### Getting Your Ad-Free RSS Feed\n\n1. After adding a podcast, Podly will generate a new RSS feed URL\n2. Use this new URL in your podcast app instead of the original\n3. Your podcast app will now download ad-free versions!\n\n## Troubleshooting\n\n### \"Docker command not found\"\n\n- Make sure Docker Desktop is running\n- On Windows, restart your terminal after installing Docker\n- On Linux, make sure you logged out and back in after adding yourself to the docker group\n\n### Cannot connect to the Docker daemon. Is the docker daemon running?\n\n- If using docker desktop, open up the app, otherwise start the daemon\n\n### \"Permission denied\" errors\n\n- On macOS/Linux, make sure the script is executable: `chmod +x run_podly_docker.sh`\n- On Windows, try running Command Prompt as Administrator\n\n### OpenAI API errors\n\n- Double-check your API key in the Config page at `/config`\n- Make sure you have billing set up in your OpenAI account\n- Check your usage limits haven't been exceeded\n\n### Port 5001 already in use\n\n- Another application is using port 5001\n- **Docker users**: Either stop that application or modify the port in `compose.dev.cpu.yml` and `compose.yml`\n- **Native users**: Change the port in the Config page under App settings\n- To kill processes on that port run `lsof -i :5001 | grep LISTEN | awk '{print $2}' | xargs kill -9`\n\n### Out of memory errors\n\n- Close other applications to free up RAM\n- Consider using `--cpu` flag if you have limited memory\n\n## Stopping Podly\n\nTo stop the application:\n\nIf you have launched it in the foreground by omitting the `-d` parameter:\n1. In the terminal where Podly is running, press `Ctrl+C`\n2. Wait for the container to stop gracefully\n\nIf you have launched it in the background using the `-d` parameter:\n1. In the terminal where Podly is running, execute `docker compose down`\n2. Wait for the container to stop gracefully\n\nIn both cases this output should appear to indicate that it has stopped:\n\n```sh\n[+] Running 2/2\n ✔ Container podly-pure-podcasts        Removed\n ✔ Network podly-pure-podcasts-network  Removed\n```\n\n## Upgrading Podly\n\nTo upgrade the application while you are in the terminal where it is running:\n1. [Stop it](#stopping-podly)\n2. Execute `git pull`\n3. [Run it again](#running-podly)\n\n## Getting Help\n\nIf you encounter issues ask in our discord, we're friendly!\n\nhttps://discord.gg/FRB98GtF6N\n\n## What's Next?\n\nOnce you have Podly running:\n\n- Explore the web interface to add more podcasts\n- Configure settings in the Config page\n- Consider setting up automatic background processing\n- Enjoy your ad-free podcasts!\n"
  },
  {
    "path": "docs/how_to_run_railway.md",
    "content": "# How to Run on Railway\n\nThis guide will walk you through deploying Podly on Railway using the one-click template.\n\n## 0. Important! Set Budgets\n\nBoth Railway and Groq allow you to set budgets on your processing. Set a $10 (minimum possible, expect smaller bill) budget on Railway. Set a $5 budget for Groq.\n\n## 1. Get Free Groq API Key\n\nPodly uses Groq to transcribe podcasts quickly and for free.\n\n1.  Go to [https://console.groq.com/keys](https://console.groq.com/keys).\n2.  Sign up for a free account.\n3.  Create a new API key.\n4.  Copy the key and paste it into the `GROQ_API_KEY` field during the Railway deployment.\n\n## 2. Deploy Railway Template\n\nClick the button below to deploy Podly to Railway. This is a sponsored link that supports the project!\n\n[![Deploy on Railway](https://railway.com/button.svg)](https://railway.com/deploy/podly?referralCode=NMdeg5&utm_medium=integration&utm_source=template&utm_campaign=generic)\n\nIf you want to be a beta-tester, you can deploy the preview branch instead:\n\n[![Deploy on Railway](https://railway.com/button.svg)](https://railway.com/deploy/podly-preview?referralCode=NMdeg5&utm_medium=integration&utm_source=template&utm_campaign=generic)\n\n## 3. Configure Networking\n\nAfter the deployment is complete, you need to expose the service to the internet.\n\n1.  Click on the new deployment in your Railway dashboard.\n2.  Go to the **Settings** tab.\n3.  Under **Networking**, find the **Public Networking** section and click **Generate Domain**.\n4.  You can now access Podly at the generated URL.\n5.  (Optional) To change the domain name, click **Edit** and enter a new name.\n\n![Setting up Railway Networking](images/setting_up_railway_networking.png)\n\n## 4. Set Budgets & Expected Pricing\n\nSet a $10 budget on Railway and a $5 budget on Groq (or use the free tier for Groq which will slow processing).\n\nPodly is designed to run efficiently on Railway's hobby plan.\n\nIf you process a large volume of podcasts, you can check the **Config** page in your Podly deployment for estimated monthly costs based on your usage.\n\n## 5. Secure Your Deployment\n\nPodly now uses secure session cookies for the web dashboard while keeping HTTP Basic authentication for RSS feeds and audio downloads. Before inviting listeners, secure the app:\n\n1. In the Railway dashboard, open your Podly service and head to **Variables**.\n2. Add `REQUIRE_AUTH` with value `true`.\n3. Add a strong `PODLY_ADMIN_PASSWORD` (minimum 12 characters including uppercase, lowercase, digit, and symbol). Optionally set `PODLY_ADMIN_USERNAME`.\n4. Provide a long, random `PODLY_SECRET_KEY` so session cookies survive restarts. (If you omit it, Podly will generate a new key each deploy and sign everyone out.)\n5. Redeploy the service. On first boot Podly seeds the admin user and requires those credentials on every request.\n\n> **Important:** Enabling auth on an existing deployment requires a fresh data volume. Create a new Railway deployment or wipe the existing storage so the initial admin can be seeded.\n\nAfter signing in, use the Config page to change your password, add additional users, and copy RSS links via the \"Copy protected feed\" button. Podly issues feed-specific access tokens and embeds them in each URL so listeners can subscribe without knowing your main password. When you rotate passwords, update the corresponding Railway variables so restarts succeed.\n\n## 6. Using Podly\n\n1.  Open your new Podly URL in a browser.\n2.  Navigate to the **Feeds** page.\n3.  Add the RSS feed URL of a podcast you want to process.\n4.  Go to your favorite podcast client and subscribe to the new feed URL provided by Podly (e.g., `https://your-podly-app.up.railway.app/feed/1`).\n5.  Download and enjoy ad-free episodes!\n"
  },
  {
    "path": "docs/todo.txt",
    "content": "- config audit & testing (advanced and basic)\n- move host/port/threads to docker config\nreaudit security + testing\nci.sh\ntest railway\nlogin for public facing\npodcast rss search\n\n'basic' config page - just put in groq api key + test + save on populate\nalso show if api key is set or blank\n\ntest hide 'local' whisper in lite build"
  },
  {
    "path": "frontend/.gitignore",
    "content": "# Logs\nlogs\n*.log\nnpm-debug.log*\nyarn-debug.log*\nyarn-error.log*\npnpm-debug.log*\nlerna-debug.log*\n\nnode_modules\ndist\ndist-ssr\n*.local\n\n# Editor directories and files\n.vscode/*\n!.vscode/extensions.json\n.idea\n.DS_Store\n*.suo\n*.ntvs*\n*.njsproj\n*.sln\n*.sw?\n"
  },
  {
    "path": "frontend/README.md",
    "content": "# Podly Frontend\n\nThis is the React + TypeScript + Vite frontend for Podly. The frontend is built and served as part of the main Podly application.\n\n## Development\n\nThe frontend is integrated into the main Podly application and served as static assets by the Flask backend on port 5001.\n\n### Development Workflows\n\n1. **Docker (recommended)**: The Docker build compiles the frontend during image creation and serves static assets from Flask.\n\n2. **Direct Frontend Development**: You can run the frontend development server separately for advanced frontend work:\n\n   ```bash\n   cd frontend\n   npm install\n   npm run dev\n   ```\n\n   This starts the Vite development server on port 5173 with hot reloading and proxies API calls to the backend on port 5001.\n\n### Build Process\n\n- **Direct Development** (`npm run dev`): Vite dev server serves files with hot reloading on port 5173 and proxies API calls to backend on port 5001\n- **Docker**: Multi-stage build compiles frontend assets during image creation and copies them to the Flask static directory\n\n## Technology Stack\n\n- **React 18+** with TypeScript\n- **Vite** for build tooling and development server\n- **Tailwind CSS** for styling\n- **React Router** for client-side routing\n- **Tanstack Query** for data fetching\n\n## Configuration\n\nThe frontend configuration is handled through:\n\n- **Environment Variables**: Set via Vite's environment variable system\n- **Vite Config**: `vite.config.ts` for build and development settings\n  - Development server runs on port 5173\n  - Proxies API calls to backend on port 5001 (configurable via `BACKEND_TARGET`)\n- **Tailwind Config**: `tailwind.config.js` for styling configuration\n"
  },
  {
    "path": "frontend/eslint.config.js",
    "content": "import js from '@eslint/js'\nimport globals from 'globals'\nimport reactHooks from 'eslint-plugin-react-hooks'\nimport reactRefresh from 'eslint-plugin-react-refresh'\nimport tseslint from 'typescript-eslint'\n\nexport default tseslint.config(\n  { ignores: ['dist'] },\n  {\n    extends: [js.configs.recommended, ...tseslint.configs.recommended],\n    files: ['**/*.{ts,tsx}'],\n    languageOptions: {\n      ecmaVersion: 2020,\n      globals: globals.browser,\n    },\n    plugins: {\n      'react-hooks': reactHooks,\n      'react-refresh': reactRefresh,\n    },\n    rules: {\n      ...reactHooks.configs.recommended.rules,\n      'react-refresh/only-export-components': [\n        'warn',\n        { allowConstantExport: true },\n      ],\n    },\n  },\n)\n"
  },
  {
    "path": "frontend/index.html",
    "content": "<!doctype html>\n<html lang=\"en\">\n  <head>\n    <meta charset=\"UTF-8\" />\n    <link rel=\"icon\" type=\"image/x-icon\" href=\"/favicon.ico\" />\n    <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\" />\n    <title>Podly</title>\n  </head>\n  <body>\n    <div id=\"root\"></div>\n    <script type=\"module\" src=\"/src/main.tsx\"></script>\n  </body>\n</html>\n"
  },
  {
    "path": "frontend/package.json",
    "content": "{\n  \"name\": \"frontend\",\n  \"private\": true,\n  \"version\": \"0.0.0\",\n  \"type\": \"module\",\n  \"scripts\": {\n    \"dev\": \"vite\",\n    \"build\": \"tsc -b && vite build\",\n    \"lint\": \"eslint .\",\n    \"preview\": \"vite preview\"\n  },\n  \"dependencies\": {\n    \"@tailwindcss/line-clamp\": \"^0.4.4\",\n    \"@tanstack/react-query\": \"^5.77.0\",\n    \"axios\": \"^1.9.0\",\n    \"clsx\": \"^2.1.1\",\n    \"react\": \"^19.1.0\",\n    \"react-dom\": \"^19.1.0\",\n    \"react-hot-toast\": \"^2.6.0\",\n    \"react-router-dom\": \"^7.6.1\",\n    \"tailwind-merge\": \"^3.3.0\"\n  },\n  \"devDependencies\": {\n    \"@eslint/js\": \"^9.25.0\",\n    \"@types/react\": \"^19.1.2\",\n    \"@types/react-dom\": \"^19.1.2\",\n    \"@vitejs/plugin-react\": \"^4.4.1\",\n    \"autoprefixer\": \"^10.4.21\",\n    \"eslint\": \"^9.25.0\",\n    \"eslint-plugin-react-hooks\": \"^5.2.0\",\n    \"eslint-plugin-react-refresh\": \"^0.4.19\",\n    \"globals\": \"^16.0.0\",\n    \"postcss\": \"^8.5.3\",\n    \"tailwindcss\": \"^3.4.17\",\n    \"typescript\": \"~5.8.3\",\n    \"typescript-eslint\": \"^8.30.1\",\n    \"vite\": \"^6.3.5\"\n  }\n}\n"
  },
  {
    "path": "frontend/postcss.config.js",
    "content": "export default {\n  plugins: {\n    tailwindcss: {},\n    autoprefixer: {},\n  },\n} "
  },
  {
    "path": "frontend/src/App.css",
    "content": "html, body {\n  margin: 0 !important;\n  padding: 0 !important;\n  height: 100% !important;\n  overflow: hidden !important;\n}\n\n#root {\n  height: 100vh !important;\n  overflow: hidden !important;\n  max-width: none !important;\n  margin: 0 !important;\n  padding: 0 !important;\n}\n\n.logo {\n  height: 6em;\n  padding: 1.5em;\n  will-change: filter;\n  transition: filter 300ms;\n}\n.logo:hover {\n  filter: drop-shadow(0 0 2em #646cffaa);\n}\n.logo.react:hover {\n  filter: drop-shadow(0 0 2em #61dafbaa);\n}\n\n@keyframes logo-spin {\n  from {\n    transform: rotate(0deg);\n  }\n  to {\n    transform: rotate(360deg);\n  }\n}\n\n@media (prefers-reduced-motion: no-preference) {\n  .logo {\n    animation: logo-spin infinite 20s linear;\n  }\n}\n\n.card {\n  padding: 2em;\n}\n\n.read-the-docs {\n  color: #888;\n}\n\n/* Audio Player Styles */\n.audio-player-progress {\n  transition: all 0.1s ease;\n}\n\n.audio-player-progress:hover {\n  height: 6px;\n}\n\n.audio-player-progress-thumb {\n  transition: all 0.2s ease;\n  transform: scale(0);\n}\n\n.audio-player-progress:hover .audio-player-progress-thumb {\n  transform: scale(1);\n}\n\n.audio-player-volume-slider {\n  transition: all 0.2s ease;\n}\n\n/* Custom scrollbar for better UX */\n::-webkit-scrollbar {\n  width: 6px;\n}\n\n::-webkit-scrollbar-track {\n  background: #f1f1f1;\n}\n\n::-webkit-scrollbar-thumb {\n  background: #c1c1c1;\n  border-radius: 3px;\n}\n\n::-webkit-scrollbar-thumb:hover {\n  background: #a8a8a8;\n}\n"
  },
  {
    "path": "frontend/src/App.tsx",
    "content": "import { QueryClient, QueryClientProvider } from '@tanstack/react-query';\nimport { Toaster } from 'react-hot-toast';\nimport { BrowserRouter as Router, Routes, Route, Link, Navigate, useLocation } from 'react-router-dom';\nimport { AudioPlayerProvider } from './contexts/AudioPlayerContext';\nimport { AuthProvider, useAuth } from './contexts/AuthContext';\nimport { useQuery } from '@tanstack/react-query';\nimport { useState, useEffect, useRef } from 'react';\nimport HomePage from './pages/HomePage';\nimport JobsPage from './pages/JobsPage';\nimport ConfigPage from './pages/ConfigPage';\nimport LoginPage from './pages/LoginPage';\nimport LandingPage from './pages/LandingPage';\nimport BillingPage from './pages/BillingPage';\nimport AudioPlayer from './components/AudioPlayer';\nimport { billingApi } from './services/api';\nimport { DiagnosticsProvider, useDiagnostics } from './contexts/DiagnosticsContext';\nimport DiagnosticsModal from './components/DiagnosticsModal';\nimport './App.css';\n\nconst queryClient = new QueryClient({\n  defaultOptions: {\n    queries: {\n      staleTime: 0,\n      gcTime: 0,\n      refetchOnMount: 'always',\n      refetchOnWindowFocus: 'always',\n      refetchOnReconnect: 'always',\n    },\n  },\n});\n\nfunction AppShell() {\n  const { status, requireAuth, isAuthenticated, user, logout, landingPageEnabled } = useAuth();\n  const { open: openDiagnostics } = useDiagnostics();\n  const [mobileMenuOpen, setMobileMenuOpen] = useState(false);\n  const mobileMenuRef = useRef<HTMLDivElement>(null);\n  const location = useLocation();\n  const { data: billingSummary } = useQuery({\n    queryKey: ['billing', 'summary'],\n    queryFn: billingApi.getSummary,\n    enabled: !!user && requireAuth && isAuthenticated,\n    retry: false,\n  });\n\n  // Close mobile menu on route change\n  useEffect(() => {\n    setMobileMenuOpen(false);\n  }, [location.pathname]);\n\n  // Close mobile menu when clicking outside\n  useEffect(() => {\n    function handleClickOutside(event: MouseEvent) {\n      if (mobileMenuRef.current && !mobileMenuRef.current.contains(event.target as Node)) {\n        setMobileMenuOpen(false);\n      }\n    }\n    if (mobileMenuOpen) {\n      document.addEventListener('mousedown', handleClickOutside);\n      return () => document.removeEventListener('mousedown', handleClickOutside);\n    }\n  }, [mobileMenuOpen]);\n\n  if (status === 'loading') {\n    return (\n      <div className=\"h-screen flex items-center justify-center bg-gray-50\">\n        <div className=\"flex flex-col items-center gap-4\">\n          <div className=\"animate-spin rounded-full h-10 w-10 border-b-2 border-blue-600\" />\n          <p className=\"text-sm text-gray-600\">Loading authentication…</p>\n        </div>\n      </div>\n    );\n  }\n\n  // Show landing page for unauthenticated users when auth is required\n  // But allow access to /login route\n  if (requireAuth && !isAuthenticated) {\n    return (\n      <Routes>\n        <Route path=\"/login\" element={<LoginPage />} />\n        {landingPageEnabled ? (\n          <Route path=\"*\" element={<LandingPage />} />\n        ) : (\n          <>\n            <Route path=\"/\" element={<Navigate to=\"/login\" replace />} />\n            <Route path=\"*\" element={<Navigate to=\"/login\" replace />} />\n          </>\n        )}\n      </Routes>\n    );\n  }\n\n  const isAdmin = !requireAuth || user?.role === 'admin';\n  const showConfigLink = !requireAuth || isAdmin;\n  const showJobsLink = !requireAuth || isAdmin;\n  const showBillingLink = requireAuth && !isAdmin;\n\n  return (\n    <div className=\"h-screen bg-gray-50 flex flex-col overflow-hidden\">\n      <header className=\"bg-white shadow-sm border-b flex-shrink-0\">\n        <div className=\"px-2 sm:px-4 lg:px-6\">\n          <div className=\"flex items-center justify-between h-12\">\n            <div className=\"flex items-center\">\n              <Link to=\"/\" className=\"flex items-center\">\n                <img \n                  src=\"/images/logos/logo.webp\" \n                  alt=\"Podly\" \n                  className=\"h-6 w-auto\"\n                />\n                <h1 className=\"ml-2 text-lg font-semibold text-gray-900\">\n                  Podly\n                </h1>\n              </Link>\n            </div>\n\n            {/* Desktop Navigation */}\n            <nav className=\"hidden md:flex items-center space-x-4\">\n              <Link to=\"/\" className=\"text-sm font-medium text-gray-700 hover:text-gray-900\">\n                Home\n              </Link>\n              {showBillingLink && (\n                <Link to=\"/billing\" className=\"text-sm font-medium text-gray-700 hover:text-gray-900\">\n                  Billing\n                </Link>\n              )}\n              {showJobsLink && (\n                <Link to=\"/jobs\" className=\"text-sm font-medium text-gray-700 hover:text-gray-900\">\n                  Jobs\n                </Link>\n              )}\n              {showConfigLink && (\n                <Link to=\"/config\" className=\"text-sm font-medium text-gray-700 hover:text-gray-900\">\n                  Config\n                </Link>\n              )}\n              <button\n                type=\"button\"\n                onClick={() => openDiagnostics()}\n                className=\"text-sm font-medium text-gray-700 hover:text-gray-900\"\n              >\n                Report issue\n              </button>\n              {requireAuth && user && (\n                <div className=\"flex items-center gap-3 text-sm text-gray-600 flex-shrink-0\">\n                  {billingSummary && !isAdmin && (\n                    <>\n                      <div\n                        className=\"px-2 py-1 rounded-md border border-blue-200 text-blue-700 bg-blue-50 text-xs whitespace-nowrap\"\n                        title=\"Feeds included in your plan\"\n                      >\n                        Feeds {billingSummary.feeds_in_use}/{billingSummary.feed_allowance}\n                      </div>\n                      <Link\n                        to=\"/billing\"\n                        className=\"px-2 py-1 rounded-md border border-blue-200 text-blue-700 bg-white hover:bg-blue-50 text-xs whitespace-nowrap transition-colors\"\n                      >\n                        Change plan\n                      </Link>\n                    </>\n                  )}\n                  <span className=\"hidden sm:inline whitespace-nowrap\">{user.username}</span>\n                  <button\n                    onClick={logout}\n                    className=\"px-3 py-1 border border-gray-200 rounded-md hover:bg-gray-100 transition-colors whitespace-nowrap\"\n                  >\n                    Logout\n                  </button>\n                </div>\n              )}\n            </nav>\n\n            {/* Mobile: Credits + Hamburger */}\n            <div className=\"md:hidden flex items-center gap-2\">\n              {requireAuth && user && billingSummary && !isAdmin && (\n                <>\n                  <div\n                    className=\"px-2 py-1 rounded-md border border-blue-200 text-blue-700 bg-blue-50 text-xs whitespace-nowrap\"\n                    title=\"Feeds included in your plan\"\n                  >\n                    Feeds {billingSummary.feeds_in_use}/{billingSummary.feed_allowance}\n                  </div>\n                  <Link\n                    to=\"/billing\"\n                    className=\"px-2 py-1 rounded-md border border-blue-200 text-blue-700 bg-white text-xs whitespace-nowrap\"\n                  >\n                    Change plan\n                  </Link>\n                </>\n              )}\n\n              {/* Hamburger Button */}\n              <div className=\"relative\" ref={mobileMenuRef}>\n                <button\n                  onClick={() => setMobileMenuOpen(!mobileMenuOpen)}\n                  className=\"p-2 rounded-md text-gray-600 hover:text-gray-900 hover:bg-gray-100 transition-colors\"\n                  aria-label=\"Toggle menu\"\n                >\n                  {mobileMenuOpen ? (\n                    <svg className=\"h-6 w-6\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\">\n                      <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M6 18L18 6M6 6l12 12\" />\n                    </svg>\n                  ) : (\n                    <svg className=\"h-6 w-6\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\">\n                      <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M4 6h16M4 12h16M4 18h16\" />\n                    </svg>\n                  )}\n                </button>\n\n                {/* Mobile Menu Dropdown */}\n                {mobileMenuOpen && (\n                  <div className=\"absolute right-0 top-full mt-2 w-56 bg-white rounded-lg shadow-lg border border-gray-200 py-2 z-50\">\n                    <Link\n                      to=\"/\"\n                      className=\"block px-4 py-2 text-sm text-gray-700 hover:bg-gray-100\"\n                    >\n                      Home\n                    </Link>\n                    {showBillingLink && (\n                      <Link\n                        to=\"/billing\"\n                        className=\"block px-4 py-2 text-sm text-gray-700 hover:bg-gray-100\"\n                      >\n                        Billing\n                      </Link>\n                    )}\n                    {showJobsLink && (\n                      <Link\n                        to=\"/jobs\"\n                        className=\"block px-4 py-2 text-sm text-gray-700 hover:bg-gray-100\"\n                      >\n                        Jobs\n                      </Link>\n                    )}\n                    {showConfigLink && (\n                      <Link\n                        to=\"/config\"\n                        className=\"block px-4 py-2 text-sm text-gray-700 hover:bg-gray-100\"\n                      >\n                        Config\n                      </Link>\n                    )}\n                    <button\n                      type=\"button\"\n                      onClick={() => {\n                        openDiagnostics();\n                        setMobileMenuOpen(false);\n                      }}\n                      className=\"block w-full text-left px-4 py-2 text-sm text-gray-700 hover:bg-gray-100\"\n                    >\n                      Report issue\n                    </button>\n                    {requireAuth && user && (\n                      <>\n                        <div className=\"border-t border-gray-100 my-2\" />\n                        <div className=\"px-4 py-2 text-sm text-gray-500\">\n                          {user.username}\n                        </div>\n                        <button\n                          onClick={() => {\n                            logout();\n                            setMobileMenuOpen(false);\n                          }}\n                          className=\"block w-full text-left px-4 py-2 text-sm text-gray-700 hover:bg-gray-100\"\n                        >\n                          Logout\n                        </button>\n                      </>\n                    )}\n                  </div>\n                )}\n              </div>\n            </div>\n          </div>\n        </div>\n      </header>\n\n      <main className=\"flex-1 px-2 sm:px-4 lg:px-6 py-4 overflow-auto\">\n        <Routes>\n          <Route path=\"/\" element={<HomePage />} />\n          {showBillingLink && <Route path=\"/billing\" element={<BillingPage />} />}\n          {showJobsLink && <Route path=\"/jobs\" element={<JobsPage />} />}\n          {showConfigLink && <Route path=\"/config\" element={<ConfigPage />} />}\n          <Route path=\"*\" element={<Navigate to=\"/\" replace />} />\n        </Routes>\n      </main>\n\n      <AudioPlayer />\n      <DiagnosticsModal />\n      <Toaster position=\"top-center\" toastOptions={{ duration: 3000 }} />\n    </div>\n  );\n}\n\nfunction App() {\n  return (\n    <QueryClientProvider client={queryClient}>\n      <AuthProvider>\n        <AudioPlayerProvider>\n          <DiagnosticsProvider>\n            <Router>\n              <AppShell />\n            </Router>\n          </DiagnosticsProvider>\n        </AudioPlayerProvider>\n      </AuthProvider>\n    </QueryClientProvider>\n  );\n}\n\nexport default App;\n"
  },
  {
    "path": "frontend/src/components/AddFeedForm.tsx",
    "content": "import { useState, useEffect, useCallback } from 'react';\nimport { feedsApi } from '../services/api';\nimport type { PodcastSearchResult } from '../types';\nimport { diagnostics, emitDiagnosticError } from '../utils/diagnostics';\nimport { getHttpErrorInfo } from '../utils/httpError';\n\ninterface AddFeedFormProps {\n  onSuccess: () => void;\n  onUpgradePlan?: () => void;\n  planLimitReached?: boolean;\n}\n\ntype AddMode = 'url' | 'search';\n\nconst PAGE_SIZE = 10;\n\nexport default function AddFeedForm({ onSuccess, onUpgradePlan, planLimitReached }: AddFeedFormProps) {\n  const [url, setUrl] = useState('');\n  const [activeMode, setActiveMode] = useState<AddMode>('search');\n  const [isSubmitting, setIsSubmitting] = useState(false);\n  const [error, setError] = useState('');\n  const [addingFeedUrl, setAddingFeedUrl] = useState<string | null>(null);\n  const [upgradePrompt, setUpgradePrompt] = useState<string | null>(null);\n\n  const [searchTerm, setSearchTerm] = useState('');\n  const [searchResults, setSearchResults] = useState<PodcastSearchResult[]>([]);\n  const [searchError, setSearchError] = useState('');\n  const [isSearching, setIsSearching] = useState(false);\n  const [searchPage, setSearchPage] = useState(1);\n  const [totalResults, setTotalResults] = useState(0);\n  const [hasSearched, setHasSearched] = useState(false);\n\n  const resetSearchState = () => {\n    setSearchResults([]);\n    setSearchError('');\n    setSearchPage(1);\n    setTotalResults(0);\n    setHasSearched(false);\n  };\n\n  const handleSubmitManual = async (e: React.FormEvent) => {\n    e.preventDefault();\n    if (!url.trim()) return;\n\n    diagnostics.add('info', 'Add feed (manual) submitted', { via: 'url', hasUrl: true });\n    setError('');\n    await addFeed(url.trim(), 'url');\n  };\n\n  const addFeed = async (feedUrl: string, source: AddMode) => {\n    if (planLimitReached) {\n      setUpgradePrompt('Your plan is full. Increase your feed allowance to add more.');\n      return;\n    }\n    setIsSubmitting(true);\n    setAddingFeedUrl(source === 'url' ? 'manual' : feedUrl);\n    setError('');\n    setUpgradePrompt(null);\n\n    try {\n      diagnostics.add('info', 'Add feed request', { source, hasUrl: !!feedUrl });\n      await feedsApi.addFeed(feedUrl);\n      if (source === 'url') {\n        setUrl('');\n      }\n      diagnostics.add('info', 'Add feed success', { source });\n      onSuccess();\n    } catch (err) {\n      console.error('Failed to add feed:', err);\n      const { status, data, message } = getHttpErrorInfo(err);\n      const code = data && typeof data === 'object' ? (data as { error?: unknown }).error : undefined;\n      const errorCode = typeof code === 'string' ? code : undefined;\n\n      emitDiagnosticError({\n        title: 'Failed to add feed',\n        message,\n        kind: status ? 'http' : 'network',\n        details: {\n          source,\n          feedUrl,\n          status,\n          response: data,\n        },\n      });\n\n      if (errorCode === 'FEED_LIMIT_REACHED') {\n        setUpgradePrompt(message || 'Plan limit reached. Increase your feeds to add more.');\n      } else {\n        setError(message || 'Failed to add feed. Please check the URL and try again.');\n      }\n    } finally {\n      setIsSubmitting(false);\n      setAddingFeedUrl(null);\n    }\n  };\n\n  const performSearch = useCallback(async (term: string) => {\n    if (!term.trim()) {\n      setSearchResults([]);\n      setTotalResults(0);\n      setHasSearched(false);\n      setSearchError('');\n      return;\n    }\n\n    setIsSearching(true);\n    setSearchError('');\n\n    try {\n      diagnostics.add('info', 'Search podcasts request', { term: term.trim() });\n      const response = await feedsApi.searchFeeds(term.trim());\n      setSearchResults(response.results);\n      setTotalResults(response.total ?? response.results.length);\n      setSearchPage(1);\n      setHasSearched(true);\n      diagnostics.add('info', 'Search podcasts success', {\n        term: term.trim(),\n        total: response.total ?? response.results.length,\n      });\n    } catch (err) {\n      console.error('Podcast search failed:', err);\n      diagnostics.add('error', 'Search podcasts failed', { term: term.trim() });\n      setSearchError('Failed to search podcasts. Please try again.');\n      setSearchResults([]);\n    } finally {\n      setIsSearching(false);\n    }\n  }, []);\n\n  useEffect(() => {\n    const delayDebounceFn = setTimeout(() => {\n      if (searchTerm.trim()) {\n        performSearch(searchTerm);\n      } else {\n        setSearchResults([]);\n        setTotalResults(0);\n        setHasSearched(false);\n      }\n    }, 500);\n\n    return () => clearTimeout(delayDebounceFn);\n  }, [searchTerm, performSearch]);\n\n  const handleSearchSubmit = async (e: React.FormEvent) => {\n    e.preventDefault();\n    await performSearch(searchTerm);\n  };\n\n  const handleAddFromSearch = async (result: PodcastSearchResult) => {\n    await addFeed(result.feedUrl, 'search');\n  };\n\n  const totalPages =\n    totalResults === 0 ? 1 : Math.max(1, Math.ceil(totalResults / PAGE_SIZE));\n  const startIndex =\n    totalResults === 0 ? 0 : (searchPage - 1) * PAGE_SIZE + 1;\n  const endIndex =\n    totalResults === 0\n      ? 0\n      : Math.min(searchPage * PAGE_SIZE, totalResults);\n  const displayedResults = searchResults.slice(\n    (searchPage - 1) * PAGE_SIZE,\n    (searchPage - 1) * PAGE_SIZE + PAGE_SIZE\n  );\n\n  return (\n    <div className=\"bg-white rounded-xl border border-gray-200 shadow-sm p-4 sm:p-6\">\n      <h3 className=\"text-lg font-medium text-gray-900 mb-4\">Add New Podcast Feed</h3>\n      {planLimitReached && (\n        <div className=\"mb-3 text-sm text-amber-800 bg-amber-50 border border-amber-200 rounded-md px-3 py-2\">\n          Your plan is full. Increase your feed allowance to add more.\n        </div>\n      )}\n\n      <div className=\"flex flex-col sm:flex-row gap-2 mb-4\">\n        <button\n          type=\"button\"\n          onClick={() => {\n            setActiveMode('url');\n          }}\n          className={`flex-1 px-3 py-2 rounded-md border transition-colors ${\n            activeMode === 'url'\n              ? 'bg-blue-50 border-blue-500 text-blue-700'\n              : 'border-gray-200 text-gray-600 hover:bg-gray-100'\n          }`}\n        >\n          Enter RSS URL\n        </button>\n        <button\n          type=\"button\"\n          onClick={() => {\n            setActiveMode('search');\n            setError('');\n            resetSearchState();\n          }}\n          className={`flex-1 px-3 py-2 rounded-md border ${\n            activeMode === 'search'\n              ? 'bg-blue-50 border-blue-500 text-blue-700'\n              : 'border-gray-200 text-gray-600 hover:bg-gray-100'\n          }`}\n        >\n          Search Podcasts\n        </button>\n      </div>\n\n      {activeMode === 'url' && (\n        <form onSubmit={handleSubmitManual} className=\"space-y-4\">\n          <div>\n            <label htmlFor=\"feed-url\" className=\"block text-sm font-medium text-gray-700 mb-1\">\n              RSS Feed URL\n            </label>\n            <input\n              type=\"url\"\n              id=\"feed-url\"\n              value={url}\n              onChange={(e) => setUrl(e.target.value)}\n              placeholder=\"https://example.com/podcast/feed.xml\"\n              className=\"w-full px-3 py-2 border border-gray-300 rounded-md focus:outline-none focus:ring-2 focus:ring-blue-500 focus:border-transparent\"\n              required\n              disabled={!!planLimitReached}\n            />\n          </div>\n\n      {error && (\n        <div className=\"text-red-600 text-sm\">{error}</div>\n      )}\n      {upgradePrompt && (\n        <div className=\"flex flex-col sm:flex-row sm:items-center gap-2 p-3 border border-amber-200 bg-amber-50 rounded-md text-sm text-amber-800\">\n          <span>{upgradePrompt}</span>\n          {onUpgradePlan && (\n            <button\n              type=\"button\"\n              onClick={onUpgradePlan}\n              className=\"inline-flex items-center justify-center px-3 py-2 rounded-md bg-blue-600 text-white text-xs font-medium hover:bg-blue-700\"\n            >\n              Increase plan\n            </button>\n          )}\n        </div>\n      )}\n\n        <div className=\"flex flex-col sm:flex-row sm:justify-end gap-3\">\n          <button\n            type=\"submit\"\n            disabled={isSubmitting || !url.trim() || !!planLimitReached}\n            className=\"bg-blue-600 hover:bg-blue-700 disabled:bg-gray-400 text-white px-4 py-2 rounded-md font-medium transition-colors sm:w-auto w-full\"\n          >\n            {isSubmitting && addingFeedUrl === 'manual' ? 'Adding...' : 'Add Feed'}\n          </button>\n        </div>\n        </form>\n      )}\n\n      {activeMode === 'search' && (\n        <div className=\"space-y-4\">\n          <form onSubmit={handleSearchSubmit} className=\"flex flex-col md:flex-row gap-3\">\n            <div className=\"flex-1\">\n              <label htmlFor=\"search-term\" className=\"block text-sm font-medium text-gray-700 mb-1\">\n                Search keyword\n              </label>\n              <input\n                type=\"text\"\n                id=\"search-term\"\n                value={searchTerm}\n                onChange={(e) => setSearchTerm(e.target.value)}\n                placeholder=\"e.g. history, space, entrepreneurship\"\n                className=\"w-full px-3 py-2 border border-gray-300 rounded-md focus:outline-none focus:ring-2 focus:ring-blue-500 focus:border-transparent\"\n                disabled={!!planLimitReached}\n              />\n            </div>\n\n            <div className=\"flex items-end\">\n              <button\n                type=\"submit\"\n                disabled={isSearching || !!planLimitReached}\n                className=\"bg-blue-600 hover:bg-blue-700 disabled:bg-gray-400 text-white px-4 py-2 rounded-md font-medium transition-colors w-full md:w-auto\"\n              >\n                {isSearching ? 'Searching...' : 'Search'}\n              </button>\n            </div>\n          </form>\n\n          {searchError && (\n            <div className=\"text-red-600 text-sm\">{searchError}</div>\n          )}\n\n          {isSearching && searchResults.length === 0 && (\n            <div className=\"text-sm text-gray-600\">Searching for podcasts...</div>\n          )}\n\n          {!isSearching && searchResults.length === 0 && totalResults === 0 && hasSearched && !searchError && (\n            <div className=\"text-sm text-gray-600\">No podcasts found. Try a different search term.</div>\n          )}\n\n          {searchResults.length > 0 && (\n            <div className=\"space-y-3\">\n              <div className=\"flex justify-between items-center text-sm text-gray-500\">\n                <span>\n                  Showing {startIndex}-{endIndex} of {totalResults} results\n                </span>\n                <div className=\"flex gap-2\">\n                  <button\n                    type=\"button\"\n                    onClick={() =>\n                      setSearchPage((prev) => Math.max(prev - 1, 1))\n                    }\n                    disabled={isSearching || searchPage <= 1}\n                    className=\"px-3 py-1 border border-gray-200 rounded-md disabled:text-gray-400 disabled:border-gray-200 hover:bg-gray-100 transition-colors\"\n                  >\n                    Previous\n                  </button>\n                  <button\n                    type=\"button\"\n                    onClick={() =>\n                      setSearchPage((prev) => Math.min(prev + 1, totalPages))\n                    }\n                    disabled={isSearching || searchPage >= totalPages}\n                    className=\"px-3 py-1 border border-gray-200 rounded-md disabled:text-gray-400 disabled:border-gray-200 hover:bg-gray-100 transition-colors\"\n                  >\n                    Next\n                  </button>\n                </div>\n              </div>\n\n              <ul className=\"space-y-3 max-h-[45vh] sm:max-h-80 overflow-y-auto pr-2\">\n                {displayedResults.map((result) => (\n                  <li\n                    key={result.feedUrl}\n                    className=\"flex gap-3 p-3 border border-gray-200 rounded-md bg-gray-50\"\n                  >\n                    {result.artworkUrl ? (\n                      <img\n                        src={result.artworkUrl}\n                        alt={result.title}\n                        className=\"w-16 h-16 rounded-md object-cover\"\n                      />\n                    ) : (\n                      <div className=\"w-16 h-16 rounded-md bg-gray-200 flex items-center justify-center text-gray-500 text-xs\">\n                        No Image\n                      </div>\n                    )}\n                    <div className=\"flex-1\">\n                      <h4 className=\"font-medium text-gray-900\">{result.title}</h4>\n                      {result.author && (\n                        <p className=\"text-sm text-gray-600\">{result.author}</p>\n                      )}\n                      {result.genres.length > 0 && (\n                        <p className=\"text-xs text-gray-500 mt-1\">\n                          {result.genres.join(' · ')}\n                        </p>\n                      )}\n                      <p className=\"text-xs text-gray-500 break-all mt-2\">{result.feedUrl}</p>\n                    </div>\n                    <div className=\"flex items-center\">\n                      <button\n                        type=\"button\"\n                        onClick={() => handleAddFromSearch(result)}\n                        disabled={planLimitReached || (isSubmitting && addingFeedUrl === result.feedUrl)}\n                        className=\"bg-blue-600 hover:bg-blue-700 disabled:bg-gray-400 text-white px-3 py-2 rounded-md text-sm transition-colors\"\n                      >\n                        {isSubmitting && addingFeedUrl === result.feedUrl ? 'Adding...' : 'Add'}\n                      </button>\n                    </div>\n                  </li>\n                ))}\n              </ul>\n            </div>\n          )}\n        </div>\n      )}\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/AudioPlayer.tsx",
    "content": "import React, { useState, useRef, useEffect } from 'react';\nimport { useAudioPlayer } from '../contexts/AudioPlayerContext';\n\n// Simple SVG icons to replace Heroicons\nconst PlayIcon = ({ className }: { className: string }) => (\n  <svg className={className} fill=\"currentColor\" viewBox=\"0 0 24 24\">\n    <path d=\"M8 5v14l11-7z\"/>\n  </svg>\n);\n\nconst PauseIcon = ({ className }: { className: string }) => (\n  <svg className={className} fill=\"currentColor\" viewBox=\"0 0 24 24\">\n    <path d=\"M6 19h4V5H6v14zm8-14v14h4V5h-4z\"/>\n  </svg>\n);\n\nconst SpeakerWaveIcon = ({ className }: { className: string }) => (\n  <svg className={className} fill=\"currentColor\" viewBox=\"0 0 24 24\">\n    <path d=\"M3 9v6h4l5 5V4L7 9H3zm13.5 3c0-1.77-1.02-3.29-2.5-4.03v8.05c1.48-.73 2.5-2.25 2.5-4.02zM14 3.23v2.06c2.89.86 5 3.54 5 6.71s-2.11 5.85-5 6.71v2.06c4.01-.91 7-4.49 7-8.77s-2.99-7.86-7-8.77z\"/>\n  </svg>\n);\n\nconst SpeakerXMarkIcon = ({ className }: { className: string }) => (\n  <svg className={className} fill=\"currentColor\" viewBox=\"0 0 24 24\">\n    <path d=\"M16.5 12c0-1.77-1.02-3.29-2.5-4.03v2.21l2.45 2.45c.03-.2.05-.41.05-.63zm2.5 0c0 .94-.2 1.82-.54 2.64l1.51 1.51C20.63 14.91 21 13.5 21 12c0-4.28-2.99-7.86-7-8.77v2.06c2.89.86 5 3.54 5 6.71zM4.27 3L3 4.27 7.73 9H3v6h4l5 5v-6.73l4.25 4.25c-.67.52-1.42.93-2.25 1.18v2.06c1.38-.31 2.63-.95 3.69-1.81L19.73 21 21 19.73l-9-9L4.27 3zM12 4L9.91 6.09 12 8.18V4z\"/>\n  </svg>\n);\n\nconst XMarkIcon = ({ className }: { className: string }) => (\n  <svg className={className} fill=\"currentColor\" viewBox=\"0 0 24 24\">\n    <path d=\"M6 18L18 6M6 6l12 12\" stroke=\"currentColor\" strokeWidth=\"2\" strokeLinecap=\"round\" strokeLinejoin=\"round\" fill=\"none\"/>\n  </svg>\n);\n\nexport default function AudioPlayer() {\n  const {\n    currentEpisode,\n    isPlaying,\n    currentTime,\n    duration,\n    volume,\n    isLoading,\n    error,\n    togglePlayPause,\n    seekTo,\n    setVolume\n  } = useAudioPlayer();\n\n  const [isDragging, setIsDragging] = useState(false);\n  const [dragTime, setDragTime] = useState(0);\n  const [showVolumeSlider, setShowVolumeSlider] = useState(false);\n  const [showKeyboardShortcuts, setShowKeyboardShortcuts] = useState(false);\n  const [dismissedError, setDismissedError] = useState<string | null>(null);\n  const progressBarRef = useRef<HTMLDivElement>(null);\n  const volumeSliderRef = useRef<HTMLDivElement>(null);\n\n  // Reset dismissed error when a new error occurs\n  useEffect(() => {\n    if (error && error !== dismissedError) {\n      setDismissedError(null);\n    }\n  }, [error, dismissedError]);\n\n  // Close volume slider when clicking outside\n  useEffect(() => {\n    const handleClickOutside = (event: MouseEvent) => {\n      if (volumeSliderRef.current && !volumeSliderRef.current.contains(event.target as Node)) {\n        setShowVolumeSlider(false);\n      }\n    };\n\n    if (showVolumeSlider) {\n      document.addEventListener('mousedown', handleClickOutside);\n      return () => document.removeEventListener('mousedown', handleClickOutside);\n    }\n  }, [showVolumeSlider]);\n\n  // Don't render if no episode is loaded\n  if (!currentEpisode) {\n    return null;\n  }\n\n  console.log('AudioPlayer rendering with:', {\n    currentEpisode: currentEpisode?.title,\n    isPlaying,\n    isLoading,\n    error,\n    duration\n  });\n\n  const formatTime = (seconds: number) => {\n    if (isNaN(seconds)) return '0:00';\n    const hours = Math.floor(seconds / 3600);\n    const minutes = Math.floor((seconds % 3600) / 60);\n    const remainingSeconds = Math.floor(seconds % 60);\n    \n    if (hours > 0) {\n      return `${hours}:${minutes.toString().padStart(2, '0')}:${remainingSeconds.toString().padStart(2, '0')}`;\n    }\n    return `${minutes}:${remainingSeconds.toString().padStart(2, '0')}`;\n  };\n\n  const handleProgressClick = (e: React.MouseEvent<HTMLDivElement>) => {\n    if (!progressBarRef.current || !duration) return;\n    \n    const rect = progressBarRef.current.getBoundingClientRect();\n    const clickX = e.clientX - rect.left;\n    const newTime = (clickX / rect.width) * duration;\n    seekTo(newTime);\n  };\n\n  const handleProgressMouseDown = (e: React.MouseEvent<HTMLDivElement>) => {\n    setIsDragging(true);\n    handleProgressClick(e);\n  };\n\n  const handleProgressMouseMove = (e: React.MouseEvent<HTMLDivElement>) => {\n    if (!isDragging || !progressBarRef.current || !duration) return;\n    \n    const rect = progressBarRef.current.getBoundingClientRect();\n    const clickX = e.clientX - rect.left;\n    const newTime = Math.max(0, Math.min((clickX / rect.width) * duration, duration));\n    setDragTime(newTime);\n  };\n\n  const handleProgressMouseUp = () => {\n    if (isDragging) {\n      seekTo(dragTime);\n      setIsDragging(false);\n    }\n  };\n\n  const handleVolumeChange = (e: React.MouseEvent<HTMLDivElement>) => {\n    if (!volumeSliderRef.current) return;\n    \n    const rect = volumeSliderRef.current.getBoundingClientRect();\n    const clickX = e.clientX - rect.left;\n    const newVolume = Math.max(0, Math.min(clickX / rect.width, 1));\n    setVolume(newVolume);\n  };\n\n  const toggleMute = () => {\n    setVolume(volume > 0 ? 0 : 1);\n  };\n\n  const dismissError = () => {\n    setDismissedError(error);\n  };\n\n  const displayTime = isDragging ? dragTime : currentTime;\n  const progressPercentage = duration > 0 ? (displayTime / duration) * 100 : 0;\n  const shouldShowError = error && error !== dismissedError;\n\n  return (\n    <div className=\"fixed bottom-0 left-0 right-0 bg-white border-t border-gray-200 shadow-lg z-50\">\n      <div className=\"max-w-7xl mx-auto px-4 py-3\">\n        {shouldShowError && (\n          <div className=\"mb-2 p-2 bg-red-100 border border-red-300 rounded text-red-700 text-sm flex items-center justify-between\">\n            <span>{error}</span>\n            <button\n              onClick={dismissError}\n              className=\"ml-2 p-1 hover:bg-red-200 rounded transition-colors\"\n              aria-label=\"Dismiss error\"\n            >\n              <XMarkIcon className=\"w-4 h-4\" />\n            </button>\n          </div>\n        )}\n        \n        <div className=\"flex items-center space-x-4\">\n          {/* Episode Info */}\n          <div className=\"flex-1 min-w-0\">\n            <div className=\"flex items-center space-x-3\">\n              <div className=\"w-12 h-12 bg-gray-200 rounded flex-shrink-0 flex items-center justify-center\">\n                <span className=\"text-gray-500 text-xs\">🎵</span>\n              </div>\n              <div className=\"min-w-0 flex-1\">\n                <h4 className=\"text-sm font-medium text-gray-900 truncate\">\n                  {currentEpisode.title}\n                </h4>\n                <p className=\"text-xs text-gray-500 truncate\">\n                  Episode • {formatTime(duration)}\n                </p>\n              </div>\n            </div>\n          </div>\n\n          {/* Player Controls */}\n          <div className=\"flex-1 max-w-2xl\">\n            {/* Control Buttons */}\n            <div \n              className=\"flex items-center justify-center space-x-4 mb-2 relative\"\n              onMouseEnter={() => setShowKeyboardShortcuts(true)}\n              onMouseLeave={() => setShowKeyboardShortcuts(false)}\n            >\n              <button\n                onClick={togglePlayPause}\n                disabled={isLoading}\n                className=\"p-2 bg-gray-900 text-white rounded-full hover:bg-gray-800 transition-colors disabled:opacity-50 disabled:cursor-not-allowed\"\n              >\n                {isLoading ? (\n                  <div className=\"w-6 h-6 border-2 border-white border-t-transparent rounded-full animate-spin\" />\n                ) : isPlaying ? (\n                  <PauseIcon className=\"w-6 h-6\" />\n                ) : (\n                  <PlayIcon className=\"w-6 h-6\" />\n                )}\n              </button>\n              \n              {/* Keyboard Shortcuts Tooltip */}\n              {showKeyboardShortcuts && (\n                <div className=\"absolute bottom-full mb-2 left-1/2 transform -translate-x-1/2 bg-gray-900 text-white text-xs rounded py-2 px-3 whitespace-nowrap z-10\">\n                  <div className=\"space-y-1\">\n                    <div>Space: Play/Pause</div>\n                    <div>← →: Seek ±10s</div>\n                    <div>↑ ↓: Volume ±10%</div>\n                  </div>\n                  <div className=\"absolute top-full left-1/2 transform -translate-x-1/2 border-4 border-transparent border-t-gray-900\"></div>\n                </div>\n              )}\n            </div>\n\n            {/* Progress Bar */}\n            <div className=\"flex items-center space-x-2 text-xs text-gray-500\">\n              <span className=\"w-10 text-right\">{formatTime(displayTime)}</span>\n              <div\n                ref={progressBarRef}\n                className=\"flex-1 h-1 bg-gray-200 rounded-full cursor-pointer relative group audio-player-progress\"\n                onMouseDown={handleProgressMouseDown}\n                onMouseMove={handleProgressMouseMove}\n                onMouseUp={handleProgressMouseUp}\n                onMouseLeave={handleProgressMouseUp}\n                onClick={handleProgressClick}\n              >\n                <div\n                  className=\"h-full bg-gray-900 rounded-full relative\"\n                  style={{ width: `${progressPercentage}%` }}\n                >\n                  <div className=\"absolute right-0 top-1/2 transform -translate-y-1/2 w-3 h-3 bg-gray-900 rounded-full audio-player-progress-thumb\" />\n                </div>\n              </div>\n              <span className=\"w-10\">{formatTime(duration)}</span>\n            </div>\n          </div>\n\n          {/* Volume Control */}\n          <div className=\"flex items-center space-x-2 relative\">\n            <button\n              onClick={toggleMute}\n              onMouseEnter={() => setShowVolumeSlider(true)}\n              className=\"p-1 text-gray-600 hover:text-gray-900 transition-colors\"\n            >\n              {volume === 0 ? (\n                <SpeakerXMarkIcon className=\"w-5 h-5\" />\n              ) : (\n                <SpeakerWaveIcon className=\"w-5 h-5\" />\n              )}\n            </button>\n            \n            {showVolumeSlider && (\n              <div\n                ref={volumeSliderRef}\n                className=\"absolute bottom-full right-0 mb-2 p-2 bg-white border border-gray-200 rounded shadow-lg audio-player-volume-slider\"\n                onMouseEnter={() => setShowVolumeSlider(true)}\n              >\n                <div\n                  className=\"w-20 h-1 bg-gray-200 rounded-full cursor-pointer relative group\"\n                  onClick={handleVolumeChange}\n                >\n                  <div\n                    className=\"h-full bg-gray-900 rounded-full relative\"\n                    style={{ width: `${volume * 100}%` }}\n                  >\n                    <div className=\"absolute right-0 top-1/2 transform -translate-y-1/2 w-3 h-3 bg-gray-900 rounded-full opacity-0 group-hover:opacity-100 transition-opacity\" />\n                  </div>\n                </div>\n              </div>\n            )}\n          </div>\n        </div>\n      </div>\n    </div>\n  );\n} "
  },
  {
    "path": "frontend/src/components/DiagnosticsModal.tsx",
    "content": "import { useEffect, useMemo, useState } from 'react';\nimport { useDiagnostics } from '../contexts/DiagnosticsContext';\nimport { DIAGNOSTIC_UPDATED_EVENT, diagnostics, type DiagnosticsEntry } from '../utils/diagnostics';\n\nconst GITHUB_NEW_ISSUE_URL = 'https://github.com/podly-pure-podcasts/podly_pure_podcasts/issues/new';\n\nconst buildIssueUrl = (title: string, body: string) => {\n  const url = new URL(GITHUB_NEW_ISSUE_URL);\n  url.searchParams.set('title', title);\n  url.searchParams.set('body', body);\n  return url.toString();\n};\n\nconst formatTs = (ts: number) => {\n  try {\n    return new Date(ts).toISOString();\n  } catch {\n    return String(ts);\n  }\n};\n\nexport default function DiagnosticsModal() {\n  const { isOpen, close, clear, getEntries, currentError } = useDiagnostics();\n  const [entries, setEntries] = useState<DiagnosticsEntry[]>(() => getEntries());\n\n  useEffect(() => {\n    if (!isOpen) return;\n\n    // Refresh immediately when opened\n    setEntries(getEntries());\n\n    const handler = () => {\n      setEntries(getEntries());\n    };\n\n    window.addEventListener(DIAGNOSTIC_UPDATED_EVENT, handler);\n    return () => window.removeEventListener(DIAGNOSTIC_UPDATED_EVENT, handler);\n  }, [getEntries, isOpen]);\n\n  const recentEntries = useMemo(() => entries.slice(-80), [entries]);\n\n  const issueTitle = currentError?.title\n    ? `[FE] ${currentError.title}`\n    : '[FE] Troubleshooting info';\n\n  const issueBody = useMemo(() => {\n    const env = {\n      userAgent: typeof navigator !== 'undefined' ? navigator.userAgent : null,\n      url: typeof window !== 'undefined' ? window.location.href : null,\n      time: new Date().toISOString(),\n    };\n\n    const payload = {\n      error: currentError,\n      env,\n      logs: recentEntries,\n    };\n\n    const json = JSON.stringify(diagnostics.sanitize(payload), null, 2);\n\n    return [\n      '## What happened',\n      '(Describe what you clicked / expected / saw)',\n      '',\n      '## Diagnostics (auto-collected)',\n      '```json',\n      json,\n      '```',\n    ].join('\\n');\n  }, [currentError, recentEntries]);\n\n  const issueUrl = useMemo(() => buildIssueUrl(issueTitle, issueBody), [issueTitle, issueBody]);\n\n  if (!isOpen) return null;\n\n  return (\n    <div className=\"fixed inset-0 z-[60] flex items-center justify-center p-4\">\n      <div className=\"absolute inset-0 bg-black/40\" onClick={close} />\n\n      <div className=\"relative w-full max-w-3xl bg-white rounded-xl border border-gray-200 shadow-lg overflow-hidden\">\n        <div className=\"flex items-start justify-between gap-4 px-5 py-4 border-b border-gray-200\">\n          <div>\n            <h2 className=\"text-base font-semibold text-gray-900\">Troubleshooting</h2>\n            <p className=\"text-sm text-gray-600\">\n              {currentError\n                ? 'An error occurred. You can report it with logs.'\n                : 'Use this to collect logs for a bug report.'}\n            </p>\n          </div>\n          <button\n            type=\"button\"\n            onClick={close}\n            className=\"px-3 py-1.5 text-sm border border-gray-200 rounded-md hover:bg-gray-100\"\n          >\n            Dismiss\n          </button>\n        </div>\n\n        {currentError && (\n          <div className=\"px-5 py-4 border-b border-gray-200 bg-red-50\">\n            <div className=\"text-sm font-medium text-red-900\">{currentError.title}</div>\n            <div className=\"text-sm text-red-800 mt-1\">{currentError.message}</div>\n          </div>\n        )}\n\n        <div className=\"px-5 py-4\">\n          <div className=\"flex flex-col sm:flex-row gap-2 sm:items-center sm:justify-between mb-3\">\n            <div className=\"text-sm text-gray-700\">\n              Showing last {recentEntries.length} log entries (session only).\n            </div>\n            <div className=\"flex gap-2\">\n              <a\n                href={issueUrl}\n                target=\"_blank\"\n                rel=\"noreferrer\"\n                className=\"inline-flex items-center justify-center px-3 py-2 rounded-md bg-blue-600 text-white text-sm font-medium hover:bg-blue-700\"\n              >\n                Report on GitHub\n              </a>\n              <button\n                type=\"button\"\n                onClick={() => {\n                  try {\n                    navigator.clipboard.writeText(issueBody);\n                  } catch {\n                    // ignore\n                  }\n                }}\n                className=\"inline-flex items-center justify-center px-3 py-2 rounded-md border border-gray-200 text-sm font-medium hover:bg-gray-100\"\n              >\n                Copy logs\n              </button>\n              <button\n                type=\"button\"\n                onClick={() => {\n                  clear();\n                }}\n                className=\"inline-flex items-center justify-center px-3 py-2 rounded-md border border-gray-200 text-sm font-medium hover:bg-gray-100\"\n              >\n                Clear\n              </button>\n            </div>\n          </div>\n\n          <div className=\"border border-gray-200 rounded-md bg-gray-50 overflow-hidden\">\n            <div className=\"max-h-[45vh] overflow-auto\">\n              <pre className=\"text-xs text-gray-800 p-3 whitespace-pre-wrap break-words\">\n{recentEntries\n  .map((e) => {\n    const base = `[${formatTs(e.ts)}] ${e.level.toUpperCase()}: ${e.message}`;\n    if (e.data === undefined) return base;\n    try {\n      return base + `\\n  ${JSON.stringify(e.data)}`;\n    } catch {\n      return base;\n    }\n  })\n  .join('\\n')}\n              </pre>\n            </div>\n          </div>\n\n          <div className=\"text-xs text-gray-500 mt-2\">\n            Sensitive fields like tokens/cookies are redacted.\n          </div>\n        </div>\n      </div>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/DownloadButton.tsx",
    "content": "import { useState } from 'react';\nimport { useQueryClient } from '@tanstack/react-query';\nimport axios from 'axios';\nimport { feedsApi } from '../services/api';\nimport ReprocessButton from './ReprocessButton';\nimport { configApi } from '../services/api';\nimport { toast } from 'react-hot-toast';\nimport { useEpisodeStatus } from '../hooks/useEpisodeStatus';\n\ninterface DownloadButtonProps {\n  episodeGuid: string;\n  isWhitelisted: boolean;\n  hasProcessedAudio: boolean;\n  feedId?: number;\n  canModifyEpisodes?: boolean;\n  className?: string;\n}\n\nexport default function DownloadButton({\n  episodeGuid,\n  isWhitelisted,\n  hasProcessedAudio,\n  feedId,\n  canModifyEpisodes = true,\n  className = ''\n}: DownloadButtonProps) {\n  const [error, setError] = useState<string | null>(null);\n  const queryClient = useQueryClient();\n  \n  const { data: status } = useEpisodeStatus(episodeGuid, isWhitelisted, hasProcessedAudio, feedId);\n  \n  const isProcessing = status?.status === 'pending' || status?.status === 'running' || status?.status === 'starting';\n  const isCompleted = hasProcessedAudio || status?.status === 'completed';\n  const downloadUrl = status?.download_url || (hasProcessedAudio ? `/api/posts/${episodeGuid}/download` : undefined);\n\n  const handleDownloadClick = async () => {\n    if (!isWhitelisted) {\n      setError('Post must be whitelisted before processing');\n      return;\n    }\n\n    // Guard when LLM API key is not configured - use fresh server check\n    try {\n      const { configured } = await configApi.isConfigured();\n      if (!configured) {\n        toast.error('Add an API key in Config before processing.');\n        return;\n      }\n    } catch (err) {\n      if (!(axios.isAxiosError(err) && err.response?.status === 403)) {\n        toast.error('Unable to verify configuration. Please try again.');\n        return;\n      }\n    }\n\n    if (isCompleted && downloadUrl) {\n      // Already processed, download directly\n      try {\n        await feedsApi.downloadPost(episodeGuid);\n      } catch (err) {\n        console.error('Error downloading file:', err);\n        setError('Failed to download file');\n      }\n      return;\n    }\n\n    try {\n      setError(null);\n      // Optimistically update status to show processing state immediately\n      queryClient.setQueryData(['episode-status', episodeGuid], {\n        status: 'starting',\n        step: 0,\n        step_name: 'Starting',\n        total_steps: 4,\n        message: 'Requesting processing...'\n      });\n\n      const response = await feedsApi.processPost(episodeGuid);\n      \n      // Invalidate to trigger polling in the hook\n      queryClient.invalidateQueries({ queryKey: ['episode-status', episodeGuid] });\n\n      if (response.status === 'not_started') {\n          setError('No processing job found');\n      }\n    } catch (err: unknown) {\n      console.error('Error starting processing:', err);\n      const errorMessage = err && typeof err === 'object' && 'response' in err\n        ? (err as { response?: { data?: { error?: string; message?: string } } }).response?.data?.message \n          || (err as { response?: { data?: { error?: string } } }).response?.data?.error \n          || 'Failed to start processing'\n        : 'Failed to start processing';\n      setError(errorMessage);\n      // Invalidate to clear optimistic update if failed\n      queryClient.invalidateQueries({ queryKey: ['episode-status', episodeGuid] });\n    }\n  };\n\n  // Show completed state with download button only\n  if (isCompleted && downloadUrl) {\n    return (\n      <div className={`${className}`}>\n        <div className=\"flex gap-2\">\n          <button\n            onClick={handleDownloadClick}\n            className=\"px-3 py-1 text-xs rounded font-medium transition-colors bg-blue-600 text-white hover:bg-blue-700\"\n            title=\"Download processed episode\"\n          >\n            Download\n          </button>\n          <ReprocessButton\n            episodeGuid={episodeGuid}\n            isWhitelisted={isWhitelisted}\n            feedId={feedId}\n            canModifyEpisodes={canModifyEpisodes}\n            onReprocessStart={() => {\n              queryClient.invalidateQueries({ queryKey: ['episode-status', episodeGuid] });\n            }}\n          />\n        </div>\n        {error && (\n          <div className=\"text-xs text-red-600 mt-1\">\n            {error}\n          </div>\n        )}\n      </div>\n    );\n  }\n\n  // If user can't modify episodes, don't show the Process button\n  if (!canModifyEpisodes) {\n    return null;\n  }\n\n  // If processing, hide the button (EpisodeProcessingStatus will show progress)\n  if (isProcessing) {\n    return null;\n  }\n\n  return (\n    <div className={`space-y-2 ${className}`}>\n      <button\n        onClick={handleDownloadClick}\n        className=\"px-3 py-1 text-xs rounded font-medium transition-colors border bg-white text-gray-700 border-gray-300 hover:bg-gray-50 hover:border-gray-400 hover:text-gray-900\"\n        title=\"Start processing episode\"\n      >\n        Process\n      </button>\n\n      {/* Error message */}\n      {error && (\n        <div className=\"text-xs text-red-600 text-center\">\n          {error}\n        </div>\n      )}\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/EpisodeProcessingStatus.tsx",
    "content": "import { useEpisodeStatus } from '../hooks/useEpisodeStatus';\n\ninterface EpisodeProcessingStatusProps {\n  episodeGuid: string;\n  isWhitelisted: boolean;\n  hasProcessedAudio: boolean;\n  feedId?: number;\n  className?: string;\n}\n\nexport default function EpisodeProcessingStatus({\n  episodeGuid,\n  isWhitelisted,\n  hasProcessedAudio,\n  feedId,\n  className = ''\n}: EpisodeProcessingStatusProps) {\n  const { data: status } = useEpisodeStatus(episodeGuid, isWhitelisted, hasProcessedAudio, feedId);\n\n  if (!status) return null;\n\n  // Don't show anything if completed (DownloadButton handles this) or not started\n  if (status.status === 'completed' || status.status === 'not_started') {\n    return null;\n  }\n\n  const getProgressPercentage = () => {\n    if (!status) return 0;\n    return (status.step / status.total_steps) * 100;\n  };\n\n  const getStepIcon = (stepNumber: number) => {\n    if (!status) return '○';\n\n    if (status.step > stepNumber) {\n      return '✓'; // Completed\n    } else if (status.step === stepNumber) {\n      return '●'; // Current\n    } else {\n      return '○'; // Not started\n    }\n  };\n\n  return (\n    <div className={`space-y-2 min-w-[200px] ${className}`}>\n      {/* Progress indicator */}\n      <div className=\"space-y-1\">\n        {/* Progress bar */}\n        <div className=\"w-full bg-gray-200 rounded-full h-1.5\">\n          <div\n            className={`h-1.5 rounded-full transition-all duration-300 ${\n              status.status === 'error' || status.status === 'failed' ? 'bg-red-500' : 'bg-blue-500'\n            }`}\n            style={{ width: `${getProgressPercentage()}%` }}\n          />\n        </div>\n\n        {/* Step indicators */}\n        <div className=\"flex justify-between text-xs text-gray-600\">\n          {[1, 2, 3, 4].map((stepNumber) => (\n            <div\n              key={stepNumber}\n              className={`flex flex-col items-center ${\n                status.step === stepNumber ? 'text-blue-600 font-medium' : ''\n              } ${\n                status.step > stepNumber ? 'text-green-600' : ''\n              }`}\n            >\n              <span className=\"text-xs\">{getStepIcon(stepNumber)}</span>\n              <span className=\"text-xs\">{stepNumber}/4</span>\n            </div>\n          ))}\n        </div>\n\n        {/* Current step name */}\n        <div className=\"text-xs text-center text-gray-600\">\n          {status.step_name}\n        </div>\n      </div>\n\n      {/* Error message */}\n      {(status.error || status.status === 'failed' || status.status === 'error') && (\n        <div className=\"text-xs text-red-600 text-center\">\n          {status.error || 'Processing failed'}\n        </div>\n      )}\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/FeedDetail.tsx",
    "content": "import { useQuery, useMutation, useQueryClient } from '@tanstack/react-query';\nimport { useState, useEffect, useRef, useMemo } from 'react';\nimport { toast } from 'react-hot-toast';\nimport type { Feed, Episode, PagedResult, ConfigResponse } from '../types';\nimport { feedsApi, configApi } from '../services/api';\nimport DownloadButton from './DownloadButton';\nimport PlayButton from './PlayButton';\nimport ProcessingStatsButton from './ProcessingStatsButton';\nimport EpisodeProcessingStatus from './EpisodeProcessingStatus';\nimport { useAuth } from '../contexts/AuthContext';\nimport { copyToClipboard } from '../utils/clipboard';\nimport { emitDiagnosticError } from '../utils/diagnostics';\nimport { getHttpErrorInfo } from '../utils/httpError';\n\ninterface FeedDetailProps {\n  feed: Feed;\n  onClose?: () => void;\n  onFeedDeleted?: () => void;\n}\n\ntype SortOption = 'newest' | 'oldest' | 'title';\n\ninterface ProcessingEstimate {\n  post_guid: string;\n  estimated_minutes: number;\n  can_process: boolean;\n  reason: string | null;\n}\n\nconst EPISODES_PAGE_SIZE = 25;\n\nexport default function FeedDetail({ feed, onClose, onFeedDeleted }: FeedDetailProps) {\n  const { requireAuth, isAuthenticated, user } = useAuth();\n  const [sortBy, setSortBy] = useState<SortOption>('newest');\n  const [showStickyHeader, setShowStickyHeader] = useState(false);\n  const [showHelp, setShowHelp] = useState(false);\n  const [showMenu, setShowMenu] = useState(false);\n  const queryClient = useQueryClient();\n  const scrollContainerRef = useRef<HTMLDivElement>(null);\n  const feedHeaderRef = useRef<HTMLDivElement>(null);\n  const [currentFeed, setCurrentFeed] = useState(feed);\n  const [pendingEpisode, setPendingEpisode] = useState<Episode | null>(null);\n  const [showProcessingModal, setShowProcessingModal] = useState(false);\n  const [processingEstimate, setProcessingEstimate] = useState<ProcessingEstimate | null>(null);\n  const [isEstimating, setIsEstimating] = useState(false);\n  const [estimateError, setEstimateError] = useState<string | null>(null);\n  const [page, setPage] = useState(1);\n\n  const isAdmin = !requireAuth || user?.role === 'admin';\n  const whitelistedOnly = requireAuth && !isAdmin;\n\n  const { data: configResponse } = useQuery<ConfigResponse>({\n    queryKey: ['config'],\n    queryFn: configApi.getConfig,\n    enabled: isAdmin,\n  });\n\n  const {\n    data: episodesPage,\n    isLoading,\n    isFetching,\n    error,\n  } = useQuery<PagedResult<Episode>, Error, PagedResult<Episode>, [string, number, number, boolean]>({\n    queryKey: ['episodes', currentFeed.id, page, whitelistedOnly],\n    queryFn: () =>\n      feedsApi.getFeedPosts(currentFeed.id, {\n        page,\n        pageSize: EPISODES_PAGE_SIZE,\n        whitelistedOnly,\n      }),\n    placeholderData: (previousData) => previousData,\n  });\n\n  const whitelistMutation = useMutation({\n    mutationFn: ({ guid, whitelisted, triggerProcessing }: { guid: string; whitelisted: boolean; triggerProcessing?: boolean }) =>\n      feedsApi.togglePostWhitelist(guid, whitelisted, triggerProcessing),\n    onSuccess: () => {\n      queryClient.invalidateQueries({ queryKey: ['episodes', currentFeed.id] });\n    },\n    onError: (err) => {\n      const { status, data, message } = getHttpErrorInfo(err);\n      emitDiagnosticError({\n        title: 'Failed to update whitelist status',\n        message,\n        kind: status ? 'http' : 'network',\n        details: {\n          status,\n          response: data,\n        },\n      });\n    },\n  });\n\n  const bulkWhitelistMutation = useMutation({\n    mutationFn: () => feedsApi.toggleAllPostsWhitelist(currentFeed.id),\n    onSuccess: () => {\n      queryClient.invalidateQueries({ queryKey: ['episodes', currentFeed.id] });\n    },\n  });\n\n  const refreshFeedMutation = useMutation({\n    mutationFn: () => feedsApi.refreshFeed(currentFeed.id),\n    onSuccess: (data) => {\n      queryClient.invalidateQueries({ queryKey: ['feeds'] });\n      queryClient.invalidateQueries({ queryKey: ['episodes', currentFeed.id] });\n      toast.success(data?.message ?? 'Feed refreshed');\n    },\n    onError: (err) => {\n      console.error('Failed to refresh feed', err);\n      const { status, data, message } = getHttpErrorInfo(err);\n      emitDiagnosticError({\n        title: 'Failed to refresh feed',\n        message,\n        kind: status ? 'http' : 'network',\n        details: {\n          status,\n          response: data,\n          feedId: currentFeed.id,\n        },\n      });\n    },\n  });\n\n  const updateFeedSettingsMutation = useMutation({\n    mutationFn: (override: boolean | null) =>\n      feedsApi.updateFeedSettings(currentFeed.id, {\n        auto_whitelist_new_episodes_override: override,\n      }),\n    onSuccess: (data) => {\n      setCurrentFeed(data);\n      queryClient.invalidateQueries({ queryKey: ['feeds'] });\n      toast.success('Feed settings updated');\n    },\n    onError: (err) => {\n      const { status, data, message } = getHttpErrorInfo(err);\n      emitDiagnosticError({\n        title: 'Failed to update feed settings',\n        message,\n        kind: status ? 'http' : 'network',\n        details: {\n          status,\n          response: data,\n          feedId: currentFeed.id,\n        },\n      });\n      toast.error('Failed to update feed settings');\n    },\n  });\n\n  const deleteFeedMutation = useMutation({\n    mutationFn: () => feedsApi.deleteFeed(currentFeed.id),\n    onSuccess: () => {\n      queryClient.invalidateQueries({ queryKey: ['feeds'] });\n      if (onFeedDeleted) {\n        onFeedDeleted();\n      }\n    },\n    onError: (err) => {\n      console.error('Failed to delete feed', err);\n      const { status, data, message } = getHttpErrorInfo(err);\n      emitDiagnosticError({\n        title: 'Failed to delete feed',\n        message,\n        kind: status ? 'http' : 'network',\n        details: {\n          status,\n          response: data,\n          feedId: currentFeed.id,\n        },\n      });\n    },\n  });\n\n  const joinFeedMutation = useMutation({\n    mutationFn: () => feedsApi.joinFeed(currentFeed.id),\n    onSuccess: (data) => {\n      toast.success('Joined feed');\n      setCurrentFeed(data);\n      queryClient.invalidateQueries({ queryKey: ['feeds'] });\n    },\n    onError: (err) => {\n      console.error('Failed to join feed', err);\n      const { status, data, message } = getHttpErrorInfo(err);\n      emitDiagnosticError({\n        title: 'Failed to join feed',\n        message,\n        kind: status ? 'http' : 'network',\n        details: {\n          status,\n          response: data,\n          feedId: currentFeed.id,\n        },\n      });\n    },\n  });\n\n  const leaveFeedMutation = useMutation({\n    mutationFn: () => feedsApi.leaveFeed(currentFeed.id),\n    onSuccess: () => {\n      toast.success('Removed from your feeds');\n      setCurrentFeed((prev) => (prev ? { ...prev, is_member: false, is_active_subscription: false } : prev));\n      queryClient.invalidateQueries({ queryKey: ['feeds'] });\n      if (onFeedDeleted && !isAdmin) {\n        onFeedDeleted();\n      }\n    },\n    onError: (err) => {\n      console.error('Failed to leave feed', err);\n      const { status, data, message } = getHttpErrorInfo(err);\n      emitDiagnosticError({\n        title: 'Failed to remove feed',\n        message,\n        kind: status ? 'http' : 'network',\n        details: {\n          status,\n          response: data,\n          feedId: currentFeed.id,\n        },\n      });\n    },\n  });\n\n  useEffect(() => {\n    setCurrentFeed(feed);\n  }, [feed]);\n\n  useEffect(() => {\n    setPage(1);\n  }, [feed.id, whitelistedOnly]);\n\n  // Handle scroll to show/hide sticky header\n  useEffect(() => {\n    const scrollContainer = scrollContainerRef.current;\n    const feedHeader = feedHeaderRef.current;\n\n    if (!scrollContainer || !feedHeader) return;\n\n    const handleScroll = () => {\n      const scrollTop = scrollContainer.scrollTop;\n      const feedHeaderHeight = feedHeader.offsetHeight;\n\n      // Show sticky header when scrolled past the feed header\n      setShowStickyHeader(scrollTop > feedHeaderHeight - 100);\n    };\n\n    scrollContainer.addEventListener('scroll', handleScroll);\n    return () => scrollContainer.removeEventListener('scroll', handleScroll);\n  }, []);\n\n  // Handle click outside to close menu\n  useEffect(() => {\n    const handleClickOutside = (event: MouseEvent) => {\n      if (showMenu && !(event.target as Element).closest('.menu-container')) {\n        setShowMenu(false);\n      }\n    };\n\n    document.addEventListener('mousedown', handleClickOutside);\n    return () => document.removeEventListener('mousedown', handleClickOutside);\n  }, [showMenu]);\n\n  const handleWhitelistToggle = (episode: Episode) => {\n    if (!episode.whitelisted) {\n      setPendingEpisode(episode);\n      setShowProcessingModal(true);\n      setProcessingEstimate(null);\n      setEstimateError(null);\n      setIsEstimating(true);\n      feedsApi\n        .getProcessingEstimate(episode.guid)\n        .then((estimate) => {\n          setProcessingEstimate(estimate);\n        })\n        .catch((err) => {\n          console.error('Failed to load processing estimate', err);\n          const { status, data, message } = getHttpErrorInfo(err);\n          emitDiagnosticError({\n            title: 'Failed to load processing estimate',\n            message,\n            kind: status ? 'http' : 'network',\n            details: {\n              status,\n              response: data,\n              postGuid: episode.guid,\n            },\n          });\n          setEstimateError(message ?? 'Unable to estimate processing time');\n        })\n        .finally(() => setIsEstimating(false));\n      return;\n    }\n\n    whitelistMutation.mutate({\n      guid: episode.guid,\n      whitelisted: false,\n    });\n  };\n\n  const handleConfirmProcessing = () => {\n    if (!pendingEpisode) return;\n    whitelistMutation.mutate(\n      {\n        guid: pendingEpisode.guid,\n        whitelisted: true,\n        triggerProcessing: true,\n      },\n      {\n        onSuccess: () => {\n          setShowProcessingModal(false);\n          setPendingEpisode(null);\n          setProcessingEstimate(null);\n        },\n      }\n    );\n  };\n\n  const handleCancelProcessing = () => {\n    setShowProcessingModal(false);\n    setPendingEpisode(null);\n    setProcessingEstimate(null);\n    setEstimateError(null);\n  };\n\n  const handleAutoWhitelistOverrideChange = (value: string) => {\n    const override =\n      value === 'inherit' ? null : value === 'on';\n    updateFeedSettingsMutation.mutate(override);\n  };\n\n  const isMember = Boolean(currentFeed.is_member);\n  const isActiveSubscription = currentFeed.is_active_subscription !== false;\n\n  // Admins can manage everything; regular users are read-only.\n  const canDeleteFeed = isAdmin; // only admins can delete feeds\n  const canModifyEpisodes = !requireAuth ? true : Boolean(isAdmin);\n  const canBulkModifyEpisodes = !requireAuth ? true : Boolean(isAdmin);\n  const canSubscribe = !requireAuth || isMember;\n  const showPodlyRssButton = !(requireAuth && isAdmin && !isMember);\n  const showWhitelistUi = canModifyEpisodes && isAdmin;\n  const appAutoWhitelistDefault =\n    configResponse?.config?.app?.automatically_whitelist_new_episodes;\n  const autoWhitelistDefaultLabel =\n    appAutoWhitelistDefault === undefined\n      ? 'Unknown'\n      : appAutoWhitelistDefault\n        ? 'On'\n        : 'Off';\n  const autoWhitelistOverrideValue =\n    currentFeed.auto_whitelist_new_episodes_override ?? null;\n  const autoWhitelistSelectValue =\n    autoWhitelistOverrideValue === true\n      ? 'on'\n      : autoWhitelistOverrideValue === false\n        ? 'off'\n        : 'inherit';\n\n  const episodes = episodesPage?.items ?? [];\n  const totalCount = episodesPage?.total ?? 0;\n  const whitelistedCount =\n    episodesPage?.whitelisted_total ?? episodes.filter((ep: Episode) => ep.whitelisted).length;\n  const totalPages = Math.max(\n    1,\n    episodesPage?.total_pages ?? Math.ceil(totalCount / EPISODES_PAGE_SIZE)\n  );\n  const hasEpisodes = totalCount > 0;\n  const visibleStart = hasEpisodes ? (page - 1) * EPISODES_PAGE_SIZE + 1 : 0;\n  const visibleEnd = hasEpisodes ? Math.min(totalCount, page * EPISODES_PAGE_SIZE) : 0;\n\n  useEffect(() => {\n    if (page > totalPages && totalPages > 0) {\n      setPage(totalPages);\n    }\n  }, [page, totalPages]);\n\n  const handleBulkWhitelistToggle = () => {\n    if (requireAuth && !isAdmin) {\n      toast.error('Only admins can bulk toggle whitelist status.');\n      return;\n    }\n    bulkWhitelistMutation.mutate();\n  };\n\n  const handleDeleteFeed = () => {\n    if (confirm(`Are you sure you want to delete \"${currentFeed.title}\"? This action cannot be undone.`)) {\n      deleteFeedMutation.mutate();\n    }\n  };\n\n  const episodesToShow = useMemo(() => episodes, [episodes]);\n\n  const sortedEpisodes = useMemo(() => {\n    const list = [...episodesToShow];\n    return list.sort((a, b) => {\n      switch (sortBy) {\n        case 'newest':\n          return new Date(b.release_date || 0).getTime() - new Date(a.release_date || 0).getTime();\n        case 'oldest':\n          return new Date(a.release_date || 0).getTime() - new Date(b.release_date || 0).getTime();\n        case 'title':\n          return a.title.localeCompare(b.title);\n        default:\n          return 0;\n      }\n    });\n  }, [episodesToShow, sortBy]);\n\n  // Calculate whitelist status for bulk button\n  const allWhitelisted = totalCount > 0 && whitelistedCount === totalCount;\n\n  const formatDate = (dateString: string | null) => {\n    if (!dateString) return 'Unknown date';\n    return new Date(dateString).toLocaleDateString('en-US', {\n      year: 'numeric',\n      month: 'short',\n      day: 'numeric'\n    });\n  };\n\n  const formatDuration = (seconds: number | null) => {\n    if (!seconds) return '';\n    const hours = Math.floor(seconds / 3600);\n    const minutes = Math.floor((seconds % 3600) / 60);\n    if (hours > 0) {\n      return `${hours}h ${minutes}m`;\n    }\n    return `${minutes}m`;\n  };\n\n  const handleCopyRssToClipboard = async () => {\n    if (requireAuth && !isAuthenticated) {\n      toast.error('Please sign in to copy a protected RSS URL.');\n      return;\n    }\n\n    try {\n      let rssUrl: string;\n      if (requireAuth) {\n        const response = await feedsApi.createProtectedFeedShareLink(currentFeed.id);\n        rssUrl = response.url;\n      } else {\n        rssUrl = new URL(`/feed/${currentFeed.id}`, window.location.origin).toString();\n      }\n\n      await copyToClipboard(rssUrl, 'Copy the Feed RSS URL:', 'Feed URL copied to clipboard!');\n    } catch (err) {\n      console.error('Failed to copy feed URL', err);\n      toast.error('Failed to copy feed URL');\n    }\n  };\n\n  const handleCopyOriginalRssToClipboard = async () => {\n    try {\n      const rssUrl = currentFeed.rss_url || '';\n      if (!rssUrl) throw new Error('No RSS URL');\n\n      await copyToClipboard(rssUrl, 'Copy the Original RSS URL:', 'Original RSS URL copied to clipboard');\n    } catch (err) {\n      console.error('Failed to copy original RSS URL', err);\n      toast.error('Failed to copy original RSS URL');\n    }\n  };\n\n  return (\n    <div className=\"h-full flex flex-col bg-white relative\">\n      {/* Mobile Header */}\n      <div className=\"flex items-center justify-between p-4 border-b lg:hidden\">\n        <h2 className=\"text-lg font-semibold text-gray-900\">Podcast Details</h2>\n        {onClose && (\n          <button\n            onClick={onClose}\n            className=\"p-2 text-gray-400 hover:text-gray-600\"\n          >\n            <svg className=\"w-6 h-6\" fill=\"none\" stroke=\"currentColor\" viewBox=\"0 0 24 24\">\n              <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M6 18L18 6M6 6l12 12\" />\n            </svg>\n          </button>\n        )}\n      </div>\n\n      {/* Sticky Header - appears when scrolling */}\n      <div className={`absolute top-16 lg:top-0 left-0 right-0 z-10 bg-white border-b transition-all duration-300 ${\n        showStickyHeader ? 'opacity-100 translate-y-0' : 'opacity-0 -translate-y-full pointer-events-none'\n      }`}>\n        <div className=\"p-4\">\n          <div className=\"flex items-center gap-3\">\n            {currentFeed.image_url && (\n              <img\n                src={currentFeed.image_url}\n                alt={currentFeed.title}\n                className=\"w-10 h-10 rounded-lg object-cover\"\n              />\n            )}\n            <div className=\"flex-1 min-w-0\">\n              <h2 className=\"font-semibold text-gray-900 truncate\">{currentFeed.title}</h2>\n              {currentFeed.author && (\n                <p className=\"text-sm text-gray-600 truncate\">by {currentFeed.author}</p>\n              )}\n            </div>\n            <select\n              value={sortBy}\n              onChange={(e) => setSortBy(e.target.value as SortOption)}\n              className=\"text-sm border border-gray-300 rounded-md px-3 py-1 bg-white\"\n            >\n              <option value=\"newest\">Newest First</option>\n              <option value=\"oldest\">Oldest First</option>\n              <option value=\"title\">Title A-Z</option>\n            </select>\n\n            {/* do not add addtional controls to sticky headers */}\n          </div>\n        </div>\n      </div>\n\n      {/* Scrollable Content */}\n      <div ref={scrollContainerRef} className=\"flex-1 overflow-y-auto\">\n        {/* Feed Info Header */}\n        <div ref={feedHeaderRef} className=\"p-6 border-b\">\n          <div className=\"flex flex-col gap-6\">\n            {/* Top Section: Image and Title */}\n            <div className=\"flex items-end gap-6\">\n              {/* Podcast Image */}\n              <div className=\"flex-shrink-0\">\n                {currentFeed.image_url ? (\n                  <img\n                    src={currentFeed.image_url}\n                    alt={currentFeed.title}\n                    className=\"w-32 h-32 sm:w-40 sm:h-40 rounded-lg object-cover shadow-lg\"\n                  />\n                ) : (\n                  <div className=\"w-32 h-32 sm:w-40 sm:h-40 rounded-lg bg-gray-200 flex items-center justify-center shadow-lg\">\n                    <svg className=\"w-16 h-16 text-gray-400\" fill=\"none\" stroke=\"currentColor\" viewBox=\"0 0 24 24\">\n                      <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M9 19V6l12-3v13M9 19c0 1.105-1.343 2-3 2s-3-.895-3-2 1.343-2 3-2 3 .895 3 2zm12-3c0 1.105-1.343 2-3 2s-3-.895-3-2 1.343-2 3-2 3 .895 3 2zM9 10l12-3\" />\n                    </svg>\n                  </div>\n                )}\n              </div>\n\n              {/* Title aligned to bottom-left of image */}\n              <div className=\"flex-1 min-w-0 pb-2\">\n                <h1 className=\"text-2xl font-bold text-gray-900 mb-1\">{currentFeed.title}</h1>\n                {currentFeed.author && (\n                  <p className=\"text-lg text-gray-600\">by {currentFeed.author}</p>\n                )}\n                <div className=\"mt-2 text-sm text-gray-500\">\n                  <span>{totalCount} episodes visible</span>\n                </div>\n                {requireAuth && isAdmin && (\n                  <div className=\"mt-2 flex items-center gap-2 flex-wrap text-sm\">\n                    <span\n                      className={`px-2 py-1 rounded-full text-xs font-medium border ${\n                        isMember\n                          ? 'bg-green-50 text-green-700 border-green-200'\n                          : 'bg-gray-100 text-gray-600 border-gray-200'\n                      }`}\n                    >\n                      {isMember ? 'Joined' : 'Not joined'}\n                    </span>\n                    {isMember && !isActiveSubscription && (\n                      <span className=\"px-2 py-1 rounded-full text-xs font-medium border bg-amber-50 text-amber-700 border-amber-200\">\n                        Paused\n                      </span>\n                    )}\n                  </div>\n                )}\n              </div>\n            </div>\n\n            {/* RSS Button and Menu */}\n            <div className=\"flex items-center gap-3\">\n              {/* Podly RSS Subscribe Button */}\n              {showPodlyRssButton && (\n                <button\n                  onClick={handleCopyRssToClipboard}\n                  title=\"Copy Podly RSS feed URL\"\n                  className={`flex items-center gap-3 px-5 py-2 bg-black hover:bg-gray-900 text-white rounded-lg font-medium transition-colors ${\n                    !canSubscribe ? 'opacity-60 cursor-not-allowed' : ''\n                  }`}\n                  disabled={!canSubscribe}\n                >\n                  <img\n                    src=\"/rss-round-color-icon.svg\"\n                    alt=\"Podly RSS\"\n                    className=\"w-6 h-6\"\n                    aria-hidden=\"true\"\n                  />\n                  <span className=\"text-white\">\n                    {canSubscribe ? 'Subscribe to Podly RSS' : 'Join feed to subscribe'}\n                  </span>\n                </button>\n              )}\n\n              {requireAuth && isAdmin && !isMember && (\n                <button\n                  onClick={() => joinFeedMutation.mutate()}\n                  disabled={joinFeedMutation.isPending}\n                  className={`flex items-center gap-2 px-4 py-2 rounded-lg font-medium transition-colors ${\n                    joinFeedMutation.isPending\n                      ? 'bg-blue-100 text-blue-300 cursor-not-allowed'\n                      : 'bg-blue-600 text-white hover:bg-blue-700'\n                  }`}\n                >\n                  <svg className=\"w-4 h-4\" fill=\"none\" stroke=\"currentColor\" viewBox=\"0 0 24 24\">\n                    <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M12 4v16m8-8H4\" />\n                  </svg>\n                  Join feed\n                </button>\n              )}\n\n              {canModifyEpisodes && (\n                <button\n                  onClick={() => refreshFeedMutation.mutate()}\n                  disabled={refreshFeedMutation.isPending}\n                  title=\"Refresh feed from source\"\n                  className={`flex items-center gap-2 px-4 py-2 rounded-lg font-medium transition-colors ${\n                    refreshFeedMutation.isPending\n                      ? 'bg-gray-200 text-gray-500 cursor-not-allowed'\n                      : 'bg-gray-100 text-gray-700 hover:bg-gray-200'\n                  }`}\n                >\n                  <img\n                    className={`w-4 h-4 ${refreshFeedMutation.isPending ? 'animate-spin' : ''}`}\n                    src=\"/reload-icon.svg\"\n                    alt=\"Refresh feed\"\n                    aria-hidden=\"true\"\n                  />\n                  <span>Refresh Feed</span>\n                </button>\n              )}\n\n              {/* Ellipsis Menu */}\n              <div className=\"relative menu-container\">\n                <button\n                  onClick={() => setShowMenu(!showMenu)}\n                  className=\"w-10 h-10 rounded-lg bg-gray-100 hover:bg-gray-200 flex items-center justify-center text-gray-600 hover:text-gray-800 transition-colors\"\n                >\n                  <svg className=\"w-5 h-5\" fill=\"currentColor\" viewBox=\"0 0 24 24\">\n                    <path d=\"M12 8c1.1 0 2-.9 2-2s-.9-2-2-2-2 .9-2 2 .9 2 2 2zm0 2c-1.1 0-2 .9-2 2s.9 2 2 2 2-.9 2-2-.9-2-2-2zm0 6c-1.1 0-2 .9-2 2s.9 2 2 2 2-.9 2-2-.9-2-2-2z\"/>\n                  </svg>\n                </button>\n\n                {/* Dropdown Menu */}\n                {showMenu && (\n                  <div className=\"absolute top-full right-0 mt-1 w-56 bg-white rounded-lg shadow-lg border border-gray-200 py-1 z-20 max-w-[calc(100vw-2rem)]\">\n                      {canBulkModifyEpisodes && (\n                        <>\n                          <button\n                            onClick={() => {\n                              if (!allWhitelisted) {\n                                handleBulkWhitelistToggle();\n                            }\n                            setShowMenu(false);\n                          }}\n                          disabled={bulkWhitelistMutation.isPending || totalCount === 0 || allWhitelisted}\n                          className=\"w-full px-4 py-2 text-left text-sm text-gray-700 hover:bg-gray-50 flex items-center gap-3 disabled:opacity-50 disabled:cursor-not-allowed\"\n                        >\n                          <span className=\"text-green-600\">✓</span>\n                          Enable all episodes\n                        </button>\n\n                        <button\n                          onClick={() => {\n                            if (allWhitelisted) {\n                              handleBulkWhitelistToggle();\n                            }\n                            setShowMenu(false);\n                          }}\n                          disabled={bulkWhitelistMutation.isPending || totalCount === 0 || !allWhitelisted}\n                          className=\"w-full px-4 py-2 text-left text-sm text-gray-700 hover:bg-gray-50 flex items-center gap-3 disabled:opacity-50 disabled:cursor-not-allowed\"\n                        >\n                          <span className=\"text-red-600\">⛔</span>\n                          Disable all episodes\n                        </button>\n                      </>\n                      )}\n\n                      {isAdmin && (\n                        <button\n                          onClick={() => {\n                            setShowHelp(!showHelp);\n                            setShowMenu(false);\n                          }}\n                          className=\"w-full px-4 py-2 text-left text-sm text-gray-700 hover:bg-gray-50 flex items-center gap-3\"\n                        >\n                          <span className=\"text-blue-600\">ℹ️</span>\n                          Explain whitelist\n                        </button>\n                      )}\n\n                    <button\n                      onClick={() => {\n                        handleCopyOriginalRssToClipboard();\n                        setShowMenu(false);\n                      }}\n                      className=\"w-full px-4 py-2 text-left text-sm text-gray-700 hover:bg-gray-50 flex items-center gap-3\"\n                    >\n                      <img src=\"/rss-round-color-icon.svg\" alt=\"Original RSS\" className=\"w-4 h-4\" />\n                      Original RSS feed\n                    </button>\n\n                    {requireAuth && isAdmin && isMember && (\n                      <>\n                        <div className=\"border-t border-gray-100 my-1\"></div>\n                        <button\n                          onClick={() => {\n                            leaveFeedMutation.mutate();\n                            setShowMenu(false);\n                          }}\n                          disabled={leaveFeedMutation.isPending}\n                          className=\"w-full px-4 py-2 text-left text-sm text-gray-700 hover:bg-gray-50 flex items-center gap-3 disabled:opacity-50 disabled:cursor-not-allowed\"\n                        >\n                          <svg className=\"w-4 h-4\" fill=\"none\" stroke=\"currentColor\" viewBox=\"0 0 24 24\">\n                            <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M17 16l4-4m0 0l-4-4m4 4H7m6 4v1a3 3 0 01-3 3H6a3 3 0 01-3-3V7a3 3 0 013-3h4a3 3 0 013 3v1\" />\n                          </svg>\n                          Leave feed\n                        </button>\n                      </>\n                    )}\n\n                    {canDeleteFeed && (\n                      <>\n                        <div className=\"border-t border-gray-100 my-1\"></div>\n\n                        <button\n                          onClick={() => {\n                            handleDeleteFeed();\n                            setShowMenu(false);\n                          }}\n                          disabled={deleteFeedMutation.isPending}\n                          className=\"w-full px-4 py-2 text-left text-sm text-red-600 hover:bg-red-50 flex items-center gap-3 disabled:opacity-50 disabled:cursor-not-allowed\"\n                        >\n                          <svg className=\"w-4 h-4\" fill=\"none\" stroke=\"currentColor\" viewBox=\"0 0 24 24\">\n                            <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M19 7l-.867 12.142A2 2 0 0116.138 21H7.862a2 2 0 01-1.995-1.858L5 7m5 4v6m4-6v6m1-10V4a1 1 0 00-1-1h-4a1 1 0 00-1 1v3M4 7h16\" />\n                          </svg>\n                          Delete feed\n                        </button>\n                      </>\n                    )}\n                  </div>\n                )}\n              </div>\n            </div>\n\n            {/* Feed Description */}\n            {currentFeed.description && (\n              <div className=\"text-gray-700 leading-relaxed\">\n                <p>{currentFeed.description.replace(/<[^>]*>/g, '')}</p>\n              </div>\n            )}\n\n            {isAdmin && (\n              <div className=\"rounded-lg border border-gray-200 bg-gray-50 p-4\">\n                <div className=\"flex flex-col gap-2\">\n                  <div>\n                    <label className=\"text-sm font-medium text-gray-900\">\n                      Auto-whitelist new episodes\n                    </label>\n                    <p className=\"text-xs text-gray-600\">\n                      Overrides the global setting. Global default: {autoWhitelistDefaultLabel}.\n                    </p>\n                  </div>\n                  <select\n                    value={autoWhitelistSelectValue}\n                    onChange={(e) => handleAutoWhitelistOverrideChange(e.target.value)}\n                    disabled={updateFeedSettingsMutation.isPending}\n                    className={`text-sm border border-gray-300 rounded-md px-3 py-2 bg-white ${\n                      updateFeedSettingsMutation.isPending\n                        ? 'opacity-60 cursor-not-allowed'\n                        : ''\n                    }`}\n                  >\n                    <option value=\"inherit\">Use global setting ({autoWhitelistDefaultLabel})</option>\n                    <option value=\"on\">On</option>\n                    <option value=\"off\">Off</option>\n                  </select>\n                </div>\n              </div>\n            )}\n          </div>\n        </div>\n\n        {/* Inactive Subscription Warning */}\n        {currentFeed.is_member && currentFeed.is_active_subscription === false && (\n          <div className=\"bg-amber-50 border border-amber-200 rounded-lg p-4 flex items-start gap-3\">\n            <svg className=\"w-5 h-5 text-amber-600 mt-0.5 flex-shrink-0\" fill=\"none\" stroke=\"currentColor\" viewBox=\"0 0 24 24\">\n              <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M12 9v2m0 4h.01m-6.938 4h13.856c1.54 0 2.502-1.667 1.732-3L13.732 4c-.77-1.333-2.694-1.333-3.464 0L3.34 16c-.77 1.333.192 3 1.732 3z\" />\n            </svg>\n            <div>\n              <h3 className=\"text-sm font-medium text-amber-800\">Processing Paused</h3>\n              <p className=\"text-sm text-amber-700 mt-1\">\n                This feed exceeds your plan's allowance. New episodes will not be processed automatically until you upgrade your plan or leave other feeds.\n              </p>\n            </div>\n          </div>\n        )}\n\n        {/* Episodes Header with Sort Only */}\n        <div className=\"p-4 border-b bg-gray-50\">\n          <div className=\"flex items-center justify-between\">\n            <h3 className=\"text-lg font-semibold text-gray-900\">Episodes</h3>\n            <select\n              value={sortBy}\n              onChange={(e) => setSortBy(e.target.value as SortOption)}\n              className=\"text-sm border border-gray-300 rounded-md px-3 py-1 bg-white\"\n            >\n              <option value=\"newest\">Newest First</option>\n              <option value=\"oldest\">Oldest First</option>\n              <option value=\"title\">Title A-Z</option>\n            </select>\n          </div>\n        </div>\n\n            {/* Help Explainer (admins only) */}\n            {showHelp && isAdmin && (\n          <div className=\"bg-blue-50 border-b border-blue-200 p-4\">\n            <div className=\"max-w-2xl\">\n              <h4 className=\"font-semibold text-blue-900 mb-2\">About Enabling & Disabling Ad Removal</h4>\n              <div className=\"text-sm text-blue-800 space-y-2 text-left\">\n                <p>\n                  <strong>Enabled episodes</strong> are processed by Podly to automatically detect and remove advertisements,\n                  giving you a clean, ad-free listening experience.\n                </p>\n                <p>\n                  <strong>Disabled episodes</strong> are not processed and won't be available for download through Podly.\n                  This is useful for episodes you don't want to listen to.\n                </p>\n                <p>\n                  <strong>Why whitelist episodes?</strong> Processing takes time and computational resources.\n                  Enable only the episodes you want to hear to keep your feed focused. This is useful when adding a new feed with a large back catalog.\n                </p>\n              </div>\n              <button\n                onClick={() => setShowHelp(false)}\n                className=\"mt-3 text-xs text-blue-600 hover:text-blue-800 font-medium\"\n              >\n                Got it, hide this explanation\n              </button>\n            </div>\n          </div>\n        )}\n\n        {/* Episodes List */}\n        <div>\n          {isLoading ? (\n            <div className=\"p-6\">\n              <div className=\"animate-pulse space-y-4\">\n                {[...Array(5)].map((_, i) => (\n                  <div key={i} className=\"h-20 bg-gray-200 rounded\"></div>\n                ))}\n              </div>\n            </div>\n          ) : error ? (\n            <div className=\"p-6\">\n              <p className=\"text-red-600\">Failed to load episodes</p>\n            </div>\n          ) : sortedEpisodes.length === 0 ? (\n            <div className=\"p-6 text-center\">\n              <p className=\"text-gray-500\">No episodes found</p>\n            </div>\n          ) : (\n            <div className=\"divide-y divide-gray-200\">\n              {sortedEpisodes.map((episode) => (\n                <div key={episode.id} className=\"p-4 hover:bg-gray-50\">\n                  <div className={`flex flex-col ${episode.whitelisted ? 'gap-3' : 'gap-2'}`}>\n                    {/* Top Section: Thumbnail and Title */}\n                    <div className=\"flex items-start gap-3\">\n                      {/* Episode/Podcast Thumbnail */}\n                      <div className=\"flex-shrink-0\">\n                        {(episode.image_url || currentFeed.image_url) ? (\n                          <img\n                            src={episode.image_url || currentFeed.image_url}\n                            alt={episode.title}\n                            className=\"w-16 h-16 rounded-lg object-cover\"\n                          />\n                        ) : (\n                          <div className=\"w-16 h-16 rounded-lg bg-gray-200 flex items-center justify-center\">\n                            <svg className=\"w-8 h-8 text-gray-400\" fill=\"none\" stroke=\"currentColor\" viewBox=\"0 0 24 24\">\n                              <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M9 19V6l12-3v13M9 19c0 1.105-1.343 2-3 2s-3-.895-3-2 1.343-2 3-2 3 .895 3 2zm12-3c0 1.105-1.343 2-3 2s-3-.895-3-2 1.343-2 3-2 3 .895 3 2zM9 10l12-3\" />\n                            </svg>\n                          </div>\n                        )}\n                      </div>\n\n                      {/* Title and Feed Name */}\n                      <div className=\"flex-1 min-w-0 text-left\">\n                        <h4 className=\"font-medium text-gray-900 mb-1 line-clamp-2 text-left\">\n                          {episode.title}\n                        </h4>\n                        <p className=\"text-sm text-gray-600 text-left\">\n                          {currentFeed.title}\n                        </p>\n                      </div>\n                    </div>\n\n                    {/* Episode Description */}\n                    {episode.description && (\n                      <div className=\"text-left\">\n                        <p className=\"text-sm text-gray-500 line-clamp-3\">\n                          {episode.description.replace(/<[^>]*>/g, '').substring(0, 300)}...\n                        </p>\n                      </div>\n                    )}\n\n                    {/* Metadata: Status, Date and Duration */}\n                    <div className=\"flex items-center gap-2 text-sm text-gray-500\">\n                      {showWhitelistUi && (\n                        <>\n                          <button\n                            onClick={() => handleWhitelistToggle(episode)}\n                            disabled={whitelistMutation.isPending}\n                            className={`px-2 py-1 text-xs font-medium rounded-full transition-colors flex items-center justify-center gap-1 ${\n                              episode.whitelisted\n                                ? 'bg-green-100 text-green-800 hover:bg-green-200'\n                                : 'bg-gray-100 text-gray-800 hover:bg-gray-200'\n                            } ${whitelistMutation.isPending ? 'opacity-50 cursor-not-allowed' : ''}`}\n                          >\n                            {whitelistMutation.isPending ? (\n                              <>\n                                <svg className=\"w-3 h-3 animate-spin\" fill=\"none\" stroke=\"currentColor\" viewBox=\"0 0 24 24\">\n                                  <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M4 4v5h.582m15.356 2A8.001 8.001 0 004.582 9m0 0H9m11 11v-5h-.581m0 0a8.003 8.003 0 01-15.357-2m15.357 2H15\" />\n                                </svg>\n                                <span>...</span>\n                              </>\n                            ) : episode.whitelisted ? (\n                              <>\n                                <span>✅</span>\n                                <span>Enabled</span>\n                              </>\n                            ) : (\n                              <>\n                                <span>⛔</span>\n                                <span>Disabled</span>\n                              </>\n                            )}\n                          </button>\n                          <span>•</span>\n                        </>\n                      )}\n                      <span>{formatDate(episode.release_date)}</span>\n                      {episode.duration && (\n                        <>\n                          <span>•</span>\n                          <span>{formatDuration(episode.duration)}</span>\n                        </>\n                      )}\n                      <>\n                        <span>•</span>\n                        <span>\n                          {episode.download_count ? episode.download_count : 0} {episode.download_count === 1 ? 'download' : 'downloads'}\n                        </span>\n                      </>\n                    </div>\n\n                    {/* Bottom Controls - only show if episode is whitelisted */}\n                    {episode.whitelisted && (\n                      <div className=\"flex items-center justify-between\">\n                        {/* Left side: Download buttons */}\n                        <div className=\"flex items-center gap-2\">\n                          <DownloadButton\n                            episodeGuid={episode.guid}\n                            isWhitelisted={episode.whitelisted}\n                            hasProcessedAudio={episode.has_processed_audio}\n                            feedId={currentFeed.id}\n                            canModifyEpisodes={canModifyEpisodes}\n                            className=\"min-w-[100px]\"\n                          />\n\n                          <EpisodeProcessingStatus\n                            episodeGuid={episode.guid}\n                            isWhitelisted={episode.whitelisted}\n                            hasProcessedAudio={episode.has_processed_audio}\n                            feedId={currentFeed.id}\n                          />\n\n                          <ProcessingStatsButton\n                            episodeGuid={episode.guid}\n                            hasProcessedAudio={episode.has_processed_audio}\n                          />\n                        </div>\n\n                        {/* Right side: Play button */}\n                        <div className=\"flex-shrink-0 w-12 flex justify-end\">\n                          {episode.has_processed_audio && (\n                            <PlayButton\n                              episode={episode}\n                              className=\"ml-2\"\n                            />\n                          )}\n                        </div>\n                      </div>\n                    )}\n                  </div>\n                </div>\n              ))}\n            </div>\n          )}\n        </div>\n\n        {totalCount > 0 && (\n          <div className=\"flex items-center justify-between px-4 py-3 border-t bg-white\">\n            <div className=\"text-sm text-gray-600\">\n              Showing {visibleStart}-{visibleEnd} of {totalCount} episodes\n            </div>\n            <div className=\"flex items-center gap-2\">\n              <button\n                onClick={() => setPage((prev) => Math.max(1, prev - 1))}\n                disabled={page === 1 || isLoading || isFetching}\n                className={`px-3 py-1 text-sm rounded-md border transition-colors ${\n                  page === 1 || isLoading || isFetching\n                    ? 'bg-gray-100 text-gray-400 border-gray-200 cursor-not-allowed'\n                    : 'bg-white text-gray-700 border-gray-300 hover:bg-gray-50'\n                }`}\n              >\n                Previous\n              </button>\n              <span className=\"text-sm text-gray-700\">\n                Page {page} of {totalPages}\n              </span>\n              <button\n                onClick={() => setPage((prev) => Math.min(totalPages, prev + 1))}\n                disabled={page >= totalPages || isLoading || isFetching}\n                className={`px-3 py-1 text-sm rounded-md border transition-colors ${\n                  page >= totalPages || isLoading || isFetching\n                    ? 'bg-gray-100 text-gray-400 border-gray-200 cursor-not-allowed'\n                    : 'bg-white text-gray-700 border-gray-300 hover:bg-gray-50'\n                }`}\n              >\n                Next\n              </button>\n            </div>\n          </div>\n        )}\n      </div>\n\n      {showProcessingModal && pendingEpisode && (\n        <div className=\"fixed inset-0 z-50 flex items-center justify-center bg-black/60 p-4\" onClick={handleCancelProcessing}>\n          <div\n            className=\"bg-white rounded-xl shadow-2xl max-w-lg w-full p-6 space-y-4\"\n            onClick={(event) => event.stopPropagation()}\n          >\n            <h3 className=\"text-lg font-semibold text-gray-900\">Enable episode</h3>\n            <p className=\"text-sm text-gray-600\">{pendingEpisode.title}</p>\n            {isEstimating && (\n              <div className=\"flex items-center gap-2 text-sm text-gray-500\">\n                <div className=\"w-4 h-4 border-2 border-gray-300 border-t-gray-600 rounded-full animate-spin\" />\n                Estimating processing time…\n              </div>\n            )}\n            {!isEstimating && estimateError && (\n              <p className=\"text-sm text-red-600\">{estimateError}</p>\n            )}\n            {!isEstimating && processingEstimate && (\n              <div className=\"bg-gray-50 rounded-lg p-4 text-sm text-gray-700 space-y-1\">\n                <p><strong>Estimated minutes:</strong> {processingEstimate.estimated_minutes.toFixed(2)}</p>\n                {!processingEstimate.can_process && (\n                  <p className=\"text-red-600 font-medium\">Processing not available for this episode.</p>\n                )}\n              </div>\n            )}\n            <div className=\"flex justify-end gap-3\">\n              <button\n                onClick={handleCancelProcessing}\n                className=\"px-4 py-2 text-sm font-medium text-gray-600 hover:text-gray-800\"\n              >\n                Cancel\n              </button>\n              <button\n                onClick={handleConfirmProcessing}\n                disabled={\n                  whitelistMutation.isPending ||\n                  isEstimating ||\n                  !processingEstimate?.can_process\n                }\n                className={`px-4 py-2 rounded-lg text-sm font-medium ${\n                  whitelistMutation.isPending || isEstimating || !processingEstimate?.can_process\n                    ? 'bg-gray-200 text-gray-500 cursor-not-allowed'\n                    : 'bg-blue-600 text-white hover:bg-blue-700'\n                }`}\n              >\n                {whitelistMutation.isPending ? 'Starting…' : 'Confirm & process'}\n              </button>\n            </div>\n          </div>\n        </div>\n      )}\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/FeedList.tsx",
    "content": "import { useMemo, useState } from 'react';\nimport { useAuth } from '../contexts/AuthContext';\nimport type { Feed } from '../types';\n\ninterface FeedListProps {\n  feeds: Feed[];\n  onFeedDeleted: () => void;\n  onFeedSelected: (feed: Feed) => void;\n  selectedFeedId?: number;\n}\n\nexport default function FeedList({ feeds, onFeedDeleted: _onFeedDeleted, onFeedSelected, selectedFeedId }: FeedListProps) {\n  const [searchTerm, setSearchTerm] = useState('');\n  const { requireAuth, user } = useAuth();\n  const showMembership = Boolean(requireAuth && user?.role === 'admin');\n\n  // Ensure feeds is an array\n  const feedsArray = Array.isArray(feeds) ? feeds : [];\n\n  const filteredFeeds = useMemo(() => {\n    const term = searchTerm.trim().toLowerCase();\n    if (!term) {\n      return feedsArray;\n    }\n    return feedsArray.filter((feed) => {\n      const title = feed.title?.toLowerCase() ?? '';\n      const author = feed.author?.toLowerCase() ?? '';\n      return title.includes(term) || author.includes(term);\n    });\n  }, [feedsArray, searchTerm]);\n\n  if (feedsArray.length === 0) {\n    return (\n      <div className=\"text-center py-12\">\n        <p className=\"text-gray-500 text-lg\">No podcast feeds added yet.</p>\n        <p className=\"text-gray-400 mt-2\">Click \"Add Feed\" to get started.</p>\n      </div>\n    );\n  }\n\n  return (\n    <div className=\"flex flex-col h-full\">\n      <div className=\"mb-3\">\n        <label htmlFor=\"feed-search\" className=\"sr-only\">\n          Search feeds\n        </label>\n        <input\n          id=\"feed-search\"\n          type=\"search\"\n          placeholder=\"Search feeds\"\n          value={searchTerm}\n          onChange={(event) => setSearchTerm(event.target.value)}\n          className=\"w-full rounded-lg border border-gray-300 bg-white px-3 py-2 text-sm text-gray-900 placeholder:text-gray-500 focus:border-blue-500 focus:outline-none focus:ring-2 focus:ring-blue-200\"\n        />\n      </div>\n      <div className=\"space-y-2 overflow-y-auto h-full pb-20\">\n        {filteredFeeds.length === 0 ? (\n          <div className=\"flex h-full items-center justify-center rounded-lg border border-dashed border-gray-300 bg-gray-50 px-4 py-8 text-center\">\n            <p className=\"text-sm text-gray-500\">\n              No podcasts match &quot;{searchTerm}&quot;\n            </p>\n          </div>\n        ) : (\n          filteredFeeds.map((feed) => (\n            <div \n              key={feed.id} \n              className={`bg-white rounded-lg shadow border cursor-pointer transition-all hover:shadow-md group ${\n                selectedFeedId === feed.id ? 'ring-2 ring-blue-500 border-blue-200' : ''\n              }`}\n              onClick={() => onFeedSelected(feed)}\n            >\n              <div className=\"p-4\">\n                <div className=\"flex items-start gap-3\">\n                  {/* Podcast Image */}\n                  <div className=\"flex-shrink-0\">\n                    {feed.image_url ? (\n                      <img\n                        src={feed.image_url}\n                        alt={feed.title}\n                        className=\"w-12 h-12 rounded-lg object-cover\"\n                      />\n                    ) : (\n                      <div className=\"w-12 h-12 rounded-lg bg-gray-200 flex items-center justify-center\">\n                        <svg className=\"w-6 h-6 text-gray-400\" fill=\"none\" stroke=\"currentColor\" viewBox=\"0 0 24 24\">\n                          <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M9 19V6l12-3v13M9 19c0 1.105-1.343 2-3 2s-3-.895-3-2 1.343-2 3-2 3 .895 3 2zm12-3c0 1.105-1.343 2-3 2s-3-.895-3-2 1.343-2 3-2 3 .895 3 2zM9 10l12-3\" />\n                        </svg>\n                      </div>\n                    )}\n                  </div>\n\n                  {/* Feed Info */}\n                  <div className=\"flex-1 min-w-0\">\n                    <h3 className=\"font-medium text-gray-900 line-clamp-2\">{feed.title}</h3>\n                    {feed.author && (\n                      <p className=\"text-sm text-gray-600 mt-1\">by {feed.author}</p>\n                    )}\n                    <div className=\"flex items-center justify-between mt-2\">\n                      <span className=\"text-xs text-gray-500\">{feed.posts_count} episodes</span>\n                      {showMembership && (\n                        <div className=\"flex items-center gap-2\">\n                          <span\n                            className={`px-2 py-0.5 rounded-full text-[11px] font-medium ${\n                              feed.is_member\n                                ? 'bg-green-100 text-green-700 border border-green-200'\n                                : 'bg-gray-100 text-gray-600 border border-gray-200'\n                            }`}\n                          >\n                            {feed.is_member ? 'Joined' : 'Not joined'}\n                          </span>\n                          {feed.is_member && feed.is_active_subscription === false && (\n                            <span className=\"px-2 py-0.5 rounded-full text-[11px] font-medium bg-amber-100 text-amber-700 border border-amber-200\">\n                              Paused\n                            </span>\n                          )}\n                        </div>\n                      )}\n                    </div>\n                  </div>\n                </div>\n              </div>\n            </div>\n          ))\n        )}\n      </div>\n    </div>\n  );\n} \n"
  },
  {
    "path": "frontend/src/components/PlayButton.tsx",
    "content": "import { useAudioPlayer } from '../contexts/AudioPlayerContext';\nimport type { Episode } from '../types';\n\ninterface PlayButtonProps {\n  episode: Episode;\n  className?: string;\n}\n\nconst PlayIcon = ({ className }: { className: string }) => (\n  <svg className={className} fill=\"currentColor\" viewBox=\"0 0 24 24\">\n    <path d=\"M8 5v14l11-7z\"/>\n  </svg>\n);\n\nconst PauseIcon = ({ className }: { className: string }) => (\n  <svg className={className} fill=\"currentColor\" viewBox=\"0 0 24 24\">\n    <path d=\"M6 19h4V5H6v14zm8-14v14h4V5h-4z\"/>\n  </svg>\n);\n\nexport default function PlayButton({ episode, className = '' }: PlayButtonProps) {\n  const { currentEpisode, isPlaying, isLoading, playEpisode, togglePlayPause } = useAudioPlayer();\n  \n  const isCurrentEpisode = currentEpisode?.id === episode.id;\n  const canPlay = episode.has_processed_audio;\n\n  console.log(`PlayButton for \"${episode.title}\":`, {\n    has_processed_audio: episode.has_processed_audio,\n    whitelisted: episode.whitelisted,\n    canPlay\n  });\n\n  const getDisabledReason = () => {\n    if (!episode.has_processed_audio) {\n      return 'Episode not processed yet';\n    }\n    return '';\n  };\n\n  const handleClick = () => {\n    console.log('PlayButton clicked for episode:', episode.title);\n    console.log('canPlay:', canPlay);\n    console.log('isCurrentEpisode:', isCurrentEpisode);\n    \n    if (!canPlay) return;\n    \n    if (isCurrentEpisode) {\n      console.log('Toggling play/pause for current episode');\n      togglePlayPause();\n    } else {\n      console.log('Playing new episode');\n      playEpisode(episode);\n    }\n  };\n\n  const isDisabled = !canPlay || (isLoading && isCurrentEpisode);\n  const disabledReason = getDisabledReason();\n  const title = isDisabled && disabledReason \n    ? disabledReason \n    : isCurrentEpisode \n      ? (isPlaying ? 'Pause' : 'Play') \n      : 'Play episode';\n\n  return (\n    <button\n      onClick={handleClick}\n      disabled={isDisabled}\n      className={`p-2 rounded-full transition-colors ${\n        isDisabled \n          ? 'bg-gray-300 text-gray-500 cursor-not-allowed' \n          : 'bg-blue-600 text-white hover:bg-blue-700'\n      } ${className}`}\n      title={title}\n    >\n      {isLoading && isCurrentEpisode ? (\n        <div className=\"w-4 h-4 border-2 border-current border-t-transparent rounded-full animate-spin\" />\n      ) : isCurrentEpisode && isPlaying ? (\n        <PauseIcon className=\"w-4 h-4\" />\n      ) : (\n        <PlayIcon className=\"w-4 h-4\" />\n      )}\n    </button>\n  );\n} "
  },
  {
    "path": "frontend/src/components/ProcessingStatsButton.tsx",
    "content": "import { useState } from 'react';\nimport { useQuery } from '@tanstack/react-query';\nimport { feedsApi } from '../services/api';\n\ninterface ProcessingStatsButtonProps {\n  episodeGuid: string;\n  hasProcessedAudio: boolean;\n  className?: string;\n}\n\nexport default function ProcessingStatsButton({\n  episodeGuid,\n  hasProcessedAudio,\n  className = ''\n}: ProcessingStatsButtonProps) {\n  const [showModal, setShowModal] = useState(false);\n  const [activeTab, setActiveTab] = useState<'overview' | 'model-calls' | 'transcript' | 'identifications'>('overview');\n  const [expandedModelCalls, setExpandedModelCalls] = useState<Set<number>>(new Set());\n\n  const { data: stats, isLoading, error } = useQuery({\n    queryKey: ['episode-stats', episodeGuid],\n    queryFn: () => feedsApi.getPostStats(episodeGuid),\n    enabled: showModal && hasProcessedAudio, // Only fetch when modal is open and episode is processed\n  });\n\n  const formatDuration = (seconds: number) => {\n    const hours = Math.floor(seconds / 3600);\n    const minutes = Math.floor((seconds % 3600) / 60);\n    const secs = Math.round(seconds % 60); // Round to nearest whole second\n\n    if (hours > 0) {\n      return `${hours}h ${minutes}m ${secs}s`;\n    }\n    return `${minutes}m ${secs}s`;\n  };\n\n  const formatTimestamp = (timestamp: string | null) => {\n    if (!timestamp) return 'N/A';\n    return new Date(timestamp).toLocaleString();\n  };\n\n  const toggleModelCallDetails = (callId: number) => {\n    const newExpanded = new Set(expandedModelCalls);\n    if (newExpanded.has(callId)) {\n      newExpanded.delete(callId);\n    } else {\n      newExpanded.add(callId);\n    }\n    setExpandedModelCalls(newExpanded);\n  };\n\n  if (!hasProcessedAudio) {\n    return null;\n  }\n\n  return (\n    <>\n      <button\n        onClick={() => setShowModal(true)}\n        className={`px-3 py-1 text-xs rounded font-medium transition-colors border bg-white text-gray-700 border-gray-300 hover:bg-gray-50 hover:border-gray-400 hover:text-gray-900 flex items-center gap-1 ${className}`}\n      >\n        Stats\n      </button>\n\n      {/* Modal */}\n      {showModal && (\n        <div className=\"fixed inset-0 bg-black bg-opacity-50 flex items-center justify-center z-50 p-4\">\n          <div className=\"bg-white rounded-lg max-w-6xl w-full max-h-[90vh] overflow-hidden\">\n            {/* Header */}\n            <div className=\"flex items-center justify-between p-6 border-b\">\n              <h2 className=\"text-xl font-bold text-gray-900 text-left\">Processing Statistics & Debug</h2>\n              <button\n                onClick={() => setShowModal(false)}\n                className=\"p-2 text-gray-400 hover:text-gray-600 rounded-lg hover:bg-gray-100\"\n              >\n                <svg className=\"w-6 h-6\" fill=\"none\" stroke=\"currentColor\" viewBox=\"0 0 24 24\">\n                  <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M6 18L18 6M6 6l12 12\" />\n                </svg>\n              </button>\n            </div>\n\n            {/* Tabs */}\n            <div className=\"border-b\">\n              <nav className=\"flex space-x-8 px-6\">\n                {[\n                  { id: 'overview', label: 'Overview' },\n                  { id: 'model-calls', label: 'Model Calls' },\n                  { id: 'transcript', label: 'Transcript Segments' },\n                  { id: 'identifications', label: 'Identifications' }\n                ].map((tab) => (\n                  <button\n                    key={tab.id}\n                    onClick={() => setActiveTab(tab.id as 'overview' | 'model-calls' | 'transcript' | 'identifications')}\n                    className={`py-4 px-1 border-b-2 font-medium text-sm ${\n                      activeTab === tab.id\n                        ? 'border-blue-500 text-blue-600'\n                        : 'border-transparent text-gray-500 hover:text-gray-700 hover:border-gray-300'\n                    }`}\n                  >\n                    {tab.label}\n                    {stats && tab.id === 'model-calls' && stats.model_calls && ` (${stats.model_calls.length})`}\n                    {stats && tab.id === 'transcript' && stats.transcript_segments && ` (${stats.transcript_segments.length})`}\n                    {stats && tab.id === 'identifications' && stats.identifications && ` (${stats.identifications.length})`}\n                  </button>\n                ))}\n              </nav>\n            </div>\n\n            {/* Content */}\n            <div className=\"p-6 overflow-y-auto max-h-[calc(90vh-200px)]\">\n              {isLoading ? (\n                <div className=\"flex items-center justify-center py-12\">\n                  <div className=\"animate-spin rounded-full h-8 w-8 border-b-2 border-blue-600\"></div>\n                  <span className=\"ml-3 text-gray-600\">Loading stats...</span>\n                </div>\n              ) : error ? (\n                <div className=\"text-center py-12\">\n                  <p className=\"text-red-600\">Failed to load processing statistics</p>\n                </div>\n              ) : stats ? (\n                <>\n                  {/* Overview Tab */}\n                  {activeTab === 'overview' && (\n                    <div className=\"space-y-6\">\n                      {/* Episode Info */}\n                      <div className=\"bg-gray-50 rounded-lg p-4\">\n                        <h3 className=\"font-semibold text-gray-900 mb-2 text-left\">Episode Information</h3>\n                        <div className=\"grid grid-cols-1 md:grid-cols-2 gap-4 text-sm\">\n                          <div className=\"text-left\">\n                            <span className=\"font-medium text-gray-700\">Title:</span>\n                            <span className=\"ml-2 text-gray-600\">{stats.post?.title || 'Unknown'}</span>\n                          </div>\n                          <div className=\"text-left\">\n                            <span className=\"font-medium text-gray-700\">Duration:</span>\n                            <span className=\"ml-2 text-gray-600\">\n                              {stats.post?.duration ? formatDuration(stats.post.duration) : 'Unknown'}\n                            </span>\n                          </div>\n                        </div>\n                      </div>\n\n                      {/* Key Metrics */}\n                      <div>\n                        <h3 className=\"font-semibold text-gray-900 mb-4 text-left\">Key Metrics</h3>\n                        <div className=\"grid grid-cols-2 md:grid-cols-4 gap-4\">\n                          <div className=\"bg-gradient-to-br from-blue-50 to-blue-100 rounded-lg p-4 text-center\">\n                            <div className=\"text-2xl font-bold text-blue-600\">\n                              {stats.processing_stats?.total_segments || 0}\n                            </div>\n                            <div className=\"text-sm text-blue-800\">Transcript Segments</div>\n                          </div>\n\n                          <div className=\"bg-gradient-to-br from-green-50 to-green-100 rounded-lg p-4 text-center\">\n                            <div className=\"text-2xl font-bold text-green-600\">\n                              {stats.processing_stats?.content_segments || 0}\n                            </div>\n                            <div className=\"text-sm text-green-800\">Content Segments</div>\n                          </div>\n\n                          <div className=\"bg-gradient-to-br from-red-50 to-red-100 rounded-lg p-4 text-center\">\n                            <div className=\"text-2xl font-bold text-red-600\">\n                              {stats.processing_stats?.ad_segments_count || 0}\n                            </div>\n                            <div className=\"text-sm text-red-800\">Ad Segments Removed</div>\n                          </div>\n                        </div>\n                      </div>\n\n                      {/* Model Performance */}\n                      <div>\n                        <h3 className=\"font-semibold text-gray-900 mb-4 text-left\">AI Model Performance</h3>\n                        <div className=\"grid grid-cols-1 md:grid-cols-2 gap-6\">\n                          {/* Model Call Status */}\n                          <div className=\"bg-white border rounded-lg p-4\">\n                            <h4 className=\"font-medium text-gray-900 mb-3 text-left\">Processing Status</h4>\n                            <div className=\"space-y-2\">\n                              {Object.entries(stats.processing_stats?.model_call_statuses || {}).map(([status, count]) => (\n                                <div key={status} className=\"flex justify-between items-center\">\n                                  <span className=\"text-sm text-gray-600 capitalize\">{status}</span>\n                                  <span className={`px-2 py-1 rounded-full text-xs font-medium ${\n                                    status === 'success' ? 'bg-green-100 text-green-800' :\n                                    status === 'failed' ? 'bg-red-100 text-red-800' :\n                                    'bg-gray-100 text-gray-800'\n                                  }`}>\n                                    {count}\n                                  </span>\n                                </div>\n                              ))}\n                            </div>\n                          </div>\n\n                          {/* Model Types */}\n                          <div className=\"bg-white border rounded-lg p-4\">\n                            <h4 className=\"font-medium text-gray-900 mb-3 text-left\">Models Used</h4>\n                            <div className=\"space-y-2\">\n                              {Object.entries(stats.processing_stats?.model_types || {}).map(([model, count]) => (\n                                <div key={model} className=\"flex justify-between items-center\">\n                                  <span className=\"text-sm text-gray-600\">{model}</span>\n                                  <span className=\"px-2 py-1 bg-blue-100 text-blue-800 rounded-full text-xs font-medium\">\n                                    {count} calls\n                                  </span>\n                                </div>\n                              ))}\n                            </div>\n                          </div>\n                        </div>\n                      </div>\n                    </div>\n                  )}\n\n                  {/* Model Calls Tab */}\n                  {activeTab === 'model-calls' && (\n                    <div>\n                      <h3 className=\"font-semibold text-gray-900 mb-4 text-left\">Model Calls ({stats.model_calls?.length || 0})</h3>\n                      <div className=\"bg-white border rounded-lg overflow-hidden\">\n                        <div className=\"overflow-x-auto\">\n                          <table className=\"min-w-full divide-y divide-gray-200\">\n                            <thead className=\"bg-gray-50\">\n                              <tr>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">ID</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Model</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Segment Range</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Status</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Timestamp</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Retries</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Actions</th>\n                              </tr>\n                            </thead>\n                            <tbody className=\"bg-white divide-y divide-gray-200\">\n                              {(stats.model_calls || []).map((call) => (\n                                <>\n                                  <tr key={call.id} className=\"hover:bg-gray-50\">\n                                    <td className=\"px-4 py-3 text-sm text-gray-900\">{call.id}</td>\n                                    <td className=\"px-4 py-3 text-sm text-gray-900\">{call.model_name}</td>\n                                    <td className=\"px-4 py-3 text-sm text-gray-600\">{call.segment_range}</td>\n                                    <td className=\"px-4 py-3\">\n                                      <span className={`inline-flex px-2 py-1 text-xs font-medium rounded-full ${\n                                        call.status === 'success' ? 'bg-green-100 text-green-800' :\n                                        call.status === 'failed' ? 'bg-red-100 text-red-800' :\n                                        'bg-yellow-100 text-yellow-800'\n                                      }`}>\n                                        {call.status}\n                                      </span>\n                                    </td>\n                                    <td className=\"px-4 py-3 text-sm text-gray-600\">{formatTimestamp(call.timestamp)}</td>\n                                    <td className=\"px-4 py-3 text-sm text-gray-600\">{call.retry_attempts}</td>\n                                    <td className=\"px-4 py-3\">\n                                      <button\n                                        onClick={() => toggleModelCallDetails(call.id)}\n                                        className=\"text-blue-600 hover:text-blue-800 text-sm font-medium\"\n                                      >\n                                        {expandedModelCalls.has(call.id) ? 'Hide' : 'Details'}\n                                      </button>\n                                    </td>\n                                  </tr>\n                                  {expandedModelCalls.has(call.id) && (\n                                    <tr className=\"bg-gray-50\">\n                                      <td colSpan={7} className=\"px-4 py-4\">\n                                        <div className=\"space-y-4\">\n                                          {call.prompt && (\n                                            <div>\n                                              <h5 className=\"font-medium text-gray-900 mb-2 text-left\">Prompt:</h5>\n                                              <div className=\"bg-gray-100 p-3 rounded text-sm font-mono whitespace-pre-wrap max-h-40 overflow-y-auto text-left\">\n                                                {call.prompt}\n                                              </div>\n                                            </div>\n                                          )}\n                                          {call.error_message && (\n                                            <div>\n                                              <h5 className=\"font-medium text-red-900 mb-2 text-left\">Error Message:</h5>\n                                              <div className=\"bg-red-50 p-3 rounded text-sm font-mono whitespace-pre-wrap text-left\">\n                                                {call.error_message}\n                                              </div>\n                                            </div>\n                                          )}\n                                          {call.response && (\n                                            <div>\n                                              <h5 className=\"font-medium text-gray-900 mb-2 text-left\">Response:</h5>\n                                              <div className=\"bg-gray-100 p-3 rounded text-sm font-mono whitespace-pre-wrap max-h-40 overflow-y-auto text-left\">\n                                                {call.response}\n                                              </div>\n                                            </div>\n                                          )}\n                                        </div>\n                                      </td>\n                                    </tr>\n                                  )}\n                                </>\n                              ))}\n                            </tbody>\n                          </table>\n                        </div>\n                      </div>\n                    </div>\n                  )}\n\n                  {/* Transcript Segments Tab */}\n                  {activeTab === 'transcript' && (\n                    <div>\n                      <h3 className=\"font-semibold text-gray-900 mb-4 text-left\">Transcript Segments ({stats.transcript_segments?.length || 0})</h3>\n                      <div className=\"bg-white border rounded-lg overflow-hidden\">\n                        <div className=\"overflow-x-auto\">\n                          <table className=\"min-w-full divide-y divide-gray-200\">\n                            <thead className=\"bg-gray-50\">\n                              <tr>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Seq #</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Time Range</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Label</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Text</th>\n                              </tr>\n                            </thead>\n                            <tbody className=\"bg-white divide-y divide-gray-200\">\n                              {(stats.transcript_segments || []).map((segment) => (\n                                <tr key={segment.id} className={`hover:bg-gray-50 ${\n                                  segment.primary_label === 'ad' ? 'bg-red-50' : ''\n                                }`}>\n                                  <td className=\"px-4 py-3 text-sm text-gray-900\">{segment.sequence_num}</td>\n                                  <td className=\"px-4 py-3 text-sm text-gray-600\">\n                                    {segment.start_time}s - {segment.end_time}s\n                                  </td>\n                                  <td className=\"px-4 py-3\">\n                                    <span className={`inline-flex px-2 py-1 text-xs font-medium rounded-full ${\n                                      segment.primary_label === 'ad'\n                                        ? 'bg-red-100 text-red-800'\n                                        : 'bg-green-100 text-green-800'\n                                    }`}>\n                                      {segment.primary_label === 'ad'\n                                        ? (segment.mixed ? 'Ad (mixed)' : 'Ad')\n                                        : 'Content'}\n                                    </span>\n                                  </td>\n                                  <td className=\"px-4 py-3 text-sm text-gray-900 max-w-md\">\n                                    <div className=\"truncate text-left\" title={segment.text}>\n                                      {segment.text}\n                                    </div>\n                                  </td>\n                                </tr>\n                              ))}\n                            </tbody>\n                          </table>\n                        </div>\n                      </div>\n                    </div>\n                  )}\n\n                  {/* Identifications Tab */}\n                  {activeTab === 'identifications' && (\n                    <div>\n                      <h3 className=\"font-semibold text-gray-900 mb-4 text-left\">Identifications ({stats.identifications?.length || 0})</h3>\n                      <div className=\"bg-white border rounded-lg overflow-hidden\">\n                        <div className=\"overflow-x-auto\">\n                          <table className=\"min-w-full divide-y divide-gray-200\">\n                            <thead className=\"bg-gray-50\">\n                              <tr>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">ID</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Segment ID</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Time Range</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Label</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Confidence</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Model Call</th>\n                                <th className=\"px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider\">Text</th>\n                              </tr>\n                            </thead>\n                            <tbody className=\"bg-white divide-y divide-gray-200\">\n                              {(stats.identifications || []).map((identification) => (\n                                <tr key={identification.id} className={`hover:bg-gray-50 ${\n                                  identification.label === 'ad' ? 'bg-red-50' : ''\n                                }`}>\n                                  <td className=\"px-4 py-3 text-sm text-gray-900\">{identification.id}</td>\n                                  <td className=\"px-4 py-3 text-sm text-gray-600\">{identification.transcript_segment_id}</td>\n                                  <td className=\"px-4 py-3 text-sm text-gray-600\">\n                                    {identification.segment_start_time}s - {identification.segment_end_time}s\n                                  </td>\n                                  <td className=\"px-4 py-3\">\n                                    <span className={`inline-flex px-2 py-1 text-xs font-medium rounded-full ${\n                                      identification.label === 'ad'\n                                        ? 'bg-red-100 text-red-800'\n                                        : 'bg-green-100 text-green-800'\n                                    }`}>\n                                      {identification.label === 'ad'\n                                        ? (identification.mixed ? 'ad (mixed)' : 'ad')\n                                        : identification.label}\n                                    </span>\n                                  </td>\n                                  <td className=\"px-4 py-3 text-sm text-gray-600\">\n                                    {identification.confidence ? identification.confidence.toFixed(2) : 'N/A'}\n                                  </td>\n                                  <td className=\"px-4 py-3 text-sm text-gray-600\">{identification.model_call_id}</td>\n                                  <td className=\"px-4 py-3 text-sm text-gray-900 max-w-md\">\n                                    <div className=\"truncate text-left\" title={identification.segment_text}>\n                                      {identification.segment_text}\n                                    </div>\n                                  </td>\n                                </tr>\n                              ))}\n                            </tbody>\n                          </table>\n                        </div>\n                      </div>\n                    </div>\n                  )}\n                </>\n              ) : null}\n            </div>\n          </div>\n        </div>\n      )}\n    </>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/ReprocessButton.tsx",
    "content": "import { useState } from 'react';\nimport { useQueryClient } from '@tanstack/react-query';\nimport { feedsApi } from '../services/api';\n\ninterface ReprocessButtonProps {\n  episodeGuid: string;\n  isWhitelisted: boolean;\n  feedId?: number;\n  canModifyEpisodes?: boolean;\n  className?: string;\n  onReprocessStart?: () => void;\n}\n\nexport default function ReprocessButton({\n  episodeGuid,\n  isWhitelisted,\n  feedId,\n  canModifyEpisodes = true,\n  className = '',\n  onReprocessStart\n}: ReprocessButtonProps) {\n  const [isReprocessing, setIsReprocessing] = useState(false);\n  const [error, setError] = useState<string | null>(null);\n  const [showModal, setShowModal] = useState(false);\n  const queryClient = useQueryClient();\n\n  const handleReprocessClick = async () => {\n    if (!isWhitelisted) {\n      setError('Post must be whitelisted before reprocessing');\n      return;\n    }\n\n    setShowModal(true);\n  };\n\n  const handleConfirmReprocess = async () => {\n    setShowModal(false);\n    setIsReprocessing(true);\n    setError(null);\n\n    try {\n      const response = await feedsApi.reprocessPost(episodeGuid);\n\n      if (response.status === 'started') {\n        // Notify parent component that reprocessing started\n        onReprocessStart?.();\n\n        // Invalidate queries to refresh the UI\n        if (feedId) {\n          queryClient.invalidateQueries({ queryKey: ['episodes', feedId] });\n        }\n        queryClient.invalidateQueries({ queryKey: ['episode-stats', episodeGuid] });\n      } else {\n        setError(response.message || 'Failed to start reprocessing');\n      }\n    } catch (err: unknown) {\n      console.error('Error starting reprocessing:', err);\n      const errorMessage = err && typeof err === 'object' && 'response' in err\n        ? (err as { response?: { data?: { message?: string } } }).response?.data?.message || 'Failed to start reprocessing'\n        : 'Failed to start reprocessing';\n      setError(errorMessage);\n    } finally {\n      setIsReprocessing(false);\n    }\n  };\n\n  if (!isWhitelisted || !canModifyEpisodes) {\n    return null;\n  }\n\n  return (\n    <div className={`${className}`}>\n      <button\n        onClick={handleReprocessClick}\n        disabled={isReprocessing}\n        className={`px-3 py-1 text-xs rounded font-medium transition-colors border ${\n          isReprocessing\n            ? 'bg-gray-500 text-white cursor-wait border-gray-500'\n            : 'bg-white text-gray-700 border-gray-300 hover:bg-gray-50 hover:border-gray-400 hover:text-gray-900'\n        }`}\n        title={\n          isReprocessing\n            ? 'Clearing data and reprocessing...'\n            : 'Clear all processing data and start fresh processing'\n        }\n      >\n        {isReprocessing ? (\n          '⏳ Reprocessing...'\n        ) : (\n          'Reprocess'\n        )}\n      </button>\n\n      {error && (\n        <div className=\"text-xs text-red-600 mt-1\">\n          {error}\n        </div>\n      )}\n\n      {/* Confirmation Modal */}\n      {showModal && (\n        <div className=\"fixed inset-0 bg-black bg-opacity-50 flex items-center justify-center z-50 p-4\">\n          <div className=\"bg-white rounded-lg max-w-md w-full overflow-hidden\">\n            {/* Header */}\n            <div className=\"flex items-center justify-between p-6 border-b\">\n              <h2 className=\"text-xl font-bold text-gray-900\">Confirm Reprocess</h2>\n              <button\n                onClick={() => setShowModal(false)}\n                className=\"p-2 text-gray-400 hover:text-gray-600 rounded-lg hover:bg-gray-100\"\n              >\n                <svg className=\"w-6 h-6\" fill=\"none\" stroke=\"currentColor\" viewBox=\"0 0 24 24\">\n                  <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M6 18L18 6M6 6l12 12\" />\n                </svg>\n              </button>\n            </div>\n\n            {/* Content */}\n            <div className=\"p-6\">\n              <p className=\"text-gray-700 mb-6\">\n                Are you sure you want to reprocess this episode? This will delete the existing processed data and start fresh processing.\n              </p>\n\n              {/* Action Buttons */}\n              <div className=\"flex gap-3 justify-end\">\n                <button\n                  onClick={() => setShowModal(false)}\n                  className=\"px-4 py-2 text-sm font-medium text-gray-700 bg-white border border-gray-300 rounded-md hover:bg-gray-50 hover:border-gray-400 transition-colors\"\n                >\n                  Cancel\n                </button>\n                <button\n                  onClick={handleConfirmReprocess}\n                  className=\"px-4 py-2 text-sm font-medium text-white bg-orange-600 rounded-md hover:bg-orange-700 transition-colors\"\n                >\n                  Reprocess Episode\n                </button>\n              </div>\n            </div>\n          </div>\n        </div>\n      )}\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/ConfigContext.tsx",
    "content": "import { createContext, useContext } from 'react';\nimport type { UseConfigStateReturn } from '../../hooks/useConfigState';\n\nexport type ConfigTabId = 'default' | 'advanced' | 'users' | 'discord';\nexport type AdvancedSubtab = 'llm' | 'whisper' | 'processing' | 'output' | 'app';\n\nexport interface ConfigContextValue extends UseConfigStateReturn {\n  activeTab: ConfigTabId;\n  setActiveTab: (tab: ConfigTabId) => void;\n  activeSubtab: AdvancedSubtab;\n  setActiveSubtab: (subtab: AdvancedSubtab) => void;\n  isAdmin: boolean;\n  showSecurityControls: boolean;\n}\n\nexport const ConfigContext = createContext<ConfigContextValue | null>(null);\n\nexport function useConfigContext(): ConfigContextValue {\n  const context = useContext(ConfigContext);\n  if (!context) {\n    throw new Error('useConfigContext must be used within ConfigProvider');\n  }\n  return context;\n}\n"
  },
  {
    "path": "frontend/src/components/config/ConfigTabs.tsx",
    "content": "import { useMemo, useEffect, useCallback } from 'react';\nimport { useSearchParams } from 'react-router-dom';\nimport { useAuth } from '../../contexts/AuthContext';\nimport useConfigState from '../../hooks/useConfigState';\nimport { ConfigContext, type ConfigTabId, type AdvancedSubtab } from './ConfigContext';\nimport { EnvOverrideWarningModal } from './shared';\nimport DefaultTab from './tabs/DefaultTab';\nimport AdvancedTab from './tabs/AdvancedTab';\nimport UserManagementTab from './tabs/UserManagementTab';\nimport DiscordTab from './tabs/DiscordTab';\n\nconst TABS: { id: ConfigTabId; label: string; adminOnly?: boolean }[] = [\n  { id: 'default', label: 'Default' },\n  { id: 'advanced', label: 'Advanced' },\n  { id: 'users', label: 'User Management', adminOnly: true },\n  { id: 'discord', label: 'Discord', adminOnly: true },\n];\n\nexport default function ConfigTabs() {\n  const [searchParams, setSearchParams] = useSearchParams();\n  const { user, requireAuth } = useAuth();\n  const configState = useConfigState();\n\n  const showSecurityControls = requireAuth && !!user;\n  const isAdmin = !requireAuth || (showSecurityControls && user?.role === 'admin');\n\n  // Get tab from URL or default\n  const activeTab = useMemo<ConfigTabId>(() => {\n    const urlTab = searchParams.get('tab') as ConfigTabId | null;\n    if (urlTab && TABS.some((t) => t.id === urlTab)) {\n      // Check admin-only tabs\n      const tab = TABS.find((t) => t.id === urlTab);\n      if (tab?.adminOnly && !isAdmin) {\n        return 'default';\n      }\n      if (urlTab === 'users' && !requireAuth) {\n        return 'default';\n      }\n      return urlTab;\n    }\n    return 'default';\n  }, [searchParams, isAdmin, requireAuth]);\n\n  const activeSubtab = useMemo<AdvancedSubtab>(() => {\n    const urlSubtab = searchParams.get('section') as AdvancedSubtab | null;\n    if (urlSubtab && ['llm', 'whisper', 'processing', 'output', 'app'].includes(urlSubtab)) {\n      return urlSubtab;\n    }\n    return 'llm';\n  }, [searchParams]);\n\n  const setActiveTab = useCallback((tab: ConfigTabId) => {\n    setSearchParams((prev) => {\n      const newParams = new URLSearchParams(prev);\n      newParams.set('tab', tab);\n      if (tab !== 'advanced') {\n        newParams.delete('section');\n      }\n      return newParams;\n    }, { replace: true });\n  }, [setSearchParams]);\n\n  const setActiveSubtab = useCallback((subtab: AdvancedSubtab) => {\n    setSearchParams((prev) => {\n      const newParams = new URLSearchParams(prev);\n      newParams.set('section', subtab);\n      return newParams;\n    }, { replace: true });\n  }, [setSearchParams]);\n\n  // Redirect if on admin-only tab without permission\n  useEffect(() => {\n    const tab = TABS.find((t) => t.id === activeTab);\n    if (tab?.adminOnly && !isAdmin) {\n      setActiveTab('default');\n    }\n  }, [isAdmin, activeTab, setActiveTab]);\n\n  const contextValue = useMemo(\n    () => ({\n      ...configState,\n      activeTab,\n      setActiveTab,\n      activeSubtab,\n      setActiveSubtab,\n      isAdmin,\n      showSecurityControls,\n    }),\n    [configState, activeTab, setActiveTab, activeSubtab, setActiveSubtab, isAdmin, showSecurityControls]\n  );\n\n  const visibleTabs = TABS.filter((tab) => {\n    if (tab.id === 'users' && !requireAuth) return false;\n    return !tab.adminOnly || isAdmin;\n  });\n\n  if (configState.isLoading || !configState.pending) {\n    return <div className=\"text-sm text-gray-700\">Loading configuration...</div>;\n  }\n\n  return (\n    <ConfigContext.Provider value={contextValue}>\n      <div className=\"space-y-6\">\n        <div className=\"flex items-center justify-between\">\n          <h2 className=\"text-lg font-semibold text-gray-900\">Configuration</h2>\n        </div>\n\n        {/* Tab Navigation */}\n        <div className=\"border-b border-gray-200 overflow-x-auto\">\n          <nav className=\"flex space-x-8 min-w-max\" aria-label=\"Config tabs\">\n            {visibleTabs.map((tab) => (\n              <button\n                key={tab.id}\n                onClick={() => setActiveTab(tab.id)}\n                className={`py-3 px-1 border-b-2 font-medium text-sm whitespace-nowrap ${\n                  activeTab === tab.id\n                    ? 'border-indigo-500 text-indigo-600'\n                    : 'border-transparent text-gray-500 hover:text-gray-700 hover:border-gray-300'\n                }`}\n              >\n                {tab.label}\n              </button>\n            ))}\n          </nav>\n        </div>\n\n        {/* Tab Content */}\n        <div className=\"mt-4\">\n          {activeTab === 'default' && <DefaultTab />}\n          {activeTab === 'advanced' && <AdvancedTab />}\n          {activeTab === 'users' && isAdmin && <UserManagementTab />}\n          {activeTab === 'discord' && isAdmin && <DiscordTab />}\n        </div>\n\n        {/* Env Warning Modal */}\n        {configState.showEnvWarning && configState.envWarningPaths.length > 0 && (\n          <EnvOverrideWarningModal\n            paths={configState.envWarningPaths}\n            overrides={configState.envOverrides}\n            onCancel={configState.handleDismissEnvWarning}\n            onConfirm={configState.handleConfirmEnvWarning}\n          />\n        )}\n\n        {/* Extra padding to prevent audio player overlay from obscuring bottom settings */}\n        <div className=\"h-24\"></div>\n      </div>\n    </ConfigContext.Provider>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/index.ts",
    "content": "export { default as ConfigTabs } from './ConfigTabs';\nexport { ConfigContext, useConfigContext } from './ConfigContext';\nexport type { ConfigTabId, AdvancedSubtab, ConfigContextValue } from './ConfigContext';\n\n// Re-export tabs\nexport * from './tabs';\n\n// Re-export sections\nexport * from './sections';\n\n// Re-export shared components\nexport * from './shared';\n"
  },
  {
    "path": "frontend/src/components/config/sections/AppSection.tsx",
    "content": "import { useConfigContext } from '../ConfigContext';\nimport { Section, Field, SaveButton } from '../shared';\n\nexport default function AppSection() {\n  const { pending, setField, handleSave, isSaving } = useConfigContext();\n\n  if (!pending) return null;\n\n  return (\n    <div className=\"space-y-6\">\n      <Section title=\"App\">\n        <div className=\"grid grid-cols-1 md:grid-cols-2 gap-3\">\n          <Field label=\"Feed Refresh Background Interval (min)\">\n            <input\n              className=\"input\"\n              type=\"number\"\n              value={pending?.app?.background_update_interval_minute ?? ''}\n              onChange={(e) =>\n                setField(\n                  ['app', 'background_update_interval_minute'],\n                  e.target.value === '' ? null : Number(e.target.value)\n                )\n              }\n            />\n          </Field>\n          <Field label=\"Cleanup Retention (days)\">\n            <input\n              className=\"input\"\n              type=\"number\"\n              min={0}\n              value={pending?.app?.post_cleanup_retention_days ?? ''}\n              onChange={(e) =>\n                setField(\n                  ['app', 'post_cleanup_retention_days'],\n                  e.target.value === '' ? null : Number(e.target.value)\n                )\n              }\n            />\n          </Field>\n          <Field label=\"Auto-whitelist new episodes\">\n            <input\n              type=\"checkbox\"\n              checked={!!pending?.app?.automatically_whitelist_new_episodes}\n              onChange={(e) =>\n                setField(['app', 'automatically_whitelist_new_episodes'], e.target.checked)\n              }\n            />\n          </Field>\n          <Field label=\"List all episodes in RSS and queue processing on download attempt if not previously whitelisted\">\n            <label className=\"flex items-center gap-2 text-sm text-gray-700\">\n              <input\n                type=\"checkbox\"\n                checked={!!pending?.app?.autoprocess_on_download}\n                onChange={(e) => setField(['app', 'autoprocess_on_download'], e.target.checked)}\n              />\n            </label>\n          </Field>\n          <Field label=\"Number of episodes to whitelist from new feed archive\">\n            <input\n              className=\"input\"\n              type=\"number\"\n              value={pending?.app?.number_of_episodes_to_whitelist_from_archive_of_new_feed ?? 1}\n              onChange={(e) =>\n                setField(\n                  ['app', 'number_of_episodes_to_whitelist_from_archive_of_new_feed'],\n                  Number(e.target.value)\n                )\n              }\n            />\n          </Field>\n          <div className=\"col-span-1 md:col-span-2 flex items-center gap-3\">\n            <label className=\"flex items-center gap-2 text-sm text-gray-700 font-medium\">\n              <input\n                type=\"checkbox\"\n                checked={!!pending?.app?.enable_public_landing_page}\n                onChange={(e) => setField(['app', 'enable_public_landing_page'], e.target.checked)}\n              />\n              Enable the public landing page\n            </label>\n          </div>\n        </div>\n      </Section>\n\n      <SaveButton onSave={handleSave} isPending={isSaving} />\n\n      <style>{`.input{width:100%;padding:0.5rem;border:1px solid #e5e7eb;border-radius:0.375rem;font-size:0.875rem}`}</style>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/sections/LLMSection.tsx",
    "content": "import { useState } from 'react';\nimport { toast } from 'react-hot-toast';\nimport { configApi } from '../../../services/api';\nimport { useConfigContext } from '../ConfigContext';\nimport { Section, Field, SaveButton, TestButton } from '../shared';\nimport type { LLMConfig } from '../../../types';\n\nconst LLM_MODEL_ALIASES: string[] = [\n  'openai/gpt-4',\n  'openai/gpt-4o',\n  'anthropic/claude-3.5-sonnet',\n  'anthropic/claude-3.5-haiku',\n  'gemini/gemini-3-flash-preview',\n  'gemini/gemini-2.0-flash',\n  'gemini/gemini-1.5-pro',\n  'gemini/gemini-1.5-flash',\n  'groq/openai/gpt-oss-120b',\n];\n\nexport default function LLMSection() {\n  const { pending, setField, getEnvHint, handleSave, isSaving } = useConfigContext();\n  const [showBaseUrlInfo, setShowBaseUrlInfo] = useState(false);\n\n  if (!pending) return null;\n\n  const handleTestLLM = () => {\n    toast.promise(configApi.testLLM({ llm: pending.llm as LLMConfig }), {\n      loading: 'Testing LLM connection...',\n      success: (res: { ok: boolean; message?: string }) => res?.message || 'LLM connection OK',\n      error: (err: unknown) => {\n        const e = err as {\n          response?: { data?: { error?: string; message?: string } };\n          message?: string;\n        };\n        return (\n          e?.response?.data?.error ||\n          e?.response?.data?.message ||\n          e?.message ||\n          'LLM connection failed'\n        );\n      },\n    });\n  };\n\n  return (\n    <div className=\"space-y-6\">\n      <Section title=\"LLM\">\n        <Field label=\"API Key\" envMeta={getEnvHint('llm.llm_api_key')}>\n          <input\n            className=\"input\"\n            type=\"text\"\n            placeholder={pending?.llm?.llm_api_key_preview || ''}\n            value={pending?.llm?.llm_api_key || ''}\n            onChange={(e) => setField(['llm', 'llm_api_key'], e.target.value)}\n          />\n        </Field>\n\n        <label className=\"flex items-start justify-between gap-3\">\n          <div className=\"w-60\">\n            <div className=\"flex items-center gap-2\">\n              <span className=\"text-sm text-gray-700\">OpenAI Base URL</span>\n              <button\n                type=\"button\"\n                className=\"px-2 py-1 text-xs border border-gray-300 rounded hover:bg-gray-50\"\n                onClick={() => setShowBaseUrlInfo((v) => !v)}\n                title=\"When is this used?\"\n              >\n                ⓘ\n              </button>\n            </div>\n            {getEnvHint('llm.openai_base_url')?.env_var && (\n              <code className=\"mt-1 block text-xs text-gray-500 font-mono\">\n                {getEnvHint('llm.openai_base_url')?.env_var}\n              </code>\n            )}\n          </div>\n          <div className=\"flex-1 space-y-2\">\n            <input\n              className=\"input\"\n              type=\"text\"\n              placeholder=\"https://api.openai.com/v1\"\n              value={pending?.llm?.openai_base_url || ''}\n              onChange={(e) => setField(['llm', 'openai_base_url'], e.target.value)}\n            />\n            {showBaseUrlInfo && <BaseUrlInfoBox />}\n          </div>\n        </label>\n\n        <div className=\"grid grid-cols-1 md:grid-cols-2 gap-3\">\n          <Field label=\"Model\" envMeta={getEnvHint('llm.llm_model')}>\n            <div className=\"relative\">\n              <input\n                list=\"llm-model-datalist\"\n                className=\"input\"\n                type=\"text\"\n                value={pending?.llm?.llm_model ?? ''}\n                onChange={(e) => setField(['llm', 'llm_model'], e.target.value)}\n                placeholder=\"e.g. groq/openai/gpt-oss-120b\"\n              />\n            </div>\n          </Field>\n          <Field label=\"OpenAI Timeout (sec)\">\n            <input\n              className=\"input\"\n              type=\"number\"\n              value={pending?.llm?.openai_timeout ?? 300}\n              onChange={(e) => setField(['llm', 'openai_timeout'], Number(e.target.value))}\n            />\n          </Field>\n          <Field label=\"OpenAI Max Tokens\">\n            <input\n              className=\"input\"\n              type=\"number\"\n              value={pending?.llm?.openai_max_tokens ?? 4096}\n              onChange={(e) => setField(['llm', 'openai_max_tokens'], Number(e.target.value))}\n            />\n          </Field>\n          <Field label=\"Max Concurrent LLM Calls\">\n            <input\n              className=\"input\"\n              type=\"number\"\n              value={pending?.llm?.llm_max_concurrent_calls ?? 3}\n              onChange={(e) => setField(['llm', 'llm_max_concurrent_calls'], Number(e.target.value))}\n            />\n          </Field>\n          <Field label=\"Max Retry Attempts\">\n            <input\n              className=\"input\"\n              type=\"number\"\n              value={pending?.llm?.llm_max_retry_attempts ?? 5}\n              onChange={(e) => setField(['llm', 'llm_max_retry_attempts'], Number(e.target.value))}\n            />\n          </Field>\n          <Field label=\"Enable Token Rate Limiting\">\n            <input\n              type=\"checkbox\"\n              checked={!!pending?.llm?.llm_enable_token_rate_limiting}\n              onChange={(e) => setField(['llm', 'llm_enable_token_rate_limiting'], e.target.checked)}\n            />\n          </Field>\n          <Field label=\"Enable Boundary Refinement\" hint=\"LLM-based ad boundary refinement for improved precision\">\n            <input\n              type=\"checkbox\"\n              checked={pending?.llm?.enable_boundary_refinement ?? true}\n              onChange={(e) => setField(['llm', 'enable_boundary_refinement'], e.target.checked)}\n            />\n          </Field>\n          <Field\n            label=\"Enable Word-Level Boundary Refiner\"\n            hint=\"Uses a word-position heuristic to estimate the ad start time within a transcript segment\"\n          >\n            <input\n              type=\"checkbox\"\n              checked={!!pending?.llm?.enable_word_level_boundary_refinder}\n              onChange={(e) =>\n                setField(['llm', 'enable_word_level_boundary_refinder'], e.target.checked)\n              }\n            />\n          </Field>\n          <Field label=\"Max Input Tokens Per Call (optional)\">\n            <input\n              className=\"input\"\n              type=\"number\"\n              value={pending?.llm?.llm_max_input_tokens_per_call ?? ''}\n              onChange={(e) =>\n                setField(\n                  ['llm', 'llm_max_input_tokens_per_call'],\n                  e.target.value === '' ? null : Number(e.target.value)\n                )\n              }\n            />\n          </Field>\n          <Field label=\"Max Input Tokens Per Minute (optional)\">\n            <input\n              className=\"input\"\n              type=\"number\"\n              value={pending?.llm?.llm_max_input_tokens_per_minute ?? ''}\n              onChange={(e) =>\n                setField(\n                  ['llm', 'llm_max_input_tokens_per_minute'],\n                  e.target.value === '' ? null : Number(e.target.value)\n                )\n              }\n            />\n          </Field>\n        </div>\n\n        <TestButton onClick={handleTestLLM} label=\"Test LLM\" />\n      </Section>\n\n      <SaveButton onSave={handleSave} isPending={isSaving} />\n\n      {/* Datalist for model suggestions */}\n      <datalist id=\"llm-model-datalist\">\n        {LLM_MODEL_ALIASES.map((m) => (\n          <option key={m} value={m} />\n        ))}\n      </datalist>\n\n      <style>{`.input{width:100%;padding:0.5rem;border:1px solid #e5e7eb;border-radius:0.375rem;font-size:0.875rem}`}</style>\n    </div>\n  );\n}\n\nfunction BaseUrlInfoBox() {\n  return (\n    <div className=\"text-xs text-gray-700 bg-blue-50 border border-blue-200 rounded p-3 space-y-2\">\n      <p className=\"font-semibold\">When is Base URL used?</p>\n      <p>\n        The Base URL is <strong>only used for models without a provider prefix</strong>. LiteLLM\n        automatically routes provider-prefixed models to their respective APIs.\n      </p>\n      <div className=\"space-y-1\">\n        <p className=\"font-medium\">✅ Base URL is IGNORED for:</p>\n        <ul className=\"list-disc pl-5 space-y-0.5\">\n          <li>\n            <code className=\"bg-white px-1 rounded\">groq/openai/gpt-oss-120b</code> → Groq API\n          </li>\n          <li>\n            <code className=\"bg-white px-1 rounded\">anthropic/claude-3.5-sonnet</code> → Anthropic\n            API\n          </li>\n          <li>\n            <code className=\"bg-white px-1 rounded\">gemini/gemini-3-flash-preview</code> → Google API\n          </li>\n          <li>\n            <code className=\"bg-white px-1 rounded\">gemini/gemini-2.0-flash</code> → Google API\n          </li>\n        </ul>\n      </div>\n      <div className=\"space-y-1\">\n        <p className=\"font-medium\">⚙️ Base URL is USED for:</p>\n        <ul className=\"list-disc pl-5 space-y-0.5\">\n          <li>\n            Unprefixed models like <code className=\"bg-white px-1 rounded\">gpt-4o</code>\n          </li>\n          <li>Self-hosted OpenAI-compatible endpoints</li>\n          <li>LiteLLM proxy servers or local LLMs</li>\n        </ul>\n      </div>\n      <p className=\"italic text-gray-600\">For the default Groq setup, you don't need to set this.</p>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/sections/OutputSection.tsx",
    "content": "import { useConfigContext } from '../ConfigContext';\nimport { Section, Field, SaveButton } from '../shared';\n\nexport default function OutputSection() {\n  const { pending, setField, handleSave, isSaving } = useConfigContext();\n\n  if (!pending) return null;\n\n  return (\n    <div className=\"space-y-6\">\n      <Section title=\"Output\">\n        <div className=\"grid grid-cols-1 md:grid-cols-2 gap-3\">\n          <Field label=\"Fade (ms)\">\n            <input\n              className=\"input\"\n              type=\"number\"\n              value={pending?.output?.fade_ms ?? 3000}\n              onChange={(e) => setField(['output', 'fade_ms'], Number(e.target.value))}\n            />\n          </Field>\n          <Field label=\"Min Segment Separation (sec)\">\n            <input\n              className=\"input\"\n              type=\"number\"\n              value={pending?.output?.min_ad_segement_separation_seconds ?? 60}\n              onChange={(e) =>\n                setField(['output', 'min_ad_segement_separation_seconds'], Number(e.target.value))\n              }\n            />\n          </Field>\n          <Field label=\"Min Segment Length (sec)\">\n            <input\n              className=\"input\"\n              type=\"number\"\n              value={pending?.output?.min_ad_segment_length_seconds ?? 14}\n              onChange={(e) =>\n                setField(['output', 'min_ad_segment_length_seconds'], Number(e.target.value))\n              }\n            />\n          </Field>\n          <Field label=\"Min Confidence\">\n            <input\n              className=\"input\"\n              type=\"number\"\n              step=\"0.01\"\n              value={pending?.output?.min_confidence ?? 0.8}\n              onChange={(e) => setField(['output', 'min_confidence'], Number(e.target.value))}\n            />\n          </Field>\n        </div>\n      </Section>\n\n      <SaveButton onSave={handleSave} isPending={isSaving} />\n\n      <style>{`.input{width:100%;padding:0.5rem;border:1px solid #e5e7eb;border-radius:0.375rem;font-size:0.875rem}`}</style>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/sections/ProcessingSection.tsx",
    "content": "import { useConfigContext } from '../ConfigContext';\nimport { Section, Field, SaveButton } from '../shared';\n\nexport default function ProcessingSection() {\n  const { pending, setField, handleSave, isSaving } = useConfigContext();\n\n  if (!pending) return null;\n\n  return (\n    <div className=\"space-y-6\">\n      <Section title=\"Processing\">\n        <Field label=\"Number of Segments per Prompt\">\n          <input\n            className=\"input\"\n            type=\"number\"\n            value={pending?.processing?.num_segments_to_input_to_prompt ?? 30}\n            onChange={(e) =>\n              setField(['processing', 'num_segments_to_input_to_prompt'], Number(e.target.value))\n            }\n          />\n        </Field>\n      </Section>\n\n      <SaveButton onSave={handleSave} isPending={isSaving} />\n\n      <style>{`.input{width:100%;padding:0.5rem;border:1px solid #e5e7eb;border-radius:0.375rem;font-size:0.875rem}`}</style>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/sections/WhisperSection.tsx",
    "content": "import { useMemo } from 'react';\nimport { toast } from 'react-hot-toast';\nimport { configApi } from '../../../services/api';\nimport { useConfigContext } from '../ConfigContext';\nimport { Section, Field, SaveButton, TestButton } from '../shared';\nimport type { WhisperConfig } from '../../../types';\n\nexport default function WhisperSection() {\n  const {\n    pending,\n    setField,\n    getEnvHint,\n    handleSave,\n    isSaving,\n    localWhisperAvailable,\n    handleWhisperTypeChange,\n    getWhisperApiKey,\n    envOverrides,\n  } = useConfigContext();\n\n  const whisperApiKeyPreview =\n    pending?.whisper?.whisper_type === 'remote' || pending?.whisper?.whisper_type === 'groq'\n      ? (pending.whisper as { api_key_preview?: string }).api_key_preview\n      : undefined;\n\n  const whisperApiKeyPlaceholder = useMemo(() => {\n    if (pending?.whisper?.whisper_type === 'remote' || pending?.whisper?.whisper_type === 'groq') {\n      if (whisperApiKeyPreview) {\n        return whisperApiKeyPreview;\n      }\n      const override = envOverrides['whisper.api_key'];\n      if (override) {\n        return override.value_preview || override.value || '';\n      }\n    }\n    return '';\n  }, [whisperApiKeyPreview, pending?.whisper?.whisper_type, envOverrides]);\n\n  if (!pending) return null;\n\n  const handleTestWhisper = () => {\n    toast.promise(configApi.testWhisper({ whisper: pending.whisper as WhisperConfig }), {\n      loading: 'Testing Whisper...',\n      success: (res: { ok: boolean; message?: string }) => res?.message || 'Whisper OK',\n      error: (err: unknown) => {\n        const e = err as {\n          response?: { data?: { error?: string; message?: string } };\n          message?: string;\n        };\n        return (\n          e?.response?.data?.error ||\n          e?.response?.data?.message ||\n          e?.message ||\n          'Whisper test failed'\n        );\n      },\n    });\n  };\n\n  const whisperType = pending?.whisper?.whisper_type ?? (localWhisperAvailable === false ? 'remote' : 'local');\n\n  return (\n    <div className=\"space-y-6\">\n      <Section title=\"Whisper\">\n        <Field label=\"Type\" envMeta={getEnvHint('whisper.whisper_type')}>\n          <select\n            className=\"input\"\n            value={whisperType}\n            onChange={(e) => handleWhisperTypeChange(e.target.value as 'local' | 'remote' | 'groq')}\n          >\n            {localWhisperAvailable !== false && <option value=\"local\">local</option>}\n            <option value=\"remote\">remote</option>\n            <option value=\"groq\">groq</option>\n          </select>\n        </Field>\n\n        {/* Local Whisper Options */}\n        {pending?.whisper?.whisper_type === 'local' && (\n          <Field\n            label=\"Local Model\"\n            envMeta={getEnvHint('whisper.model', { env_var: 'WHISPER_LOCAL_MODEL' })}\n          >\n            <input\n              className=\"input\"\n              type=\"text\"\n              value={(pending?.whisper as { model?: string })?.model || 'base'}\n              onChange={(e) => setField(['whisper', 'model'], e.target.value)}\n            />\n          </Field>\n        )}\n\n        {/* Remote Whisper Options */}\n        {pending?.whisper?.whisper_type === 'remote' && (\n          <div className=\"grid grid-cols-1 md:grid-cols-2 gap-3\">\n            <Field\n              label=\"API Key\"\n              envMeta={getEnvHint('whisper.api_key', { env_var: 'WHISPER_REMOTE_API_KEY' })}\n            >\n              <input\n                className=\"input\"\n                type=\"text\"\n                placeholder={whisperApiKeyPlaceholder}\n                value={getWhisperApiKey(pending?.whisper)}\n                onChange={(e) => setField(['whisper', 'api_key'], e.target.value)}\n              />\n            </Field>\n            <Field\n              label=\"Remote Model\"\n              envMeta={getEnvHint('whisper.model', { env_var: 'WHISPER_REMOTE_MODEL' })}\n            >\n              <input\n                className=\"input\"\n                type=\"text\"\n                value={(pending?.whisper as { model?: string })?.model || 'whisper-1'}\n                onChange={(e) => setField(['whisper', 'model'], e.target.value)}\n              />\n            </Field>\n            <Field label=\"Base URL\" envMeta={getEnvHint('whisper.base_url')}>\n              <input\n                className=\"input\"\n                type=\"text\"\n                placeholder=\"https://api.openai.com/v1\"\n                value={(pending?.whisper as { base_url?: string })?.base_url || ''}\n                onChange={(e) => setField(['whisper', 'base_url'], e.target.value)}\n              />\n            </Field>\n            <Field label=\"Language\">\n              <input\n                className=\"input\"\n                type=\"text\"\n                value={(pending?.whisper as { language?: string })?.language || 'en'}\n                onChange={(e) => setField(['whisper', 'language'], e.target.value)}\n              />\n            </Field>\n            <Field label=\"Timeout (sec)\" envMeta={getEnvHint('whisper.timeout_sec')}>\n              <input\n                className=\"input\"\n                type=\"number\"\n                value={(pending?.whisper as { timeout_sec?: number })?.timeout_sec ?? 600}\n                onChange={(e) => setField(['whisper', 'timeout_sec'], Number(e.target.value))}\n              />\n            </Field>\n            <Field label=\"Chunk Size (MB)\" envMeta={getEnvHint('whisper.chunksize_mb')}>\n              <input\n                className=\"input\"\n                type=\"number\"\n                value={(pending?.whisper as { chunksize_mb?: number })?.chunksize_mb ?? 24}\n                onChange={(e) => setField(['whisper', 'chunksize_mb'], Number(e.target.value))}\n              />\n            </Field>\n          </div>\n        )}\n\n        {/* Groq Whisper Options */}\n        {pending?.whisper?.whisper_type === 'groq' && (\n          <div className=\"grid grid-cols-1 md:grid-cols-2 gap-3\">\n            <Field\n              label=\"API Key\"\n              envMeta={getEnvHint('whisper.api_key', { env_var: 'GROQ_API_KEY' })}\n            >\n              <input\n                className=\"input\"\n                type=\"text\"\n                placeholder={whisperApiKeyPlaceholder}\n                value={getWhisperApiKey(pending?.whisper)}\n                onChange={(e) => setField(['whisper', 'api_key'], e.target.value)}\n              />\n            </Field>\n            <Field\n              label=\"Model\"\n              envMeta={getEnvHint('whisper.model', { env_var: 'GROQ_WHISPER_MODEL' })}\n            >\n              <input\n                className=\"input\"\n                type=\"text\"\n                value={(pending?.whisper as { model?: string })?.model || 'whisper-large-v3-turbo'}\n                onChange={(e) => setField(['whisper', 'model'], e.target.value)}\n              />\n            </Field>\n            <Field label=\"Language\">\n              <input\n                className=\"input\"\n                type=\"text\"\n                value={(pending?.whisper as { language?: string })?.language || 'en'}\n                onChange={(e) => setField(['whisper', 'language'], e.target.value)}\n              />\n            </Field>\n            <Field label=\"Max Retries\" envMeta={getEnvHint('whisper.max_retries')}>\n              <input\n                className=\"input\"\n                type=\"number\"\n                value={(pending?.whisper as { max_retries?: number })?.max_retries ?? 3}\n                onChange={(e) => setField(['whisper', 'max_retries'], Number(e.target.value))}\n              />\n            </Field>\n          </div>\n        )}\n\n        <TestButton onClick={handleTestWhisper} label=\"Test Whisper\" />\n      </Section>\n\n      <SaveButton onSave={handleSave} isPending={isSaving} />\n\n      <style>{`.input{width:100%;padding:0.5rem;border:1px solid #e5e7eb;border-radius:0.375rem;font-size:0.875rem}`}</style>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/sections/index.ts",
    "content": "export { default as LLMSection } from './LLMSection';\nexport { default as WhisperSection } from './WhisperSection';\nexport { default as ProcessingSection } from './ProcessingSection';\nexport { default as OutputSection } from './OutputSection';\nexport { default as AppSection } from './AppSection';\n"
  },
  {
    "path": "frontend/src/components/config/shared/ConnectionStatusCard.tsx",
    "content": "interface ConnectionStatusCardProps {\n  title: string;\n  status: 'loading' | 'ok' | 'error';\n  message: string;\n  error?: string;\n  onRetry: () => void;\n}\n\nexport default function ConnectionStatusCard({\n  title,\n  status,\n  message,\n  error,\n  onRetry,\n}: ConnectionStatusCardProps) {\n  const statusColor =\n    status === 'ok'\n      ? 'text-green-700'\n      : status === 'error'\n      ? 'text-red-700'\n      : 'text-gray-600';\n\n  const displayMessage =\n    status === 'loading'\n      ? 'Testing...'\n      : status === 'ok'\n      ? message || `${title} connection OK`\n      : error || `${title} connection failed`;\n\n  return (\n    <div className=\"flex items-start justify-between border rounded p-3\">\n      <div>\n        <div className=\"text-sm font-medium text-gray-900\">{title}</div>\n        <div className={`text-xs ${statusColor}`}>{displayMessage}</div>\n      </div>\n      <button\n        type=\"button\"\n        className=\"text-xs text-indigo-600 hover:underline\"\n        onClick={onRetry}\n      >\n        Retry\n      </button>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/shared/EnvOverrideWarningModal.tsx",
    "content": "import type { EnvOverrideMap } from '../../../types';\nimport { ENV_FIELD_LABELS } from './constants';\n\ninterface EnvOverrideWarningModalProps {\n  paths: string[];\n  overrides: EnvOverrideMap;\n  onConfirm: () => void;\n  onCancel: () => void;\n}\n\nexport default function EnvOverrideWarningModal({\n  paths,\n  overrides,\n  onConfirm,\n  onCancel,\n}: EnvOverrideWarningModalProps) {\n  if (!paths.length) {\n    return null;\n  }\n\n  return (\n    <div className=\"fixed inset-0 z-50 flex items-center justify-center bg-black/40 px-4 py-6\">\n      <div className=\"w-full max-w-lg space-y-4 rounded-lg bg-white p-5 shadow-xl\">\n        <div>\n          <h3 className=\"text-base font-semibold text-gray-900\">Environment-managed settings</h3>\n          <p className=\"text-sm text-gray-600\">\n            These fields are controlled by environment variables. Update the referenced variables in your\n            <code className=\"mx-1 font-mono text-xs\">.env</code>\n            (or deployment secrets) to make the change persistent. Your manual change will be saved, but will be overwritten if you modify your environment variables in the future.\n          </p>\n        </div>\n        <ul className=\"space-y-3 text-sm\">\n          {paths.map((path) => {\n            const meta = overrides[path];\n            const label = ENV_FIELD_LABELS[path] ?? path;\n            return (\n              <li key={path} className=\"rounded border border-amber-200 bg-amber-50 p-3\">\n                <div className=\"font-medium text-gray-900\">{label}</div>\n                {meta?.env_var ? (\n                  <p className=\"mt-1 text-xs text-gray-700\">\n                    Managed by <code className=\"font-mono\">{meta.env_var}</code>\n                    {meta?.value_preview && (\n                      <span className=\"ml-1 text-gray-600\">({meta.value_preview})</span>\n                    )}\n                    {!meta?.value_preview && meta?.value && (\n                      <span className=\"ml-1 text-gray-600\">({meta.value})</span>\n                    )}\n                  </p>\n                ) : (\n                  <p className=\"mt-1 text-xs text-gray-700\">Managed by deployment environment</p>\n                )}\n              </li>\n            );\n          })}\n        </ul>\n        <div className=\"flex justify-end gap-2\">\n          <button\n            type=\"button\"\n            onClick={onCancel}\n            className=\"rounded border border-gray-300 px-3 py-2 text-sm text-gray-700 hover:bg-gray-50\"\n          >\n            Go back\n          </button>\n          <button\n            type=\"button\"\n            onClick={onConfirm}\n            className=\"rounded bg-indigo-600 px-3 py-2 text-sm font-semibold text-white hover:bg-indigo-700\"\n          >\n            Save anyway\n          </button>\n        </div>\n      </div>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/shared/EnvVarHint.tsx",
    "content": "import type { EnvOverrideEntry } from '../../../types';\n\ninterface EnvVarHintProps {\n  meta?: EnvOverrideEntry;\n}\n\nexport default function EnvVarHint({ meta }: EnvVarHintProps) {\n  if (!meta?.env_var) {\n    return null;\n  }\n\n  return (\n    <code className=\"mt-1 block text-xs text-gray-500 font-mono\">{meta.env_var}</code>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/shared/Field.tsx",
    "content": "import type { ReactNode } from 'react';\nimport type { EnvOverrideEntry } from '../../../types';\nimport EnvVarHint from './EnvVarHint';\n\ninterface FieldProps {\n  label: string;\n  children: ReactNode;\n  envMeta?: EnvOverrideEntry;\n  labelWidth?: string;\n  hint?: string;\n}\n\nexport default function Field({\n  label,\n  children,\n  envMeta,\n  labelWidth = 'w-60',\n  hint,\n}: FieldProps) {\n  return (\n    <label className=\"flex items-start justify-between gap-3\">\n      <div className={labelWidth}>\n        <span className=\"block text-sm text-gray-700\">{label}</span>\n        {hint ? <span className=\"block text-xs text-gray-500\">{hint}</span> : null}\n        <EnvVarHint meta={envMeta} />\n      </div>\n      <div className=\"flex-1\">{children}</div>\n    </label>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/shared/SaveButton.tsx",
    "content": "interface SaveButtonProps {\n  onSave: () => void;\n  isPending: boolean;\n  className?: string;\n}\n\nexport default function SaveButton({ onSave, isPending, className = '' }: SaveButtonProps) {\n  return (\n    <div className={`flex items-center justify-end ${className}`}>\n      <button\n        onClick={onSave}\n        className=\"px-3 py-2 text-sm rounded bg-indigo-600 text-white hover:bg-indigo-700 disabled:opacity-60\"\n        disabled={isPending}\n      >\n        {isPending ? 'Saving...' : 'Save Changes'}\n      </button>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/shared/Section.tsx",
    "content": "import type { ReactNode } from 'react';\n\ninterface SectionProps {\n  title: string;\n  children: ReactNode;\n  className?: string;\n}\n\nexport default function Section({ title, children, className = '' }: SectionProps) {\n  return (\n    <div className={`bg-white rounded border p-4 ${className}`}>\n      <h3 className=\"text-sm font-semibold text-gray-900 mb-3\">{title}</h3>\n      <div className=\"space-y-3\">{children}</div>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/shared/TestButton.tsx",
    "content": "interface TestButtonProps {\n  onClick: () => void;\n  label: string;\n  className?: string;\n}\n\nexport default function TestButton({ onClick, label, className = '' }: TestButtonProps) {\n  return (\n    <div className={`flex justify-center ${className}`}>\n      <button\n        onClick={onClick}\n        className=\"mt-2 px-3 py-2 text-sm rounded bg-indigo-600 text-white hover:bg-indigo-700\"\n      >\n        {label}\n      </button>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/shared/constants.ts",
    "content": "export const ENV_FIELD_LABELS: Record<string, string> = {\n  'groq.api_key': 'Groq API Key',\n  'llm.llm_api_key': 'LLM API Key',\n  'llm.llm_model': 'LLM Model',\n  'llm.openai_base_url': 'LLM Base URL',\n  'whisper.whisper_type': 'Whisper Mode',\n  'whisper.api_key': 'Whisper API Key',\n  'whisper.model': 'Whisper Model',\n  'whisper.base_url': 'Whisper Base URL',\n  'whisper.timeout_sec': 'Whisper Timeout (sec)',\n  'whisper.chunksize_mb': 'Whisper Chunk Size (MB)',\n  'whisper.max_retries': 'Whisper Max Retries',\n};\n"
  },
  {
    "path": "frontend/src/components/config/shared/index.ts",
    "content": "export { default as Section } from './Section';\nexport { default as Field } from './Field';\nexport { default as EnvVarHint } from './EnvVarHint';\nexport { default as EnvOverrideWarningModal } from './EnvOverrideWarningModal';\nexport { default as ConnectionStatusCard } from './ConnectionStatusCard';\nexport { default as SaveButton } from './SaveButton';\nexport { default as TestButton } from './TestButton';\nexport { ENV_FIELD_LABELS } from './constants';\n"
  },
  {
    "path": "frontend/src/components/config/tabs/AdvancedTab.tsx",
    "content": "import { useConfigContext, type AdvancedSubtab } from '../ConfigContext';\nimport {\n  LLMSection,\n  WhisperSection,\n  ProcessingSection,\n  OutputSection,\n  AppSection,\n} from '../sections';\n\nconst SUBTABS: { id: AdvancedSubtab; label: string }[] = [\n  { id: 'llm', label: 'LLM' },\n  { id: 'whisper', label: 'Whisper' },\n  { id: 'processing', label: 'Processing' },\n  { id: 'output', label: 'Output' },\n  { id: 'app', label: 'App' },\n];\n\nexport default function AdvancedTab() {\n  const { activeSubtab, setActiveSubtab } = useConfigContext();\n\n  return (\n    <div className=\"space-y-6\">\n      {/* Subtab Navigation */}\n      <div className=\"flex space-x-2 flex-wrap gap-y-2\">\n        {SUBTABS.map((subtab) => (\n          <button\n            key={subtab.id}\n            onClick={() => setActiveSubtab(subtab.id)}\n            className={`px-3 py-1.5 text-sm rounded-md font-medium transition-colors ${\n              activeSubtab === subtab.id\n                ? 'bg-indigo-100 text-indigo-700'\n                : 'text-gray-600 hover:bg-gray-100 hover:text-gray-900'\n            }`}\n          >\n            {subtab.label}\n          </button>\n        ))}\n      </div>\n\n      {/* Subtab Content */}\n      <div>\n        {activeSubtab === 'llm' && <LLMSection />}\n        {activeSubtab === 'whisper' && <WhisperSection />}\n        {activeSubtab === 'processing' && <ProcessingSection />}\n        {activeSubtab === 'output' && <OutputSection />}\n        {activeSubtab === 'app' && <AppSection />}\n      </div>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/tabs/DefaultTab.tsx",
    "content": "import { useState } from 'react';\nimport { useConfigContext } from '../ConfigContext';\nimport { Section, Field, ConnectionStatusCard } from '../shared';\nimport type { WhisperConfig, LLMConfig } from '../../../types';\n\nexport default function DefaultTab() {\n  const {\n    pending,\n    updatePending,\n    llmStatus,\n    whisperStatus,\n    probeConnections,\n    getEnvHint,\n    getWhisperApiKey,\n    groqRecommendedModel,\n    groqRecommendedWhisper,\n    applyGroqKey,\n  } = useConfigContext();\n\n  const [showGroqHelp, setShowGroqHelp] = useState(false);\n  const [showGroqPricing, setShowGroqPricing] = useState(false);\n\n  if (!pending) return null;\n\n  const handleGroqKeyChange = (val: string) => {\n    updatePending((prevConfig) => {\n      return {\n        ...prevConfig,\n        llm: {\n          ...(prevConfig.llm as LLMConfig),\n          llm_api_key: val,\n          llm_model: groqRecommendedModel,\n        },\n        whisper: {\n          whisper_type: 'groq',\n          api_key: val,\n          model: groqRecommendedWhisper,\n          language: 'en',\n          max_retries: 3,\n        } as WhisperConfig,\n      };\n    });\n  };\n\n  const handleGroqKeyApply = (key: string) => {\n    if (!key.trim()) return;\n    void applyGroqKey(key.trim());\n  };\n\n  const currentGroqKey =\n    pending?.whisper?.whisper_type === 'groq'\n      ? getWhisperApiKey(pending?.whisper)\n      : pending?.llm?.llm_api_key || '';\n\n  const groqKeyPlaceholder =\n    pending?.whisper?.whisper_type === 'groq'\n      ? pending?.whisper?.api_key_preview || ''\n      : pending?.llm?.llm_api_key_preview || '';\n\n  return (\n    <div className=\"space-y-6\">\n      <Section title=\"Connection Status\">\n        <div className=\"grid grid-cols-1 md:grid-cols-2 gap-3\">\n          <ConnectionStatusCard\n            title=\"LLM\"\n            status={llmStatus.status}\n            message={llmStatus.message}\n            error={llmStatus.error}\n            onRetry={() => void probeConnections()}\n          />\n          <ConnectionStatusCard\n            title=\"Whisper\"\n            status={whisperStatus.status}\n            message={whisperStatus.message}\n            error={whisperStatus.error}\n            onRetry={() => void probeConnections()}\n          />\n        </div>\n      </Section>\n\n      <Section title=\"Quick Setup\">\n        <div className=\"text-sm text-gray-700 mb-2 flex items-center gap-2 flex-wrap\">\n          <span>Enter your Groq API key to use the recommended setup.</span>\n          <button\n            type=\"button\"\n            className=\"text-indigo-600 hover:underline\"\n            onClick={() => setShowGroqHelp((v) => !v)}\n          >\n            {showGroqHelp ? 'Hide help' : '(need help getting a key?)'}\n          </button>\n          <button\n            type=\"button\"\n            className=\"text-indigo-600 hover:underline\"\n            onClick={() => setShowGroqPricing((v) => !v)}\n          >\n            {showGroqPricing ? 'Hide pricing' : '(pricing guide)'}\n          </button>\n        </div>\n\n        {showGroqHelp && <GroqHelpBox />}\n        {showGroqPricing && <GroqPricingBox />}\n\n        <Field label=\"Groq API Key\" envMeta={getEnvHint('groq.api_key')}>\n          <div className=\"flex gap-2\">\n            <input\n              className=\"input\"\n              type=\"text\"\n              placeholder={groqKeyPlaceholder}\n              value={currentGroqKey}\n              onChange={(e) => handleGroqKeyChange(e.target.value)}\n              onBlur={(e) => handleGroqKeyApply(e.target.value)}\n              onPaste={(e) => {\n                const text = e.clipboardData.getData('text').trim();\n                if (text) handleGroqKeyApply(text);\n              }}\n            />\n          </div>\n        </Field>\n      </Section>\n\n      {/* Input styling */}\n      <style>{`.input{width:100%;padding:0.5rem;border:1px solid #e5e7eb;border-radius:0.375rem;font-size:0.875rem}`}</style>\n    </div>\n  );\n}\n\nfunction GroqHelpBox() {\n  return (\n    <div className=\"text-sm text-gray-700 mb-2 bg-indigo-50 border border-indigo-200 rounded p-3 space-y-2\">\n      <ol className=\"list-decimal pl-5 space-y-1\">\n        <li>\n          Visit the{' '}\n          <a\n            className=\"text-indigo-700 underline\"\n            href=\"https://console.groq.com/keys\"\n            target=\"_blank\"\n            rel=\"noreferrer\"\n          >\n            Groq Console\n          </a>{' '}\n          and sign in or create an account.\n        </li>\n        <li>Open the Keys page and click \"Create API Key\".</li>\n        <li>\n          Copy the key (it starts with <code>gsk_</code>) and paste it below.\n        </li>\n        <li>\n          <strong>Recommended:</strong> Set a billing limit at{' '}\n          <a\n            className=\"text-indigo-700 underline\"\n            href=\"https://console.groq.com/settings/billing\"\n            target=\"_blank\"\n            rel=\"noreferrer\"\n          >\n            Settings → Billing → Limits\n          </a>{' '}\n          to control costs and receive usage alerts.\n        </li>\n      </ol>\n    </div>\n  );\n}\n\nfunction GroqPricingBox() {\n  return (\n    <div className=\"text-sm text-gray-700 mb-2 bg-green-50 border border-green-200 rounded p-3 space-y-3\">\n      <div>\n        <h4 className=\"font-semibold text-green-800 mb-2\">Groq Pricing Guide</h4>\n        <p className=\"text-green-700 mb-3\">\n          Based on the recommended models: <code>whisper-large-v3-turbo</code> and{' '}\n          <code>llama-3.3-70b-versatile</code>\n        </p>\n      </div>\n\n      <div className=\"grid grid-cols-1 md:grid-cols-2 gap-4\">\n        <div className=\"bg-white border border-green-300 rounded p-3\">\n          <h5 className=\"font-medium text-green-800 mb-2\">Whisper (Transcription)</h5>\n          <ul className=\"space-y-1 text-green-700\">\n            <li>\n              • <strong>whisper-large-v3-turbo:</strong> $0.04/hour\n            </li>\n            <li>• Speed: 216x real-time</li>\n            <li>• Minimum charge: 10 seconds per request</li>\n          </ul>\n        </div>\n\n        <div className=\"bg-white border border-green-300 rounded p-3\">\n          <h5 className=\"font-medium text-green-800 mb-2\">LLM (Ad Detection)</h5>\n          <ul className=\"space-y-1 text-green-700\">\n            <li>\n              • <strong>llama-3.3-70b-versatile:</strong>\n            </li>\n            <li>• Input: $0.59/1M tokens</li>\n            <li>• Output: $0.79/1M tokens</li>\n            <li>• ~1M tokens ≈ 750,000 words</li>\n          </ul>\n        </div>\n      </div>\n\n      <div className=\"bg-white border border-green-300 rounded p-3\">\n        <h5 className=\"font-medium text-green-800 mb-2\">\n          Estimated Monthly Cost (6 podcasts, 6 hours/week)\n        </h5>\n        <div className=\"grid grid-cols-1 md:grid-cols-3 gap-3 text-green-700\">\n          <div>\n            <strong>Transcription:</strong>\n            <br />\n            24 hours/month × $0.04 = <span className=\"font-semibold\">$0.96/month</span>\n          </div>\n          <div>\n            <strong>Ad Detection:</strong>\n            <br />\n            ~2M tokens × $0.69 avg = <span className=\"font-semibold\">$1.38/month</span>\n          </div>\n          <div className=\"md:col-span-1\">\n            <strong>Total Estimate:</strong>\n            <br />\n            <span className=\"font-semibold text-lg\">~$2.34/month</span>\n          </div>\n        </div>\n        <p className=\"text-xs text-green-600 mt-2\">\n          * Actual costs may vary based on podcast length, complexity, and token usage. Consider\n          setting a $5-10/month billing limit for safety.\n        </p>\n      </div>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/tabs/DiscordTab.tsx",
    "content": "import { useState, useEffect } from 'react';\nimport { useQuery, useMutation, useQueryClient } from '@tanstack/react-query';\nimport { toast } from 'react-hot-toast';\nimport { discordApi } from '../../../services/api';\nimport { Section } from '../shared';\n\nexport default function DiscordTab() {\n  const queryClient = useQueryClient();\n  \n  const { data, isLoading, error } = useQuery({\n    queryKey: ['discord-config'],\n    queryFn: discordApi.getConfig,\n  });\n\n  const [form, setForm] = useState({\n    client_id: '',\n    client_secret: '',\n    redirect_uri: '',\n    guild_ids: '',\n    allow_registration: true,\n  });\n\n  const [hasSecretChange, setHasSecretChange] = useState(false);\n\n  // Initialize form when data loads\n  useEffect(() => {\n    if (data?.config) {\n      setForm({\n        client_id: data.config.client_id || '',\n        client_secret: '', // Don't prefill secret\n        redirect_uri: data.config.redirect_uri || '',\n        guild_ids: data.config.guild_ids || '',\n        allow_registration: data.config.allow_registration,\n      });\n      setHasSecretChange(false);\n    }\n  }, [data]);\n\n  const mutation = useMutation({\n    mutationFn: discordApi.updateConfig,\n    onSuccess: () => {\n      toast.success('Discord settings saved');\n      queryClient.invalidateQueries({ queryKey: ['discord-config'] });\n      queryClient.invalidateQueries({ queryKey: ['discord-status'] });\n      setHasSecretChange(false);\n    },\n    onError: (err: Error) => {\n      toast.error(`Failed to save: ${err.message}`);\n    },\n  });\n\n  const handleSubmit = (e: React.FormEvent) => {\n    e.preventDefault();\n    \n    const payload: Record<string, unknown> = {\n      client_id: form.client_id,\n      redirect_uri: form.redirect_uri,\n      guild_ids: form.guild_ids,\n      allow_registration: form.allow_registration,\n    };\n\n    // Only include secret if it was changed\n    if (hasSecretChange && form.client_secret) {\n      payload.client_secret = form.client_secret;\n    }\n\n    mutation.mutate(payload);\n  };\n\n  const envOverrides = data?.env_overrides || {};\n\n  if (isLoading) {\n    return <div className=\"text-sm text-gray-600\">Loading Discord configuration...</div>;\n  }\n\n  if (error) {\n    return <div className=\"text-sm text-red-600\">Failed to load Discord configuration</div>;\n  }\n\n  return (\n    <div className=\"space-y-6\">\n      <Section title=\"Discord SSO Configuration\">\n        <StatusIndicator enabled={data?.config.enabled ?? false} />\n        \n        <form onSubmit={handleSubmit} className=\"mt-6 space-y-4 max-w-xl\">\n          <div>\n            <label className=\"block text-sm font-medium text-gray-700 mb-1\">\n              Client ID\n              {envOverrides.client_id && (\n                <span className=\"ml-2 text-xs text-amber-600\">\n                  (Overridden by {envOverrides.client_id.env_var})\n                </span>\n              )}\n            </label>\n            <input\n              type=\"text\"\n              className=\"input\"\n              value={form.client_id}\n              onChange={(e) => setForm({ ...form, client_id: e.target.value })}\n              placeholder=\"Your Discord application Client ID\"\n              disabled={!!envOverrides.client_id}\n            />\n          </div>\n\n          <div>\n            <label className=\"block text-sm font-medium text-gray-700 mb-1\">\n              Client Secret\n              {envOverrides.client_secret ? (\n                <span className=\"ml-2 text-xs text-amber-600\">\n                  (Overridden by {envOverrides.client_secret.env_var})\n                </span>\n              ) : data?.config.client_secret_preview ? (\n                <span className=\"ml-2 text-xs text-gray-500\">\n                  (Current: {data.config.client_secret_preview})\n                </span>\n              ) : null}\n            </label>\n            <input\n              type=\"password\"\n              className=\"input\"\n              value={form.client_secret}\n              onChange={(e) => {\n                setForm({ ...form, client_secret: e.target.value });\n                setHasSecretChange(true);\n              }}\n              placeholder={data?.config.client_secret_preview ? '••••••••' : 'Your Discord application Client Secret'}\n              disabled={!!envOverrides.client_secret}\n            />\n          </div>\n\n          <div>\n            <label className=\"block text-sm font-medium text-gray-700 mb-1\">\n              Redirect URI\n              {envOverrides.redirect_uri && (\n                <span className=\"ml-2 text-xs text-amber-600\">\n                  (Overridden by {envOverrides.redirect_uri.env_var})\n                </span>\n              )}\n            </label>\n            <input\n              type=\"url\"\n              className=\"input\"\n              value={form.redirect_uri}\n              onChange={(e) => setForm({ ...form, redirect_uri: e.target.value })}\n              placeholder=\"https://your-domain.com/api/auth/discord/callback\"\n              disabled={!!envOverrides.redirect_uri}\n            />\n            <p className=\"text-xs text-gray-500 mt-1\">\n              Must match the URI configured in Discord Developer Portal\n            </p>\n          </div>\n\n          <div>\n            <label className=\"block text-sm font-medium text-gray-700 mb-1\">\n              Guild IDs (optional)\n              {envOverrides.guild_ids && (\n                <span className=\"ml-2 text-xs text-amber-600\">\n                  (Overridden by {envOverrides.guild_ids.env_var})\n                </span>\n              )}\n            </label>\n            <input\n              type=\"text\"\n              className=\"input\"\n              value={form.guild_ids}\n              onChange={(e) => setForm({ ...form, guild_ids: e.target.value })}\n              placeholder=\"123456789,987654321\"\n              disabled={!!envOverrides.guild_ids}\n            />\n            <p className=\"text-xs text-gray-500 mt-1\">\n              Comma-separated Discord server IDs to restrict access\n            </p>\n          </div>\n\n          <div>\n            <label className=\"flex items-center gap-2\">\n              <input\n                type=\"checkbox\"\n                checked={form.allow_registration}\n                onChange={(e) => setForm({ ...form, allow_registration: e.target.checked })}\n                disabled={!!envOverrides.allow_registration}\n                className=\"h-4 w-4 rounded border-gray-300 text-indigo-600 focus:ring-indigo-500\"\n              />\n              <span className=\"text-sm text-gray-700\">\n                Allow new users to register via Discord\n              </span>\n            </label>\n            {envOverrides.allow_registration && (\n              <p className=\"text-xs text-amber-600 mt-1 ml-6\">\n                Overridden by {envOverrides.allow_registration.env_var}\n              </p>\n            )}\n          </div>\n\n          <div className=\"pt-4\">\n            <button\n              type=\"submit\"\n              disabled={mutation.isPending}\n              className=\"px-4 py-2 rounded bg-indigo-600 text-white text-sm font-medium hover:bg-indigo-700 disabled:opacity-60\"\n            >\n              {mutation.isPending ? 'Saving...' : 'Save Discord Settings'}\n            </button>\n          </div>\n        </form>\n      </Section>\n\n      <Section title=\"Setup Instructions\">\n        <SetupInstructions />\n      </Section>\n      \n      <style>{`.input{width:100%;padding:0.5rem;border:1px solid #e5e7eb;border-radius:0.375rem;font-size:0.875rem}`}</style>\n    </div>\n  );\n}\n\nfunction StatusIndicator({ enabled }: { enabled: boolean }) {\n  return (\n    <div className=\"flex items-center gap-3\">\n      <div\n        className={`w-3 h-3 rounded-full ${\n          enabled ? 'bg-green-500' : 'bg-gray-300'\n        }`}\n      />\n      <span className=\"text-sm font-medium text-gray-900\">\n        {enabled ? 'Discord SSO is enabled' : 'Discord SSO is not configured'}\n      </span>\n    </div>\n  );\n}\n\nfunction SetupInstructions() {\n  return (\n    <div className=\"bg-gray-50 border border-gray-200 rounded-lg p-4 space-y-4\">\n      <h4 className=\"text-sm font-medium text-gray-900\">\n        Discord Developer Portal Setup\n      </h4>\n      <ol className=\"text-sm text-gray-600 list-decimal list-inside space-y-2\">\n        <li>\n          Go to{' '}\n          <a\n            href=\"https://discord.com/developers/applications\"\n            target=\"_blank\"\n            rel=\"noopener noreferrer\"\n            className=\"text-indigo-600 hover:text-indigo-800 underline\"\n          >\n            Discord Developer Portal\n          </a>\n        </li>\n        <li>Create a new application or select an existing one</li>\n        <li>Navigate to <strong>OAuth2 → General</strong></li>\n        <li>Copy the <strong>Client ID</strong> and <strong>Client Secret</strong></li>\n        <li>Add your redirect URI to the list of allowed redirects</li>\n        <li>The redirect URI should be: <code className=\"bg-gray-100 px-1 rounded text-xs\">https://your-domain/api/auth/discord/callback</code></li>\n      </ol>\n      \n      <div className=\"pt-2 border-t border-gray-200\">\n        <p className=\"text-xs text-gray-500\">\n          <strong>Note:</strong> Environment variables (DISCORD_CLIENT_ID, DISCORD_CLIENT_SECRET, etc.) \n          take precedence over values configured here.\n        </p>\n      </div>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/components/config/tabs/UserManagementTab.tsx",
    "content": "import { useMemo, useState } from 'react';\nimport type { FormEvent } from 'react';\nimport { useQuery } from '@tanstack/react-query';\nimport { toast } from 'react-hot-toast';\nimport { authApi } from '../../../services/api';\nimport { useAuth } from '../../../contexts/AuthContext';\nimport { useConfigContext } from '../ConfigContext';\nimport { Section, Field, SaveButton } from '../shared';\nimport type { ManagedUser } from '../../../types';\n\nexport default function UserManagementTab() {\n  const { changePassword, refreshUser, user, logout } = useAuth();\n  const { pending, setField, handleSave, isSaving } = useConfigContext();\n\n  const {\n    data: managedUsers,\n    isLoading: usersLoading,\n    refetch: refetchUsers,\n  } = useQuery<ManagedUser[]>({\n    queryKey: ['auth-users'],\n    queryFn: async () => {\n      const response = await authApi.listUsers();\n      return response.users;\n    },\n  });\n\n  const totalUsers = useMemo(() => managedUsers?.length ?? 0, [managedUsers]);\n  const limitValue = pending?.app?.user_limit_total ?? null;\n\n  return (\n    <div className=\"space-y-6\">\n      <AccountSecuritySection\n        changePassword={changePassword}\n        refreshUser={refreshUser}\n      />\n      {pending && (\n        <UserLimitSection\n          currentUsers={totalUsers}\n          userLimit={limitValue}\n          onChangeLimit={(value) =>\n            setField(\n              ['app', 'user_limit_total'],\n              value === '' ? null : Number(value)\n            )\n          }\n          onSave={handleSave}\n          isSaving={isSaving}\n          isLoadingUsers={usersLoading}\n        />\n      )}\n      <UserManagementSection\n        currentUser={user}\n        refreshUser={refreshUser}\n        logout={logout}\n        managedUsers={managedUsers}\n        usersLoading={usersLoading}\n        refetchUsers={refetchUsers}\n      />\n    </div>\n  );\n}\n\n// --- Account Security Section ---\ninterface AccountSecurityProps {\n  changePassword: (current: string, next: string) => Promise<void>;\n  refreshUser: () => Promise<void>;\n}\n\nfunction AccountSecuritySection({ changePassword, refreshUser }: AccountSecurityProps) {\n  const [passwordForm, setPasswordForm] = useState({ current: '', next: '', confirm: '' });\n  const [passwordSubmitting, setPasswordSubmitting] = useState(false);\n\n  const handlePasswordSubmit = async (event: FormEvent<HTMLFormElement>) => {\n    event.preventDefault();\n    if (passwordForm.next !== passwordForm.confirm) {\n      toast.error('New passwords do not match.');\n      return;\n    }\n\n    setPasswordSubmitting(true);\n    try {\n      await changePassword(passwordForm.current, passwordForm.next);\n      toast.success('Password updated. Update PODLY_ADMIN_PASSWORD to match.');\n      setPasswordForm({ current: '', next: '', confirm: '' });\n      await refreshUser();\n    } catch (error) {\n      toast.error(getErrorMessage(error, 'Failed to update password.'));\n    } finally {\n      setPasswordSubmitting(false);\n    }\n  };\n\n  return (\n    <Section title=\"Account Security\">\n      <form className=\"grid gap-3 max-w-md\" onSubmit={handlePasswordSubmit}>\n        <Field label=\"Current password\">\n          <input\n            className=\"input\"\n            type=\"password\"\n            autoComplete=\"current-password\"\n            value={passwordForm.current}\n            onChange={(event) =>\n              setPasswordForm((prev) => ({ ...prev, current: event.target.value }))\n            }\n            required\n          />\n        </Field>\n        <Field label=\"New password\">\n          <input\n            className=\"input\"\n            type=\"password\"\n            autoComplete=\"new-password\"\n            value={passwordForm.next}\n            onChange={(event) =>\n              setPasswordForm((prev) => ({ ...prev, next: event.target.value }))\n            }\n            required\n          />\n        </Field>\n        <Field label=\"Confirm new password\">\n          <input\n            className=\"input\"\n            type=\"password\"\n            autoComplete=\"new-password\"\n            value={passwordForm.confirm}\n            onChange={(event) =>\n              setPasswordForm((prev) => ({ ...prev, confirm: event.target.value }))\n            }\n            required\n          />\n        </Field>\n        <div className=\"flex items-center gap-3\">\n          <button\n            type=\"submit\"\n            className=\"px-4 py-2 rounded bg-indigo-600 text-white text-sm font-medium hover:bg-indigo-700 disabled:opacity-60\"\n            disabled={passwordSubmitting}\n          >\n            {passwordSubmitting ? 'Updating…' : 'Update password'}\n          </button>\n          <p className=\"text-xs text-gray-500\">\n            After updating, rotate <code className=\"font-mono\">PODLY_ADMIN_PASSWORD</code> to match.\n          </p>\n        </div>\n      </form>\n      <style>{`.input{width:100%;padding:0.5rem;border:1px solid #e5e7eb;border-radius:0.375rem;font-size:0.875rem}`}</style>\n    </Section>\n  );\n}\n\n// --- User Limit Section ---\ninterface UserLimitSectionProps {\n  currentUsers: number;\n  userLimit: number | null;\n  onChangeLimit: (value: string) => void;\n  onSave: () => void;\n  isSaving: boolean;\n  isLoadingUsers: boolean;\n}\n\nfunction UserLimitSection({ currentUsers, userLimit, onChangeLimit, onSave, isSaving, isLoadingUsers }: UserLimitSectionProps) {\n  return (\n    <Section title=\"User Limits\">\n      <div className=\"grid gap-3 md:grid-cols-2 md:items-end\">\n        <Field label=\"Total users allowed\">\n          <input\n            className=\"input\"\n            type=\"number\"\n            min={0}\n            value={userLimit ?? ''}\n            onChange={(event) => onChangeLimit(event.target.value)}\n            placeholder=\"Unlimited\"\n          />\n          <p className=\"text-xs text-gray-500 mt-1\">\n            Leave blank for unlimited; set to 0 to block new user creation. Applies only when authentication is enabled.\n          </p>\n        </Field>\n        <div className=\"text-sm text-gray-700 space-y-1\">\n          <div className=\"font-semibold\">Current users</div>\n          <div>{isLoadingUsers ? 'Loading…' : currentUsers}</div>\n          {userLimit !== null && userLimit > 0 && currentUsers >= userLimit ? (\n            <div className=\"text-xs text-red-600\">\n              Limit reached. New users are blocked until the total drops below {userLimit}.\n            </div>\n          ) : (\n            <div className=\"text-xs text-gray-500\">\n              New user creation is blocked once the limit is reached.\n            </div>\n          )}\n        </div>\n      </div>\n      <div className=\"mt-3\">\n        <SaveButton onSave={onSave} isPending={isSaving} />\n      </div>\n      <style>{`.input{width:100%;padding:0.5rem;border:1px solid #e5e7eb;border-radius:0.375rem;font-size:0.875rem}`}</style>\n    </Section>\n  );\n}\n\n// --- User Management Section ---\ninterface UserManagementProps {\n  currentUser: { id: number; username: string; role: string } | null;\n  refreshUser: () => Promise<void>;\n  logout: () => void;\n  managedUsers: ManagedUser[] | undefined;\n  usersLoading: boolean;\n  refetchUsers: () => Promise<unknown>;\n}\n\nfunction UserManagementSection({ currentUser, refreshUser, logout, managedUsers, usersLoading, refetchUsers }: UserManagementProps) {\n  const [newUser, setNewUser] = useState({ username: '', password: '', confirm: '', role: 'user' });\n  const [activeResetUser, setActiveResetUser] = useState<string | null>(null);\n  const [resetPassword, setResetPassword] = useState('');\n  const [resetConfirm, setResetConfirm] = useState('');\n\n  const sortedUsers = useMemo(() => {\n    if (!managedUsers) {\n      return [];\n    }\n    return [...managedUsers].sort(\n      (a, b) => new Date(b.created_at).getTime() - new Date(a.created_at).getTime()\n    );\n  }, [managedUsers]);\n  const adminCount = useMemo(\n    () => sortedUsers.filter((u) => u.role === 'admin').length,\n    [sortedUsers]\n  );\n\n  const handleCreateUser = async (event: FormEvent<HTMLFormElement>) => {\n    event.preventDefault();\n    const username = newUser.username.trim();\n    if (!username) {\n      toast.error('Username is required.');\n      return;\n    }\n    if (newUser.password !== newUser.confirm) {\n      toast.error('Passwords do not match.');\n      return;\n    }\n\n    try {\n      await authApi.createUser({\n        username,\n        password: newUser.password,\n        role: newUser.role,\n      });\n      toast.success(`User '${username}' created.`);\n      setNewUser({ username: '', password: '', confirm: '', role: 'user' });\n      await refetchUsers();\n    } catch (error) {\n      toast.error(getErrorMessage(error, 'Failed to create user.'));\n    }\n  };\n\n  const handleRoleChange = async (username: string, role: string) => {\n    try {\n      await authApi.updateUser(username, { role });\n      toast.success(`Updated role for ${username}.`);\n      await refetchUsers();\n      if (currentUser && currentUser.username === username) {\n        await refreshUser();\n      }\n    } catch (error) {\n      toast.error(getErrorMessage(error, 'Failed to update role.'));\n    }\n  };\n\n  const handleAllowanceChange = async (username: string, allowance: string) => {\n    const val = allowance === '' ? null : parseInt(allowance, 10);\n    if (val !== null && isNaN(val)) return;\n\n    try {\n      await authApi.updateUser(username, { manual_feed_allowance: val });\n      toast.success(`Updated allowance for ${username}.`);\n      await refetchUsers();\n    } catch (error) {\n      toast.error(getErrorMessage(error, 'Failed to update allowance.'));\n    }\n  };\n\n  const handleResetPassword = async (event: FormEvent<HTMLFormElement>) => {\n    event.preventDefault();\n    if (!activeResetUser) {\n      return;\n    }\n    if (resetPassword !== resetConfirm) {\n      toast.error('Passwords do not match.');\n      return;\n    }\n\n    try {\n      await authApi.updateUser(activeResetUser, { password: resetPassword });\n      toast.success(`Password updated for ${activeResetUser}.`);\n      setActiveResetUser(null);\n      setResetPassword('');\n      setResetConfirm('');\n      await refetchUsers();\n    } catch (error) {\n      toast.error(getErrorMessage(error, 'Failed to update password.'));\n    }\n  };\n\n  const handleDeleteUser = async (username: string) => {\n    const confirmed = window.confirm(`Delete user '${username}'? This action cannot be undone.`);\n    if (!confirmed) {\n      return;\n    }\n    try {\n      await authApi.deleteUser(username);\n      toast.success(`Deleted user '${username}'.`);\n      await refetchUsers();\n      if (currentUser && currentUser.username === username) {\n        logout();\n      }\n    } catch (error) {\n      toast.error(getErrorMessage(error, 'Failed to delete user.'));\n    }\n  };\n\n  return (\n    <Section title=\"User Management\">\n      <div className=\"space-y-4\">\n        {/* Create User Form */}\n        <form className=\"grid gap-3 md:grid-cols-2\" onSubmit={handleCreateUser}>\n          <div className=\"md:col-span-2\">\n            <label className=\"block text-sm font-medium text-gray-700 mb-1\">Username</label>\n            <input\n              className=\"input\"\n              type=\"text\"\n              value={newUser.username}\n              onChange={(event) => setNewUser((prev) => ({ ...prev, username: event.target.value }))}\n              placeholder=\"new_user\"\n              required\n            />\n          </div>\n          <div>\n            <label className=\"block text-sm font-medium text-gray-700 mb-1\">Password</label>\n            <input\n              className=\"input\"\n              type=\"password\"\n              value={newUser.password}\n              onChange={(event) => setNewUser((prev) => ({ ...prev, password: event.target.value }))}\n              required\n            />\n          </div>\n          <div>\n            <label className=\"block text-sm font-medium text-gray-700 mb-1\">Confirm password</label>\n            <input\n              className=\"input\"\n              type=\"password\"\n              value={newUser.confirm}\n              onChange={(event) => setNewUser((prev) => ({ ...prev, confirm: event.target.value }))}\n              required\n            />\n          </div>\n          <div>\n            <label className=\"block text-sm font-medium text-gray-700 mb-1\">Role</label>\n            <select\n              className=\"input\"\n              value={newUser.role}\n              onChange={(event) => setNewUser((prev) => ({ ...prev, role: event.target.value }))}\n            >\n              <option value=\"user\">user</option>\n              <option value=\"admin\">admin</option>\n            </select>\n          </div>\n          <div className=\"md:col-span-2 flex items-center justify-start\">\n            <button\n              type=\"submit\"\n              className=\"px-4 py-2 rounded bg-green-600 text-white text-sm font-medium hover:bg-green-700\"\n            >\n              Add user\n            </button>\n          </div>\n        </form>\n\n        {/* User List */}\n        <div className=\"space-y-3\">\n          {usersLoading && <div className=\"text-sm text-gray-600\">Loading users…</div>}\n          {!usersLoading && (!managedUsers || managedUsers.length === 0) && (\n            <div className=\"text-sm text-gray-600\">No additional users configured.</div>\n          )}\n          {!usersLoading && managedUsers && managedUsers.length > 0 && (\n            <div className=\"space-y-3\">\n              {sortedUsers.map((managed) => {\n                const disableDemotion = managed.role === 'admin' && adminCount <= 1;\n                const disableDelete = disableDemotion;\n                const isActive = activeResetUser === managed.username;\n                const allowance = managed.feed_allowance ?? 0;\n                const subscriptionStatus = managed.feed_subscription_status ?? 'inactive';\n\n                return (\n                  <div\n                    key={managed.id}\n                    className=\"border border-gray-200 rounded-lg p-3 space-y-3 bg-white\"\n                  >\n                    <div className=\"flex flex-col gap-2 md:flex-row md:items-center md:justify-between\">\n                      <div>\n                        <div className=\"text-sm font-semibold text-gray-900\">{managed.username}</div>\n                        <div className=\"text-xs text-gray-500\">\n                          Added {new Date(managed.created_at).toLocaleString()} • Role {managed.role} • Feeds {allowance} • Status {subscriptionStatus}\n                          {managed.last_active && (\n                            <> • Last Active {new Date(managed.last_active).toLocaleString()}</>\n                          )}\n                        </div>\n                      </div>\n                      <div className=\"flex flex-wrap items-center gap-2\">\n                        <div className=\"flex items-center gap-1\" title=\"Override feed allowance\">\n                          <span className=\"text-xs text-gray-500\">Feed Allowance Override:</span>\n                          <input\n                            className=\"input text-sm w-20 py-1\"\n                            type=\"number\"\n                            min=\"0\"\n                            placeholder=\"None\"\n                            defaultValue={managed.manual_feed_allowance ?? ''}\n                            onBlur={(e) => {\n                              const val = e.target.value;\n                              const current = managed.manual_feed_allowance?.toString() ?? '';\n                              if (val !== current) {\n                                void handleAllowanceChange(managed.username, val);\n                              }\n                            }}\n                            onKeyDown={(e) => {\n                              if (e.key === 'Enter') {\n                                e.currentTarget.blur();\n                              }\n                            }}\n                          />\n                        </div>\n                        <select\n                          className=\"input text-sm\"\n                          value={managed.role}\n                          onChange={(event) => {\n                            const nextRole = event.target.value;\n                            if (nextRole !== managed.role) {\n                              void handleRoleChange(managed.username, nextRole);\n                            }\n                          }}\n                          disabled={disableDemotion && managed.role === 'admin'}\n                        >\n                          <option value=\"user\">user</option>\n                          <option value=\"admin\">admin</option>\n                        </select>\n                        <button\n                          type=\"button\"\n                          className=\"px-3 py-1 border border-gray-300 rounded-md text-sm hover:bg-gray-50\"\n                          onClick={() => {\n                            if (isActive) {\n                              setActiveResetUser(null);\n                              setResetPassword('');\n                              setResetConfirm('');\n                            } else {\n                              setActiveResetUser(managed.username);\n                              setResetPassword('');\n                              setResetConfirm('');\n                            }\n                          }}\n                        >\n                          {isActive ? 'Cancel' : 'Set password'}\n                        </button>\n                        <button\n                          type=\"button\"\n                          className=\"px-3 py-1 border border-red-300 text-red-600 rounded-md text-sm hover:bg-red-50 disabled:opacity-50\"\n                          onClick={() => void handleDeleteUser(managed.username)}\n                          disabled={disableDelete}\n                        >\n                          Delete\n                        </button>\n                      </div>\n                    </div>\n\n                    {isActive && (\n                      <form className=\"grid gap-2 md:grid-cols-3\" onSubmit={handleResetPassword}>\n                        <div className=\"md:col-span-1\">\n                          <label className=\"block text-xs font-medium text-gray-600 mb-1\">\n                            New password\n                          </label>\n                          <input\n                            className=\"input\"\n                            type=\"password\"\n                            value={resetPassword}\n                            onChange={(event) => setResetPassword(event.target.value)}\n                            required\n                          />\n                        </div>\n                        <div className=\"md:col-span-1\">\n                          <label className=\"block text-xs font-medium text-gray-600 mb-1\">\n                            Confirm password\n                          </label>\n                          <input\n                            className=\"input\"\n                            type=\"password\"\n                            value={resetConfirm}\n                            onChange={(event) => setResetConfirm(event.target.value)}\n                            required\n                          />\n                        </div>\n                        <div className=\"md:col-span-1 flex items-end gap-2\">\n                          <button\n                            type=\"submit\"\n                            className=\"px-4 py-2 rounded bg-indigo-600 text-white text-sm hover:bg-indigo-700\"\n                          >\n                            Update\n                          </button>\n                          <p className=\"text-xs text-gray-500\">Share new credentials securely.</p>\n                        </div>\n                      </form>\n                    )}\n                  </div>\n                );\n              })}\n            </div>\n          )}\n        </div>\n      </div>\n      <style>{`.input{width:100%;padding:0.5rem;border:1px solid #e5e7eb;border-radius:0.375rem;font-size:0.875rem}`}</style>\n    </Section>\n  );\n}\n\n// Helper function\nfunction getErrorMessage(error: unknown, fallback = 'Request failed.') {\n  if (error && typeof error === 'object') {\n    const err = error as {\n      response?: { data?: { error?: string; message?: string } };\n      message?: string;\n    };\n    return err.response?.data?.error || err.response?.data?.message || err.message || fallback;\n  }\n  if (error instanceof Error) {\n    return error.message;\n  }\n  return fallback;\n}\n"
  },
  {
    "path": "frontend/src/components/config/tabs/index.ts",
    "content": "export { default as DefaultTab } from './DefaultTab';\nexport { default as AdvancedTab } from './AdvancedTab';\nexport { default as UserManagementTab } from './UserManagementTab';\nexport { default as DiscordTab } from './DiscordTab';\n"
  },
  {
    "path": "frontend/src/contexts/AudioPlayerContext.tsx",
    "content": "import React, { createContext, useContext, useReducer, useRef, useEffect, useCallback } from 'react';\nimport type { Episode } from '../types';\nimport { feedsApi } from '../services/api';\n\ninterface AudioPlayerState {\n  currentEpisode: Episode | null;\n  isPlaying: boolean;\n  currentTime: number;\n  duration: number;\n  volume: number;\n  isLoading: boolean;\n  error: string | null;\n}\n\ninterface AudioPlayerContextType extends AudioPlayerState {\n  playEpisode: (episode: Episode) => void;\n  togglePlayPause: () => void;\n  seekTo: (time: number) => void;\n  setVolume: (volume: number) => void;\n  audioRef: React.RefObject<HTMLAudioElement | null>;\n}\n\ntype AudioPlayerAction =\n  | { type: 'SET_EPISODE'; payload: Episode }\n  | { type: 'SET_PLAYING'; payload: boolean }\n  | { type: 'SET_CURRENT_TIME'; payload: number }\n  | { type: 'SET_DURATION'; payload: number }\n  | { type: 'SET_VOLUME'; payload: number }\n  | { type: 'SET_LOADING'; payload: boolean }\n  | { type: 'SET_ERROR'; payload: string | null };\n\nconst initialState: AudioPlayerState = {\n  currentEpisode: null,\n  isPlaying: false,\n  currentTime: 0,\n  duration: 0,\n  volume: 1,\n  isLoading: false,\n  error: null,\n};\n\nfunction audioPlayerReducer(state: AudioPlayerState, action: AudioPlayerAction): AudioPlayerState {\n  switch (action.type) {\n    case 'SET_EPISODE':\n      return { ...state, currentEpisode: action.payload, currentTime: 0, error: null };\n    case 'SET_PLAYING':\n      return { ...state, isPlaying: action.payload };\n    case 'SET_CURRENT_TIME':\n      return { ...state, currentTime: action.payload };\n    case 'SET_DURATION':\n      return { ...state, duration: action.payload };\n    case 'SET_VOLUME':\n      return { ...state, volume: action.payload };\n    case 'SET_LOADING':\n      return { ...state, isLoading: action.payload };\n    case 'SET_ERROR':\n      return { ...state, error: action.payload, isLoading: false };\n    default:\n      return state;\n  }\n}\n\nconst AudioPlayerContext = createContext<AudioPlayerContextType | undefined>(undefined);\n\nexport function AudioPlayerProvider({ children }: { children: React.ReactNode }) {\n  const [state, dispatch] = useReducer(audioPlayerReducer, initialState);\n  const audioRef = useRef<HTMLAudioElement>(null);\n\n  const playEpisode = (episode: Episode) => {\n    console.log('playEpisode called with:', episode);\n    console.log('Episode audio flags:', {\n      has_processed_audio: episode.has_processed_audio,\n      has_unprocessed_audio: episode.has_unprocessed_audio,\n      download_url: episode.download_url\n    });\n\n    if (!episode.has_processed_audio) {\n      console.log('No processed audio available for episode');\n      dispatch({ type: 'SET_ERROR', payload: 'Post needs to be processed first' });\n      return;\n    }\n\n    console.log('Setting episode and loading state');\n    dispatch({ type: 'SET_EPISODE', payload: episode });\n    dispatch({ type: 'SET_LOADING', payload: true });\n    \n    if (audioRef.current) {\n      // Use the new API endpoint for audio\n      const audioUrl = feedsApi.getPostAudioUrl(episode.guid);\n      console.log('Using API audio URL:', audioUrl);\n      \n      audioRef.current.src = audioUrl;\n      audioRef.current.load();\n    } else {\n      console.log('audioRef.current is null');\n    }\n  };\n\n  const togglePlayPause = useCallback(() => {\n    if (!audioRef.current || !state.currentEpisode) return;\n\n    if (state.isPlaying) {\n      audioRef.current.pause();\n    } else {\n      audioRef.current.play().catch((error) => {\n        dispatch({ type: 'SET_ERROR', payload: 'Failed to play audio' });\n        console.error('Audio play error:', error);\n      });\n    }\n  }, [state.isPlaying, state.currentEpisode]);\n\n  const seekTo = useCallback((time: number) => {\n    if (audioRef.current) {\n      audioRef.current.currentTime = time;\n      dispatch({ type: 'SET_CURRENT_TIME', payload: time });\n    }\n  }, []);\n\n  const setVolume = useCallback((volume: number) => {\n    if (audioRef.current) {\n      audioRef.current.volume = volume;\n      dispatch({ type: 'SET_VOLUME', payload: volume });\n    }\n  }, []);\n\n  // Audio event handlers\n  useEffect(() => {\n    const audio = audioRef.current;\n    if (!audio) return;\n\n    const handleLoadedData = () => {\n      dispatch({ type: 'SET_DURATION', payload: audio.duration });\n      dispatch({ type: 'SET_LOADING', payload: false });\n    };\n\n    const handleTimeUpdate = () => {\n      dispatch({ type: 'SET_CURRENT_TIME', payload: audio.currentTime });\n    };\n\n    const handlePlay = () => {\n      dispatch({ type: 'SET_PLAYING', payload: true });\n    };\n\n    const handlePause = () => {\n      dispatch({ type: 'SET_PLAYING', payload: false });\n    };\n\n    const handleEnded = () => {\n      dispatch({ type: 'SET_PLAYING', payload: false });\n      dispatch({ type: 'SET_CURRENT_TIME', payload: 0 });\n    };\n\n    const handleError = () => {\n      const audio = audioRef.current;\n      if (!audio) return;\n\n      // Get more specific error information\n      let errorMessage = 'Failed to load audio';\n      \n      if (audio.error) {\n        switch (audio.error.code) {\n          case MediaError.MEDIA_ERR_ABORTED:\n            errorMessage = 'Audio loading was aborted';\n            break;\n          case MediaError.MEDIA_ERR_NETWORK:\n            errorMessage = 'Network error while loading audio';\n            break;\n          case MediaError.MEDIA_ERR_DECODE:\n            errorMessage = 'Audio file is corrupted or unsupported';\n            break;\n          case MediaError.MEDIA_ERR_SRC_NOT_SUPPORTED:\n            errorMessage = 'Audio format not supported or file not found';\n            break;\n          default:\n            errorMessage = 'Unknown audio error';\n        }\n      }\n\n      // Check if it's a network error that might indicate specific HTTP status\n      if (audio.error?.code === MediaError.MEDIA_ERR_NETWORK || \n          audio.error?.code === MediaError.MEDIA_ERR_SRC_NOT_SUPPORTED) {\n        // For network errors, provide more helpful messages\n        if (state.currentEpisode) {\n          if (!state.currentEpisode.has_processed_audio) {\n            errorMessage = 'Post needs to be processed first';\n          } else if (!state.currentEpisode.whitelisted) {\n            errorMessage = 'Post is not whitelisted';\n          } else {\n            errorMessage = 'Audio file not available - try processing the post again';\n          }\n        }\n      }\n\n      console.error('Audio error:', audio.error, 'Message:', errorMessage);\n      dispatch({ type: 'SET_ERROR', payload: errorMessage });\n    };\n\n    const handleCanPlay = () => {\n      dispatch({ type: 'SET_LOADING', payload: false });\n    };\n\n    audio.addEventListener('loadeddata', handleLoadedData);\n    audio.addEventListener('timeupdate', handleTimeUpdate);\n    audio.addEventListener('play', handlePlay);\n    audio.addEventListener('pause', handlePause);\n    audio.addEventListener('ended', handleEnded);\n    audio.addEventListener('error', handleError);\n    audio.addEventListener('canplay', handleCanPlay);\n\n    return () => {\n      audio.removeEventListener('loadeddata', handleLoadedData);\n      audio.removeEventListener('timeupdate', handleTimeUpdate);\n      audio.removeEventListener('play', handlePlay);\n      audio.removeEventListener('pause', handlePause);\n      audio.removeEventListener('ended', handleEnded);\n      audio.removeEventListener('error', handleError);\n      audio.removeEventListener('canplay', handleCanPlay);\n    };\n  }, []);\n\n  // Keyboard shortcuts\n  useEffect(() => {\n    const handleKeyDown = (event: KeyboardEvent) => {\n      // Only handle shortcuts when there's a current episode and not typing in an input\n      if (!state.currentEpisode || \n          event.target instanceof HTMLInputElement || \n          event.target instanceof HTMLTextAreaElement) {\n        return;\n      }\n\n      switch (event.code) {\n        case 'Space':\n          event.preventDefault();\n          togglePlayPause();\n          break;\n        case 'ArrowLeft':\n          event.preventDefault();\n          seekTo(Math.max(0, state.currentTime - 10)); // Seek back 10 seconds\n          break;\n        case 'ArrowRight':\n          event.preventDefault();\n          seekTo(Math.min(state.duration, state.currentTime + 10)); // Seek forward 10 seconds\n          break;\n        case 'ArrowUp':\n          event.preventDefault();\n          setVolume(Math.min(1, state.volume + 0.1)); // Volume up\n          break;\n        case 'ArrowDown':\n          event.preventDefault();\n          setVolume(Math.max(0, state.volume - 0.1)); // Volume down\n          break;\n      }\n    };\n\n    document.addEventListener('keydown', handleKeyDown);\n    return () => document.removeEventListener('keydown', handleKeyDown);\n  }, [state.currentEpisode, state.currentTime, state.duration, state.volume, togglePlayPause, seekTo, setVolume]);\n\n  const contextValue: AudioPlayerContextType = {\n    ...state,\n    playEpisode,\n    togglePlayPause,\n    seekTo,\n    setVolume,\n    audioRef,\n  };\n\n  return (\n    <AudioPlayerContext.Provider value={contextValue}>\n      {children}\n      <audio ref={audioRef} preload=\"metadata\" />\n    </AudioPlayerContext.Provider>\n  );\n}\n\nexport function useAudioPlayer() {\n  const context = useContext(AudioPlayerContext);\n  if (context === undefined) {\n    throw new Error('useAudioPlayer must be used within an AudioPlayerProvider');\n  }\n  return context;\n} "
  },
  {
    "path": "frontend/src/contexts/AuthContext.tsx",
    "content": "import { createContext, useCallback, useContext, useEffect, useMemo, useState } from 'react';\nimport type { ReactNode } from 'react';\nimport { authApi } from '../services/api';\nimport type { AuthUser } from '../types';\n\ntype AuthStatus = 'loading' | 'ready';\n\ninterface AuthContextValue {\n  status: AuthStatus;\n  requireAuth: boolean;\n  isAuthenticated: boolean;\n  user: AuthUser | null;\n  landingPageEnabled: boolean;\n  login: (username: string, password: string) => Promise<void>;\n  logout: () => void;\n  changePassword: (currentPassword: string, newPassword: string) => Promise<void>;\n  refreshUser: () => Promise<void>;\n}\n\nconst AuthContext = createContext<AuthContextValue | undefined>(undefined);\n\ninterface InternalState {\n  status: AuthStatus;\n  requireAuth: boolean;\n  user: AuthUser | null;\n  landingPageEnabled: boolean;\n}\n\nexport function AuthProvider({ children }: { children: ReactNode }) {\n  const [state, setState] = useState<InternalState>({\n    status: 'loading',\n    requireAuth: false,\n    user: null,\n    landingPageEnabled: false,\n  });\n\n  const bootstrapAuth = useCallback(async () => {\n    try {\n      const statusResponse = await authApi.getStatus();\n      const requireAuth = Boolean(statusResponse.require_auth);\n      const landingPageEnabled = Boolean(statusResponse.landing_page_enabled);\n\n      if (!requireAuth) {\n        setState({\n          status: 'ready',\n          requireAuth: false,\n          user: null,\n          landingPageEnabled,\n        });\n        return;\n      }\n\n      try {\n        const me = await authApi.getCurrentUser();\n        setState({\n          status: 'ready',\n          requireAuth: true,\n          user: me.user,\n          landingPageEnabled,\n        });\n      } catch (error) {\n        setState({\n          status: 'ready',\n          requireAuth: true,\n          user: null,\n          landingPageEnabled,\n        });\n      }\n    } catch (error) {\n      console.error('Failed to initialize auth state', error);\n      setState({\n        status: 'ready',\n        requireAuth: false,\n        user: null,\n        landingPageEnabled: false,\n      });\n    }\n  }, []);\n\n  useEffect(() => {\n    void bootstrapAuth();\n  }, [bootstrapAuth]);\n\n  const login = useCallback(async (username: string, password: string) => {\n    const trimmedUsername = username.trim();\n    if (!trimmedUsername) {\n      throw new Error('Username is required.');\n    }\n\n    const response = await authApi.login(trimmedUsername, password);\n    setState((prev) => ({\n      status: 'ready',\n      requireAuth: true,\n      user: response.user,\n      landingPageEnabled: prev.landingPageEnabled,\n    }));\n  }, []);\n\n  const logout = useCallback(() => {\n    void authApi.logout().catch((error) => {\n      console.warn('Failed to log out cleanly', error);\n    });\n    setState((prev) => ({\n      status: 'ready',\n      requireAuth: prev.requireAuth,\n      user: prev.requireAuth ? null : prev.user,\n      landingPageEnabled: prev.landingPageEnabled,\n    }));\n  }, []);\n\n  const changePassword = useCallback(\n    async (currentPassword: string, newPassword: string) => {\n      await authApi.changePassword({\n        current_password: currentPassword,\n        new_password: newPassword,\n      });\n    },\n    [],\n  );\n\n  const refreshUser = useCallback(async () => {\n    if (!state.requireAuth) {\n      return;\n    }\n    try {\n      const me = await authApi.getCurrentUser();\n      setState((prev) => ({\n        ...prev,\n        user: me.user,\n      }));\n    } catch (error) {\n      console.warn('Session expired while refreshing user', error);\n      setState((prev) => ({\n        ...prev,\n        user: null,\n      }));\n    }\n  }, [state.requireAuth]);\n\n  const value = useMemo<AuthContextValue>(() => {\n    const isAuthenticated = !state.requireAuth || Boolean(state.user);\n    return {\n      status: state.status,\n      requireAuth: state.requireAuth,\n      isAuthenticated,\n      user: state.user,\n      landingPageEnabled: state.landingPageEnabled,\n      login,\n      logout,\n      changePassword,\n      refreshUser,\n    };\n  }, [changePassword, login, logout, refreshUser, state.requireAuth, state.status, state.user]);\n\n  return <AuthContext.Provider value={value}>{children}</AuthContext.Provider>;\n}\n\nexport const useAuth = (): AuthContextValue => {\n  const context = useContext(AuthContext);\n  if (!context) {\n    throw new Error('useAuth must be used within an AuthProvider');\n  }\n  return context;\n};\n"
  },
  {
    "path": "frontend/src/contexts/DiagnosticsContext.tsx",
    "content": "/* eslint-disable react-refresh/only-export-components */\n\nimport { createContext, useCallback, useContext, useEffect, useMemo, useRef, useState, type ReactNode } from 'react';\nimport { DIAGNOSTIC_ERROR_EVENT, diagnostics, type DiagnosticErrorPayload, type DiagnosticsEntry } from '../utils/diagnostics';\n\nexport type DiagnosticsContextValue = {\n  isOpen: boolean;\n  open: (payload?: DiagnosticErrorPayload) => void;\n  close: () => void;\n  clear: () => void;\n  getEntries: () => DiagnosticsEntry[];\n  currentError: DiagnosticErrorPayload | null;\n};\n\nconst DiagnosticsContext = createContext<DiagnosticsContextValue | null>(null);\n\nconst signatureFor = (payload: DiagnosticErrorPayload): string => {\n  const base = {\n    title: payload.title,\n    message: payload.message,\n    kind: payload.kind,\n  };\n  try {\n    return JSON.stringify(base);\n  } catch {\n    return `${payload.title}:${payload.message}`;\n  }\n};\n\nexport function DiagnosticsProvider({ children }: { children: ReactNode }) {\n  const [isOpen, setIsOpen] = useState(false);\n  const [currentError, setCurrentError] = useState<DiagnosticErrorPayload | null>(null);\n  const lastShownRef = useRef<{ sig: string; ts: number } | null>(null);\n\n  const open = useCallback((payload?: DiagnosticErrorPayload) => {\n    if (payload) {\n      setCurrentError(payload);\n    } else {\n      setCurrentError(null);\n    }\n    setIsOpen(true);\n  }, []);\n\n  const close = useCallback(() => {\n    setIsOpen(false);\n  }, []);\n\n  const clear = useCallback(() => {\n    diagnostics.clear();\n  }, []);\n\n  const getEntries = useCallback(() => diagnostics.getEntries(), []);\n\n  useEffect(() => {\n    const handler = (event: Event) => {\n      const detail = (event as CustomEvent).detail as DiagnosticErrorPayload | undefined;\n      if (!detail) return;\n\n      // Deduplicate noisy errors (same signature within 5s)\n      const sig = signatureFor(detail);\n      const now = Date.now();\n      const last = lastShownRef.current;\n      if (last && last.sig === sig && now - last.ts < 5000) {\n        return;\n      }\n      lastShownRef.current = { sig, ts: now };\n\n      setCurrentError(detail);\n      setIsOpen(true);\n    };\n\n    window.addEventListener(DIAGNOSTIC_ERROR_EVENT, handler as EventListener);\n    return () => window.removeEventListener(DIAGNOSTIC_ERROR_EVENT, handler as EventListener);\n  }, []);\n\n  const value = useMemo<DiagnosticsContextValue>(\n    () => ({\n      isOpen,\n      open,\n      close,\n      clear,\n      getEntries,\n      currentError,\n    }),\n    [close, clear, currentError, getEntries, isOpen, open]\n  );\n\n  return <DiagnosticsContext.Provider value={value}>{children}</DiagnosticsContext.Provider>;\n}\n\nexport const useDiagnostics = (): DiagnosticsContextValue => {\n  const ctx = useContext(DiagnosticsContext);\n  if (!ctx) {\n    throw new Error('useDiagnostics must be used within DiagnosticsProvider');\n  }\n  return ctx;\n};\n"
  },
  {
    "path": "frontend/src/hooks/useConfigState.ts",
    "content": "import { useCallback, useEffect, useMemo, useRef, useState } from 'react';\nimport { useMutation, useQuery } from '@tanstack/react-query';\nimport { configApi } from '../services/api';\nimport { toast } from 'react-hot-toast';\nimport type {\n  CombinedConfig,\n  ConfigResponse,\n  EnvOverrideEntry,\n  EnvOverrideMap,\n  LLMConfig,\n  WhisperConfig,\n} from '../types';\n\nconst DEFAULT_ENV_HINTS: Record<string, EnvOverrideEntry> = {\n  'groq.api_key': { env_var: 'GROQ_API_KEY' },\n  'llm.llm_api_key': { env_var: 'LLM_API_KEY' },\n  'llm.llm_model': { env_var: 'LLM_MODEL' },\n  'llm.openai_base_url': { env_var: 'OPENAI_BASE_URL' },\n  'whisper.whisper_type': { env_var: 'WHISPER_TYPE' },\n  'whisper.api_key': { env_var: 'WHISPER_REMOTE_API_KEY' },\n  'whisper.base_url': { env_var: 'WHISPER_REMOTE_BASE_URL' },\n  'whisper.model': { env_var: 'WHISPER_REMOTE_MODEL' },\n  'whisper.timeout_sec': { env_var: 'WHISPER_REMOTE_TIMEOUT_SEC' },\n  'whisper.chunksize_mb': { env_var: 'WHISPER_REMOTE_CHUNKSIZE_MB' },\n  'whisper.max_retries': { env_var: 'GROQ_MAX_RETRIES' },\n};\n\nconst getValueAtPath = (obj: unknown, path: string): unknown => {\n  if (!obj || typeof obj !== 'object') {\n    return undefined;\n  }\n  return path.split('.').reduce<unknown>((acc, key) => {\n    if (!acc || typeof acc !== 'object') {\n      return undefined;\n    }\n    return (acc as Record<string, unknown>)[key];\n  }, obj);\n};\n\nconst valuesDiffer = (a: unknown, b: unknown): boolean => {\n  if (a === b) {\n    return false;\n  }\n  const aEmpty = a === null || a === undefined || a === '';\n  const bEmpty = b === null || b === undefined || b === '';\n  if (aEmpty && bEmpty) {\n    return false;\n  }\n  return true;\n};\n\nexport interface ConnectionStatus {\n  status: 'loading' | 'ok' | 'error';\n  message: string;\n  error: string;\n}\n\nexport interface UseConfigStateReturn {\n  // Data\n  pending: CombinedConfig | null;\n  configData: CombinedConfig | undefined;\n  envOverrides: EnvOverrideMap;\n  isLoading: boolean;\n\n  // Status\n  llmStatus: ConnectionStatus;\n  whisperStatus: ConnectionStatus;\n  hasEdits: boolean;\n  localWhisperAvailable: boolean | null;\n  isSaving: boolean;\n\n  // Actions\n  setField: (path: string[], value: unknown) => void;\n  updatePending: (\n    transform: (prevConfig: CombinedConfig) => CombinedConfig,\n    markDirty?: boolean\n  ) => void;\n  probeConnections: () => Promise<void>;\n  handleSave: () => void;\n  refetch: () => void;\n  setHasEdits: (value: boolean) => void;\n\n  // Helpers\n  getEnvHint: (path: string, fallback?: EnvOverrideEntry) => EnvOverrideEntry | undefined;\n  getWhisperApiKey: (w: WhisperConfig | undefined) => string;\n\n  // Recommended defaults\n  groqRecommendedModel: string;\n  groqRecommendedWhisper: string;\n\n  // Env warning modal\n  envWarningPaths: string[];\n  showEnvWarning: boolean;\n  handleConfirmEnvWarning: () => void;\n  handleDismissEnvWarning: () => void;\n\n  // Whisper type change handler\n  handleWhisperTypeChange: (nextType: 'local' | 'remote' | 'groq') => void;\n\n  // Groq quick setup mutation\n  applyGroqKey: (key: string) => Promise<void>;\n  isApplyingGroqKey: boolean;\n}\n\nexport function useConfigState(): UseConfigStateReturn {\n  const { data, isLoading, refetch } = useQuery<ConfigResponse>({\n    queryKey: ['config'],\n    queryFn: configApi.getConfig,\n    staleTime: Infinity,\n    refetchOnWindowFocus: false,\n    refetchOnReconnect: false,\n  });\n\n  const configData = data?.config;\n  const envOverrides = useMemo<EnvOverrideMap>(() => data?.env_overrides ?? {}, [data]);\n\n  const getEnvHint = useCallback(\n    (path: string, fallback?: EnvOverrideEntry) =>\n      envOverrides[path] ?? fallback ?? DEFAULT_ENV_HINTS[path],\n    [envOverrides]\n  );\n\n  const [pending, setPending] = useState<CombinedConfig | null>(null);\n  const [hasEdits, setHasEdits] = useState(false);\n  const [localWhisperAvailable, setLocalWhisperAvailable] = useState<boolean | null>(null);\n\n  // Connection statuses\n  const [llmStatus, setLlmStatus] = useState<ConnectionStatus>({\n    status: 'loading',\n    message: '',\n    error: '',\n  });\n  const [whisperStatus, setWhisperStatus] = useState<ConnectionStatus>({\n    status: 'loading',\n    message: '',\n    error: '',\n  });\n\n  // Env warning modal state\n  const [envWarningPaths, setEnvWarningPaths] = useState<string[]>([]);\n  const [showEnvWarning, setShowEnvWarning] = useState(false);\n\n  const initialProbeDone = useRef(false);\n  const groqRecommendedModel = useMemo(() => 'groq/openai/gpt-oss-120b', []);\n  const groqRecommendedWhisper = useMemo(() => 'whisper-large-v3-turbo', []);\n\n  const getWhisperApiKey = (w: WhisperConfig | undefined): string => {\n    if (!w) return '';\n    if (w.whisper_type === 'remote') return w.api_key ?? '';\n    if (w.whisper_type === 'groq') return w.api_key ?? '';\n    return '';\n  };\n\n  const updatePending = useCallback(\n    (transform: (prevConfig: CombinedConfig) => CombinedConfig, markDirty: boolean = true) => {\n      let updated = false;\n      setPending((prevConfig) => {\n        if (!prevConfig) {\n          return prevConfig;\n        }\n        const nextConfig = transform(prevConfig);\n        if (nextConfig === prevConfig) {\n          return prevConfig;\n        }\n        updated = true;\n        return nextConfig;\n      });\n\n      if (updated && markDirty) {\n        setHasEdits(true);\n      }\n    },\n    []\n  );\n\n  const setField = useCallback(\n    (path: string[], value: unknown) => {\n      updatePending((prevConfig) => {\n        const prevRecord = prevConfig as unknown as Record<string, unknown>;\n        const lastIndex = path.length - 1;\n\n        let existingParent: Record<string, unknown> | null = prevRecord;\n        for (let i = 0; i < lastIndex; i++) {\n          const key = path[i];\n          const rawNext: unknown = existingParent?.[key];\n          const nextParent: Record<string, unknown> | null =\n            rawNext && typeof rawNext === 'object'\n              ? (rawNext as Record<string, unknown>)\n              : null;\n          if (!nextParent) {\n            existingParent = null;\n            break;\n          }\n          existingParent = nextParent;\n        }\n\n        if (existingParent) {\n          const currentValue = existingParent[path[lastIndex]];\n          if (Object.is(currentValue, value)) {\n            return prevConfig;\n          }\n        }\n\n        const next: Record<string, unknown> = { ...prevRecord };\n\n        let cursor: Record<string, unknown> = next;\n        let sourceCursor: Record<string, unknown> = prevRecord;\n\n        for (let i = 0; i < lastIndex; i++) {\n          const key = path[i];\n          const currentSource = (sourceCursor?.[key] as Record<string, unknown>) ?? {};\n          const clonedChild: Record<string, unknown> = { ...currentSource };\n          cursor[key] = clonedChild;\n          cursor = clonedChild;\n          sourceCursor = currentSource;\n        }\n\n        cursor[path[lastIndex]] = value;\n\n        return next as unknown as CombinedConfig;\n      });\n    },\n    [updatePending]\n  );\n\n  // Initialize pending from config data\n  useEffect(() => {\n    if (!configData) {\n      return;\n    }\n    setPending((prev) => {\n      if (prev === null) {\n        return configData;\n      }\n      if (hasEdits) {\n        return prev;\n      }\n      return configData;\n    });\n  }, [configData, hasEdits]);\n\n  // Probe connections\n  const probeConnections = async () => {\n    if (!pending) return;\n    setLlmStatus({ status: 'loading', message: '', error: '' });\n    setWhisperStatus({ status: 'loading', message: '', error: '' });\n\n    try {\n      const [llmRes, whisperRes] = await Promise.all([\n        configApi.testLLM({ llm: pending.llm as LLMConfig }),\n        configApi.testWhisper({ whisper: pending.whisper as WhisperConfig }),\n      ]);\n\n      if (llmRes?.ok) {\n        setLlmStatus({\n          status: 'ok',\n          message: llmRes.message || 'LLM connection OK',\n          error: '',\n        });\n      } else {\n        setLlmStatus({\n          status: 'error',\n          message: '',\n          error: llmRes?.error || 'LLM connection failed',\n        });\n      }\n\n      if (whisperRes?.ok) {\n        setWhisperStatus({\n          status: 'ok',\n          message: whisperRes.message || 'Whisper connection OK',\n          error: '',\n        });\n      } else {\n        setWhisperStatus({\n          status: 'error',\n          message: '',\n          error: whisperRes?.error || 'Whisper test failed',\n        });\n      }\n    } catch (err: unknown) {\n      const e = err as {\n        response?: { data?: { error?: string; message?: string } };\n        message?: string;\n      };\n      const msg =\n        e?.response?.data?.error ||\n        e?.response?.data?.message ||\n        e?.message ||\n        'Connection test failed';\n      setLlmStatus({ status: 'error', message: '', error: msg });\n      setWhisperStatus({ status: 'error', message: '', error: msg });\n    }\n  };\n\n  // Initial probe\n  useEffect(() => {\n    if (!pending || initialProbeDone.current) return;\n    initialProbeDone.current = true;\n    void probeConnections();\n    // eslint-disable-next-line react-hooks/exhaustive-deps\n  }, [pending]);\n\n  // Probe whisper capabilities\n  useEffect(() => {\n    let cancelled = false;\n    configApi\n      .getWhisperCapabilities()\n      .then((res) => {\n        if (!cancelled) setLocalWhisperAvailable(!!res.local_available);\n      })\n      .catch(() => {\n        if (!cancelled) setLocalWhisperAvailable(false);\n      });\n    return () => {\n      cancelled = true;\n    };\n  }, []);\n\n  // If local is unavailable but selected, switch to safe default\n  useEffect(() => {\n    if (!pending || localWhisperAvailable !== false) return;\n    const currentType = pending.whisper.whisper_type;\n    if (currentType === 'local') {\n      setField(['whisper', 'whisper_type'], 'remote');\n    }\n  }, [localWhisperAvailable, pending, setField]);\n\n  // Save mutation\n  const saveMutation = useMutation({\n    mutationFn: async () => {\n      return configApi.updateConfig((pending ?? {}) as Partial<CombinedConfig>);\n    },\n    onSuccess: () => {\n      setHasEdits(false);\n      refetch();\n    },\n  });\n\n  const saveToastMessages = {\n    loading: 'Saving changes...',\n    success: 'Configuration saved',\n    error: (err: unknown) => {\n      if (typeof err === 'object' && err !== null) {\n        const e = err as {\n          response?: { data?: { error?: string; details?: string; message?: string } };\n          message?: string;\n        };\n        return (\n          e.response?.data?.message ||\n          e.response?.data?.error ||\n          e.response?.data?.details ||\n          e.message ||\n          'Failed to save configuration'\n        );\n      }\n      return 'Failed to save configuration';\n    },\n  } as const;\n\n  const getEnvManagedConflicts = (): string[] => {\n    if (!pending || !configData) {\n      return [];\n    }\n    return Object.keys(envOverrides).filter((path) => {\n      const baseline = getValueAtPath(configData, path);\n      const current = getValueAtPath(pending, path);\n      return valuesDiffer(current, baseline);\n    });\n  };\n\n  const triggerSaveMutation = () => {\n    toast.promise(saveMutation.mutateAsync(), saveToastMessages);\n  };\n\n  const handleSave = () => {\n    if (saveMutation.isPending) {\n      return;\n    }\n    const envConflicts = getEnvManagedConflicts();\n    if (envConflicts.length > 0) {\n      setEnvWarningPaths(envConflicts);\n      setShowEnvWarning(true);\n      return;\n    }\n    triggerSaveMutation();\n  };\n\n  const handleConfirmEnvWarning = () => {\n    setShowEnvWarning(false);\n    triggerSaveMutation();\n  };\n\n  const handleDismissEnvWarning = () => {\n    setShowEnvWarning(false);\n    setEnvWarningPaths([]);\n  };\n\n  // Whisper type change handler\n  const handleWhisperTypeChange = (nextType: 'local' | 'remote' | 'groq') => {\n    updatePending((prevConfig) => {\n      const prevWhisper = {\n        ...(prevConfig.whisper as unknown as Record<string, unknown>),\n      };\n      const prevModelRaw = (prevWhisper?.model as string | undefined) ?? '';\n      const prevModel = String(prevModelRaw).toLowerCase();\n\n      const isNonGroqDefault =\n        prevModel === 'base' || prevModel === 'base.en' || prevModel === 'whisper-1';\n      const isDeprecatedGroq = prevModel === 'distil-whisper-large-v3-en';\n\n      let nextModel: string | undefined = prevWhisper?.model as string | undefined;\n\n      if (nextType === 'groq') {\n        if (!nextModel || isNonGroqDefault || isDeprecatedGroq) {\n          nextModel = 'whisper-large-v3-turbo';\n        }\n      } else if (nextType === 'remote') {\n        if (!nextModel || prevModel === 'base' || prevModel === 'base.en') {\n          nextModel = 'whisper-1';\n        }\n      } else if (nextType === 'local') {\n        if (!nextModel || prevModel === 'whisper-1' || prevModel.startsWith('whisper-large')) {\n          nextModel = 'base.en';\n        }\n      }\n\n      const nextWhisper: Record<string, unknown> = {\n        ...prevWhisper,\n        whisper_type: nextType,\n      };\n\n      if (nextType === 'groq') {\n        nextWhisper.model = nextModel ?? 'whisper-large-v3-turbo';\n        nextWhisper.language = (prevWhisper.language as string | undefined) || 'en';\n        delete nextWhisper.base_url;\n        delete nextWhisper.timeout_sec;\n        delete nextWhisper.chunksize_mb;\n      } else if (nextType === 'remote') {\n        nextWhisper.model = nextModel ?? 'whisper-1';\n        nextWhisper.language = (prevWhisper.language as string | undefined) || 'en';\n      } else if (nextType === 'local') {\n        nextWhisper.model = nextModel ?? 'base.en';\n        delete nextWhisper.api_key;\n      } else if (nextType === 'test') {\n        delete nextWhisper.model;\n        delete nextWhisper.api_key;\n      }\n\n      return {\n        ...prevConfig,\n        whisper: nextWhisper as unknown as WhisperConfig,\n      } as CombinedConfig;\n    });\n  };\n\n  // Groq key mutation\n  const applyGroqKeyMutation = useMutation({\n    mutationFn: async (key: string) => {\n      const next = {\n        llm: {\n          ...(pending?.llm as LLMConfig),\n          llm_api_key: key,\n          llm_model: groqRecommendedModel,\n        },\n        whisper: {\n          whisper_type: 'groq',\n          api_key: key,\n          model: groqRecommendedWhisper,\n          language: 'en',\n          max_retries: 3,\n        },\n      } as Partial<CombinedConfig>;\n\n      updatePending((prevConfig) => ({\n        ...prevConfig,\n        llm: next.llm as LLMConfig,\n        whisper: next.whisper as WhisperConfig,\n      }));\n\n      const [llmRes, whisperRes] = await Promise.all([\n        configApi.testLLM({ llm: next.llm as LLMConfig }),\n        configApi.testWhisper({ whisper: next.whisper as WhisperConfig }),\n      ]);\n      if (!llmRes?.ok) throw new Error(llmRes?.error || 'LLM test failed');\n      if (!whisperRes?.ok) throw new Error(whisperRes?.error || 'Whisper test failed');\n\n      return await configApi.updateConfig(next);\n    },\n    onSuccess: () => {\n      setHasEdits(false);\n      refetch();\n      toast.success('Groq key verified and saved. Defaults applied.');\n      setLlmStatus({ status: 'ok', message: 'LLM connection OK', error: '' });\n      setWhisperStatus({ status: 'ok', message: 'Whisper connection OK', error: '' });\n    },\n  });\n\n  const applyGroqKey = async (key: string) => {\n    await toast.promise(applyGroqKeyMutation.mutateAsync(key), {\n      loading: 'Verifying Groq key and applying defaults...',\n      success: 'Groq configured successfully',\n      error: (err: unknown) => {\n        const e = err as {\n          response?: { data?: { error?: string; message?: string } };\n          message?: string;\n        };\n        return (\n          e?.response?.data?.error ||\n          e?.response?.data?.message ||\n          e?.message ||\n          'Failed to configure Groq'\n        );\n      },\n    });\n  };\n\n  return {\n    // Data\n    pending,\n    configData,\n    envOverrides,\n    isLoading,\n\n    // Status\n    llmStatus,\n    whisperStatus,\n    hasEdits,\n    localWhisperAvailable,\n    isSaving: saveMutation.isPending,\n\n    // Actions\n    setField,\n    updatePending,\n    probeConnections,\n    handleSave,\n    refetch,\n    setHasEdits,\n\n    // Helpers\n    getEnvHint,\n    getWhisperApiKey,\n\n    // Recommended defaults\n    groqRecommendedModel,\n    groqRecommendedWhisper,\n\n    // Env warning modal\n    envWarningPaths,\n    showEnvWarning,\n    handleConfirmEnvWarning,\n    handleDismissEnvWarning,\n\n    // Whisper type change\n    handleWhisperTypeChange,\n\n    // Groq quick setup\n    applyGroqKey,\n    isApplyingGroqKey: applyGroqKeyMutation.isPending,\n  };\n}\n\nexport default useConfigState;\n"
  },
  {
    "path": "frontend/src/hooks/useEpisodeStatus.ts",
    "content": "import { useQuery, useQueryClient } from '@tanstack/react-query';\nimport { useEffect } from 'react';\nimport { feedsApi } from '../services/api';\n\nexport function useEpisodeStatus(episodeGuid: string, isWhitelisted: boolean, hasProcessedAudio: boolean, feedId?: number) {\n  const queryClient = useQueryClient();\n\n  const query = useQuery({\n    queryKey: ['episode-status', episodeGuid],\n    queryFn: () => feedsApi.getPostStatus(episodeGuid),\n    enabled: isWhitelisted && !hasProcessedAudio,\n    refetchOnWindowFocus: false,\n    refetchInterval: (query) => {\n      const status = query.state.data?.status;\n      if (status === 'pending' || status === 'running' || status === 'starting' || status === 'processing') {\n        return 3000;\n      }\n      return false;\n    },\n  });\n\n  useEffect(() => {\n    if (query.data?.status === 'completed' && feedId) {\n      // Invalidate episodes list to refresh UI (show Play button)\n      queryClient.invalidateQueries({ queryKey: ['episodes', feedId] });\n    }\n  }, [query.data?.status, feedId, queryClient]);\n\n  return query;\n}\n"
  },
  {
    "path": "frontend/src/index.css",
    "content": "@tailwind base;\n@tailwind components;\n@tailwind utilities;\n"
  },
  {
    "path": "frontend/src/main.tsx",
    "content": "import { StrictMode } from 'react'\nimport { createRoot } from 'react-dom/client'\nimport './index.css'\nimport './App.css'\nimport App from './App.tsx'\nimport { initFrontendDiagnostics } from './utils/diagnostics'\n\ninitFrontendDiagnostics()\n\ncreateRoot(document.getElementById('root')!).render(\n  <StrictMode>\n    <App />\n  </StrictMode>,\n)\n"
  },
  {
    "path": "frontend/src/pages/BillingPage.tsx",
    "content": "import { useEffect, useState } from 'react';\nimport { useQuery, useMutation } from '@tanstack/react-query';\nimport { billingApi } from '../services/api';\nimport { toast } from 'react-hot-toast';\nimport { useAuth } from '../contexts/AuthContext';\nimport { Navigate } from 'react-router-dom';\n\nexport default function BillingPage() {\n  const { user } = useAuth();\n  if (user?.role === 'admin') {\n    return <Navigate to=\"/\" replace />;\n  }\n  const { data, refetch, isLoading } = useQuery({\n    queryKey: ['billing', 'summary'],\n    queryFn: billingApi.getSummary,\n  });\n  \n  // Amount in dollars\n  const [amount, setAmount] = useState<number>(5);\n\n  useEffect(() => {\n    if (data?.current_amount) {\n      setAmount(data.current_amount / 100);\n    }\n  }, [data]);\n\n  const updateSubscription = useMutation({\n    mutationFn: (amt: number) =>\n      billingApi.updateSubscription(Math.round(amt * 100), {\n        subscriptionId: data?.stripe_subscription_id ?? null,\n      }),\n    onSuccess: (res) => {\n      if (res.checkout_url) {\n        window.location.href = res.checkout_url;\n        return;\n      }\n      toast.success('Plan updated');\n      if (res.current_amount) {\n          setAmount(res.current_amount / 100);\n      }\n      refetch();\n    },\n    onError: (err) => {\n      console.error('Failed to update plan', err);\n      toast.error('Could not update plan');\n    },\n  });\n\n  const portalSession = useMutation({\n    mutationFn: () => billingApi.createPortalSession(),\n    onSuccess: (res) => {\n      if (res.url) {\n        window.location.href = res.url;\n      }\n    },\n    onError: (err) => {\n      console.error('Failed to open billing portal', err);\n      toast.error('Unable to open billing portal');\n    },\n  });\n\n  if (isLoading || !data) {\n    return (\n      <div className=\"p-6\">\n        <div className=\"text-sm text-gray-600\">Loading billing…</div>\n      </div>\n    );\n  }\n\n  const isSubscribed = data.subscription_status === 'active' || data.subscription_status === 'trialing';\n  const currentAmountDollars = data.current_amount ? data.current_amount / 100 : 0;\n  const atCurrentAmount = amount === currentAmountDollars && isSubscribed;\n  const planLimitInfo = `${data.feeds_in_use}/${data.feed_allowance} feeds active`;\n  const minAmountCents = data.min_amount_cents ?? 100;\n  const minAmountDollars = minAmountCents / 100;\n\n  return (\n    <div className=\"p-6 max-w-3xl mx-auto space-y-6\">\n      <div>\n        <h1 className=\"text-2xl font-bold text-gray-900\">Billing</h1>\n        <p className=\"text-sm text-gray-600 mt-1\">\n          Pay what you want for the Starter Bundle (10 feeds).\n        </p>\n      </div>\n\n      <div className=\"bg-white border border-gray-200 rounded-xl shadow-sm p-5 space-y-4\">\n        <div className=\"flex flex-wrap gap-3 items-center justify-between\">\n          <div>\n            <div className=\"text-sm text-gray-600\">Current plan</div>\n            <div className=\"text-lg font-semibold text-gray-900\">\n              {isSubscribed ? 'Starter Bundle (10 Feeds)' : 'Free Tier'}\n            </div>\n            <div className=\"text-xs text-gray-500\">\n              {planLimitInfo}\n            </div>\n          </div>\n          <div className=\"text-right\">\n            <div className=\"text-sm text-gray-600\">Monthly payment</div>\n            <div className=\"text-2xl font-bold text-gray-900\">\n                {isSubscribed ? `$${currentAmountDollars.toFixed(2)}` : '$0.00'}\n            </div>\n            <div className=\"text-xs text-gray-500\">\n              Subscription status: {data.subscription_status || 'inactive'}\n            </div>\n          </div>\n        </div>\n\n        <div className=\"space-y-3 pt-4 border-t border-gray-100\">\n          <div className=\"text-sm text-gray-700 font-medium\">\n            {isSubscribed ? 'Update your price' : 'Subscribe to Starter Bundle'}\n          </div>\n          <p className=\"text-sm text-gray-600\">\n            Get 10 feeds for a monthly price of your choice (min ${minAmountDollars.toFixed(2)}).\n          </p>\n          \n          <div className=\"text-xs text-amber-800 bg-amber-50 p-3 rounded-md border border-amber-200\">\n            <strong>Note:</strong> We suggest paying ~$1 per feed you use. If revenue doesn't cover server costs, we may have to shut down the service.\n          </div>\n          \n          <div className=\"flex flex-col sm:flex-row sm:items-center gap-3\">\n            <div className=\"relative rounded-md shadow-sm w-32\">\n              <div className=\"pointer-events-none absolute inset-y-0 left-0 flex items-center pl-3\">\n                <span className=\"text-gray-500 sm:text-sm\">$</span>\n              </div>\n              <input\n                type=\"number\"\n                min={minAmountDollars}\n                step={0.5}\n                value={amount}\n                onChange={(e) => setAmount(Math.max(0, Number(e.target.value)))}\n                className=\"block w-full rounded-md border-gray-300 pl-7 pr-3 py-2 focus:border-blue-500 focus:ring-blue-500 sm:text-sm border\"\n                placeholder=\"5.00\"\n              />\n            </div>\n            \n            <div className=\"flex items-center gap-2 text-xs text-gray-600\">\n              <span>Suggested:</span>\n              {[3, 5, 10, 15].map((preset) => (\n                <button\n                  key={preset}\n                  type=\"button\"\n                  onClick={() => setAmount(preset)}\n                  className={`px-2 py-1 rounded-md border text-xs transition-colors ${\n                    amount === preset\n                      ? 'border-blue-200 bg-blue-50 text-blue-700'\n                      : 'border-gray-200 bg-white text-gray-700 hover:bg-gray-50'\n                  }`}\n                  disabled={updateSubscription.isPending}\n                >\n                  ${preset}\n                </button>\n              ))}\n            </div>\n          </div>\n          \n          <div className=\"flex flex-col sm:flex-row gap-2 sm:items-center pt-2\">\n            <button\n              onClick={() => updateSubscription.mutate(amount)}\n              disabled={updateSubscription.isPending || atCurrentAmount || amount < minAmountDollars}\n              className=\"px-4 py-2 rounded-md bg-blue-600 text-white text-sm font-medium hover:bg-blue-700 disabled:opacity-50 disabled:cursor-not-allowed\"\n            >\n              {updateSubscription.isPending \n                ? 'Processing…' \n                : isSubscribed \n                  ? (atCurrentAmount ? 'Current Price' : 'Update Price') \n                  : 'Subscribe'}\n            </button>\n            {amount < minAmountDollars && (\n                <span className=\"text-xs text-red-500\">Minimum amount is ${minAmountDollars.toFixed(2)}</span>\n            )}\n          </div>\n        </div>\n\n        <div className=\"flex flex-col sm:flex-row sm:items-center sm:justify-between gap-2 text-sm pt-4 border-t border-gray-100\">\n          <div className=\"text-gray-500 text-xs\">\n             Payments are securely processed by Stripe. You can cancel anytime.\n          </div>\n          <button\n            onClick={() => portalSession.mutate()}\n            disabled={portalSession.isPending || !data.stripe_customer_id}\n            className=\"inline-flex items-center justify-center px-3 py-2 rounded-md border border-gray-200 text-gray-700 hover:bg-gray-100 disabled:opacity-50 text-sm\"\n          >\n            {portalSession.isPending ? 'Opening…' : 'Manage Billing'}\n          </button>\n        </div>\n      </div>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/pages/ConfigPage.tsx",
    "content": "import ConfigTabs from '../components/config/ConfigTabs';\n\nexport default function ConfigPage() {\n  return <ConfigTabs />;\n}\n"
  },
  {
    "path": "frontend/src/pages/HomePage.tsx",
    "content": "import { useMutation, useQuery } from '@tanstack/react-query';\nimport { useEffect, useState } from 'react';\nimport { feedsApi, configApi, billingApi } from '../services/api';\nimport FeedList from '../components/FeedList';\nimport FeedDetail from '../components/FeedDetail';\nimport AddFeedForm from '../components/AddFeedForm';\nimport type { Feed, ConfigResponse } from '../types';\nimport { toast } from 'react-hot-toast';\nimport { useAuth } from '../contexts/AuthContext';\nimport { useNavigate } from 'react-router-dom';\nimport { copyToClipboard } from '../utils/clipboard';\nimport { emitDiagnosticError } from '../utils/diagnostics';\nimport { getHttpErrorInfo } from '../utils/httpError';\n\nexport default function HomePage() {\n  const navigate = useNavigate();\n  const [showAddForm, setShowAddForm] = useState(false);\n  const [selectedFeed, setSelectedFeed] = useState<Feed | null>(null);\n  const { requireAuth, user } = useAuth();\n\n  const { data: feeds, isLoading, error, refetch } = useQuery({\n    queryKey: ['feeds'],\n    queryFn: feedsApi.getFeeds,\n  });\n\n  const { data: billingSummary, refetch: refetchBilling } = useQuery({\n    queryKey: ['billing', 'summary'],\n    queryFn: billingApi.getSummary,\n    enabled: requireAuth && !!user,\n  });\n\n  useQuery<ConfigResponse>({\n    queryKey: ['config'],\n    queryFn: configApi.getConfig,\n    enabled: !requireAuth || user?.role === 'admin',\n  });\n  const canRefreshAll = !requireAuth || user?.role === 'admin';\n  const refreshAllMutation = useMutation({\n    mutationFn: () => feedsApi.refreshAllFeeds(),\n    onSuccess: (data) => {\n      toast.success(\n        `Refreshed ${data.feeds_refreshed} feeds and enqueued ${data.jobs_enqueued} jobs`\n      );\n      refetch();\n    },\n    onError: (err) => {\n      console.error('Failed to refresh all feeds', err);\n      const { status, data, message } = getHttpErrorInfo(err);\n      emitDiagnosticError({\n        title: 'Failed to refresh all feeds',\n        message,\n        kind: status ? 'http' : 'network',\n        details: {\n          status,\n          response: data,\n        },\n      });\n    },\n  });\n\n  useEffect(() => {\n    if (!showAddForm || typeof document === 'undefined') {\n      return;\n    }\n\n    const originalOverflow = document.body.style.overflow;\n    document.body.style.overflow = 'hidden';\n    return () => {\n      document.body.style.overflow = originalOverflow;\n    };\n  }, [showAddForm]);\n\n  if (isLoading) {\n    return (\n      <div className=\"flex justify-center items-center h-64\">\n        <div className=\"animate-spin rounded-full h-8 w-8 border-b-2 border-blue-600\"></div>\n      </div>\n    );\n  }\n\n  if (error) {\n    return (\n      <div className=\"bg-red-50 border border-red-200 rounded-md p-4\">\n        <p className=\"text-red-800\">Error loading feeds. Please try again.</p>\n      </div>\n    );\n  }\n\n  const planLimitReached =\n    !!billingSummary &&\n    billingSummary.feeds_in_use >= billingSummary.feed_allowance &&\n    user?.role !== 'admin';\n\n  const handleChangePlan = () => {\n    navigate('/billing');\n  };\n\n\n  const handleCopyAggregateLink = async () => {\n    try {\n      const { url } = await feedsApi.getAggregateFeedLink();\n      await copyToClipboard(url, 'Copy the Aggregate RSS URL:', 'Aggregate feed URL copied to clipboard!');\n    } catch (err) {\n      console.error('Failed to get aggregate link', err);\n      toast.error('Failed to get aggregate feed link');\n    }\n  };\n\n  return (\n    <div className=\"h-full flex flex-col lg:flex-row gap-6\">\n      {/* Left Panel - Feed List (hidden on mobile when feed is selected) */}\n      <div className={`flex-1 lg:max-w-md xl:max-w-lg flex flex-col ${\n        selectedFeed ? 'hidden lg:flex' : 'flex'\n      }`}>\n        <div className=\"flex justify-between items-center mb-6 gap-3\">\n          <h2 className=\"text-2xl font-bold text-gray-900\">Podcast Feeds</h2>\n          <div className=\"flex items-center gap-2\">\n            {canRefreshAll && (\n              <button\n                onClick={() => refreshAllMutation.mutate()}\n                disabled={refreshAllMutation.isPending}\n                title=\"Refresh all feeds\"\n                className={`flex items-center justify-center px-3 py-2 rounded-md border transition-colors ${\n                  refreshAllMutation.isPending\n                    ? 'border-gray-200 text-gray-400 cursor-not-allowed'\n                    : 'border-gray-200 text-gray-600 hover:bg-gray-100'\n                }`}\n              >\n                <img\n                  src=\"/reload-icon.svg\"\n                  alt=\"Refresh all\"\n                  className={`w-4 h-4 ${refreshAllMutation.isPending ? 'animate-spin' : ''}`}\n                />\n              </button>\n            )}\n            <button\n              onClick={handleCopyAggregateLink}\n              className=\"flex items-center justify-center px-3 py-2 rounded-md border border-gray-200 text-gray-600 hover:bg-gray-100 transition-colors\"\n              title=\"Copy your aggregate feed URL (last 3 episodes from each feed)\"\n            >\n              <svg className=\"w-4 h-4\" fill=\"none\" stroke=\"currentColor\" viewBox=\"0 0 24 24\">\n                <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M13.828 10.172a4 4 0 00-5.656 0l-4 4a4 4 0 105.656 5.656l1.102-1.101m-.758-4.899a4 4 0 005.656 0l4-4a4 4 0 00-5.656-5.656l-1.1 1.1\" />\n              </svg>\n            </button>\n            <button\n              onClick={() => {\n                if (planLimitReached) {\n                  navigate('/billing');\n                } else {\n                  setShowAddForm((prev) => !prev);\n                }\n              }}\n              className={`px-4 py-2 rounded-md font-medium transition-colors ${\n                planLimitReached\n                  ? 'bg-amber-600 hover:bg-amber-700 text-white'\n                  : 'bg-blue-600 hover:bg-blue-700 text-white'\n              }`}\n              title={planLimitReached ? 'Your plan is full. Click to upgrade.' : undefined}\n            >\n              {planLimitReached ? 'Plan full' : showAddForm ? 'Close' : 'Add Feed'}\n            </button>\n          </div>\n        </div>\n\n        <div className=\"flex-1 min-h-0 overflow-hidden\">\n          <FeedList \n            feeds={feeds || []} \n            onFeedDeleted={refetch}\n            onFeedSelected={setSelectedFeed}\n            selectedFeedId={selectedFeed?.id}\n          />\n        </div>\n      </div>\n\n      {/* Right Panel - Feed Detail */}\n      {selectedFeed && (\n        <div className={`flex-1 lg:flex-[2] ${\n          selectedFeed ? 'flex' : 'hidden lg:flex'\n        } flex-col bg-white rounded-lg shadow border overflow-hidden`}>\n          <FeedDetail \n            feed={selectedFeed} \n            onClose={() => setSelectedFeed(null)}\n            onFeedDeleted={() => {\n              setSelectedFeed(null);\n              refetch();\n            }}\n          />\n        </div>\n      )}\n\n      {/* Empty State for Desktop */}\n      {!selectedFeed && (\n        <div className=\"hidden lg:flex flex-[2] items-center justify-center bg-gray-50 rounded-lg border-2 border-dashed border-gray-300\">\n          <div className=\"text-center\">\n            <svg className=\"mx-auto h-12 w-12 text-gray-400\" fill=\"none\" stroke=\"currentColor\" viewBox=\"0 0 24 24\">\n              <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M9 19V6l12-3v13M9 19c0 1.105-1.343 2-3 2s-3-.895-3-2 1.343-2 3-2 3 .895 3 2zm12-3c0 1.105-1.343 2-3 2s-3-.895-3-2 1.343-2 3-2 3 .895 3 2zM9 10l12-3\" />\n            </svg>\n            <h3 className=\"mt-2 text-sm font-medium text-gray-900\">No podcast selected</h3>\n            <p className=\"mt-1 text-sm text-gray-500\">Select a podcast from the list to view details and episodes.</p>\n          </div>\n        </div>\n      )}\n\n      {showAddForm && (\n        <div\n          className=\"fixed inset-0 z-50 flex items-start sm:items-center justify-center bg-black/60 backdrop-blur-sm p-4 sm:p-6\"\n          onClick={() => setShowAddForm(false)}\n        >\n          <div\n            className=\"w-full max-w-3xl bg-white rounded-2xl shadow-2xl border border-gray-200 flex flex-col max-h-[90vh]\"\n            onClick={(event) => event.stopPropagation()}\n          >\n            <div className=\"flex items-center justify-between border-b border-gray-200 px-4 sm:px-6 py-4\">\n              <div>\n                <h2 className=\"text-xl sm:text-2xl font-semibold text-gray-900\">Add a Podcast Feed</h2>\n                <p className=\"text-sm text-gray-500 mt-1\">\n                  Paste an RSS URL or search the catalog to find shows to follow.\n                </p>\n              </div>\n              <button\n                onClick={() => setShowAddForm(false)}\n                className=\"p-2 text-gray-400 hover:text-gray-600 rounded-lg hover:bg-gray-100 transition-colors\"\n                aria-label=\"Close add feed modal\"\n              >\n                <svg className=\"w-6 h-6\" fill=\"none\" stroke=\"currentColor\" viewBox=\"0 0 24 24\">\n                  <path strokeLinecap=\"round\" strokeLinejoin=\"round\" strokeWidth={2} d=\"M6 18L18 6M6 6l12 12\" />\n                </svg>\n              </button>\n            </div>\n\n            <div className=\"overflow-y-auto px-4 sm:px-6 py-4\">\n              <AddFeedForm\n                onSuccess={() => {\n                  setShowAddForm(false);\n                  refetch();\n                  refetchBilling();\n                }}\n                onUpgradePlan={handleChangePlan}\n                planLimitReached={planLimitReached}\n              />\n            </div>\n          </div>\n        </div>\n      )}\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/pages/JobsPage.tsx",
    "content": "import { useCallback, useEffect, useRef, useState } from 'react';\nimport { jobsApi } from '../services/api';\nimport type { CleanupPreview, Job, JobManagerRun, JobManagerStatus } from '../types';\n\nfunction getStatusColor(status: string) {\n  switch (status) {\n    case 'running':\n      return 'bg-green-100 text-green-800';\n    case 'pending':\n      return 'bg-yellow-100 text-yellow-800';\n    case 'failed':\n      return 'bg-red-100 text-red-800';\n    case 'completed':\n      return 'bg-blue-100 text-blue-800';\n    case 'skipped':\n      return 'bg-purple-100 text-purple-800';\n    case 'cancelled':\n      return 'bg-gray-100 text-gray-800';\n    default:\n      return 'bg-gray-100 text-gray-800';\n  }\n}\n\nfunction StatusBadge({ status }: { status: string }) {\n  const color = getStatusColor(status);\n  return (\n    <span className={`inline-flex items-center px-2 py-0.5 rounded text-xs font-medium ${color}`}>\n      {status}\n    </span>\n  );\n}\n\nfunction ProgressBar({ value }: { value: number }) {\n  const clamped = Math.max(0, Math.min(100, Math.round(value)));\n  return (\n    <div className=\"w-full bg-gray-200 rounded h-2\">\n      <div\n        className=\"bg-indigo-600 h-2 rounded\"\n        style={{ width: `${clamped}%` }}\n      />\n    </div>\n  );\n}\n\nfunction RunStat({ label, value }: { label: string; value: number }) {\n  return (\n    <div>\n      <div className=\"text-xs uppercase tracking-wide text-gray-500\">{label}</div>\n      <div className=\"mt-1 text-lg font-semibold text-gray-900\">{value}</div>\n    </div>\n  );\n}\n\nfunction formatDateTime(value: string | null): string {\n  if (!value) {\n    return '—';\n  }\n  try {\n    return new Date(value).toLocaleString();\n  } catch (err) {\n    console.error('Failed to format date', err);\n    return value;\n  }\n}\n\nexport default function JobsPage() {\n  const [jobs, setJobs] = useState<Job[]>([]);\n  const [managerStatus, setManagerStatus] = useState<JobManagerStatus | null>(null);\n  const [statusError, setStatusError] = useState<string | null>(null);\n  const [loading, setLoading] = useState(false);\n  const [error, setError] = useState<string | null>(null);\n  const [mode, setMode] = useState<'active' | 'all'>('active');\n  const [cancellingJobs, setCancellingJobs] = useState<Set<string>>(new Set());\n  const previousHasActiveWork = useRef<boolean>(false);\n  const [cleanupPreview, setCleanupPreview] = useState<CleanupPreview | null>(null);\n  const [cleanupLoading, setCleanupLoading] = useState(false);\n  const [cleanupError, setCleanupError] = useState<string | null>(null);\n  const [cleanupRunning, setCleanupRunning] = useState(false);\n  const [cleanupMessage, setCleanupMessage] = useState<string | null>(null);\n\n  const loadStatus = useCallback(async () => {\n    try {\n      const data = await jobsApi.getJobManagerStatus();\n      setManagerStatus(data);\n      setStatusError(null);\n    } catch (e) {\n      console.error('Failed to load job manager status:', e);\n      setStatusError('Failed to load manager status');\n    }\n  }, []);\n\n  const loadActive = useCallback(async () => {\n    setLoading(true);\n    setError(null);\n    try {\n      const data = await jobsApi.getActiveJobs(100);\n      setJobs(data);\n    } catch (e) {\n      console.error('Failed to load active jobs:', e);\n      setError('Failed to load jobs');\n    } finally {\n      setLoading(false);\n    }\n  }, []);\n\n  const loadAll = useCallback(async () => {\n    setLoading(true);\n    setError(null);\n    try {\n      const data = await jobsApi.getAllJobs(200);\n      setJobs(data);\n    } catch (e) {\n      console.error('Failed to load all jobs:', e);\n      setError('Failed to load jobs');\n    } finally {\n      setLoading(false);\n    }\n  }, []);\n\n  const loadCleanupPreview = useCallback(async () => {\n    setCleanupLoading(true);\n    try {\n      const data = await jobsApi.getCleanupPreview();\n      setCleanupPreview(data);\n      setCleanupError(null);\n    } catch (e) {\n      console.error('Failed to load cleanup preview:', e);\n      setCleanupError('Failed to load cleanup preview');\n    } finally {\n      setCleanupLoading(false);\n    }\n  }, []);\n\n  const refresh = useCallback(async () => {\n    await loadStatus();\n    if (mode === 'active') {\n      await loadActive();\n    } else {\n      await loadAll();\n    }\n    await loadCleanupPreview();\n  }, [mode, loadActive, loadAll, loadStatus, loadCleanupPreview]);\n\n  const cancelJob = useCallback(\n    async (jobId: string) => {\n      setCancellingJobs(prev => new Set(prev).add(jobId));\n      try {\n        await jobsApi.cancelJob(jobId);\n        await refresh();\n      } catch (e) {\n        setError(`Failed to cancel job: ${e instanceof Error ? e.message : 'Unknown error'}`);\n      } finally {\n        setCancellingJobs(prev => {\n          const newSet = new Set(prev);\n          newSet.delete(jobId);\n          return newSet;\n        });\n      }\n    },\n    [refresh]\n  );\n\n  const runCleanupNow = useCallback(async () => {\n    setCleanupRunning(true);\n    setCleanupError(null);\n    setCleanupMessage(null);\n    try {\n      const result = await jobsApi.runCleanupJob();\n      if (result.status === 'disabled') {\n        setCleanupMessage(result.message ?? 'Cleanup is disabled.');\n        return;\n      }\n      if (result.status !== 'ok') {\n        setCleanupError(result.message ?? 'Cleanup job failed');\n        return;\n      }\n      const removed = result.removed_posts ?? 0;\n      const remaining = result.remaining_candidates ?? 0;\n      const removedText = `Cleanup removed ${removed} episode${removed === 1 ? '' : 's'}.`;\n      const remainingText =\n        remaining > 0\n          ? ` ${remaining} episode${remaining === 1 ? '' : 's'} still eligible.`\n          : '';\n      setCleanupMessage(`${removedText}${remainingText}`);\n      await refresh();\n    } catch (e) {\n      console.error('Failed to run cleanup job:', e);\n      setCleanupError('Failed to run cleanup job');\n    } finally {\n      setCleanupRunning(false);\n    }\n  }, [refresh]);\n\n  useEffect(() => {\n    void loadStatus();\n    void loadActive();\n    void loadCleanupPreview();\n  }, [loadActive, loadStatus, loadCleanupPreview]);\n\n  useEffect(() => {\n    const queued = managerStatus?.run?.queued_jobs ?? 0;\n    const running = managerStatus?.run?.running_jobs ?? 0;\n    const hasActiveWork = queued + running > 0;\n    if (!hasActiveWork) {\n      return undefined;\n    }\n\n    // Poll every 15 seconds when jobs are active to reduce database contention\n    const interval = setInterval(() => {\n      void loadStatus();\n    }, 15000);\n\n    return () => clearInterval(interval);\n  }, [managerStatus?.run?.queued_jobs, managerStatus?.run?.running_jobs, loadStatus]);\n\n  useEffect(() => {\n    const queued = managerStatus?.run?.queued_jobs ?? 0;\n    const running = managerStatus?.run?.running_jobs ?? 0;\n    const hasActiveWork = queued + running > 0;\n    if (!hasActiveWork && previousHasActiveWork.current) {\n      void refresh();\n    }\n    previousHasActiveWork.current = hasActiveWork;\n  }, [managerStatus?.run?.queued_jobs, managerStatus?.run?.running_jobs, refresh]);\n\n  const run: JobManagerRun | null = managerStatus?.run ?? null;\n  const hasActiveWork = run ? run.queued_jobs + run.running_jobs > 0 : false;\n  const retentionDays = cleanupPreview?.retention_days ?? null;\n  const cleanupDisabled = retentionDays === null || retentionDays <= 0;\n  const cleanupEligibleCount = cleanupPreview?.count ?? 0;\n\n  return (\n    <div className=\"space-y-4\">\n      <div className=\"rounded border border-gray-200 bg-white p-4 shadow-sm\">\n        <div className=\"flex flex-wrap items-center justify-between gap-3\">\n          <div>\n            <h2 className=\"text-base font-semibold text-gray-900\">Jobs Manager</h2>\n            <p className=\"text-xs text-gray-600\">\n              {run\n                ? hasActiveWork\n                  ? `Processing · Last update ${formatDateTime(run.updated_at)}`\n                  : `Idle · Last activity ${formatDateTime(run.updated_at)}`\n                : 'Jobs Manager has not started yet.'}\n            </p>\n          </div>\n          {run ? (\n            <StatusBadge status={run.status} />\n          ) : (\n            <span className=\"inline-flex items-center rounded px-2 py-0.5 text-xs font-medium bg-gray-100 text-gray-800\">\n              idle\n            </span>\n          )}\n        </div>\n\n        {statusError && (\n          <div className=\"mt-2 text-xs text-red-600\">{statusError}</div>\n        )}\n\n        {run ? (\n          <>\n            <div className=\"mt-4 grid grid-cols-2 gap-3 sm:grid-cols-5\">\n              <RunStat label=\"Queued\" value={run.queued_jobs} />\n              <RunStat label=\"Running\" value={run.running_jobs} />\n              <RunStat label=\"Completed\" value={run.completed_jobs} />\n              <RunStat label=\"Skipped\" value={run.skipped_jobs} />\n              <RunStat label=\"Failed\" value={run.failed_jobs} />\n            </div>\n            <div className=\"mt-4 space-y-1\">\n              <ProgressBar value={run.progress_percentage} />\n              <div className=\"text-xs text-gray-500\">\n                {run.completed_jobs} completed · {run.skipped_jobs} skipped · {run.failed_jobs} failed of {run.total_jobs} jobs\n              </div>\n            </div>\n            <div className=\"mt-3 text-xs text-gray-500\">\n              Trigger: <span className=\"font-medium text-gray-700\">{run.trigger}</span>\n            </div>\n            {run.counters_reset_at ? (\n              <div className=\"mt-1 text-xs text-gray-500\">\n                Stats since {formatDateTime(run.counters_reset_at)}\n              </div>\n            ) : null}\n          </>\n        ) : null}\n      </div>\n\n      <div className=\"rounded border border-gray-200 bg-white p-4 shadow-sm\">\n        <div className=\"flex flex-wrap items-start justify-between gap-3\">\n          <div>\n            <h3 className=\"text-base font-semibold text-gray-900\">Post Cleanup</h3>\n            <p className=\"text-xs text-gray-600\">\n              {cleanupDisabled\n                ? 'Cleanup is disabled while retention days are unset or zero.'\n                : `Episodes older than ${retentionDays} day${retentionDays === 1 ? '' : 's'} will be removed.`}\n            </p>\n          </div>\n          <div className=\"text-right\">\n            <div className=\"text-xs uppercase tracking-wide text-gray-500\">Eligible</div>\n            <div className=\"text-lg font-semibold text-gray-900\">\n              {cleanupLoading ? '…' : cleanupEligibleCount}\n            </div>\n          </div>\n        </div>\n\n        {cleanupError && (\n          <div className=\"mt-2 text-xs text-red-600\">{cleanupError}</div>\n        )}\n        {cleanupMessage && (\n          <div className=\"mt-2 text-xs text-green-700\">{cleanupMessage}</div>\n        )}\n\n        <div className=\"mt-4 grid grid-cols-1 gap-3 sm:grid-cols-3\">\n          <div>\n            <div className=\"text-xs uppercase tracking-wide text-gray-500\">Retention</div>\n            <div className=\"text-sm font-medium text-gray-900\">\n              {cleanupDisabled ? 'Disabled' : `${retentionDays} day${retentionDays === 1 ? '' : 's'}`}\n            </div>\n          </div>\n          <div>\n            <div className=\"text-xs uppercase tracking-wide text-gray-500\">Eligible episodes</div>\n            <div className=\"text-sm font-medium text-gray-900\">\n              {cleanupLoading ? 'Loading…' : cleanupEligibleCount}\n            </div>\n          </div>\n          <div>\n            <div className=\"text-xs uppercase tracking-wide text-gray-500\">Cutoff date</div>\n            <div className=\"text-sm font-medium text-gray-900\">\n              {cleanupPreview?.cutoff_utc ? formatDateTime(cleanupPreview.cutoff_utc) : '—'}\n            </div>\n          </div>\n        </div>\n\n        <div className=\"mt-4 flex flex-wrap items-center justify-between gap-3\">\n          <div className=\"text-xs text-gray-500\">\n            Includes completed jobs and non-whitelisted episodes with release dates older than the retention window.\n          </div>\n          <button\n            onClick={() => { void runCleanupNow(); }}\n            disabled={cleanupRunning || cleanupDisabled || cleanupLoading}\n            className=\"inline-flex items-center rounded-md bg-indigo-600 px-3 py-1.5 text-sm font-medium text-white hover:bg-indigo-700 focus:outline-none focus:ring-2 focus:ring-indigo-500 disabled:bg-gray-300 disabled:text-gray-500 disabled:cursor-not-allowed\"\n          >\n            {cleanupRunning ? 'Running cleanup…' : 'Run cleanup now'}\n          </button>\n        </div>\n      </div>\n\n      <div className=\"flex items-center justify-between\">\n        <div>\n          <h3 className=\"text-xl font-semibold text-gray-900\">{mode === 'active' ? 'Active Jobs' : 'All Jobs'}</h3>\n          <p className=\"text-sm text-gray-600\">\n            {mode === 'active'\n              ? 'Queued and running jobs, ordered by priority.'\n              : 'All jobs ordered by priority (running/pending first).'}\n          </p>\n        </div>\n        <div className=\"flex items-center gap-2\">\n          <button\n            onClick={() => { void refresh(); }}\n            className=\"inline-flex items-center rounded-md bg-indigo-600 px-3 py-1.5 text-sm font-medium text-white hover:bg-indigo-700 focus:outline-none focus:ring-2 focus:ring-indigo-500\"\n            disabled={loading}\n          >\n            {loading ? 'Refreshing…' : 'Refresh'}\n          </button>\n          {mode === 'active' ? (\n            <button\n              onClick={async () => { setMode('all'); await loadStatus(); await loadAll(); await loadCleanupPreview(); }}\n              className=\"inline-flex items-center rounded-md bg-gray-200 px-3 py-1.5 text-sm font-medium text-gray-800 hover:bg-gray-300 focus:outline-none focus:ring-2 focus:ring-gray-400\"\n              disabled={loading}\n            >\n              Load all jobs\n            </button>\n          ) : (\n            <button\n              onClick={async () => { setMode('active'); await loadStatus(); await loadActive(); await loadCleanupPreview(); }}\n              className=\"inline-flex items-center rounded-md bg-gray-200 px-3 py-1.5 text-sm font-medium text-gray-800 hover:bg-gray-300 focus:outline-none focus:ring-2 focus:ring-gray-400\"\n              disabled={loading}\n            >\n              Show active only\n            </button>\n          )}\n        </div>\n      </div>\n\n      {error && (\n        <div className=\"rounded border border-red-200 bg-red-50 p-3 text-sm text-red-800\">{error}</div>\n      )}\n\n      {jobs.length === 0 && !loading ? (\n        <div className=\"text-sm text-gray-600\">No jobs to display.</div>\n      ) : null}\n\n      <div className=\"grid grid-cols-1 sm:grid-cols-2 lg:grid-cols-3 gap-4\">\n        {jobs.map((job) => (\n          <div key={job.job_id} className=\"bg-white border rounded shadow-sm p-4 space-y-3\">\n            <div className=\"flex items-center justify-between\">\n              <div className=\"text-sm font-medium text-gray-900 truncate\">\n                {job.post_title || 'Untitled episode'}\n              </div>\n              <StatusBadge status={job.status} />\n            </div>\n            <div className=\"text-xs text-gray-600 truncate\">{job.feed_title || 'Unknown feed'}</div>\n\n            <div className=\"space-y-2\">\n              <div className=\"flex items-center justify-between text-xs text-gray-700\">\n                <span>Priority</span>\n                <span className=\"font-medium\">{job.priority}</span>\n              </div>\n              <div className=\"flex items-center justify-between text-xs text-gray-700\">\n                <span>Step</span>\n                <span className=\"font-medium\">{job.step}/{job.total_steps} {job.step_name ? `· ${job.step_name}` : ''}</span>\n              </div>\n              <div className=\"space-y-1\">\n                <div className=\"flex items-center justify-between text-xs text-gray-700\">\n                  <span>Progress</span>\n                  <span className=\"font-medium\">{Math.round(job.progress_percentage)}%</span>\n                </div>\n                <ProgressBar value={job.progress_percentage} />\n              </div>\n            </div>\n\n            <div className=\"grid grid-cols-2 gap-2 text-xs text-gray-600\">\n              <div>\n                <div className=\"text-gray-500\">Job ID</div>\n                <div className=\"truncate\" title={job.job_id}>{job.job_id}</div>\n              </div>\n              <div>\n                <div className=\"text-gray-500\">Post GUID</div>\n                <div className=\"truncate\" title={job.post_guid}>{job.post_guid}</div>\n              </div>\n              <div>\n                <div className=\"text-gray-500\">Created</div>\n                <div>{job.created_at ? formatDateTime(job.created_at) : '—'}</div>\n              </div>\n              <div>\n                <div className=\"text-gray-500\">Started</div>\n                <div>{job.started_at ? formatDateTime(job.started_at) : '—'}</div>\n              </div>\n              {job.error_message ? (\n                <div className=\"col-span-2\">\n                  <div className=\"text-gray-500\">Message</div>\n                  <div className=\"text-red-700 truncate\" title={job.error_message}>{job.error_message}</div>\n                </div>\n              ) : null}\n            </div>\n\n            {(job.status === 'pending' || job.status === 'running') && (\n              <div className=\"mt-3 pt-3 border-t border-gray-200\">\n                <button\n                  onClick={() => { void cancelJob(job.job_id); }}\n                  disabled={cancellingJobs.has(job.job_id)}\n                  className=\"w-full inline-flex items-center justify-center rounded-md bg-red-600 px-3 py-2 text-sm font-medium text-white hover:bg-red-700 focus:outline-none focus:ring-2 focus:ring-red-500 disabled:bg-gray-400 disabled:cursor-not-allowed\"\n                >\n                  {cancellingJobs.has(job.job_id) ? 'Cancelling...' : 'Cancel Job'}\n                </button>\n              </div>\n            )}\n          </div>\n        ))}\n      </div>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/pages/LandingPage.tsx",
    "content": "import { Link } from 'react-router-dom';\nimport { useQuery } from '@tanstack/react-query';\nimport { landingApi } from '../services/api';\n\nexport default function LandingPage() {\n  const { data: status } = useQuery({\n    queryKey: ['landing-status'],\n    queryFn: landingApi.getStatus,\n    refetchInterval: 30000, // refresh every 30s\n  });\n\n  const userCount = status?.user_count ?? 0;\n  const userLimit = status?.user_limit_total;\n  const slotsRemaining = status?.slots_remaining;\n\n  return (\n    <div className=\"min-h-screen bg-gradient-to-b from-gray-50 to-white overflow-y-auto overflow-x-hidden fixed inset-0\">\n      {/* Header */}\n      <header className=\"fixed top-0 left-0 right-0 bg-white/80 backdrop-blur-md border-b border-gray-100 z-50\">\n        <div className=\"max-w-6xl mx-auto px-4 sm:px-6 lg:px-8\">\n          <div className=\"flex items-center justify-between h-16\">\n            <div className=\"flex items-center gap-2\">\n              <img src=\"/images/logos/logo.webp\" alt=\"Podly\" className=\"h-8 w-auto\" />\n              <span className=\"text-xl font-bold text-gray-900\">Podly</span>\n            </div>\n            <nav className=\"hidden md:flex items-center gap-8\">\n              <a href=\"#how-it-works\" className=\"text-sm font-medium text-gray-600 hover:text-gray-900 transition-colors\">\n                How it works\n              </a>\n              <a\n                href=\"https://github.com/podly-pure-podcasts/podly-pure-podcasts\"\n                target=\"_blank\"\n                rel=\"noreferrer\"\n                className=\"text-sm font-medium text-gray-600 hover:text-gray-900 transition-colors\"\n              >\n                GitHub\n              </a>\n            </nav>\n            <Link to=\"/login\" className=\"bg-blue-600 hover:bg-blue-700 text-white px-5 py-2 rounded-lg font-medium transition-colors shadow-sm\">\n              Sign In\n            </Link>\n          </div>\n        </div>\n      </header>\n\n      {/* Hero */}\n      <section className=\"pt-32 pb-12 px-4 sm:px-6 lg:px-8\">\n        <div className=\"max-w-4xl mx-auto text-center\">\n          <h1 className=\"text-4xl sm:text-5xl font-bold text-gray-900 leading-tight mb-6\">\n            Join the Podly test group\n          </h1>\n          <p className=\"text-lg text-gray-600 mb-8 max-w-3xl mx-auto\">\n            We're testing a self-hosted podcast ad removal system. Podly transcribes episodes, detects sponsor reads with an LLM, and generates clean RSS feeds that work in any podcast app.\n          </p>\n\n          {/* Live user count */}\n          <div className=\"inline-flex items-center gap-3 bg-white border border-gray-200 rounded-xl px-6 py-4 shadow-sm mb-8\">\n            <div className=\"flex items-center gap-2\">\n              <div className=\"h-2 w-2 rounded-full bg-green-500 animate-pulse\" />\n              <span className=\"text-sm font-medium text-gray-700\">\n                {userLimit !== null && userLimit !== undefined && userLimit > 0 ? (\n                  <>\n                    <strong className=\"text-gray-900\">{userCount}</strong> / {userLimit} testers\n                    {slotsRemaining !== null && slotsRemaining !== undefined && slotsRemaining > 0 && (\n                      <span className=\"ml-2 text-gray-500\">\n                        ({slotsRemaining} {slotsRemaining === 1 ? 'slot' : 'spots'} remaining)\n                      </span>\n                    )}\n                  </>\n                ) : (\n                  <>\n                    <strong className=\"text-gray-900\">{userCount}</strong> active testers\n                  </>\n                )}\n              </span>\n            </div>\n          </div>\n\n          <div className=\"flex flex-col sm:flex-row items-center justify-center gap-4 mb-8\">\n            <Link\n              to=\"/login\"\n              className=\"w-full sm:w-auto bg-blue-600 hover:bg-blue-700 text-white px-8 py-4 rounded-xl font-semibold text-lg transition-colors shadow-lg hover:shadow-xl\"\n            >\n              Sign Up!\n            </Link>\n            <a\n              href=\"https://discord.gg/FRB98GtF6N\"\n              target=\"_blank\"\n              rel=\"noopener noreferrer\"\n              className=\"w-full sm:w-auto flex items-center justify-center gap-2 bg-[#5865F2] hover:bg-[#4752C4] text-white px-8 py-4 rounded-xl font-semibold text-lg transition-colors\"\n            >\n              <svg className=\"h-5 w-5\" viewBox=\"0 0 24 24\" fill=\"currentColor\">\n                <path d=\"M20.317 4.37a19.791 19.791 0 0 0-4.885-1.515.074.074 0 0 0-.079.037c-.21.375-.444.864-.608 1.25a18.27 18.27 0 0 0-5.487 0 12.64 12.64 0 0 0-.617-1.25.077.077 0 0 0-.079-.037A19.736 19.736 0 0 0 3.677 4.37a.07.07 0 0 0-.032.027C.533 9.046-.32 13.58.099 18.057a.082.082 0 0 0 .031.057 19.9 19.9 0 0 0 5.993 3.03.078.078 0 0 0 .084-.028 14.09 14.09 0 0 0 1.226-1.994.076.076 0 0 0-.041-.106 13.107 13.107 0 0 1-1.872-.892.077.077 0 0 1-.008-.128 10.2 10.2 0 0 0 .372-.292.074.074 0 0 1 .077-.01c3.928 1.793 8.18 1.793 12.062 0a.074.074 0 0 1 .078.01c.12.098.246.198.373.292a.077.077 0 0 1-.006.127 12.299 12.299 0 0 1-1.873.892.077.077 0 0 0-.041.107c.36.698.772 1.362 1.225 1.993a.076.076 0 0 0 .084.028 19.839 19.839 0 0 0 6.002-3.03.077.077 0 0 0 .032-.054c.5-5.177-.838-9.674-3.549-13.66a.061.061 0 0 0-.031-.03zM8.02 15.33c-1.183 0-2.157-1.085-2.157-2.419 0-1.333.956-2.419 2.157-2.419 1.21 0 2.176 1.096 2.157 2.42 0 1.333-.956 2.418-2.157 2.418zm7.975 0c-1.183 0-2.157-1.085-2.157-2.419 0-1.333.955-2.419 2.157-2.419 1.21 0 2.176 1.096 2.157 2.42 0 1.333-.946 2.418-2.157 2.418z\"/>\n              </svg>\n              Join Discord\n            </a>\n          </div>\n\n          {slotsRemaining !== null && slotsRemaining === 0 && (\n            <div className=\"max-w-2xl mx-auto bg-amber-50 border border-amber-200 rounded-xl p-4 text-sm text-amber-900\">\n              <strong>Test group full.</strong> Join the Discord to hear when more slots open up.\n            </div>\n          )}\n        </div>\n      </section>\n\n      {/* How it works */}\n      <section id=\"how-it-works\" className=\"pt-12 pb-16 px-4 sm:px-6 lg:px-8\">\n        <div className=\"max-w-6xl mx-auto space-y-12\">\n          <div className=\"text-center\">\n            <h2 className=\"text-3xl sm:text-4xl font-bold text-gray-900 mb-4\">How it works</h2>\n            <p className=\"text-lg text-gray-600 max-w-3xl mx-auto\">Podly grabs the feed, finds sponsorship blocks, and gives you a private RSS link so your own players stream the ad-free version.</p>\n          </div>\n          <div className=\"grid gap-6 lg:grid-cols-2\">\n            <div className=\"rounded-2xl border border-gray-100 bg-white p-6\">\n              <p className=\"text-sm font-semibold uppercase tracking-wide text-gray-500 mb-4\">Listen anywhere</p>\n              <ul className=\"space-y-3 text-sm text-gray-600\">\n                <li><strong className=\"text-gray-900\">Apple Podcasts:</strong> Library → Edit → Add Show by URL → paste the Podly link.</li>\n                <li><strong className=\"text-gray-900\">Overcast:</strong> Tap + → Add URL → paste → done.</li>\n                <li><strong className=\"text-gray-900\">Pocket Casts:</strong> Discover → Paste RSS Link → Subscribe.</li>\n                <li><strong className=\"text-gray-900\">Other players:</strong> Podcast Addict, AntennaPod, Castro, etc. all support \"add via URL.\"</li>\n              </ul>\n              <div className=\"mt-4 rounded-xl border border-amber-200 bg-amber-50/70 p-3 text-sm text-amber-900\">\n                Spotify blocks custom RSS feeds, so switch to any other podcast app when you use Podly links.\n              </div>\n            </div>\n            <div className=\"rounded-2xl border border-blue-100 bg-blue-50/60 p-6\">\n              <p className=\"text-sm font-semibold uppercase tracking-wide text-blue-900 mb-4\">Getting started</p>\n              <ol className=\"space-y-3 text-sm text-blue-900/90\">\n                <li><strong>1.</strong> Sign up, choose number of podcasts ($1/pod/month)</li>\n                <li><strong>2.</strong> Search for a podcast and add it to your personal feed list.</li>\n                <li><strong>3.</strong> Copy your unique Podly RSS link for that feed.</li>\n                <li><strong>4.</strong> Paste the link into your podcast app to start listening ad-free.</li>\n                <li>\n                  <strong>Need help?</strong> Ask questions in{' '}\n                  <a href=\"https://discord.gg/FRB98GtF6N\" className=\"underline font-semibold\" target=\"_blank\" rel=\"noopener noreferrer\">\n                    Discord\n                  </a>\n                  .\n                </li>\n              </ol>\n            </div>\n          </div>\n        </div>\n      </section>\n\n      {/* CTA */}\n      <section className=\"py-10 sm:py-14 px-4 sm:px-6 lg:px-8\">\n        <div className=\"max-w-4xl mx-auto text-center\">\n          <div className=\"flex flex-col sm:flex-row items-center justify-center gap-4\">\n            <Link\n              to=\"/login\"\n              className=\"w-full sm:w-auto bg-blue-600 hover:bg-blue-700 text-white px-8 py-4 rounded-xl font-semibold text-lg transition-colors shadow-lg hover:shadow-xl\"\n            >\n              Sign Up!\n            </Link>\n            <a\n              href=\"https://discord.gg/FRB98GtF6N\"\n              target=\"_blank\"\n              rel=\"noopener noreferrer\"\n              className=\"w-full sm:w-auto flex items-center justify-center gap-2 bg-[#5865F2] hover:bg-[#4752C4] text-white px-8 py-4 rounded-xl font-semibold text-lg transition-colors\"\n            >\n              <svg className=\"h-5 w-5\" viewBox=\"0 0 24 24\" fill=\"currentColor\">\n                <path d=\"M20.317 4.37a19.791 19.791 0 0 0-4.885-1.515.074.074 0 0 0-.079.037c-.21.375-.444.864-.608 1.25a18.27 18.27 0 0 0-5.487 0 12.64 12.64 0 0 0-.617-1.25.077.077 0 0 0-.079-.037A19.736 19.736 0 0 0 3.677 4.37a.07.07 0 0 0-.032.027C.533 9.046-.32 13.58.099 18.057a.082.082 0 0 0 .031.057 19.9 19.9 0 0 0 5.993 3.03.078.078 0 0 0 .084-.028 14.09 14.09 0 0 0 1.226-1.994.076.076 0 0 0-.041-.106 13.107 13.107 0 0 1-1.872-.892.077.077 0 0 1-.008-.128 10.2 10.2 0 0 0 .372-.292.074.074 0 0 1 .077-.01c3.928 1.793 8.18 1.793 12.062 0a.074.074 0 0 1 .078.01c.12.098.246.198.373.292a.077.077 0 0 1-.006.127 12.299 12.299 0 0 1-1.873.892.077.077 0 0 0-.041.107c.36.698.772 1.362 1.225 1.993a.076.076 0 0 0 .084.028 19.839 19.839 0 0 0 6.002-3.03.077.077 0 0 0 .032-.054c.5-5.177-.838-9.674-3.549-13.66a.061.061 0 0 0-.031-.03zM8.02 15.33c-1.183 0-2.157-1.085-2.157-2.419 0-1.333.956-2.419 2.157-2.419 1.21 0 2.176 1.096 2.157 2.42 0 1.333-.956 2.418-2.157 2.418zm7.975 0c-1.183 0-2.157-1.085-2.157-2.419 0-1.333.955-2.419 2.157-2.419 1.21 0 2.176 1.096 2.157 2.42 0 1.333-.946 2.418-2.157 2.418z\"/>\n              </svg>\n              Join Discord\n            </a>\n          </div>\n        </div>\n      </section>\n\n      {/* Footer */}\n      <footer className=\"py-12 px-4 sm:px-6 lg:px-8 border-t border-gray-200\">\n        <div className=\"max-w-6xl mx-auto\">\n          <div className=\"flex flex-col md:flex-row items-center justify-between gap-6\">\n            <div className=\"flex items-center gap-2\">\n              <img src=\"/images/logos/logo.webp\" alt=\"Podly\" className=\"h-6 w-auto\" />\n              <span className=\"font-semibold text-gray-900\">Podly</span>\n            </div>\n            <p className=\"text-sm text-gray-500\">\n              Open source podcast ad remover.\n            </p>\n            <div className=\"flex items-center gap-4\">\n              <a\n                href=\"https://github.com/podly-pure-podcasts/podly-pure-podcasts\"\n                target=\"_blank\"\n                rel=\"noopener noreferrer\"\n                className=\"text-gray-400 hover:text-gray-600 transition-colors\"\n              >\n                <svg className=\"h-6 w-6\" fill=\"currentColor\" viewBox=\"0 0 24 24\">\n                  <path fillRule=\"evenodd\" d=\"M12 2C6.477 2 2 6.484 2 12.017c0 4.425 2.865 8.18 6.839 9.504.5.092.682-.217.682-.483 0-.237-.008-.868-.013-1.703-2.782.605-3.369-1.343-3.369-1.343-.454-1.158-1.11-1.466-1.11-1.466-.908-.62.069-.608.069-.608 1.003.07 1.531 1.032 1.531 1.032.892 1.53 2.341 1.088 2.91.832.092-.647.35-1.088.636-1.338-2.22-.253-4.555-1.113-4.555-4.951 0-1.093.39-1.988 1.029-2.688-.103-.253-.446-1.272.098-2.65 0 0 .84-.27 2.75 1.026A9.564 9.564 0 0112 6.844c.85.004 1.705.115 2.504.337 1.909-1.296 2.747-1.027 2.747-1.027.546 1.379.202 2.398.1 2.651.64.7 1.028 1.595 1.028 2.688 0 3.848-2.339 4.695-4.566 4.943.359.309.678.92.678 1.855 0 1.338-.012 2.419-.012 2.747 0 .268.18.58.688.482A10.019 10.019 0 0022 12.017C22 6.484 17.522 2 12 2z\" clipRule=\"evenodd\" />\n                </svg>\n              </a>\n              <a\n                href=\"https://discord.gg/FRB98GtF6N\"\n                target=\"_blank\"\n                rel=\"noopener noreferrer\"\n                className=\"text-gray-400 hover:text-gray-600 transition-colors\"\n              >\n                <svg className=\"h-6 w-6\" fill=\"currentColor\" viewBox=\"0 0 24 24\">\n                  <path d=\"M20.317 4.37a19.791 19.791 0 0 0-4.885-1.515.074.074 0 0 0-.079.037c-.21.375-.444.864-.608 1.25a18.27 18.27 0 0 0-5.487 0 12.64 12.64 0 0 0-.617-1.25.077.077 0 0 0-.079-.037A19.736 19.736 0 0 0 3.677 4.37a.07.07 0 0 0-.032.027C.533 9.046-.32 13.58.099 18.057a.082.082 0 0 0 .031.057 19.9 19.9 0 0 0 5.993 3.03.078.078 0 0 0 .084-.028 14.09 14.09 0 0 0 1.226-1.994.076.076 0 0 0-.041-.106 13.107 13.107 0 0 1-1.872-.892.077.077 0 0 1-.008-.128 10.2 10.2 0 0 0 .372-.292.074.074 0 0 1 .077-.01c3.928 1.793 8.18 1.793 12.062 0a.074.074 0 0 1 .078.01c.12.098.246.198.373.292a.077.077 0 0 1-.006.127 12.299 12.299 0 0 1-1.873.892.077.077 0 0 0-.041.107c.36.698.772 1.362 1.225 1.993a.076.076 0 0 0 .084.028 19.839 19.839 0 0 0 6.002-3.03.077.077 0 0 0 .032-.054c.5-5.177-.838-9.674-3.549-13.66a.061.061 0 0 0-.031-.03zM8.02 15.33c-1.183 0-2.157-1.085-2.157-2.419 0-1.333.956-2.419 2.157-2.419 1.21 0 2.176 1.096 2.157 2.42 0 1.333-.956 2.418-2.157 2.418zm7.975 0c-1.183 0-2.157-1.085-2.157-2.419 0-1.333.955-2.419 2.157-2.419 1.21 0 2.176 1.096 2.157 2.42 0 1.333-.946 2.418-2.157 2.418z\"/>\n                </svg>\n              </a>\n            </div>\n          </div>\n        </div>\n      </footer>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/pages/LoginPage.tsx",
    "content": "import type { FormEvent } from 'react';\nimport { useState, useEffect } from 'react';\nimport axios from 'axios';\nimport { Link } from 'react-router-dom';\nimport { useAuth } from '../contexts/AuthContext';\nimport { discordApi } from '../services/api';\n\nexport default function LoginPage() {\n  const { login, landingPageEnabled } = useAuth();\n  const [username, setUsername] = useState('');\n  const [password, setPassword] = useState('');\n  const [submitting, setSubmitting] = useState(false);\n  const [error, setError] = useState<string | null>(null);\n  const [discordEnabled, setDiscordEnabled] = useState(false);\n  const [discordLoading, setDiscordLoading] = useState(false);\n  const [showPasswordLogin, setShowPasswordLogin] = useState(false);\n\n  // Check for OAuth callback errors in URL\n  useEffect(() => {\n    const params = new URLSearchParams(window.location.search);\n    const urlError = params.get('error');\n    if (urlError) {\n      const messages: Record<string, string> = {\n        'guild_requirement_not_met': 'You must be a member of the required Discord server.',\n        'registration_disabled': 'Self-registration is currently disabled.',\n        'auth_failed': 'Discord authentication failed. Please try again.',\n        'invalid_state': 'Invalid session state. Please try again.',\n        'access_denied': 'Discord access was denied.',\n        'discord_not_configured': 'Discord SSO is not configured.',\n        'missing_code': 'Missing authorization code from Discord.',\n      };\n      setError(messages[urlError] || 'An error occurred during login.');\n      // Clean URL\n      window.history.replaceState({}, '', window.location.pathname);\n    }\n  }, []);\n\n  // Check if Discord SSO is enabled\n  useEffect(() => {\n    discordApi.getStatus()\n      .then((status) => {\n        setDiscordEnabled(status.enabled);\n        setShowPasswordLogin(!status.enabled);\n      })\n      .catch(() => {\n        setDiscordEnabled(false);\n        setShowPasswordLogin(true);\n      });\n  }, []);\n\n  const handleSubmit = async (event: FormEvent<HTMLFormElement>) => {\n    event.preventDefault();\n    setError(null);\n    setSubmitting(true);\n\n    try {\n      await login(username, password);\n      setUsername('');\n      setPassword('');\n    } catch (err) {\n      if (axios.isAxiosError(err)) {\n        const message = err.response?.data?.error ?? 'Invalid username or password.';\n        setError(message);\n      } else if (err instanceof Error) {\n        setError(err.message);\n      } else {\n        setError('Login failed. Please try again.');\n      }\n    } finally {\n      setSubmitting(false);\n    }\n  };\n\n  const handleDiscordLogin = async () => {\n    setError(null);\n    setDiscordLoading(true);\n    try {\n      const { authorization_url } = await discordApi.getLoginUrl();\n      window.location.href = authorization_url;\n    } catch {\n      setError('Failed to start Discord login. Please try again.');\n      setDiscordLoading(false);\n    }\n  };\n\n  return (\n    <div className=\"min-h-screen bg-gray-50 flex items-center justify-center px-4\">\n      <div className=\"w-full max-w-md bg-white shadow-lg rounded-xl border border-gray-200 p-6\">\n        <div className=\"flex flex-col items-center gap-2 mb-6\">\n          <Link to=\"/\" className=\"flex items-center gap-2 hover:opacity-80 transition-opacity\">\n            <img src=\"/images/logos/logo.webp\" alt=\"Podly\" className=\"h-10 w-auto\" />\n          </Link>\n          <h1 className=\"text-xl font-semibold text-gray-900\">Sign in to Podly</h1>\n        </div>\n\n        {error && (\n          <div className=\"rounded-md bg-red-50 border border-red-200 px-3 py-2 text-sm text-red-700 mb-4\">\n            {error}\n          </div>\n        )}\n\n        {discordEnabled && (\n          <div className=\"space-y-3 mb-4\">\n            <button\n              type=\"button\"\n              onClick={handleDiscordLogin}\n              disabled={discordLoading}\n              className=\"w-full flex justify-center items-center gap-2 rounded-md bg-[#5865F2] px-4 py-3 text-white font-semibold shadow hover:bg-[#4752C4] transition-colors disabled:opacity-60 disabled:cursor-not-allowed\"\n            >\n              {discordLoading ? (\n                <span className=\"animate-spin h-4 w-4 border-2 border-white border-t-transparent rounded-full\" />\n              ) : (\n                <svg className=\"h-5 w-5\" viewBox=\"0 0 24 24\" fill=\"currentColor\">\n                  <path d=\"M20.317 4.37a19.791 19.791 0 0 0-4.885-1.515.074.074 0 0 0-.079.037c-.21.375-.444.864-.608 1.25a18.27 18.27 0 0 0-5.487 0 12.64 12.64 0 0 0-.617-1.25.077.077 0 0 0-.079-.037A19.736 19.736 0 0 0 3.677 4.37a.07.07 0 0 0-.032.027C.533 9.046-.32 13.58.099 18.057a.082.082 0 0 0 .031.057 19.9 19.9 0 0 0 5.993 3.03.078.078 0 0 0 .084-.028 14.09 14.09 0 0 0 1.226-1.994.076.076 0 0 0-.041-.106 13.107 13.107 0 0 1-1.872-.892.077.077 0 0 1-.008-.128 10.2 10.2 0 0 0 .372-.292.074.074 0 0 1 .077-.01c3.928 1.793 8.18 1.793 12.062 0a.074.074 0 0 1 .078.01c.12.098.246.198.373.292a.077.077 0 0 1-.006.127 12.299 12.299 0 0 1-1.873.892.077.077 0 0 0-.041.107c.36.698.772 1.362 1.225 1.993a.076.076 0 0 0 .084.028 19.839 19.839 0 0 0 6.002-3.03.077.077 0 0 0 .032-.054c.5-5.177-.838-9.674-3.549-13.66a.061.061 0 0 0-.031-.03zM8.02 15.33c-1.183 0-2.157-1.085-2.157-2.419 0-1.333.956-2.419 2.157-2.419 1.21 0 2.176 1.096 2.157 2.42 0 1.333-.956 2.418-2.157 2.418zm7.975 0c-1.183 0-2.157-1.085-2.157-2.419 0-1.333.955-2.419 2.157-2.419 1.21 0 2.176 1.096 2.157 2.42 0 1.333-.946 2.418-2.157 2.418z\"/>\n                </svg>\n              )}\n              {discordLoading ? 'Redirecting…' : 'Continue with Discord'}\n            </button>\n            {!showPasswordLogin && (\n              <button\n                type=\"button\"\n                onClick={() => setShowPasswordLogin(true)}\n                className=\"w-full text-sm font-medium text-blue-700 hover:text-blue-800 hover:underline\"\n              >\n                Use username / password\n              </button>\n            )}\n          </div>\n        )}\n\n        {(!discordEnabled || showPasswordLogin) && (\n          <form\n            className={`space-y-4 ${discordEnabled ? 'pt-4 border-t border-gray-200' : ''}`}\n            onSubmit={handleSubmit}\n          >\n            <div>\n              <label htmlFor=\"username\" className=\"block text-sm font-medium text-gray-700\">\n                Username\n              </label>\n              <input\n                id=\"username\"\n                name=\"username\"\n                type=\"text\"\n                autoComplete=\"username\"\n                value={username}\n                onChange={(event) => setUsername(event.target.value)}\n                className=\"mt-1 block w-full rounded-md border border-gray-300 px-3 py-2 shadow-sm focus:border-blue-500 focus:outline-none focus:ring-1 focus:ring-blue-500\"\n                disabled={submitting}\n                required\n              />\n            </div>\n\n            <div>\n              <label htmlFor=\"password\" className=\"block text-sm font-medium text-gray-700\">\n                Password\n              </label>\n              <input\n                id=\"password\"\n                name=\"password\"\n                type=\"password\"\n                autoComplete=\"current-password\"\n                value={password}\n                onChange={(event) => setPassword(event.target.value)}\n                className=\"mt-1 block w-full rounded-md border border-gray-300 px-3 py-2 shadow-sm focus:border-blue-500 focus:outline-none focus:ring-1 focus:ring-blue-500\"\n                disabled={submitting}\n                required\n              />\n            </div>\n\n            <button\n              type=\"submit\"\n              disabled={submitting}\n              className=\"w-full flex justify-center items-center gap-2 rounded-md bg-blue-600 px-4 py-2 text-white font-medium hover:bg-blue-700 transition-colors disabled:opacity-60 disabled:cursor-not-allowed\"\n            >\n              {submitting && <span className=\"animate-spin h-4 w-4 border-2 border-white border-t-transparent rounded-full\" />}\n              {submitting ? 'Signing in…' : 'Sign in'}\n            </button>\n          </form>\n        )}\n\n        <div className=\"mt-4 flex flex-col items-center gap-3\">\n          <a href=\"https://discord.gg/FRB98GtF6N\" target=\"_blank\" rel=\"noopener noreferrer\">\n            <img src=\"https://img.shields.io/badge/discord-join-blue.svg?logo=discord&logoColor=white\" alt=\"Discord\" />\n          </a>\n          {landingPageEnabled && (\n            <Link to=\"/\" className=\"text-sm text-gray-500 hover:text-gray-700 transition-colors\">\n              ← Back to home\n            </Link>\n          )}\n        </div>\n      </div>\n    </div>\n  );\n}\n"
  },
  {
    "path": "frontend/src/services/api.ts",
    "content": "import axios from 'axios';\nimport { diagnostics } from '../utils/diagnostics';\nimport type {\n  Feed,\n  Episode,\n  Job,\n  JobManagerStatus,\n  CleanupPreview,\n  CleanupRunResult,\n  CombinedConfig,\n  LLMConfig,\n  WhisperConfig,\n  PodcastSearchResult,\n  ConfigResponse,\n  BillingSummary,\n  LandingStatus,\n  PagedResult,\n} from '../types';\n\nconst API_BASE_URL = '';\n\nconst api = axios.create({\n  baseURL: API_BASE_URL,\n  withCredentials: true,\n});\n\napi.interceptors.response.use(\n  (response) => response,\n  (error) => {\n    try {\n      const cfg = error?.config;\n      const method = (cfg?.method ?? 'GET').toUpperCase();\n      const url = cfg?.url ?? '(unknown url)';\n      const status = error?.response?.status as number | undefined;\n      const responseData = error?.response?.data;\n\n      const details = {\n        method,\n        url,\n        status,\n        response: responseData,\n      };\n\n      diagnostics.add('error', `HTTP error ${status ?? 'NETWORK'} ${method} ${url}`, details);\n    } catch {\n      // ignore\n    }\n\n    return Promise.reject(error);\n  }\n);\n\nconst buildAbsoluteUrl = (path: string): string => {\n  if (/^https?:\\/\\//i.test(path)) {\n    return path;\n  }\n\n  const origin = API_BASE_URL || window.location.origin;\n  if (path.startsWith('/')) {\n    return `${origin}${path}`;\n  }\n  return `${origin}/${path}`;\n};\n\nexport const feedsApi = {\n  getFeeds: async (): Promise<Feed[]> => {\n    const response = await api.get('/feeds');\n    return response.data;\n  },\n\n  getFeedPosts: async (\n    feedId: number,\n    options?: { page?: number; pageSize?: number; whitelistedOnly?: boolean }\n  ): Promise<PagedResult<Episode>> => {\n    const response = await api.get(`/api/feeds/${feedId}/posts`, {\n      params: {\n        page: options?.page,\n        page_size: options?.pageSize,\n        whitelisted_only: options?.whitelistedOnly,\n      },\n    });\n    return response.data;\n  },\n\n  addFeed: async (url: string): Promise<void> => {\n    const formData = new FormData();\n    formData.append('url', url);\n    await api.post('/feed', formData);\n  },\n\n  deleteFeed: async (feedId: number): Promise<void> => {\n    await api.delete(`/feed/${feedId}`);\n  },\n\n  refreshFeed: async (\n    feedId: number\n  ): Promise<{ status: string; message?: string }> => {\n    const response = await api.post(`/api/feeds/${feedId}/refresh`);\n    return response.data;\n  },\n\n  refreshAllFeeds: async (): Promise<{\n    status: string;\n    feeds_refreshed: number;\n    jobs_enqueued: number;\n  }> => {\n    const response = await api.post('/api/feeds/refresh-all');\n    return response.data;\n  },\n\n  togglePostWhitelist: async (\n    guid: string,\n    whitelisted: boolean,\n    triggerProcessing = false\n  ): Promise<{ processing_job?: { status: string; job_id?: string; message?: string } }> => {\n    const response = await api.post(`/api/posts/${guid}/whitelist`, {\n      whitelisted,\n      trigger_processing: triggerProcessing,\n    });\n    return response.data;\n  },\n\n  toggleAllPostsWhitelist: async (feedId: number): Promise<{ message: string; whitelisted_count: number; total_count: number; all_whitelisted: boolean }> => {\n    const response = await api.post(`/api/feeds/${feedId}/toggle-whitelist-all`);\n    return response.data;\n  },\n\n  joinFeed: async (feedId: number): Promise<Feed> => {\n    const response = await api.post(`/api/feeds/${feedId}/join`);\n    return response.data;\n  },\n\n  exitFeed: async (feedId: number): Promise<Feed> => {\n    const response = await api.post(`/api/feeds/${feedId}/exit`);\n    return response.data;\n  },\n\n  leaveFeed: async (feedId: number): Promise<{ status: string; feed_id: number }> => {\n    const response = await api.post(`/api/feeds/${feedId}/leave`);\n    return response.data;\n  },\n\n  updateFeedSettings: async (\n    feedId: number,\n    settings: { auto_whitelist_new_episodes_override: boolean | null }\n  ): Promise<Feed> => {\n    const response = await api.patch(`/api/feeds/${feedId}/settings`, settings);\n    return response.data;\n  },\n\n  getProcessingEstimate: async (guid: string): Promise<{\n    post_guid: string;\n    estimated_minutes: number;\n    can_process: boolean;\n    reason: string | null;\n  }> => {\n    const response = await api.get(`/api/posts/${guid}/processing-estimate`);\n    return response.data;\n  },\n\n  searchFeeds: async (\n    term: string\n  ): Promise<{\n    results: PodcastSearchResult[];\n    total: number;\n  }> => {\n    const response = await api.get('/api/feeds/search', {\n      params: { term },\n    });\n    return response.data;\n  },\n\n  // New post processing methods\n  processPost: async (guid: string): Promise<{ status: string; job_id?: string; message: string; download_url?: string }> => {\n    const response = await api.post(`/api/posts/${guid}/process`);\n    return response.data;\n  },\n\n  reprocessPost: async (guid: string): Promise<{ status: string; job_id?: string; message: string; download_url?: string }> => {\n    const response = await api.post(`/api/posts/${guid}/reprocess`);\n    return response.data;\n  },\n\n  getPostStatus: async (guid: string): Promise<{\n    status: string;\n    step: number;\n    step_name: string;\n    total_steps: number;\n    message: string;\n    download_url?: string;\n    error?: string;\n  }> => {\n    const response = await api.get(`/api/posts/${guid}/status`);\n    return response.data;\n  },\n\n  // Get audio URL for post\n  getPostAudioUrl: (guid: string): string => {\n    return buildAbsoluteUrl(`/api/posts/${guid}/audio`);\n  },\n\n  // Get download URL for processed post\n  getPostDownloadUrl: (guid: string): string => {\n    return buildAbsoluteUrl(`/api/posts/${guid}/download`);\n  },\n\n  // Get download URL for original post\n  getPostOriginalDownloadUrl: (guid: string): string => {\n    return buildAbsoluteUrl(`/api/posts/${guid}/download/original`);\n  },\n\n  // Download processed post\n  downloadPost: async (guid: string): Promise<void> => {\n    const response = await api.get(`/api/posts/${guid}/download`, {\n      responseType: 'blob',\n    });\n\n    const blob = new Blob([response.data], { type: 'audio/mpeg' });\n    const url = window.URL.createObjectURL(blob);\n    const link = document.createElement('a');\n    link.href = url;\n    link.download = `${guid}.mp3`;\n    document.body.appendChild(link);\n    link.click();\n    document.body.removeChild(link);\n    window.URL.revokeObjectURL(url);\n  },\n\n  // Download original post\n  downloadOriginalPost: async (guid: string): Promise<void> => {\n    const response = await api.get(`/api/posts/${guid}/download/original`, {\n      responseType: 'blob',\n    });\n\n    const blob = new Blob([response.data], { type: 'audio/mpeg' });\n    const url = window.URL.createObjectURL(blob);\n    const link = document.createElement('a');\n    link.href = url;\n    link.download = `${guid}_original.mp3`;\n    document.body.appendChild(link);\n    link.click();\n    document.body.removeChild(link);\n    window.URL.revokeObjectURL(url);\n  },\n\n  createProtectedFeedShareLink: async (\n    feedId: number\n  ): Promise<{ url: string; feed_token: string; feed_secret: string; feed_id: number }> => {\n    const response = await api.post(`/api/feeds/${feedId}/share-link`);\n    return response.data;\n  },\n\n  // Get processing stats for post\n  getPostStats: async (guid: string): Promise<{\n    post: {\n      guid: string;\n      title: string;\n      duration: number | null;\n      release_date: string | null;\n      whitelisted: boolean;\n      has_processed_audio: boolean;\n    };\n    processing_stats: {\n      total_segments: number;\n      total_model_calls: number;\n      total_identifications: number;\n      content_segments: number;\n      ad_segments_count: number;\n      ad_percentage: number;\n      estimated_ad_time_seconds: number;\n      model_call_statuses: Record<string, number>;\n      model_types: Record<string, number>;\n    };\n    model_calls: Array<{\n      id: number;\n      model_name: string;\n      status: string;\n      segment_range: string;\n      first_segment_sequence_num: number;\n      last_segment_sequence_num: number;\n      timestamp: string | null;\n      retry_attempts: number;\n      error_message: string | null;\n      prompt: string | null;\n      response: string | null;\n    }>;\n    transcript_segments: Array<{\n      id: number;\n      sequence_num: number;\n      start_time: number;\n      end_time: number;\n      text: string;\n      primary_label: 'ad' | 'content';\n      mixed: boolean;\n      identifications: Array<{\n        id: number;\n        label: string;\n        confidence: number | null;\n        model_call_id: number;\n      }>;\n    }>;\n    identifications: Array<{\n      id: number;\n      transcript_segment_id: number;\n      label: string;\n      confidence: number | null;\n      model_call_id: number;\n      segment_sequence_num: number;\n      segment_start_time: number;\n      segment_end_time: number;\n      segment_text: string;\n      mixed: boolean;\n    }>;\n  }> => {\n    const response = await api.get(`/api/posts/${guid}/stats`);\n    return response.data;\n  },\n\n  // Legacy aliases for backward compatibility\n  getFeedEpisodes: async (\n    feedId: number,\n    options?: { page?: number; pageSize?: number; whitelistedOnly?: boolean }\n  ): Promise<PagedResult<Episode>> => {\n    return feedsApi.getFeedPosts(feedId, options);\n  },\n\n  toggleEpisodeWhitelist: async (guid: string, whitelisted: boolean): Promise<{ processing_job?: { status: string; job_id?: string; message?: string } }> => {\n    return feedsApi.togglePostWhitelist(guid, whitelisted);\n  },\n\n  toggleAllEpisodesWhitelist: async (feedId: number): Promise<{ message: string; whitelisted_count: number; total_count: number; all_whitelisted: boolean }> => {\n    return feedsApi.toggleAllPostsWhitelist(feedId);\n  },\n\n  processEpisode: async (guid: string): Promise<{ status: string; job_id?: string; message: string; download_url?: string }> => {\n    return feedsApi.processPost(guid);\n  },\n\n  getEpisodeStatus: async (guid: string): Promise<{\n    status: string;\n    step: number;\n    step_name: string;\n    total_steps: number;\n    message: string;\n    download_url?: string;\n    error?: string;\n  }> => {\n    return feedsApi.getPostStatus(guid);\n  },\n\n  getEpisodeAudioUrl: (guid: string): string => {\n    return feedsApi.getPostAudioUrl(guid);\n  },\n\n  getEpisodeStats: async (guid: string): Promise<{\n    post: {\n      guid: string;\n      title: string;\n      duration: number | null;\n      release_date: string | null;\n      whitelisted: boolean;\n      has_processed_audio: boolean;\n    };\n    processing_stats: {\n      total_segments: number;\n      total_model_calls: number;\n      total_identifications: number;\n      content_segments: number;\n      ad_segments_count: number;\n      ad_percentage: number;\n      estimated_ad_time_seconds: number;\n      model_call_statuses: Record<string, number>;\n      model_types: Record<string, number>;\n    };\n    model_calls: Array<{\n      id: number;\n      model_name: string;\n      status: string;\n      segment_range: string;\n      first_segment_sequence_num: number;\n      last_segment_sequence_num: number;\n      timestamp: string | null;\n      retry_attempts: number;\n      error_message: string | null;\n      prompt: string | null;\n      response: string | null;\n    }>;\n    transcript_segments: Array<{\n      id: number;\n      sequence_num: number;\n      start_time: number;\n      end_time: number;\n      text: string;\n      primary_label: 'ad' | 'content';\n      mixed: boolean;\n      identifications: Array<{\n        id: number;\n        label: string;\n        confidence: number | null;\n        model_call_id: number;\n      }>;\n    }>;\n    identifications: Array<{\n      id: number;\n      transcript_segment_id: number;\n      label: string;\n      confidence: number | null;\n      model_call_id: number;\n      segment_sequence_num: number;\n      segment_start_time: number;\n      segment_end_time: number;\n      segment_text: string;\n      mixed: boolean;\n    }>;\n  }> => {\n    return feedsApi.getPostStats(guid);\n  },\n\n  // Legacy download aliases\n  downloadEpisode: async (guid: string): Promise<void> => {\n    return feedsApi.downloadPost(guid);\n  },\n\n  downloadOriginalEpisode: async (guid: string): Promise<void> => {\n    return feedsApi.downloadOriginalPost(guid);\n  },\n\n  getEpisodeDownloadUrl: (guid: string): string => {\n    return feedsApi.getPostDownloadUrl(guid);\n  },\n\n  getEpisodeOriginalDownloadUrl: (guid: string): string => {\n    return feedsApi.getPostOriginalDownloadUrl(guid);\n  },\n\n  getAggregateFeedLink: async (): Promise<{ url: string }> => {\n    const response = await api.post('/api/user/aggregate-link');\n    return response.data;\n  },\n};\n\nexport const authApi = {\n  getStatus: async (): Promise<{ require_auth: boolean; landing_page_enabled?: boolean }> => {\n    const response = await api.get('/api/auth/status');\n    return response.data;\n  },\n\n  login: async (username: string, password: string): Promise<{ user: { id: number; username: string; role: string } }> => {\n    const response = await api.post('/api/auth/login', { username, password });\n    return response.data;\n  },\n\n  logout: async (): Promise<void> => {\n    await api.post('/api/auth/logout');\n  },\n\n  getCurrentUser: async (): Promise<{ user: { id: number; username: string; role: string } }> => {\n    const response = await api.get('/api/auth/me');\n    return response.data;\n  },\n\n  changePassword: async (payload: { current_password: string; new_password: string }): Promise<{ status: string }> => {\n    const response = await api.post('/api/auth/change-password', payload);\n    return response.data;\n  },\n\n  listUsers: async (): Promise<{ users: Array<{ id: number; username: string; role: string; created_at: string; updated_at: string; last_active?: string | null; feed_allowance?: number; feed_subscription_status?: string; manual_feed_allowance?: number | null }> }> => {\n    const response = await api.get('/api/auth/users');\n    return response.data;\n  },\n\n  createUser: async (payload: { username: string; password: string; role: string }): Promise<{ user: { id: number; username: string; role: string; created_at: string; updated_at: string } }> => {\n    const response = await api.post('/api/auth/users', payload);\n    return response.data;\n  },\n\n  updateUser: async (username: string, payload: { password?: string; role?: string; manual_feed_allowance?: number | null }): Promise<{ status: string }> => {\n    const response = await api.patch(`/api/auth/users/${username}`, payload);\n    return response.data;\n  },\n\n  deleteUser: async (username: string): Promise<{ status: string }> => {\n    const response = await api.delete(`/api/auth/users/${username}`);\n    return response.data;\n  },\n};\n\nexport const landingApi = {\n  getStatus: async (): Promise<LandingStatus> => {\n    const response = await api.get('/api/landing/status');\n    return response.data;\n  },\n};\n\nexport const discordApi = {\n  getStatus: async (): Promise<{ enabled: boolean }> => {\n    const response = await api.get('/api/auth/discord/status');\n    return response.data;\n  },\n\n  getLoginUrl: async (): Promise<{ authorization_url: string }> => {\n    const response = await api.get('/api/auth/discord/login');\n    return response.data;\n  },\n\n  getConfig: async (): Promise<{\n    config: {\n      enabled: boolean;\n      client_id: string | null;\n      client_secret_preview: string | null;\n      redirect_uri: string | null;\n      guild_ids: string;\n      allow_registration: boolean;\n    };\n    env_overrides: Record<string, { env_var: string; value?: string; is_secret?: boolean }>;\n  }> => {\n    const response = await api.get('/api/auth/discord/config');\n    return response.data;\n  },\n\n  updateConfig: async (payload: {\n    client_id?: string;\n    client_secret?: string;\n    redirect_uri?: string;\n    guild_ids?: string;\n    allow_registration?: boolean;\n  }): Promise<{\n    status: string;\n    config: {\n      enabled: boolean;\n      client_id: string | null;\n      client_secret_preview: string | null;\n      redirect_uri: string | null;\n      guild_ids: string;\n      allow_registration: boolean;\n    };\n  }> => {\n    const response = await api.put('/api/auth/discord/config', payload);\n    return response.data;\n  },\n};\n\nexport const configApi = {\n  getConfig: async (): Promise<ConfigResponse> => {\n    const response = await api.get('/api/config');\n    return response.data;\n  },\n  isConfigured: async (): Promise<{ configured: boolean }> => {\n    const response = await api.get('/api/config/api_configured_check');\n    return { configured: !!response.data?.configured };\n  },\n  updateConfig: async (payload: Partial<CombinedConfig>): Promise<CombinedConfig> => {\n    const response = await api.put('/api/config', payload);\n    return response.data;\n  },\n  testLLM: async (\n    payload: Partial<{ llm: LLMConfig }>\n  ): Promise<{ ok: boolean; message?: string; error?: string }> => {\n    const response = await api.post('/api/config/test-llm', payload ?? {});\n    return response.data;\n  },\n  testWhisper: async (\n    payload: Partial<{ whisper: WhisperConfig }>\n  ): Promise<{ ok: boolean; message?: string; error?: string }> => {\n    const response = await api.post('/api/config/test-whisper', payload ?? {});\n    return response.data;\n  },\n  getWhisperCapabilities: async (): Promise<{ local_available: boolean }> => {\n    const response = await api.get('/api/config/whisper-capabilities');\n    const local_available = !!response.data?.local_available;\n    return { local_available };\n  },\n};\n\nexport const billingApi = {\n  getSummary: async (): Promise<BillingSummary> => {\n    const response = await api.get('/api/billing/summary');\n    return response.data;\n  },\n  updateSubscription: async (\n    amount: number,\n    options?: { subscriptionId?: string | null }\n  ): Promise<\n    BillingSummary & {\n      message?: string;\n      checkout_url?: string;\n      requires_stripe_checkout?: boolean;\n    }\n  > => {\n    const response = await api.post('/api/billing/subscription', {\n      amount,\n      subscription_id: options?.subscriptionId,\n    });\n    return response.data;\n  },\n  createPortalSession: async (): Promise<{ url: string }> => {\n    const response = await api.post('/api/billing/portal-session');\n    return response.data;\n  },\n};\n\nexport const jobsApi = {\n  getActiveJobs: async (limit: number = 100): Promise<Job[]> => {\n    const response = await api.get('/api/jobs/active', { params: { limit } });\n    return response.data;\n  },\n  getAllJobs: async (limit: number = 200): Promise<Job[]> => {\n    const response = await api.get('/api/jobs/all', { params: { limit } });\n    return response.data;\n  },\n  cancelJob: async (jobId: string): Promise<{ status: string; job_id: string; message: string }> => {\n    const response = await api.post(`/api/jobs/${jobId}/cancel`);\n    return response.data;\n  },\n  getJobManagerStatus: async (): Promise<JobManagerStatus> => {\n    const response = await api.get('/api/job-manager/status');\n    return response.data;\n  },\n  getCleanupPreview: async (): Promise<CleanupPreview> => {\n    const response = await api.get('/api/jobs/cleanup/preview');\n    return response.data;\n  },\n  runCleanupJob: async (): Promise<CleanupRunResult> => {\n    const response = await api.post('/api/jobs/cleanup/run');\n    return response.data;\n  }\n};\n"
  },
  {
    "path": "frontend/src/types/index.ts",
    "content": "export interface Feed {\n  id: number;\n  rss_url: string;\n  title: string;\n  description?: string;\n  author?: string;\n  image_url?: string;\n  posts_count: number;\n  member_count?: number;\n  is_member?: boolean;\n  is_active_subscription?: boolean;\n  auto_whitelist_new_episodes_override?: boolean | null;\n}\n\nexport interface Episode {\n  id: number;\n  guid: string;\n  title: string;\n  description: string;\n  release_date: string | null;\n  duration: number | null;\n  whitelisted: boolean;\n  has_processed_audio: boolean;\n  has_unprocessed_audio: boolean;\n  download_url: string;\n  image_url: string | null;\n  download_count: number;\n} \n\nexport interface PagedResult<T> {\n  items: T[];\n  total: number;\n  page: number;\n  page_size: number;\n  total_pages?: number;\n  whitelisted_total?: number;\n}\n\nexport interface Job {\n  job_id: string;\n  post_guid: string;\n  post_title: string | null;\n  feed_title: string | null;\n  status: 'pending' | 'running' | 'completed' | 'failed' | 'cancelled' | 'skipped' | string;\n  priority: number;\n  step: number;\n  step_name: string | null;\n  total_steps: number;\n  progress_percentage: number;\n  created_at: string | null;\n  started_at: string | null;\n  completed_at: string | null;\n  error_message: string | null;\n}\n\nexport interface JobManagerRun {\n  id: string;\n  status: 'pending' | 'running' | 'completed' | 'failed' | string;\n  trigger: string;\n  started_at: string | null;\n  completed_at: string | null;\n  updated_at: string | null;\n  total_jobs: number;\n  queued_jobs: number;\n  running_jobs: number;\n  completed_jobs: number;\n  failed_jobs: number;\n  skipped_jobs: number;\n  context?: Record<string, unknown> | null;\n  counters_reset_at: string | null;\n  progress_percentage: number;\n}\n\nexport interface JobManagerStatus {\n  run: JobManagerRun | null;\n}\n\nexport interface CleanupPreview {\n  count: number;\n  retention_days: number | null;\n  cutoff_utc: string | null;\n}\n\nexport interface CleanupRunResult {\n  status: 'ok' | 'disabled' | 'error' | string;\n  removed_posts?: number;\n  remaining_candidates?: number;\n  retention_days?: number | null;\n  cutoff_utc?: string | null;\n  message?: string;\n}\n\n// ----- Configuration Types -----\n\nexport interface LLMConfig {\n  llm_api_key?: string | null;\n  llm_api_key_preview?: string | null;\n  llm_model: string;\n  openai_base_url?: string | null;\n  openai_timeout: number;\n  openai_max_tokens: number;\n  llm_max_concurrent_calls: number;\n  llm_max_retry_attempts: number;\n  llm_max_input_tokens_per_call?: number | null;\n  llm_enable_token_rate_limiting: boolean;\n  llm_max_input_tokens_per_minute?: number | null;\n  enable_boundary_refinement: boolean;\n  enable_word_level_boundary_refinder?: boolean;\n}\n\nexport type WhisperConfig =\n  | { whisper_type: 'local'; model: string }\n  | {\n      whisper_type: 'remote';\n      model: string;\n      api_key?: string | null;\n      api_key_preview?: string | null;\n      base_url?: string;\n      language: string;\n      timeout_sec: number;\n      chunksize_mb: number;\n    }\n  | {\n      whisper_type: 'groq';\n      api_key?: string | null;\n      api_key_preview?: string | null;\n      model: string;\n      language: string;\n      max_retries: number;\n    }\n  | { whisper_type: 'test' };\n\nexport interface ProcessingConfigUI {\n  num_segments_to_input_to_prompt: number;\n}\n\nexport interface OutputConfigUI {\n  fade_ms: number;\n  // Note the intentional spelling to match backend\n  min_ad_segement_separation_seconds: number;\n  min_ad_segment_length_seconds: number;\n  min_confidence: number;\n}\n\nexport interface AppConfigUI {\n  background_update_interval_minute: number | null;\n  automatically_whitelist_new_episodes: boolean;\n  post_cleanup_retention_days: number | null;\n  number_of_episodes_to_whitelist_from_archive_of_new_feed: number;\n  enable_public_landing_page: boolean;\n  user_limit_total: number | null;\n  autoprocess_on_download: boolean;\n}\n\nexport interface CombinedConfig {\n  llm: LLMConfig;\n  whisper: WhisperConfig;\n  processing: ProcessingConfigUI;\n  output: OutputConfigUI;\n  app: AppConfigUI;\n}\n\nexport interface EnvOverrideEntry {\n  env_var: string;\n  value?: string;\n  value_preview?: string | null;\n  is_secret?: boolean;\n}\n\nexport type EnvOverrideMap = Record<string, EnvOverrideEntry>;\n\nexport interface ConfigResponse {\n  config: CombinedConfig;\n  env_overrides?: EnvOverrideMap;\n}\n\nexport interface PodcastSearchResult {\n  title: string;\n  author: string;\n  feedUrl: string;\n  artworkUrl: string;\n  description: string;\n  genres: string[];\n}\n\nexport interface AuthUser {\n  id: number;\n  username: string;\n  role: 'admin' | 'user' | string;\n  feed_allowance?: number;\n  feed_subscription_status?: string;\n  manual_feed_allowance?: number | null;\n}\n\nexport interface ManagedUser extends AuthUser {\n  created_at: string;\n  updated_at: string;\n  last_active?: string | null;\n}\n\nexport interface DiscordStatus {\n  enabled: boolean;\n}\n\nexport interface BillingSummary {\n  feed_allowance: number;\n  feeds_in_use: number;\n  remaining: number;\n  current_amount?: number;\n  min_amount_cents?: number;\n  subscription_status: string;\n  stripe_subscription_id?: string | null;\n  stripe_customer_id?: string | null;\n  product_id?: string | null;\n  message?: string;\n}\n\nexport interface LandingStatus {\n  require_auth: boolean;\n  landing_page_enabled: boolean;\n  user_count: number;\n  user_limit_total: number | null;\n  slots_remaining: number | null;\n}\n"
  },
  {
    "path": "frontend/src/utils/clipboard.ts",
    "content": "import { toast } from 'react-hot-toast';\n\nexport async function copyToClipboard(text: string, promptMessage: string = 'Copy to clipboard:', successMessage?: string): Promise<boolean> {\n  // Try Clipboard API first\n  if (navigator.clipboard && navigator.clipboard.writeText) {\n    try {\n      await navigator.clipboard.writeText(text);\n      if (successMessage) toast.success(successMessage);\n      return true;\n    } catch (err) {\n      console.warn('Clipboard API failed, trying fallback', err);\n    }\n  }\n\n  // Fallback for non-secure contexts or if Clipboard API fails\n  try {\n    const textArea = document.createElement('textarea');\n    textArea.value = text;\n    \n    // Ensure it's not visible but part of the DOM\n    textArea.style.position = 'fixed';\n    textArea.style.left = '-9999px';\n    textArea.style.top = '0';\n    document.body.appendChild(textArea);\n    \n    textArea.focus();\n    textArea.select();\n    \n    const successful = document.execCommand('copy');\n    document.body.removeChild(textArea);\n    if (successful) {\n      if (successMessage) toast.success(successMessage);\n      return true;\n    }\n  } catch (err) {\n    console.error('Fallback copy failed', err);\n  }\n\n  // If all else fails, prompt the user\n  window.prompt(promptMessage, text);\n  return false;\n}\n"
  },
  {
    "path": "frontend/src/utils/diagnostics.ts",
    "content": "export type DiagnosticsLevel = 'debug' | 'info' | 'warn' | 'error';\n\nexport type DiagnosticsEntry = {\n  ts: number;\n  level: DiagnosticsLevel;\n  message: string;\n  data?: unknown;\n};\n\nexport type DiagnosticsState = {\n  v: 1;\n  entries: DiagnosticsEntry[];\n};\n\nexport type DiagnosticErrorPayload = {\n  title: string;\n  message: string;\n  kind?: 'network' | 'http' | 'app' | 'unknown';\n  details?: unknown;\n};\n\nconst STORAGE_KEY = 'podly.diagnostics.v1';\nconst MAX_ENTRIES = 200;\nconst MAX_ENTRY_MESSAGE_CHARS = 500;\nconst MAX_JSON_CHARS = 120_000;\n\nconst SENSITIVE_KEY_RE = /(authorization|cookie|set-cookie|token|access[_-]?token|refresh[_-]?token|id[_-]?token|api[_-]?key|secret|password|session)/i;\nconst SENSITIVE_VALUE_REPLACEMENT = '[REDACTED]';\n\nconst redactString = (value: string): string => {\n  let v = value;\n  // Authorization headers / bearer tokens\n  v = v.replace(/\\bBearer\\s+([A-Za-z0-9\\-._~+/]+=*)/gi, 'Bearer [REDACTED]');\n  v = v.replace(/\\bBasic\\s+([A-Za-z0-9+/=]+)\\b/gi, 'Basic [REDACTED]');\n\n  // Common query params\n  v = v.replace(/([?&](?:token|access_token|refresh_token|id_token|api_key|key|password)=)([^&#]+)/gi, '$1[REDACTED]');\n\n  // JSON-ish fields in strings\n  v = v.replace(/(\"(?:access_token|refresh_token|id_token|token|api_key|password)\"\\s*:\\s*\")([^\"]+)(\")/gi, '$1[REDACTED]$3');\n\n  return v;\n};\n\nconst sanitize = (input: unknown, depth = 0): unknown => {\n  if (depth > 6) return '[Truncated]';\n  if (input == null) return input;\n\n  if (typeof input === 'string') return redactString(input);\n  if (typeof input === 'number' || typeof input === 'boolean') return input;\n\n  if (Array.isArray(input)) {\n    return input.slice(0, 50).map((v) => sanitize(v, depth + 1));\n  }\n\n  if (typeof input === 'object') {\n    const obj = input as Record<string, unknown>;\n    const out: Record<string, unknown> = {};\n    const keys = Object.keys(obj).slice(0, 50);\n    for (const key of keys) {\n      const value = obj[key];\n      if (SENSITIVE_KEY_RE.test(key)) {\n        out[key] = SENSITIVE_VALUE_REPLACEMENT;\n      } else {\n        out[key] = sanitize(value, depth + 1);\n      }\n    }\n    return out;\n  }\n\n  return String(input);\n};\n\nconst safeJsonStringify = (value: unknown): string => {\n  try {\n    const json = JSON.stringify(value);\n    if (json.length <= MAX_JSON_CHARS) return json;\n    return json.slice(0, MAX_JSON_CHARS) + '\\n...[truncated]';\n  } catch {\n    return '[Unserializable]';\n  }\n};\n\nconst loadState = (): DiagnosticsState => {\n  try {\n    const raw = sessionStorage.getItem(STORAGE_KEY);\n    if (!raw) return { v: 1, entries: [] };\n    const parsed = JSON.parse(raw) as DiagnosticsState;\n    if (parsed?.v !== 1 || !Array.isArray(parsed.entries)) {\n      return { v: 1, entries: [] };\n    }\n    return parsed;\n  } catch {\n    return { v: 1, entries: [] };\n  }\n};\n\nconst saveState = (state: DiagnosticsState) => {\n  try {\n    const raw = safeJsonStringify(state);\n    // Prevent sessionStorage bloat\n    if (raw.length > MAX_JSON_CHARS) {\n      const trimmed = { v: 1 as const, entries: state.entries.slice(-Math.floor(MAX_ENTRIES / 2)) };\n      sessionStorage.setItem(STORAGE_KEY, safeJsonStringify(trimmed));\n      return;\n    }\n    sessionStorage.setItem(STORAGE_KEY, raw);\n  } catch {\n    // ignore\n  }\n};\n\nexport const DIAGNOSTIC_UPDATED_EVENT = 'podly:diagnostic-updated';\n\nexport const diagnostics = {\n  add: (level: DiagnosticsLevel, message: string, data?: unknown) => {\n    const sanitizedMessage = redactString(message).slice(0, MAX_ENTRY_MESSAGE_CHARS);\n    const entry: DiagnosticsEntry = {\n      ts: Date.now(),\n      level,\n      message: sanitizedMessage,\n      data: data === undefined ? undefined : sanitize(data),\n    };\n\n    const state = loadState();\n    const next = [...state.entries, entry].slice(-MAX_ENTRIES);\n    saveState({ v: 1, entries: next });\n\n    try {\n      if (typeof window !== 'undefined') {\n        window.dispatchEvent(new Event(DIAGNOSTIC_UPDATED_EVENT));\n      }\n    } catch {\n      // ignore\n    }\n  },\n\n  getEntries: (): DiagnosticsEntry[] => {\n    return loadState().entries;\n  },\n\n  clear: () => {\n    try {\n      sessionStorage.removeItem(STORAGE_KEY);\n    } catch {\n      // ignore\n    }\n  },\n\n  sanitize,\n};\n\nexport const DIAGNOSTIC_ERROR_EVENT = 'podly:diagnostic-error';\n\nexport const emitDiagnosticError = (payload: DiagnosticErrorPayload) => {\n  const safePayload = diagnostics.sanitize(payload) as DiagnosticErrorPayload;\n  diagnostics.add('error', safePayload.title + ': ' + safePayload.message, safePayload);\n  try {\n    window.dispatchEvent(new CustomEvent(DIAGNOSTIC_ERROR_EVENT, { detail: safePayload }));\n  } catch {\n    // ignore\n  }\n};\n\nlet consoleWrapped = false;\n\nexport const initFrontendDiagnostics = () => {\n  if (typeof window === 'undefined') return;\n\n  if (!consoleWrapped) {\n    consoleWrapped = true;\n    const wrap = (level: DiagnosticsLevel, original: (...args: unknown[]) => void) =>\n      (...args: unknown[]) => {\n        try {\n          const msg = args\n            .map((a) => (typeof a === 'string' ? a : safeJsonStringify(diagnostics.sanitize(a))))\n            .join(' ');\n          diagnostics.add(level, msg);\n        } catch {\n          // ignore\n        }\n        original(...args);\n      };\n\n    console.log = wrap('info', console.log.bind(console));\n    console.info = wrap('info', console.info.bind(console));\n    console.warn = wrap('warn', console.warn.bind(console));\n    console.error = wrap('error', console.error.bind(console));\n  }\n\n  window.addEventListener('error', (event) => {\n    emitDiagnosticError({\n      title: 'Unhandled error',\n      message: event.message || 'Unknown error',\n      kind: 'app',\n      details: {\n        filename: event.filename,\n        lineno: event.lineno,\n        colno: event.colno,\n      },\n    });\n  });\n\n  window.addEventListener('unhandledrejection', (event) => {\n    const reason = (event as PromiseRejectionEvent).reason;\n    emitDiagnosticError({\n      title: 'Unhandled promise rejection',\n      message: typeof reason === 'string' ? reason : 'Promise rejected',\n      kind: 'app',\n      details: reason,\n    });\n  });\n};\n"
  },
  {
    "path": "frontend/src/utils/httpError.ts",
    "content": "import type { AxiosError } from 'axios';\n\nexport type ApiErrorData = {\n  message?: unknown;\n  error?: unknown;\n  [key: string]: unknown;\n};\n\nexport type HttpErrorInfo = {\n  status?: number;\n  message: string;\n  data?: unknown;\n};\n\nconst asString = (v: unknown): string | null => (typeof v === 'string' ? v : null);\n\nexport const getHttpErrorInfo = (err: unknown): HttpErrorInfo => {\n  const axiosErr = err as AxiosError<ApiErrorData>;\n  const status = axiosErr?.response?.status;\n  const data = axiosErr?.response?.data;\n\n  const messageFromData =\n    data && typeof data === 'object'\n      ? asString((data as ApiErrorData).message) ?? asString((data as ApiErrorData).error)\n      : null;\n\n  return {\n    status,\n    data,\n    message: messageFromData ?? asString((axiosErr as unknown as { message?: unknown })?.message) ?? 'Request failed',\n  };\n};\n"
  },
  {
    "path": "frontend/src/vite-env.d.ts",
    "content": "/// <reference types=\"vite/client\" />\n"
  },
  {
    "path": "frontend/tailwind.config.js",
    "content": "/** @type {import('tailwindcss').Config} */\nmodule.exports = {\n  content: [\"./index.html\", \"./src/**/*.{js,ts,jsx,tsx}\"],\n  theme: {\n    extend: {},\n  },\n  plugins: [],\n};\n"
  },
  {
    "path": "frontend/tsconfig.app.json",
    "content": "{\n  \"compilerOptions\": {\n    \"tsBuildInfoFile\": \"./node_modules/.tmp/tsconfig.app.tsbuildinfo\",\n    \"target\": \"ES2020\",\n    \"useDefineForClassFields\": true,\n    \"lib\": [\"ES2020\", \"DOM\", \"DOM.Iterable\"],\n    \"module\": \"ESNext\",\n    \"skipLibCheck\": true,\n\n    /* Bundler mode */\n    \"moduleResolution\": \"bundler\",\n    \"allowImportingTsExtensions\": true,\n    \"verbatimModuleSyntax\": true,\n    \"moduleDetection\": \"force\",\n    \"noEmit\": true,\n    \"jsx\": \"react-jsx\",\n\n    /* Linting */\n    \"strict\": true,\n    \"noUnusedLocals\": true,\n    \"noUnusedParameters\": true,\n    \"erasableSyntaxOnly\": true,\n    \"noFallthroughCasesInSwitch\": true,\n    \"noUncheckedSideEffectImports\": true\n  },\n  \"include\": [\"src\"],\n  \"exclude\": [\"src/contexts/diagnosticsContext.ts\"]\n}\n"
  },
  {
    "path": "frontend/tsconfig.json",
    "content": "{\n  \"files\": [],\n  \"references\": [\n    { \"path\": \"./tsconfig.app.json\" },\n    { \"path\": \"./tsconfig.node.json\" }\n  ]\n}\n"
  },
  {
    "path": "frontend/tsconfig.node.json",
    "content": "{\n  \"compilerOptions\": {\n    \"tsBuildInfoFile\": \"./node_modules/.tmp/tsconfig.node.tsbuildinfo\",\n    \"target\": \"ES2022\",\n    \"lib\": [\"ES2023\"],\n    \"module\": \"ESNext\",\n    \"skipLibCheck\": true,\n\n    /* Bundler mode */\n    \"moduleResolution\": \"bundler\",\n    \"allowImportingTsExtensions\": true,\n    \"verbatimModuleSyntax\": true,\n    \"moduleDetection\": \"force\",\n    \"noEmit\": true,\n\n    /* Linting */\n    \"strict\": true,\n    \"noUnusedLocals\": true,\n    \"noUnusedParameters\": true,\n    \"erasableSyntaxOnly\": true,\n    \"noFallthroughCasesInSwitch\": true,\n    \"noUncheckedSideEffectImports\": true\n  },\n  \"include\": [\"vite.config.ts\"]\n}\n"
  },
  {
    "path": "frontend/vite.config.ts",
    "content": "import { defineConfig } from 'vite'\nimport react from '@vitejs/plugin-react'\n\n// For development, the frontend development server will proxy to the backend\n// The backend port should match the configured application port\n// This will work with the new port configuration\nconst BACKEND_TARGET = 'http://localhost:5001'\n\n// https://vite.dev/config/\nexport default defineConfig({\n  plugins: [react()],\n  server: {\n    port: 5173,\n    host: true,\n    allowedHosts: true,\n    proxy: {\n      '/api': {\n        target: BACKEND_TARGET,\n        changeOrigin: true,\n        secure: false\n      },\n      // Proxy feed endpoints for backwards compatibility\n      '/feed': {\n        target: BACKEND_TARGET,\n        changeOrigin: true,\n        secure: false\n      },\n      // Proxy legacy post endpoints for backwards compatibility\n      '/post': {\n        target: BACKEND_TARGET,\n        changeOrigin: true,\n        secure: false\n      }\n    }\n  },\n  build: {\n    outDir: 'dist',\n    sourcemap: false\n  }\n})\n"
  },
  {
    "path": "pyproject.toml",
    "content": "[tool.pylint]\ninit-hook = 'import sys; sys.path.append(\"./src\")'\n\ndisable = [\n    \"logging-fstring-interpolation\",\n    \"missing-class-docstring\",\n    \"missing-function-docstring\",\n    \"missing-module-docstring\",\n    \"too-few-public-methods\",\n    \"too-many-arguments\",\n    \"too-many-locals\",\n    \"unspecified-encoding\",\n    \"line-too-long\",\n    \"too-many-return-statements\"\n]\n\n[tool.mypy]\nwarn_unused_ignores = true\nstrict = true\nmypy_path = \"src\"\n\n[tool.pytest.ini_options]\npythonpath = [\"src\"]\n\n[tool.black]\nline-length = 88\n\n[tool.isort]\nprofile = \"black\"\nline_length = 88\nfloat_to_top = true\n"
  },
  {
    "path": "run_podly_docker.sh",
    "content": "#!/bin/bash\n\n# Colors for output\nYELLOW='\\033[1;33m'\nRED='\\033[0;31m'\nGREEN='\\033[0;32m'\nNC='\\033[0m' # No Color\n\n# Central configuration defaults\nCUDA_VERSION=\"12.4.1\"\nROCM_VERSION=\"6.4\"\nCPU_BASE_IMAGE=\"python:3.11-slim\"\nGPU_NVIDIA_BASE_IMAGE=\"nvidia/cuda:${CUDA_VERSION}-cudnn-devel-ubuntu22.04\"\nGPU_ROCM_BASE_IMAGE=\"rocm/dev-ubuntu-22.04:${ROCM_VERSION}-complete\"\n\n# Read server URL from config.yml if it exists\nSERVER_URL=\"\"\n\nif [ -f \"config/config.yml\" ]; then\n    SERVER_URL=$(grep \"^server:\" config/config.yml | cut -d' ' -f2- | tr -d ' ')\n\n    if [ -n \"$SERVER_URL\" ]; then\n        # Remove http:// or https:// prefix to get just the hostname\n        CLEAN_URL=$(echo \"$SERVER_URL\" | sed 's|^https\\?://||')\n        export VITE_API_URL=\"http://${CLEAN_URL}:5001\"\n        echo -e \"${GREEN}Using server URL from config.yml: ${VITE_API_URL}${NC}\"\n    fi\nfi\n\n# Check dependencies\necho -e \"${YELLOW}Checking dependencies...${NC}\"\nif ! command -v docker &> /dev/null; then\n    echo -e \"${RED}Docker not found. Please install Docker first.${NC}\"\n    exit 1\nfi\n\nif ! docker compose version &> /dev/null; then\n    echo -e \"${RED}Docker Compose not found. Please install Docker Compose V2.${NC}\"\n    exit 1\nfi\n\n# Default values\nBUILD_ONLY=false\nTEST_BUILD=false\nFORCE_CPU=false\nFORCE_GPU=false\nDETACHED=false\nPRODUCTION_MODE=true\nREBUILD=false\nBRANCH_SUFFIX=\"main\"\nLITE_BUILD=false\n\n# Detect NVIDIA GPU\nNVIDIA_GPU_AVAILABLE=false\nif command -v nvidia-smi &> /dev/null && nvidia-smi &> /dev/null; then\n    NVIDIA_GPU_AVAILABLE=true\n    echo -e \"${GREEN}NVIDIA GPU detected.${NC}\"\nfi\n# Detect ROCM GPU\nAMD_GPU_AVAILABLE=false\nif command -v rocm-smi &> /dev/null && rocm-smi &> /dev/null; then\n    AMD_GPU_AVAILABLE=true\n    echo -e \"${GREEN}ROCM GPU detected.${NC}\"\nfi\n\n# Parse command line arguments\nwhile [[ $# -gt 0 ]]; do\n    case \"$1\" in\n        --build)\n            BUILD_ONLY=true\n            ;;\n        --test-build)\n            TEST_BUILD=true\n            ;;\n        --gpu)\n            FORCE_GPU=true\n            ;;\n        --cpu)\n            FORCE_CPU=true\n            ;;\n        --cuda=*)\n            CUDA_VERSION=\"${1#*=}\"\n            GPU_NVIDIA_BASE_IMAGE=\"nvidia/cuda:${CUDA_VERSION}-cudnn-devel-ubuntu22.04\"\n            ;;\n        --rocm=*)\n            ROCM_VERSION=\"${1#*=}\"\n            GPU_ROCM_BASE_IMAGE=\"rocm/dev-ubuntu-22.04:${ROCM_VERSION}-complete\"\n            ;;\n        -d|--detach|-b|--background)\n            DETACHED=true\n            ;;\n        --dev)\n            REBUILD=true\n            PRODUCTION_MODE=false\n            ;;\n        --rebuild)\n            REBUILD=true\n            ;;\n        --production)\n            PRODUCTION_MODE=true\n            ;;\n        --branch=*)\n            BRANCH_NAME=\"${1#*=}\"\n            BRANCH_SUFFIX=\"${BRANCH_NAME}\"\n            ;;\n        --lite)\n            LITE_BUILD=true\n            ;;\n        -h|--help)\n            echo \"Usage: $0 [OPTIONS]\"\n            echo \"\"\n            echo \"Options:\"\n            echo \"  --build             Build containers only (don't start)\"\n            echo \"  --test-build        Test build with no cache\"\n            echo \"  --gpu               Force GPU mode\"\n            echo \"  --cpu               Force CPU mode\"\n            echo \"  --cuda=VERSION      Specify CUDA version\"\n            echo \"  --rocm=VERSION      Specify ROCM version\"\n            echo \"  -d, --detach        Run in detached/background mode\"\n            echo \"  -b, --background    Alias for --detach\"\n            echo \"  --dev               Development mode (rebuild containers)\"\n            echo \"  --rebuild           Rebuild containers before starting\"\n            echo \"  --production        Use published images (default)\"\n            echo \"  --branch=BRANCH     Use specific branch images\"\n            echo \"  --lite              Build without Whisper (smaller image, remote transcription only)\"\n            echo \"  -h, --help          Show this help message\"\n            exit 0\n            ;;\n        *)\n            echo \"Unknown argument: $1\"\n            echo \"Usage: $0 [--build] [--test-build] [--gpu] [--cpu] [--cuda=VERSION] [--rocm=VERSION] [-d|--detach] [-b|--background] [--dev] [--rebuild] [--production] [--branch=BRANCH_NAME] [--lite] [-h|--help]\"\n            exit 1\n            ;;\n    esac\n    shift\ndone\n\n# Determine if GPU should be used based on availability and flags\nUSE_GPU=false\nUSE_GPU_NVIDIA=false\nUSE_GPU_AMD=false\nif [ \"$FORCE_CPU\" = true ]; then\n    USE_GPU=false\n    echo -e \"${YELLOW}Forcing CPU mode${NC}\"\nelif [ \"$FORCE_GPU\" = true ]; then\n    if [ \"$NVIDIA_GPU_AVAILABLE\" = true ]; then\n        USE_GPU=true\n        USE_GPU_NVIDIA=true\n        echo -e \"${YELLOW}Forcing GPU mode (NVIDIA detected)${NC}\"\n    elif [ \"$AMD_GPU_AVAILABLE\" = true ]; then\n        USE_GPU=true\n        USE_GPU_AMD=true\n        echo -e \"${YELLOW}Forcing GPU mode (AMD detected)${NC}\"\n    else\n        echo -e \"${RED}Error: GPU requested but no compatible GPU detected. Please install NVIDIA or AMD GPU drivers.${NC}\"\n        exit 1\n    fi\nelif [ \"$NVIDIA_GPU_AVAILABLE\" = true ]; then\n    USE_GPU=true\n    USE_GPU_NVIDIA=true\n    echo -e \"${YELLOW}Using GPU mode (auto-detected)${NC}\"\nelif [ \"${AMD_GPU_AVAILABLE}\" = true ]; then\n    USE_GPU=true\n    USE_GPU_AMD=true\n    echo -e \"${YELLOW}Using GPU mode (auto-detected)${NC}\"\nelse\n    echo -e \"${YELLOW}Using CPU mode (no GPU detected)${NC}\"\nfi\n\n# Set base image and CUDA environment\nif [ \"$USE_GPU_NVIDIA\" = true ]; then\n    BASE_IMAGE=\"$GPU_NVIDIA_BASE_IMAGE\"\n    CUDA_VISIBLE_DEVICES=0\nelif [ \"${USE_GPU_AMD}\" = true ]; then\n    BASE_IMAGE=\"${GPU_ROCM_BASE_IMAGE}\"\n    CUDA_VISIBLE_DEVICES=0\nelse\n    BASE_IMAGE=\"$CPU_BASE_IMAGE\"\n    CUDA_VISIBLE_DEVICES=-1\nfi\n\n# Get current user's UID and GID\nexport PUID=$(id -u)\nexport PGID=$(id -g)\nexport BASE_IMAGE\nexport CUDA_VERSION\nexport ROCM_VERSION\nexport CUDA_VISIBLE_DEVICES\nexport USE_GPU\nexport USE_GPU_NVIDIA\nexport USE_GPU_AMD\nexport LITE_BUILD\n\n# Surface authentication/session configuration warnings\nREQUIRE_AUTH_LOWER=$(printf '%s' \"${REQUIRE_AUTH:-false}\" | tr '[:upper:]' '[:lower:]')\nif [ \"$REQUIRE_AUTH_LOWER\" = \"true\" ]; then\n    if [ -z \"${PODLY_SECRET_KEY}\" ]; then\n        echo -e \"${YELLOW}Warning: REQUIRE_AUTH is true but PODLY_SECRET_KEY is not set. Sessions will be reset on every restart.${NC}\"\n    fi\n\nfi\n\n# Setup Docker Compose configuration\nif [ \"$PRODUCTION_MODE\" = true ]; then\n    COMPOSE_FILES=\"-f compose.yml\"\n    # Set branch tag based on GPU detection and branch\n    if [ \"$LITE_BUILD\" = true ] && [ \"$USE_GPU\" = true ]; then\n        echo -e \"${RED}Error: --lite cannot be combined with GPU builds. Use --cpu or drop --lite.${NC}\"\n        exit 1\n    fi\n\n    if [ \"$LITE_BUILD\" = true ]; then\n        BRANCH=\"${BRANCH_SUFFIX}-lite\"\n    elif [ \"$USE_GPU_NVIDIA\" = true ]; then\n        BRANCH=\"${BRANCH_SUFFIX}-gpu-nvidia\"\n    elif [ \"$USE_GPU_AMD\" = true ]; then\n        BRANCH=\"${BRANCH_SUFFIX}-gpu-amd\"\n    else\n        BRANCH=\"${BRANCH_SUFFIX}-latest\"\n    fi\n\n    export BRANCH\n\n    echo -e \"${YELLOW}Production mode - using published images${NC}\"\n    echo -e \"${YELLOW}  Branch tag: ${BRANCH}${NC}\"\n    if [ \"$BRANCH_SUFFIX\" != \"main\" ]; then\n        echo -e \"${GREEN}Using custom branch: ${BRANCH_SUFFIX}${NC}\"\n    fi\nelse\n    export DEVELOPER_MODE=true\n    COMPOSE_FILES=\"-f compose.dev.cpu.yml\"\n    if [ \"$USE_GPU_NVIDIA\" = true ]; then\n        COMPOSE_FILES=\"$COMPOSE_FILES -f compose.dev.nvidia.yml\"\n    fi\n    if [ \"$USE_GPU_AMD\" = true ]; then\n        COMPOSE_FILES=\"$COMPOSE_FILES -f compose.dev.rocm.yml\"\n    fi\n    if [ \"$REBUILD\" = true ]; then\n        echo -e \"${YELLOW}Rebuild mode - will rebuild containers before starting${NC}\"\n    fi\n    if [ \"$LITE_BUILD\" = true ]; then\n        echo -e \"${YELLOW}Lite mode - building without Whisper (remote transcription only)${NC}\"\n    fi\nfi\n\n# Execute appropriate Docker Compose command\nif [ \"$BUILD_ONLY\" = true ]; then\n    echo -e \"${YELLOW}Building containers only...${NC}\"\n    if ! docker compose $COMPOSE_FILES build; then\n        echo -e \"${RED}Build failed! Please fix the errors above and try again.${NC}\"\n        exit 1\n    fi\n    echo -e \"${GREEN}Build completed successfully.${NC}\"\nelif [ \"$TEST_BUILD\" = true ]; then\n    echo -e \"${YELLOW}Testing build with no cache...${NC}\"\n    if ! docker compose $COMPOSE_FILES build --no-cache; then\n        echo -e \"${RED}Build failed! Please fix the errors above and try again.${NC}\"\n        exit 1\n    fi\n    echo -e \"${GREEN}Test build completed successfully.${NC}\"\nelse\n    # Handle development rebuild\n    if [ \"$REBUILD\" = true ]; then\n        echo -e \"${YELLOW}Rebuilding containers...${NC}\"\n        if ! docker compose $COMPOSE_FILES build; then\n            echo -e \"${RED}Build failed! Please fix the errors above and try again.${NC}\"\n            exit 1\n        fi\n    fi\n\n    if [ \"$DETACHED\" = true ]; then\n        echo -e \"${YELLOW}Starting Podly in detached mode...${NC}\"\n        docker compose $COMPOSE_FILES up -d\n        echo -e \"${GREEN}Podly is running in the background.${NC}\"\n        echo -e \"${GREEN}Application: http://localhost:5001${NC}\"\n    else\n        echo -e \"${YELLOW}Starting Podly...${NC}\"\n        echo -e \"${GREEN}Application will be available at: http://localhost:5001${NC}\"\n        docker compose $COMPOSE_FILES up\n    fi\nfi\n"
  },
  {
    "path": "scripts/ci.sh",
    "content": "#!/bin/bash\n\n# format\necho '============================================================='\necho \"Running 'pipenv run black .'\"\necho '============================================================='\npipenv run black .\necho '============================================================='\necho \"Running 'pipenv run isort .'\"\necho '============================================================='\npipenv run isort .\n\n# lint and type check\necho '============================================================='\necho \"Running 'pipenv run mypy . --install-types --non-interactive'\"\necho '============================================================='\npipenv run mypy . \\\n    --install-types \\\n    --non-interactive \\\n    --explicit-package-bases \\\n    --exclude 'migrations' \\\n    --exclude 'build' \\\n    --exclude 'scripts' \\\n    --exclude 'src/tests' \\\n    --exclude 'src/tests/test_routes.py' \\\n    --exclude 'src/app/routes.py'\n\necho '============================================================='\necho \"Running 'pipenv run pylint src/ --ignore=migrations,tests'\"\necho '============================================================='\npipenv run pylint src/ --ignore=migrations,tests\n\n# run tests\necho '============================================================='\necho \"Running 'pipenv run pytest --disable-warnings'\"\necho '============================================================='\npipenv run pytest --disable-warnings\n"
  },
  {
    "path": "scripts/create_migration.sh",
    "content": "#!/usr/bin/env bash\nset -euo pipefail\n\n# Usage: ./scripts/create_migration.sh \"message\"\n# Creates migrations using the project's local instance directory so the app\n# doesn't attempt to mkdir /app on macOS dev machines.\n\nSCRIPT_DIR=$(cd \"$(dirname \"${BASH_SOURCE[0]}\")\" && pwd)\nREPO_ROOT=$(cd \"$SCRIPT_DIR/..\" && pwd)\n\nMIGRATION_MSG=${1:-\"migration\"}\n\n# Prefer using repo-local src/instance to avoid writing to /app\nexport PODLY_INSTANCE_DIR=\"$REPO_ROOT/src/instance\"\n\necho \"Using PODLY_INSTANCE_DIR=$PODLY_INSTANCE_DIR\"\n\n# Ensure instance and data directories exist\nmkdir -p \"$PODLY_INSTANCE_DIR\"\nmkdir -p \"$PODLY_INSTANCE_DIR/data/in\"\nmkdir -p \"$PODLY_INSTANCE_DIR/data/srv\"\n\necho \"Running flask db migrate with message: $MIGRATION_MSG\"\nexport PYTHONPATH=\"$REPO_ROOT/src\"\npipenv run flask --app app db migrate -m \"$MIGRATION_MSG\"\n\necho \"Applying migration (upgrade)\"\n\nread -r -p \"Apply migration now? [y/N]: \" response\ncase \"$response\" in\n    [yY][eE][sS]|[yY])\n        echo \"Applying migration...\"\n        pipenv run flask --app app db upgrade\n        echo \"Migration applied.\"\n        ;;\n    *)\n        echo \"Upgrade cancelled. Migration files created but not applied.\"\n        ;;\nesac\n"
  },
  {
    "path": "scripts/downgrade_db.sh",
    "content": "#!/usr/bin/env bash\n\nSCRIPT_DIR=$(cd \"$(dirname \"${BASH_SOURCE[0]}\")\" && pwd)\nREPO_ROOT=$(cd \"$SCRIPT_DIR/..\" && pwd)\n\nexport PODLY_INSTANCE_DIR=\"$REPO_ROOT/src/instance\"\nexport PYTHONPATH=\"$REPO_ROOT/src\"\n\n# Default to downgrading one revision if not specified\nREVISION=${1:-\"-1\"}\n\npipenv run flask --app app db downgrade \"$REVISION\"\n"
  },
  {
    "path": "scripts/generate_lockfiles.sh",
    "content": "#!/bin/bash\nset -e\n\n# Generate lock file for the regular Pipfile\necho \"Locking Pipfile...\"\npipenv lock\n\n# Temporarily move Pipfiles to lock Pipfile.lite\necho \"Preparing to lock Pipfile.lite...\"\nmv Pipfile Pipfile.tmp\nmv Pipfile.lite Pipfile\n\n# Generate lock file for Pipfile.lite\necho \"Locking Pipfile.lite...\"\npipenv lock\n\n# Rename the new lock file to Pipfile.lite.lock\necho \"Renaming lockfile for lite version...\"\nmv Pipfile.lock Pipfile.lite.lock\n\n# Restore original Pipfile names\necho \"Restoring original Pipfile names...\"\nmv Pipfile Pipfile.lite\nmv Pipfile.tmp Pipfile\n\necho \"Lockfiles generated successfully!\"\necho \"- Pipfile.lock\"\necho \"- Pipfile.lite.lock\"\n"
  },
  {
    "path": "scripts/manual_publish.sh",
    "content": "#!/bin/bash\n\nset -euo pipefail\n\n# Branch name becomes part of a manual tag (slashes replaced)\nBRANCH=$(git rev-parse --abbrev-ref HEAD | tr '/' '_')\n\n# Allow overriding image/owner/builder via env vars\nIMAGE=${IMAGE:-ghcr.io/podly-pure-podcasts/podly-pure-podcasts}\nBUILDER=${BUILDER:-podly_builder}\n\n# Ensure a docker-container buildx builder for multi-arch builds\ndocker buildx create --name \"${BUILDER}\" --driver docker-container --use >/dev/null 2>&1 || docker buildx use \"${BUILDER}\"\n\n# Ensure binfmt handlers for cross-compilation are installed (no-op if already present)\ndocker run --privileged --rm tonistiigi/binfmt --install all >/dev/null 2>&1 || true\n\n# Optional GHCR login (requires GHCR_TOKEN and optionally OWNER)\nif [[ -n \"${GHCR_TOKEN:-}\" ]]; then\n  OWNER=${OWNER:-$(echo \"${IMAGE}\" | sed -E 's#^ghcr.io/([^/]+)/.*$#\\1#')}\n  echo \"${GHCR_TOKEN}\" | docker login ghcr.io -u \"${OWNER}\" --password-stdin\nfi\n\n# Build and push multi-arch CPU image (lite)\ndocker buildx build \\\n  --platform linux/amd64,linux/arm64 \\\n  -t \"${IMAGE}:${BRANCH}-lite\" \\\n  --build-arg BASE_IMAGE=python:3.11-slim \\\n  --build-arg USE_GPU=false \\\n  --build-arg USE_GPU_NVIDIA=false \\\n  --build-arg USE_GPU_AMD=false \\\n  --build-arg LITE_BUILD=true \\\n  --push .\n"
  },
  {
    "path": "scripts/new_worktree.sh",
    "content": "#!/usr/bin/env bash\nset -euo pipefail\n\nusage() {\n  echo \"Usage: $0 <branch-name> [<start-point>]\" >&2\n  exit 1\n}\n\nif [[ ${1-} == \"\" ]]; then\n  usage\nfi\n\nBRANCH_NAME=\"$1\"\nSTART_POINT=\"${2-}\"\n\nSCRIPT_DIR=\"$(cd \"$(dirname \"${BASH_SOURCE[0]}\")\" && pwd)\"\nREPO_ROOT=\"$(cd \"$SCRIPT_DIR/..\" && pwd)\"\nWORKTREES_ROOT=\"$REPO_ROOT/.worktrees\"\nWORKTREE_PATH=\"$WORKTREES_ROOT/$BRANCH_NAME\"\n\nif git worktree list --porcelain | grep -q \"^worktree $WORKTREE_PATH$\"; then\n  echo \"Worktree already exists at $WORKTREE_PATH\" >&2\n  exit 1\nfi\n\nmkdir -p \"$(dirname \"$WORKTREE_PATH\")\"\n\nif [[ -d \"$WORKTREE_PATH\" ]]; then\n  echo \"Target path $WORKTREE_PATH already exists. Remove it first.\" >&2\n  exit 1\nfi\n\necho \"Creating worktree at $WORKTREE_PATH\" >&2\nif git rev-parse --verify --quiet \"$BRANCH_NAME\" >/dev/null; then\n  git worktree add \"$WORKTREE_PATH\" \"$BRANCH_NAME\"\nelse\n  if [[ -n \"$START_POINT\" ]]; then\n    git worktree add -b \"$BRANCH_NAME\" \"$WORKTREE_PATH\" \"$START_POINT\"\n  else\n    git worktree add -b \"$BRANCH_NAME\" \"$WORKTREE_PATH\"\n  fi\nfi\n\npushd \"$WORKTREE_PATH\" >/dev/null\n\nif command -v pipenv >/dev/null; then\n  echo \"Installing dependencies via pipenv\" >&2\n  pipenv install --dev\nelse\n  echo \"pipenv not found on PATH; skipping dependency installation\" >&2\nfi\n\nENV_SOURCE=\"\"\nif [[ -f \"$REPO_ROOT/.env\" ]]; then\n  ENV_SOURCE=\"$REPO_ROOT/.env\"\nelif [[ -f \"$REPO_ROOT/.env.local\" ]]; then\n  ENV_SOURCE=\"$REPO_ROOT/.env.local\"\nfi\n\nif [[ -n \"$ENV_SOURCE\" ]]; then\n  if [[ -f .env ]]; then\n    echo \"Worktree already has a .env file; leaving existing file in place\" >&2\n  else\n    echo \"Copying $(basename \"$ENV_SOURCE\") into worktree\" >&2\n    cp \"$ENV_SOURCE\" ./.env\n  fi\nelse\n  echo \"No .env or .env.local found in repository root; nothing copied\" >&2\nfi\n\nif command -v code >/dev/null; then\n  echo \"Opening worktree in VS Code\" >&2\n  code \"$WORKTREE_PATH\"\nelse\n  echo \"VS Code command-line tool 'code' not found; skipping auto-open\" >&2\nfi\n\npopd >/dev/null\n"
  },
  {
    "path": "scripts/start_services.sh",
    "content": "#!/bin/bash\nset -e\n\n# 1. Start Writer Service in background\necho \"Starting Writer Service...\"\nexport PYTHONPATH=\"/app/src${PYTHONPATH:+:$PYTHONPATH}\"\npython3 -u -m app.writer &\nWRITER_PID=$!\n\n# Wait for writer IPC to be ready\necho \"Waiting for writer IPC on 127.0.0.1:50001...\"\nREADY=0\nfor i in {1..120}; do\n\tif python3 - <<'PY'\nimport socket\n\ns = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\ns.settimeout(0.2)\ntry:\n    s.connect((\"127.0.0.1\", 50001))\n    raise SystemExit(0)\nexcept OSError:\n    raise SystemExit(1)\nfinally:\n    try:\n        s.close()\n    except Exception:\n        pass\nPY\n\tthen\n\t\tREADY=1\n\t\tbreak\n\tfi\n\tsleep 0.25\ndone\n\nif [ $READY -ne 1 ]; then\n\techo \"Writer IPC did not become ready in time; exiting.\"\n\texit 1\nfi\n\n# 2. Start Main App (Waitress)\necho \"Starting Main Application...\"\npython3 -u src/main.py &\nAPP_PID=$!\n\n# 3. Monitor processes\n# 'wait -n' waits for the first process to exit.\n# If writer dies, we want to exit so Docker restarts us.\nwait -n\n\n# Exit with status of process that exited first\nexit $?\n"
  },
  {
    "path": "scripts/test_full_workflow.py",
    "content": "import json\nimport sys\nimport time\n\nimport requests\n\nBASE_URL = \"http://localhost:5001\"\n\n\ndef log(msg):\n    print(f\"[TEST] {msg}\")\n\n\ndef check_health():\n    try:\n        # Assuming there's a health check or just checking root\n        # If no explicit health check, we can try listing feeds\n        response = requests.get(f\"{BASE_URL}/feeds\")\n        if response.status_code == 200:\n            log(\"Server is up and running.\")\n            return True\n    except requests.exceptions.ConnectionError:\n        pass\n    return False\n\n\ndef add_feed(url):\n    log(f\"Adding feed: {url}\")\n    response = requests.post(f\"{BASE_URL}/feed\", data={\"url\": url})\n    if response.status_code == 302:  # Redirects to index on success\n        log(\"Feed added successfully (redirected).\")\n        return True\n    elif response.status_code == 200:\n        log(\"Feed added successfully.\")\n        return True\n    else:\n        log(\n            f\"Failed to add feed. Status: {response.status_code}, Body: {response.text}\"\n        )\n        return False\n\n\ndef get_feeds():\n    log(\"Fetching feeds...\")\n    response = requests.get(f\"{BASE_URL}/feeds\")\n    if response.status_code == 200:\n        feeds = response.json()\n        log(f\"Found {len(feeds)} feeds.\")\n        return feeds\n    else:\n        log(f\"Failed to fetch feeds. Status: {response.status_code}\")\n        return []\n\n\ndef get_posts(feed_id):\n    log(f\"Fetching posts for feed {feed_id}...\")\n    response = requests.get(f\"{BASE_URL}/api/feeds/{feed_id}/posts\")\n    if response.status_code == 200:\n        posts = response.json()\n        log(f\"Found {len(posts)} posts.\")\n        return posts\n    else:\n        log(f\"Failed to fetch posts. Status: {response.status_code}\")\n        return []\n\n\ndef whitelist_post(guid):\n    log(f\"Whitelisting post {guid}...\")\n    # Assuming admin auth is not strictly enforced for localhost/dev mode or we need to handle it.\n    # The code checks for current_user. If auth is disabled, it might pass.\n    # If auth is enabled, we might need to login first.\n    # For now, let's try without auth headers, assuming dev environment.\n\n    response = requests.post(\n        f\"{BASE_URL}/api/posts/{guid}/whitelist\",\n        json={\"whitelisted\": True, \"trigger_processing\": True},\n    )\n\n    if response.status_code == 200:\n        log(\"Post whitelisted and processing triggered.\")\n        return True\n    else:\n        log(\n            f\"Failed to whitelist post. Status: {response.status_code}, Body: {response.text}\"\n        )\n        return False\n\n\ndef check_status(guid):\n    response = requests.get(f\"{BASE_URL}/api/posts/{guid}/status\")\n    if response.status_code == 200:\n        return response.json()\n    return None\n\n\ndef wait_for_processing(guid, timeout=300):\n    log(f\"Waiting for processing of {guid}...\")\n    start_time = time.time()\n    while time.time() - start_time < timeout:\n        status_data = check_status(guid)\n        if status_data:\n            status = status_data.get(\"status\")\n            progress = status_data.get(\"progress_percentage\", 0)\n            step = status_data.get(\"step_name\", \"unknown\")\n            log(f\"Status: {status}, Step: {step}, Progress: {progress}%\")\n\n            if status == \"completed\":\n                log(\"Processing completed successfully!\")\n                return True\n            elif status == \"failed\":\n                log(f\"Processing failed: {status_data.get('error_message')}\")\n                return False\n            elif status == \"error\":\n                log(f\"Processing error: {status_data.get('message')}\")\n                return False\n\n        time.sleep(5)\n\n    log(\"Timeout waiting for processing.\")\n    return False\n\n\ndef main():\n    if not check_health():\n        log(\"Server is not reachable. Please start the server first.\")\n        sys.exit(1)\n\n    # 1. Add a test feed\n    # Using a known stable feed or a mock one if available.\n    # Let's use a popular tech podcast that usually works.\n    test_feed_url = \"http://test-feed/1\"  # Developer mode test feed\n\n    # Check if feed already exists\n    feeds = get_feeds()\n    target_feed = None\n    for feed in feeds:\n        if feed[\"rss_url\"] == test_feed_url:\n            target_feed = feed\n            break\n\n    if not target_feed:\n        if add_feed(test_feed_url):\n            # Fetch feeds again to get the ID\n            feeds = get_feeds()\n            for feed in feeds:\n                if feed[\"rss_url\"] == test_feed_url:\n                    target_feed = feed\n                    break\n\n    if not target_feed:\n        log(\"Could not find or add the test feed.\")\n        sys.exit(1)\n\n    log(f\"Working with feed: {target_feed['title']} (ID: {target_feed['id']})\")\n\n    # 2. Get posts\n    posts = get_posts(target_feed[\"id\"])\n    if not posts:\n        log(\"No posts found.\")\n        sys.exit(1)\n\n    # 3. Pick the latest post\n    # Posts are usually sorted by release date desc\n    target_post = posts[0]\n    log(f\"Selected post: {target_post['title']} (GUID: {target_post['guid']})\")\n\n    # 4. Trigger processing (Whitelist + Trigger)\n    if not target_post[\"whitelisted\"]:\n        if not whitelist_post(target_post[\"guid\"]):\n            log(\"Failed to trigger processing.\")\n            sys.exit(1)\n    else:\n        log(\"Post already whitelisted. Checking status...\")\n        # If already whitelisted, maybe trigger reprocess or just check status?\n        # Let's try to trigger process explicitly if it's not processed\n        if not target_post[\"has_processed_audio\"]:\n            response = requests.post(\n                f\"{BASE_URL}/api/posts/{target_post['guid']}/process\"\n            )\n            log(f\"Trigger process response: {response.status_code}\")\n\n    # 5. Wait for completion\n    if wait_for_processing(target_post[\"guid\"]):\n        # 6. Verify output\n        log(\"Verifying output...\")\n        # Check if we can get the audio link\n        response = requests.get(\n            f\"{BASE_URL}/api/posts/{target_post['guid']}/audio\", stream=True\n        )\n        if response.status_code == 200:\n            log(\"Audio file is accessible.\")\n        else:\n            log(f\"Failed to access audio file. Status: {response.status_code}\")\n\n        # Check JSON details\n        response = requests.get(f\"{BASE_URL}/post/{target_post['guid']}/json\")\n        if response.status_code == 200:\n            data = response.json()\n            log(\n                f\"Post JSON retrieved. Transcript segments: {data.get('transcript_segment_count')}\"\n            )\n        else:\n            log(\"Failed to retrieve post JSON.\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/upgrade_db.sh",
    "content": "#!/usr/bin/env bash\n\nSCRIPT_DIR=$(cd \"$(dirname \"${BASH_SOURCE[0]}\")\" && pwd)\nREPO_ROOT=$(cd \"$SCRIPT_DIR/..\" && pwd)\n\nexport PODLY_INSTANCE_DIR=\"$REPO_ROOT/src/instance\"\nexport PYTHONPATH=\"$REPO_ROOT/src\"\n\npipenv run flask --app app db upgrade"
  },
  {
    "path": "src/app/__init__.py",
    "content": "import importlib\nimport json\nimport logging\nimport os\nimport secrets\nimport sys\nfrom pathlib import Path\nfrom typing import Any\n\nfrom flask import Flask, current_app, g, has_app_context, request\nfrom flask_cors import CORS\nfrom flask_migrate import upgrade\nfrom sqlalchemy import event\nfrom sqlalchemy.engine import Engine\n\nfrom app import models\nfrom app.auth import AuthSettings, load_auth_settings\nfrom app.auth.bootstrap import bootstrap_admin_user\nfrom app.auth.discord_settings import load_discord_settings\nfrom app.auth.middleware import init_auth_middleware\nfrom app.background import add_background_job, schedule_cleanup_job\nfrom app.config_store import (\n    ensure_defaults_and_hydrate,\n    hydrate_runtime_config_inplace,\n)\nfrom app.extensions import db, migrate, scheduler\nfrom app.jobs_manager import (\n    get_jobs_manager,\n)\nfrom app.logger import setup_logger\nfrom app.processor import (\n    ProcessorSingleton,\n)\nfrom app.routes import register_routes\nfrom app.runtime_config import config, is_test\nfrom app.writer.client import writer_client\nfrom shared.processing_paths import get_in_root, get_srv_root\n\nsetup_logger(\"global_logger\", \"src/instance/logs/app.log\")\nlogger = logging.getLogger(\"global_logger\")\n\n\ndef _env_bool(name: str, default: bool = False) -> bool:\n    raw = os.environ.get(name)\n    if raw is None:\n        return default\n    return raw.strip().lower() in {\"1\", \"true\", \"yes\", \"on\"}\n\n\ndef _get_sqlite_busy_timeout_ms() -> int:\n    # Longer timeout to allow large batch deletes/updates to finish before giving up\n    return 90000\n\n\ndef setup_dirs() -> None:\n    \"\"\"Create data directories. Logs a warning and continues if paths are not writable.\"\"\"\n    in_root = get_in_root()\n    srv_root = get_srv_root()\n    try:\n        os.makedirs(in_root, exist_ok=True)\n        os.makedirs(srv_root, exist_ok=True)\n    except OSError as exc:\n        # During CLI commands like migrations, the /app path may not exist\n        logger.warning(\n            \"Could not create data directories (%s, %s): %s. \"\n            \"This is expected during migrations on local dev.\",\n            in_root,\n            srv_root,\n            exc,\n        )\n\n\nclass SchedulerConfig:\n    SCHEDULER_JOBSTORES = {\n        \"default\": {\n            \"type\": \"sqlalchemy\",\n            \"url\": \"sqlite:////tmp/jobs.sqlite\",\n        }\n    }\n    SCHEDULER_EXECUTORS = {\"default\": {\"type\": \"threadpool\", \"max_workers\": 1}}\n    SCHEDULER_JOB_DEFAULTS = {\"coalesce\": False, \"max_instances\": 1}\n\n\n@event.listens_for(Engine, \"connect\", once=False)\ndef _set_sqlite_pragmas(dbapi_connection: Any, connection_record: Any) -> None:\n    module = getattr(dbapi_connection.__class__, \"__module__\", \"\")\n    if not module.startswith((\"sqlite3\", \"pysqlite2\")):\n        return\n\n    cursor = dbapi_connection.cursor()\n    busy_timeout_ms = _get_sqlite_busy_timeout_ms()\n    try:\n        cursor.execute(\"PRAGMA journal_mode=WAL;\")\n        cursor.execute(\"PRAGMA synchronous=NORMAL;\")\n        cursor.execute(f\"PRAGMA busy_timeout={busy_timeout_ms};\")\n        # Limit WAL file size to prevent checkpoint starvation\n        cursor.execute(\"PRAGMA wal_autocheckpoint=1000;\")\n    finally:\n        cursor.close()\n\n\ndef setup_scheduler(app: Flask) -> None:\n    \"\"\"Initialize and start the scheduler.\"\"\"\n    if not is_test:\n        scheduler.init_app(app)\n        scheduler.start()\n\n\ndef create_app() -> Flask:\n    disable_scheduler = _env_bool(\"PODLY_DISABLE_SCHEDULER\", default=False)\n    run_startup = _env_bool(\"PODLY_RUN_STARTUP\", default=True)\n    return _create_configured_app(\n        app_role=\"web\",\n        run_startup=run_startup,\n        start_scheduler=not disable_scheduler,\n    )\n\n\ndef create_web_app() -> Flask:\n    \"\"\"Create the web (read-mostly) Flask app.\n\n    This app should not run startup migrations/bootstrapping; DB writes are\n    delegated to the writer service. Scheduler runs here so background processing\n    happens in the web process.\n    \"\"\"\n    return _create_configured_app(\n        app_role=\"web\",\n        run_startup=False,\n        start_scheduler=True,\n    )\n\n\ndef create_writer_app() -> Flask:\n    \"\"\"Create the writer Flask app.\n\n    This app owns startup migrations/bootstrapping.\n    \"\"\"\n    return _create_configured_app(\n        app_role=\"writer\",\n        run_startup=True,\n        start_scheduler=False,\n    )\n\n\ndef _create_configured_app(\n    *,\n    app_role: str,\n    run_startup: bool,\n    start_scheduler: bool,\n) -> Flask:\n    # Setup directories early but only when actually creating the app (not during migrations)\n    if not is_test:\n        setup_dirs()\n\n    app = _create_flask_app()\n    app.config[\"PODLY_APP_ROLE\"] = app_role\n    auth_settings = _load_auth_settings()\n    _apply_auth_settings(app, auth_settings)\n    _configure_session(app, auth_settings)\n    _configure_cors(app)\n    _configure_scheduler(app)\n    _configure_database(app)\n    _configure_external_loggers()\n    _initialize_extensions(app)\n    _register_routes_and_middleware(app)\n\n    app.config[\"developer_mode\"] = config.developer_mode\n\n    with app.app_context():\n        if run_startup:\n            _run_app_startup(auth_settings)\n        else:\n            _hydrate_web_config()\n\n        discord_settings = load_discord_settings()\n        app.config[\"DISCORD_SETTINGS\"] = discord_settings\n\n    app.config[\"AUTH_SETTINGS\"] = auth_settings.without_password()\n\n    if app.config[\"DISCORD_SETTINGS\"].enabled:\n        logger.info(\n            \"Discord SSO enabled (guild restriction: %s)\",\n            \"yes\" if app.config[\"DISCORD_SETTINGS\"].guild_ids else \"no\",\n        )\n\n    _validate_env_key_conflicts()\n    if start_scheduler:\n        _start_scheduler_and_jobs(app)\n    return app\n\n\ndef _clear_scheduler_jobstore() -> None:\n    \"\"\"Remove persisted APScheduler jobs so startup adds a clean schedule.\"\"\"\n    jobstore_config = SchedulerConfig.SCHEDULER_JOBSTORES.get(\"default\")\n    if not isinstance(jobstore_config, dict):\n        return\n\n    url = jobstore_config.get(\"url\")\n    if not isinstance(url, str):\n        return\n\n    prefix = \"sqlite:///\"\n    if not url.startswith(prefix):\n        return\n\n    relative_path = url[len(prefix) :]\n    project_root = Path(__file__).resolve().parents[2]\n    jobstore_path = (project_root / Path(relative_path)).resolve()\n    jobstore_path.parent.mkdir(parents=True, exist_ok=True)\n\n    sidecars = [\n        jobstore_path,\n        jobstore_path.with_name(jobstore_path.name + \"-wal\"),\n        jobstore_path.with_name(jobstore_path.name + \"-shm\"),\n    ]\n\n    try:\n        cleared_any = False\n        for path in sidecars:\n            if path.exists():\n                path.unlink()\n                cleared_any = True\n\n        if cleared_any:\n            logger.info(\n                \"Startup: cleared persisted APScheduler jobs at %s\", jobstore_path\n            )\n    except OSError as exc:\n        logger.warning(\n            \"Startup: failed to clear APScheduler jobs at %s: %s\", jobstore_path, exc\n        )\n\n\ndef _validate_env_key_conflicts() -> None:\n    \"\"\"Validate that environment API key variables are not conflicting.\n\n    Rules:\n    - If both LLM_API_KEY and GROQ_API_KEY are set and differ -> error\n    \"\"\"\n    llm_key = os.environ.get(\"LLM_API_KEY\")\n    groq_key = os.environ.get(\"GROQ_API_KEY\")\n\n    conflicts: list[str] = []\n    if llm_key and groq_key and llm_key != groq_key:\n        conflicts.append(\n            \"LLM_API_KEY and GROQ_API_KEY are both set but have different values\"\n        )\n\n    if conflicts:\n        details = \"; \".join(conflicts)\n        message = (\n            \"Configuration error: Conflicting environment API keys detected. \"\n            f\"{details}. To use Groq, prefer setting GROQ_API_KEY only; \"\n            \"alternatively, set the variables to the same value.\"\n        )\n        # Crash the process so Docker start fails clearly\n        raise SystemExit(message)\n\n\ndef _create_flask_app() -> Flask:\n    static_folder = os.path.abspath(os.path.join(os.path.dirname(__file__), \"static\"))\n    return Flask(__name__, static_folder=static_folder)\n\n\ndef _load_auth_settings() -> AuthSettings:\n    try:\n        return load_auth_settings()\n    except RuntimeError as exc:\n        logger.critical(\"Authentication configuration error: %s\", exc)\n        raise\n\n\ndef _apply_auth_settings(app: Flask, auth_settings: AuthSettings) -> None:\n    app.config[\"AUTH_SETTINGS\"] = auth_settings\n    app.config[\"REQUIRE_AUTH\"] = auth_settings.require_auth\n    app.config[\"AUTH_ADMIN_USERNAME\"] = auth_settings.admin_username\n\n\ndef _configure_session(app: Flask, auth_settings: AuthSettings) -> None:\n    secret_key = os.environ.get(\"PODLY_SECRET_KEY\")\n    if not secret_key:\n        try:\n            secret_key = secrets.token_urlsafe(64)\n        except Exception as exc:  # pylint: disable=broad-except\n            raise RuntimeError(\"Failed to generate session secret key.\") from exc\n        if auth_settings.require_auth:\n            logger.warning(\n                \"Generated ephemeral session secret key because PODLY_SECRET_KEY is not set; \"\n                \"all sessions will be invalidated on restart.\"\n            )\n\n    app.config[\"SECRET_KEY\"] = secret_key\n    app.config[\"SESSION_COOKIE_NAME\"] = os.environ.get(\n        \"PODLY_SESSION_COOKIE_NAME\", \"podly_session\"\n    )\n    app.config[\"SESSION_COOKIE_HTTPONLY\"] = True\n    app.config[\"SESSION_COOKIE_SAMESITE\"] = \"Lax\"\n\n    # We always allow HTTP cookies so self-hosted installs work behind simple HTTP reverse proxies.\n    app.config[\"SESSION_COOKIE_SECURE\"] = False\n\n\ndef _configure_cors(app: Flask) -> None:\n    default_cors = [\n        \"http://localhost:5173\",\n        \"http://127.0.0.1:5173\",\n    ]\n    cors_origins_env = os.environ.get(\"CORS_ORIGINS\")\n    if cors_origins_env:\n        cors_origins = [\n            origin.strip() for origin in cors_origins_env.split(\",\") if origin.strip()\n        ]\n    else:\n        cors_origins = default_cors\n    CORS(\n        app,\n        resources={r\"/*\": {\"origins\": cors_origins}},\n        allow_headers=[\"Content-Type\", \"Authorization\", \"Range\"],\n        methods=[\"GET\", \"POST\", \"PUT\", \"DELETE\", \"OPTIONS\"],\n        supports_credentials=True,\n    )\n\n\ndef _configure_scheduler(app: Flask) -> None:\n    app.config.from_object(SchedulerConfig())\n\n\ndef _configure_database(app: Flask) -> None:\n    def _get_sqlite_connect_timeout() -> int:\n        return 60\n\n    uri_scheme = \"sqlite\"\n    connect_timeout = _get_sqlite_connect_timeout()\n    app.config[\"SQLALCHEMY_DATABASE_URI\"] = (\n        f\"{uri_scheme}:///sqlite3.db?timeout={connect_timeout}\"\n    )\n    engine_options: dict[str, Any] = {\n        \"connect_args\": {\n            \"timeout\": connect_timeout,\n        },\n        # Keep pool small to reduce concurrent SQLite writers\n        \"pool_size\": 5,\n        \"max_overflow\": 5,\n    }\n\n    app.config[\"SQLALCHEMY_ENGINE_OPTIONS\"] = engine_options\n    app.config[\"SQLALCHEMY_TRACK_MODIFICATIONS\"] = False\n\n\ndef _configure_external_loggers() -> None:\n    groq_logger = logging.getLogger(\"groq\")\n    groq_logger.setLevel(logging.INFO)\n\n\ndef _configure_readonly_sessions(app: Flask) -> None:\n    \"\"\"\n    Configure SQLAlchemy sessions to be read-only for the web/API app.\n    This prevents Flask from acquiring write locks on the database, which\n    can cause deadlocks with the writer service.\n\n    Only the writer service should perform database writes.\n    \"\"\"\n    from sqlalchemy.orm import Session\n\n    @event.listens_for(Session, \"after_begin\", once=False)\n    def receive_after_begin(\n        session: Session, transaction: Any, connection: Any\n    ) -> None:\n        \"\"\"Set new transactions to read-only by default.\"\"\"\n        # Only apply to sessions created within this app context\n        try:\n            if not has_app_context():\n                return\n            if current_app.config.get(\"PODLY_APP_ROLE\") != \"web\":\n                return\n        except Exception:  # pylint: disable=broad-except\n            return\n\n        # Set isolation level to prevent write locks\n        # For SQLite, this prevents RESERVED/EXCLUSIVE locks\n        connection.connection.isolation_level = \"DEFERRED\"\n\n        # Disable autoflush to prevent accidental writes\n        session.autoflush = False\n\n        # Mark session as read-only to prevent any writes\n        session.info[\"readonly\"] = True\n\n    @event.listens_for(Session, \"before_flush\", once=False)\n    def receive_before_flush(\n        session: Session, flush_context: Any, instances: Any\n    ) -> None:\n        \"\"\"Prevent accidental writes in read-only sessions.\"\"\"\n        try:\n            if not has_app_context():\n                return\n            if current_app.config.get(\"PODLY_APP_ROLE\") != \"web\":\n                return\n        except Exception:  # pylint: disable=broad-except\n            return\n\n        if session.info.get(\"readonly\"):\n            raise RuntimeError(\n                \"Attempted to flush changes in read-only session. \"\n                \"All database writes must go through the writer service.\"\n            )\n\n\ndef _initialize_extensions(app: Flask) -> None:\n    db.init_app(app)\n    migrate.init_app(app, db)\n\n    # Configure read-only mode for web/API Flask app to prevent database locks\n    # Only the writer service should acquire write locks\n    if app.config.get(\"PODLY_APP_ROLE\") == \"web\":\n        _configure_readonly_sessions(app)\n\n\ndef _register_routes_and_middleware(app: Flask) -> None:\n    register_routes(app)\n    init_auth_middleware(app)\n\n    _register_api_logging(app)\n\n\ndef _register_api_logging(app: Flask) -> None:\n    @app.after_request\n    def _log_api_request(response: Any) -> Any:\n        try:\n            path = request.path\n        except Exception:  # pragma: no cover  # pylint: disable=broad-except\n            return response\n\n        if not path.startswith(\"/api/\"):\n            return response\n\n        method = request.method\n        status = getattr(response, \"status_code\", None)\n\n        user = getattr(g, \"current_user\", None)\n        user_id = getattr(user, \"id\", None)\n\n        logger.info(\n            \"[API] %s %s status=%s user_id=%s content_type=%s\",\n            method,\n            path,\n            status,\n            user_id,\n            getattr(response, \"content_type\", None),\n        )\n\n        return response\n\n\ndef _run_app_startup(auth_settings: AuthSettings) -> None:\n    upgrade()\n    bootstrap_admin_user(auth_settings)\n    try:\n        ensure_defaults_and_hydrate()\n\n        ProcessorSingleton.reset_instance()\n    except Exception as exc:  # pylint: disable=broad-except\n        logger.error(f\"Failed to initialize settings: {exc}\")\n\n\ndef _hydrate_web_config() -> None:\n    \"\"\"Hydrate runtime config for web app (read-only).\"\"\"\n    hydrate_runtime_config_inplace()\n\n    ProcessorSingleton.reset_instance()\n\n\ndef _start_scheduler_and_jobs(app: Flask) -> None:\n    _clear_scheduler_jobstore()\n    setup_scheduler(app)\n\n    jobs_manager = get_jobs_manager()\n    clear_result = jobs_manager.clear_all_jobs()\n    if clear_result[\"status\"] == \"success\":\n        logger.info(f\"Startup: {clear_result['message']}\")\n    else:\n        logger.warning(f\"Startup job clearing failed: {clear_result['message']}\")\n\n    add_background_job(\n        10\n        if config.background_update_interval_minute is None\n        else int(config.background_update_interval_minute)\n    )\n    schedule_cleanup_job(getattr(config, \"post_cleanup_retention_days\", None))\n"
  },
  {
    "path": "src/app/auth/__init__.py",
    "content": "\"\"\"\nAuthentication package exposing configuration helpers and utilities.\n\"\"\"\n\nfrom .guards import is_auth_enabled, require_admin\nfrom .settings import AuthSettings, load_auth_settings\n\n__all__ = [\"AuthSettings\", \"load_auth_settings\", \"require_admin\", \"is_auth_enabled\"]\n"
  },
  {
    "path": "src/app/auth/bootstrap.py",
    "content": "from __future__ import annotations\n\nimport logging\n\nfrom flask import current_app\n\nfrom app.db_commit import safe_commit\nfrom app.extensions import db\nfrom app.models import User\nfrom app.writer.client import writer_client\n\nfrom .settings import AuthSettings\n\nlogger = logging.getLogger(\"global_logger\")\n\n\ndef bootstrap_admin_user(auth_settings: AuthSettings) -> None:\n    \"\"\"Ensure an administrator user exists when auth is required.\"\"\"\n    logger.info(\"Bootstrapping admin user...\")\n\n    if not auth_settings.require_auth:\n        return\n\n    # Avoid seeding if users already exist.\n    current_admin = db.session.query(User.id).limit(1).first()\n    if current_admin is not None:\n        logger.info(\"Admin user already exists; skipping bootstrap.\")\n        return\n\n    password = auth_settings.admin_password\n    if not password:\n        logger.error(\n            \"REQUIRE_AUTH=true but PODLY_ADMIN_PASSWORD is missing during bootstrap.\"\n        )\n        raise RuntimeError(\n            \"Authentication bootstrap failed: PODLY_ADMIN_PASSWORD is required.\"\n        )\n\n    username = auth_settings.admin_username\n    role = current_app.config.get(\"PODLY_APP_ROLE\")\n    if role == \"writer\":\n        user = User(username=username, role=\"admin\")\n        user.set_password(password)\n\n        db.session.add(user)\n        safe_commit(\n            db.session,\n            must_succeed=True,\n            context=\"bootstrap_admin_user\",\n            logger_obj=logger,\n        )\n    else:\n        res = writer_client.action(\n            \"create_user\",\n            {\"username\": username, \"password\": password, \"role\": \"admin\"},\n            wait=True,\n        )\n        if not res or not res.success:\n            # If another process created the admin concurrently, treat as success.\n            if \"already exists\" not in str(getattr(res, \"error\", \"\")):\n                raise RuntimeError(\n                    getattr(res, \"error\", \"Failed to bootstrap admin user\")\n                )\n\n    logger.info(\n        \"Bootstrapped initial admin user '%s'. Ensure environment secrets are stored securely.\",\n        username,\n    )\n\n    # Clear the password from the Flask config if it was set to avoid lingering plaintext.\n    current_app.config.pop(\"PODLY_ADMIN_PASSWORD\", None)\n"
  },
  {
    "path": "src/app/auth/discord_service.py",
    "content": "from __future__ import annotations\n\nimport logging\nimport secrets\nfrom dataclasses import dataclass\nfrom typing import Any\nfrom urllib.parse import urlencode\n\nimport httpx\n\nfrom app.auth.discord_settings import DiscordSettings\nfrom app.extensions import db\nfrom app.models import User\nfrom app.writer.client import writer_client\n\nlogger = logging.getLogger(\"global_logger\")\n\nDISCORD_API_BASE = \"https://discord.com/api/v10\"\nDISCORD_OAUTH2_AUTHORIZE = \"https://discord.com/oauth2/authorize\"\nDISCORD_OAUTH2_TOKEN = \"https://discord.com/api/oauth2/token\"\n\n\nclass DiscordAuthError(Exception):\n    \"\"\"Base error for Discord auth failures.\"\"\"\n\n\nclass DiscordGuildRequirementError(DiscordAuthError):\n    \"\"\"User is not in required guild(s).\"\"\"\n\n\nclass DiscordRegistrationDisabledError(DiscordAuthError):\n    \"\"\"Self-registration is disabled.\"\"\"\n\n\n@dataclass\nclass DiscordUser:\n    id: str\n    username: str\n\n\ndef generate_oauth_state() -> str:\n    \"\"\"Generate a secure random state parameter for OAuth2.\"\"\"\n    return secrets.token_urlsafe(32)\n\n\ndef build_authorization_url(\n    settings: DiscordSettings, state: str, prompt: str = \"none\"\n) -> str:\n    \"\"\"Build the Discord OAuth2 authorization URL.\"\"\"\n    scopes = [\"identify\"]\n    if settings.guild_ids:\n        scopes.append(\"guilds\")\n\n    params = {\n        \"client_id\": settings.client_id,\n        \"redirect_uri\": settings.redirect_uri,\n        \"response_type\": \"code\",\n        \"scope\": \" \".join(scopes),\n        \"state\": state,\n    }\n    if prompt:\n        params[\"prompt\"] = prompt\n    return f\"{DISCORD_OAUTH2_AUTHORIZE}?{urlencode(params)}\"\n\n\ndef exchange_code_for_token(settings: DiscordSettings, code: str) -> dict[str, Any]:\n    \"\"\"Exchange an authorization code for an access token (synchronous).\"\"\"\n    with httpx.Client() as client:\n        response = client.post(\n            DISCORD_OAUTH2_TOKEN,\n            data={\n                \"client_id\": settings.client_id,\n                \"client_secret\": settings.client_secret,\n                \"grant_type\": \"authorization_code\",\n                \"code\": code,\n                \"redirect_uri\": settings.redirect_uri,\n            },\n            headers={\"Content-Type\": \"application/x-www-form-urlencoded\"},\n        )\n        response.raise_for_status()\n        result: dict[str, Any] = response.json()\n        return result\n\n\ndef get_discord_user(access_token: str) -> DiscordUser:\n    \"\"\"Fetch Discord user info using an access token (synchronous).\"\"\"\n    with httpx.Client() as client:\n        response = client.get(\n            f\"{DISCORD_API_BASE}/users/@me\",\n            headers={\"Authorization\": f\"Bearer {access_token}\"},\n        )\n        response.raise_for_status()\n        data = response.json()\n        return DiscordUser(\n            id=data[\"id\"],\n            username=data[\"username\"],\n        )\n\n\ndef check_guild_membership(access_token: str, settings: DiscordSettings) -> bool:\n    \"\"\"Check if user is in any of the required guilds (synchronous).\"\"\"\n    if not settings.guild_ids:\n        return True\n\n    with httpx.Client() as client:\n        response = client.get(\n            f\"{DISCORD_API_BASE}/users/@me/guilds\",\n            headers={\"Authorization\": f\"Bearer {access_token}\"},\n        )\n        response.raise_for_status()\n        user_guilds = {g[\"id\"] for g in response.json()}\n\n        return any(gid in user_guilds for gid in settings.guild_ids)\n\n\ndef find_or_create_user_from_discord(\n    discord_user: DiscordUser,\n    settings: DiscordSettings,\n) -> User:\n    \"\"\"Find an existing user by Discord ID or create a new one.\"\"\"\n    result = writer_client.action(\n        \"upsert_discord_user\",\n        {\n            \"discord_id\": discord_user.id,\n            \"discord_username\": discord_user.username,\n            \"allow_registration\": settings.allow_registration,\n        },\n        wait=True,\n    )\n    if not result or not result.success or not isinstance(result.data, dict):\n        err = getattr(result, \"error\", \"Failed to upsert Discord user\")\n        if \"disabled\" in str(err).lower():\n            raise DiscordRegistrationDisabledError(str(err))\n        raise DiscordAuthError(str(err))\n\n    user_id = int(result.data[\"user_id\"])\n    user = db.session.get(User, user_id)\n    if user is None:\n        raise DiscordAuthError(\"Discord user upserted but not found\")\n    return user\n"
  },
  {
    "path": "src/app/auth/discord_settings.py",
    "content": "from __future__ import annotations\n\nimport os\nfrom dataclasses import dataclass\nfrom typing import TYPE_CHECKING\n\nif TYPE_CHECKING:\n    from flask import Flask\n\n    from app.models import DiscordSettings as DiscordSettingsModel\n\n\n@dataclass(slots=True, frozen=True)\nclass DiscordSettings:\n    enabled: bool\n    client_id: str | None\n    client_secret: str | None\n    redirect_uri: str | None\n    guild_ids: list[str]\n    allow_registration: bool\n\n\ndef load_discord_settings() -> DiscordSettings:\n    \"\"\"Load Discord OAuth2 settings from environment variables and database.\n\n    Environment variables take precedence over database values.\n    \"\"\"\n    # Try to load from database first\n    db_settings = _load_from_database()\n\n    # Environment variables override database values\n    client_id = os.environ.get(\"DISCORD_CLIENT_ID\") or (\n        db_settings.client_id if db_settings else None\n    )\n    client_secret = os.environ.get(\"DISCORD_CLIENT_SECRET\") or (\n        db_settings.client_secret if db_settings else None\n    )\n    redirect_uri = os.environ.get(\"DISCORD_REDIRECT_URI\") or (\n        db_settings.redirect_uri if db_settings else None\n    )\n\n    enabled = bool(client_id and client_secret and redirect_uri)\n\n    # Guild IDs: env var takes precedence\n    guild_ids_env = os.environ.get(\"DISCORD_GUILD_IDS\", \"\")\n    if guild_ids_env:\n        guild_ids = [g.strip() for g in guild_ids_env.split(\",\") if g.strip()]\n    elif db_settings and db_settings.guild_ids:\n        guild_ids = [g.strip() for g in db_settings.guild_ids.split(\",\") if g.strip()]\n    else:\n        guild_ids = []\n\n    # Allow registration: env var takes precedence\n    allow_reg_env = os.environ.get(\"DISCORD_ALLOW_REGISTRATION\")\n    if allow_reg_env is not None:\n        allow_registration = allow_reg_env.lower() in (\"true\", \"1\", \"yes\")\n    elif db_settings is not None:\n        allow_registration = db_settings.allow_registration\n    else:\n        allow_registration = True\n\n    return DiscordSettings(\n        enabled=enabled,\n        client_id=client_id,\n        client_secret=client_secret,\n        redirect_uri=redirect_uri,\n        guild_ids=guild_ids,\n        allow_registration=allow_registration,\n    )\n\n\ndef _load_from_database() -> \"DiscordSettingsModel | None\":\n    \"\"\"Load Discord settings from database, returns None if not available.\"\"\"\n    try:\n        from app.extensions import db\n        from app.models import DiscordSettings as DiscordSettingsModel\n\n        return db.session.get(DiscordSettingsModel, 1)\n    except Exception:\n        # Database not initialized or table doesn't exist yet\n        return None\n\n\ndef reload_discord_settings(app: \"Flask\") -> DiscordSettings:\n    \"\"\"Reload Discord settings and update app config.\"\"\"\n    settings = load_discord_settings()\n    app.config[\"DISCORD_SETTINGS\"] = settings\n    return settings\n"
  },
  {
    "path": "src/app/auth/feed_tokens.py",
    "content": "from __future__ import annotations\n\nimport hashlib\nimport logging\nimport secrets\nfrom dataclasses import dataclass\nfrom typing import Optional\n\nfrom app.auth.service import AuthenticatedUser\nfrom app.extensions import db\nfrom app.models import Feed, FeedAccessToken, Post, User, UserFeed\nfrom app.writer.client import writer_client\n\nlogger = logging.getLogger(\"global_logger\")\n\n\ndef _hash_token(secret_value: str) -> str:\n    return hashlib.sha256(secret_value.encode(\"utf-8\")).hexdigest()\n\n\n@dataclass(slots=True)\nclass FeedTokenAuthResult:\n    user: AuthenticatedUser\n    feed_id: int | None\n    token: FeedAccessToken\n\n\ndef _validate_token_access(token: FeedAccessToken, user: User, path: str) -> bool:\n    # Handle Aggregate Token (feed_id is None)\n    if token.feed_id is None:\n        # 1. If accessing the aggregate feed itself (/feed/user/<uid>)\n        #    Validate that the token belongs to the requested user\n        requested_user_id = _resolve_user_id_from_feed_path(path)\n        if requested_user_id is not None:\n            return bool(requested_user_id == user.id)\n\n        # 2. If accessing a specific resource (audio/post), verify subscription\n        resource_feed_id = _resolve_feed_id(path)\n        if resource_feed_id is not None:\n            return _verify_subscription(user, resource_feed_id)\n\n        # If we can't resolve a feed ID but it's not the aggregate feed path,\n        # we might be in a generic context or invalid path.\n        # For safety, if we can't verify context, we might deny,\n        # but let's allow if it's just a token check not tied to a specific resource yet.\n        return True\n\n    # Handle Specific Feed Token\n    feed_id = _resolve_feed_id(path)\n    if feed_id is None or feed_id != token.feed_id:\n        return False\n\n    return _verify_subscription(user, token.feed_id)\n\n\ndef create_feed_access_token(user: User, feed: Feed | None) -> tuple[str, str]:\n    feed_id = feed.id if feed else None\n    result = writer_client.action(\n        \"create_feed_access_token\",\n        {\"user_id\": user.id, \"feed_id\": feed_id},\n        wait=True,\n    )\n    if not result or not result.success or not isinstance(result.data, dict):\n        raise RuntimeError(getattr(result, \"error\", \"Failed to create feed token\"))\n    return str(result.data[\"token_id\"]), str(result.data[\"secret\"])\n\n\ndef authenticate_feed_token(\n    token_id: str, secret: str, path: str\n) -> Optional[FeedTokenAuthResult]:\n    if not token_id:\n        return None\n\n    token = FeedAccessToken.query.filter_by(token_id=token_id, revoked=False).first()\n    if token is None:\n        return None\n\n    expected_hash = _hash_token(secret)\n    if not secrets.compare_digest(token.token_hash, expected_hash):\n        return None\n\n    user = db.session.get(User, token.user_id)\n    if user is None:\n        return None\n\n    if not _validate_token_access(token, user, path):\n        return None\n\n    writer_client.action(\n        \"touch_feed_access_token\",\n        {\"token_id\": token_id, \"secret\": secret},\n        wait=False,\n    )\n\n    return FeedTokenAuthResult(\n        user=AuthenticatedUser(id=user.id, username=user.username, role=user.role),\n        feed_id=token.feed_id,\n        token=token,\n    )\n\n\ndef _verify_subscription(user: User, feed_id: int) -> bool:\n    if user.role == \"admin\":\n        return True\n    # Hack: Always allow Feed 1\n    if feed_id == 1:\n        return True\n\n    membership = UserFeed.query.filter_by(user_id=user.id, feed_id=feed_id).first()\n    if not membership:\n        logger.warning(\n            \"Access denied: User %s has valid token but no active subscription for feed %s\",\n            user.id,\n            feed_id,\n        )\n        return False\n    return True\n\n\ndef _resolve_user_id_from_feed_path(path: str) -> Optional[int]:\n    if path.startswith(\"/feed/user/\"):\n        remainder = path[len(\"/feed/user/\") :]\n        try:\n            return int(remainder.split(\"/\", 1)[0])\n        except ValueError:\n            return None\n    return None\n\n\ndef _resolve_feed_id(path: str) -> Optional[int]:\n    if path.startswith(\"/feed/\"):\n        remainder = path[len(\"/feed/\") :]\n        try:\n            return int(remainder.split(\"/\", 1)[0])\n        except ValueError:\n            return None\n\n    if path.startswith(\"/api/posts/\"):\n        parts = path.split(\"/\")\n        if len(parts) < 4:\n            return None\n        guid = parts[3]\n        post = Post.query.filter_by(guid=guid).first()\n        return post.feed_id if post else None\n\n    if path.startswith(\"/post/\"):\n        remainder = path[len(\"/post/\") :]\n        guid = remainder.split(\"/\", 1)[0]\n        guid = guid.split(\".\", 1)[0]\n        post = Post.query.filter_by(guid=guid).first()\n        return post.feed_id if post else None\n\n    return None\n"
  },
  {
    "path": "src/app/auth/guards.py",
    "content": "\"\"\"Authorization guard utilities for admin and authenticated user checks.\"\"\"\n\nfrom typing import TYPE_CHECKING, Tuple\n\nimport flask\nfrom flask import current_app, g, jsonify\n\nfrom app.extensions import db\n\nif TYPE_CHECKING:\n    from app.models import User\n\n\ndef require_admin(\n    action: str = \"perform this action\",\n) -> Tuple[\"User | None\", flask.Response | None]:\n    \"\"\"Ensure the current user is an admin when auth is enabled.\n\n    When auth is disabled (AUTH_SETTINGS.require_auth == False),\n    returns (None, None) to allow the operation.\n\n    When auth is enabled:\n    - Returns (user, None) if user is authenticated and is admin\n    - Returns (None, error_response) if not authenticated or not admin\n\n    Args:\n        action: Description of the action for error messages.\n\n    Returns:\n        (user, error_response) tuple where only one is non-None.\n    \"\"\"\n    settings = current_app.config.get(\"AUTH_SETTINGS\")\n    if not settings or not settings.require_auth:\n        return None, None\n\n    current = getattr(g, \"current_user\", None)\n    if current is None:\n        return None, flask.make_response(\n            jsonify({\"error\": \"Authentication required.\"}), 401\n        )\n\n    from app.models import User\n\n    user: User | None = db.session.get(User, current.id)\n    if user is None:\n        return None, flask.make_response(jsonify({\"error\": \"User not found.\"}), 404)\n\n    if user.role != \"admin\":\n        return None, flask.make_response(\n            jsonify({\"error\": f\"Only admins can {action}.\"}), 403\n        )\n\n    return user, None\n\n\ndef is_auth_enabled() -> bool:\n    \"\"\"Check if authentication is enabled.\"\"\"\n    settings = current_app.config.get(\"AUTH_SETTINGS\")\n    return bool(settings and settings.require_auth)\n"
  },
  {
    "path": "src/app/auth/middleware.py",
    "content": "from __future__ import annotations\n\nimport re\nfrom typing import Any\n\nfrom flask import Response, current_app, g, jsonify, request, session\n\nfrom app.auth.feed_tokens import FeedTokenAuthResult, authenticate_feed_token\nfrom app.auth.service import AuthenticatedUser\nfrom app.auth.state import failure_rate_limiter\nfrom app.extensions import db\nfrom app.models import User\n\nSESSION_USER_KEY = \"user_id\"\n\n# Paths that remain public even when auth is required.\n_PUBLIC_PATHS: set[str] = {\n    \"/\",\n    \"/health\",\n    \"/robots.txt\",\n    \"/manifest.json\",\n    \"/favicon.ico\",\n    \"/api/auth/login\",\n    \"/api/auth/status\",\n    \"/api/auth/discord/status\",\n    \"/api/auth/discord/login\",\n    \"/api/auth/discord/callback\",\n    \"/api/landing/status\",\n    # Stripe webhooks must bypass auth to allow Stripe to deliver events\n    \"/api/billing/stripe-webhook\",\n}\n\n_PUBLIC_PREFIXES: tuple[str, ...] = (\n    \"/static/\",\n    \"/assets/\",\n    \"/images/\",\n    \"/fonts/\",\n    \"/.well-known/\",\n)\n\n_PUBLIC_EXTENSIONS: tuple[str, ...] = (\n    \".js\",\n    \".css\",\n    \".map\",\n    \".png\",\n    \".jpg\",\n    \".jpeg\",\n    \".gif\",\n    \".svg\",\n    \".ico\",\n    \".webp\",\n    \".txt\",\n)\n\n\n_TOKEN_PROTECTED_PATTERNS: tuple[re.Pattern[str], ...] = (\n    re.compile(r\"^/feed/[^/]+$\"),\n    re.compile(r\"^/feed/user/[^/]+$\"),\n    re.compile(r\"^/api/posts/[^/]+/(audio|download(?:/original)?)$\"),\n    re.compile(r\"^/post/[^/]+(?:\\\\.mp3|/original\\\\.mp3)$\"),\n)\n\n\ndef init_auth_middleware(app: Any) -> None:\n    \"\"\"Attach the authentication guard to the Flask app.\"\"\"\n\n    @app.before_request  # type: ignore[untyped-decorator]\n    def enforce_authentication() -> Response | None:\n        # pylint: disable=too-many-return-statements\n        if request.method == \"OPTIONS\":\n            return None\n\n        settings = current_app.config.get(\"AUTH_SETTINGS\")\n        if not settings or not settings.require_auth:\n            return None\n\n        if _is_public_request(request.path):\n            return None\n\n        client_identifier = request.remote_addr or \"unknown\"\n\n        session_user = _load_session_user()\n        if session_user is not None:\n            g.current_user = session_user\n            g.feed_token = None\n            failure_rate_limiter.register_success(client_identifier)\n            return None\n\n        if _is_token_protected_endpoint(request.path):\n            retry_after = failure_rate_limiter.retry_after(client_identifier)\n            if retry_after:\n                return _too_many_requests(retry_after)\n\n            token_result = _authenticate_feed_token_from_query()\n            if token_result is None:\n                backoff = failure_rate_limiter.register_failure(client_identifier)\n                response = _token_unauthorized()\n                if backoff:\n                    response.headers[\"Retry-After\"] = str(backoff)\n                return response\n\n            failure_rate_limiter.register_success(client_identifier)\n            g.current_user = token_result.user\n            g.feed_token = token_result\n            return None\n\n        return _json_unauthorized()\n\n\ndef _load_session_user() -> AuthenticatedUser | None:\n    raw_user_id = session.get(SESSION_USER_KEY)\n    if isinstance(raw_user_id, str) and raw_user_id.isdigit():\n        user_id = int(raw_user_id)\n    elif isinstance(raw_user_id, int):\n        user_id = raw_user_id\n    else:\n        return None\n\n    user = db.session.get(User, user_id)\n    if user is None:\n        session.pop(SESSION_USER_KEY, None)\n        return None\n\n    return AuthenticatedUser(id=user.id, username=user.username, role=user.role)\n\n\ndef _is_token_protected_endpoint(path: str) -> bool:\n    return any(pattern.match(path) for pattern in _TOKEN_PROTECTED_PATTERNS)\n\n\ndef _authenticate_feed_token_from_query() -> FeedTokenAuthResult | None:\n    token_id = request.args.get(\"feed_token\")\n    secret = request.args.get(\"feed_secret\")\n    if not token_id or not secret:\n        return None\n\n    return authenticate_feed_token(token_id, secret, request.path)\n\n\ndef _is_public_request(path: str) -> bool:\n    if path in _PUBLIC_PATHS:\n        return True\n\n    if any(path.startswith(prefix) for prefix in _PUBLIC_PREFIXES):\n        return True\n\n    if any(path.endswith(ext) for ext in _PUBLIC_EXTENSIONS):\n        return True\n\n    return False\n\n\ndef _json_unauthorized(message: str = \"Authentication required.\") -> Response:\n    response = jsonify({\"error\": message})\n    response.status_code = 401\n    return response\n\n\ndef _token_unauthorized() -> Response:\n    response = Response(\"Invalid or missing feed token\", status=401)\n    return response\n\n\ndef _too_many_requests(retry_after: int) -> Response:\n    response = Response(\"Too Many Authentication Attempts\", status=429)\n    response.headers[\"Retry-After\"] = str(retry_after)\n    return response\n"
  },
  {
    "path": "src/app/auth/passwords.py",
    "content": "from __future__ import annotations\n\nimport bcrypt\n\n\ndef hash_password(password: str, *, rounds: int = 12) -> str:\n    \"\"\"Hash a password using bcrypt with the provided work factor.\"\"\"\n    salt = bcrypt.gensalt(rounds)\n    hashed = bcrypt.hashpw(password.encode(\"utf-8\"), salt)\n    return hashed.decode(\"utf-8\")\n\n\ndef verify_password(password: str, password_hash: str) -> bool:\n    \"\"\"Verify the provided password against the stored bcrypt hash.\"\"\"\n    try:\n        return bcrypt.checkpw(\n            password.encode(\"utf-8\"),\n            password_hash.encode(\"utf-8\"),\n        )\n    except ValueError:\n        return False\n"
  },
  {
    "path": "src/app/auth/rate_limiter.py",
    "content": "from __future__ import annotations\n\nfrom collections.abc import MutableMapping\nfrom dataclasses import dataclass\nfrom datetime import datetime, timedelta\n\n\n@dataclass\nclass FailureState:\n    attempts: int\n    blocked_until: datetime | None\n    last_attempt: datetime\n\n\nclass FailureRateLimiter:\n    \"\"\"Simple in-memory exponential backoff tracker for authentication failures.\"\"\"\n\n    def __init__(\n        self,\n        *,\n        storage: MutableMapping[str, FailureState] | None = None,\n        max_backoff_seconds: int = 300,\n        warm_up_attempts: int = 3,\n    ) -> None:\n        self._storage = storage if storage is not None else {}\n        self._max_backoff_seconds = max_backoff_seconds\n        self._warm_up_attempts = warm_up_attempts\n\n    def register_failure(self, key: str) -> int:\n        now = datetime.utcnow()\n        state = self._storage.get(key)\n\n        if state is None:\n            state = FailureState(attempts=1, blocked_until=None, last_attempt=now)\n        else:\n            state.attempts += 1\n            state.last_attempt = now\n\n        backoff_seconds = 0\n        if state.attempts > self._warm_up_attempts:\n            exponent = state.attempts - self._warm_up_attempts\n            backoff_seconds = min(2**exponent, self._max_backoff_seconds)\n            state.blocked_until = now + timedelta(seconds=backoff_seconds)\n        else:\n            state.blocked_until = None\n\n        self._storage[key] = state\n        self._prune_stale(now)\n        return backoff_seconds\n\n    def register_success(self, key: str) -> None:\n        if key in self._storage:\n            del self._storage[key]\n\n    def retry_after(self, key: str) -> int | None:\n        state = self._storage.get(key)\n        if state is None or state.blocked_until is None:\n            return None\n\n        now = datetime.utcnow()\n        if state.blocked_until <= now:\n            del self._storage[key]\n            return None\n\n        remaining = int((state.blocked_until - now).total_seconds())\n        if remaining <= 0:\n            del self._storage[key]\n            return None\n\n        return remaining\n\n    def _prune_stale(self, now: datetime) -> None:\n        stale_keys: list[str] = []\n        for key, state in self._storage.items():\n            if now - state.last_attempt > timedelta(hours=1):\n                stale_keys.append(key)\n\n        for key in stale_keys:\n            del self._storage[key]\n"
  },
  {
    "path": "src/app/auth/service.py",
    "content": "from __future__ import annotations\n\nimport logging\nfrom dataclasses import dataclass\nfrom typing import Sequence, cast\n\nfrom app.extensions import db\nfrom app.models import User\nfrom app.runtime_config import config as runtime_config\nfrom app.writer.client import writer_client\n\nlogger = logging.getLogger(\"global_logger\")\n\n\nclass AuthServiceError(Exception):\n    \"\"\"Base class for authentication domain errors.\"\"\"\n\n\nclass InvalidCredentialsError(AuthServiceError):\n    \"\"\"Raised when provided credentials are invalid.\"\"\"\n\n\nclass PasswordValidationError(AuthServiceError):\n    \"\"\"Raised when a password fails strength validation.\"\"\"\n\n\nclass DuplicateUserError(AuthServiceError):\n    \"\"\"Raised when attempting to create a user with an existing username.\"\"\"\n\n\nclass LastAdminRemovalError(AuthServiceError):\n    \"\"\"Raised when deleting or demoting the final admin user.\"\"\"\n\n\nclass UserLimitExceededError(AuthServiceError):\n    \"\"\"Raised when creating a user would exceed the configured limit.\"\"\"\n\n\nALLOWED_ROLES: set[str] = {\"admin\", \"user\"}\n\n\n@dataclass(slots=True)\nclass AuthenticatedUser:\n    id: int\n    username: str\n    role: str\n\n\ndef _normalize_username(username: str) -> str:\n    return username.strip().lower()\n\n\ndef authenticate(username: str, password: str) -> AuthenticatedUser | None:\n    user = User.query.filter_by(username=_normalize_username(username)).first()\n    if user is None:\n        return None\n    if not user.verify_password(password):\n        return None\n    return AuthenticatedUser(id=user.id, username=user.username, role=user.role)\n\n\ndef list_users() -> Sequence[User]:\n    return cast(\n        Sequence[User],\n        User.query.order_by(User.created_at.desc(), User.id.desc()).all(),\n    )\n\n\ndef create_user(username: str, password: str, role: str = \"user\") -> User:\n    normalized_username = _normalize_username(username)\n    if not normalized_username:\n        raise AuthServiceError(\"Username is required.\")\n\n    if role not in ALLOWED_ROLES:\n        raise AuthServiceError(f\"Role must be one of {sorted(ALLOWED_ROLES)}.\")\n\n    if User.query.filter_by(username=normalized_username).first():\n        raise DuplicateUserError(\"A user with that username already exists.\")\n\n    _enforce_user_limit()\n\n    result = writer_client.action(\n        \"create_user\",\n        {\"username\": normalized_username, \"password\": password, \"role\": role},\n        wait=True,\n    )\n    if not result or not result.success or not isinstance(result.data, dict):\n        raise AuthServiceError(getattr(result, \"error\", \"Failed to create user\"))\n\n    user_id = int(result.data[\"user_id\"])\n    user = db.session.get(User, user_id)\n    if user is None:\n        raise AuthServiceError(\"User created but not found\")\n    return user\n\n\ndef change_password(user: User, current_password: str, new_password: str) -> None:\n    if not user.verify_password(current_password):\n        raise InvalidCredentialsError(\"Current password is incorrect.\")\n\n    update_password(user, new_password)\n\n\ndef update_password(user: User, new_password: str) -> None:\n    result = writer_client.action(\n        \"update_user_password\",\n        {\"user_id\": user.id, \"new_password\": new_password},\n        wait=True,\n    )\n    if not result or not result.success:\n        raise AuthServiceError(getattr(result, \"error\", \"Failed to update password\"))\n    db.session.expire(user)\n\n\ndef delete_user(user: User) -> None:\n    if user.role == \"admin\" and _count_admins() <= 1:\n        raise LastAdminRemovalError(\"Cannot remove the last admin user.\")\n\n    result = writer_client.action(\"delete_user\", {\"user_id\": user.id}, wait=True)\n    if not result or not result.success:\n        raise AuthServiceError(getattr(result, \"error\", \"Failed to delete user\"))\n\n\ndef set_role(user: User, role: str) -> None:\n    if role not in ALLOWED_ROLES:\n        raise AuthServiceError(f\"Role must be one of {sorted(ALLOWED_ROLES)}.\")\n\n    if user.role == \"admin\" and role != \"admin\" and _count_admins() <= 1:\n        raise LastAdminRemovalError(\"Cannot demote the last admin user.\")\n\n    result = writer_client.action(\n        \"set_user_role\", {\"user_id\": user.id, \"role\": role}, wait=True\n    )\n    if not result or not result.success:\n        raise AuthServiceError(getattr(result, \"error\", \"Failed to set role\"))\n    db.session.expire(user)\n\n\ndef set_manual_feed_allowance(user: User, allowance: int | None) -> None:\n    result = writer_client.action(\n        \"set_manual_feed_allowance\",\n        {\"user_id\": user.id, \"allowance\": allowance},\n        wait=True,\n    )\n    if not result or not result.success:\n        raise AuthServiceError(getattr(result, \"error\", \"Failed to set allowance\"))\n    db.session.expire(user)\n\n\ndef update_user_last_active(user_id: int) -> None:\n    \"\"\"Update the last_active timestamp for a user.\"\"\"\n    writer_client.action(\n        \"update_user_last_active\",\n        {\"user_id\": user_id},\n        wait=False,\n    )\n\n\ndef _count_admins() -> int:\n    return cast(int, User.query.filter_by(role=\"admin\").count())\n\n\ndef _enforce_user_limit() -> None:\n    \"\"\"Prevent creating users beyond the configured total limit.\n\n    Limit applies only when authentication is enabled; a non-positive or\n    missing limit means unlimited users.\n    \"\"\"\n\n    try:\n        limit = getattr(runtime_config, \"user_limit_total\", None)\n    except Exception:  # pragma: no cover - defensive\n        limit = None\n\n    if limit is None:\n        return\n\n    try:\n        limit_int = int(limit)\n    except Exception:\n        return\n\n    if limit_int < 0:\n        return\n\n    current_total = cast(int, User.query.count())\n    if limit_int == 0 or current_total >= limit_int:\n        raise UserLimitExceededError(\n            f\"User limit reached ({current_total}/{limit_int}). Delete a user or increase the limit.\"\n        )\n"
  },
  {
    "path": "src/app/auth/settings.py",
    "content": "from __future__ import annotations\n\nimport os\nfrom dataclasses import dataclass, replace\n\n\ndef _str_to_bool(value: str | None, default: bool = False) -> bool:\n    if value is None:\n        return default\n    lowered = value.strip().lower()\n    return lowered in {\"1\", \"true\", \"t\", \"yes\", \"y\", \"on\"}\n\n\n@dataclass(slots=True, frozen=True)\nclass AuthSettings:\n    \"\"\"Runtime authentication configuration derived from environment variables.\"\"\"\n\n    require_auth: bool\n    admin_username: str\n    admin_password: str | None\n\n    @property\n    def admin_password_required(self) -> bool:\n        return self.require_auth\n\n    def without_password(self) -> \"AuthSettings\":\n        \"\"\"Return a copy with the password removed to avoid retaining plaintext.\"\"\"\n        return replace(self, admin_password=None)\n\n\ndef load_auth_settings() -> AuthSettings:\n    \"\"\"Load authentication settings from environment variables.\"\"\"\n    require_auth = _str_to_bool(os.environ.get(\"REQUIRE_AUTH\"), default=False)\n    admin_username = os.environ.get(\"PODLY_ADMIN_USERNAME\", \"podly_admin\").strip()\n    admin_password = os.environ.get(\"PODLY_ADMIN_PASSWORD\")\n\n    if require_auth:\n        if not admin_username:\n            raise RuntimeError(\n                \"PODLY_ADMIN_USERNAME must be set to a non-empty value when \"\n                \"REQUIRE_AUTH=true.\"\n            )\n        if admin_password is None:\n            raise RuntimeError(\n                \"PODLY_ADMIN_PASSWORD must be provided when REQUIRE_AUTH=true.\"\n            )\n\n    return AuthSettings(\n        require_auth=require_auth,\n        admin_username=admin_username or \"podly_admin\",\n        admin_password=admin_password,\n    )\n"
  },
  {
    "path": "src/app/auth/state.py",
    "content": "from __future__ import annotations\n\nfrom .rate_limiter import FailureRateLimiter\n\nfailure_rate_limiter = FailureRateLimiter()\n"
  },
  {
    "path": "src/app/background.py",
    "content": "from datetime import datetime, timedelta\nfrom typing import Optional\n\nfrom app.extensions import scheduler\nfrom app.jobs_manager import (\n    scheduled_refresh_all_feeds,\n)\nfrom app.post_cleanup import scheduled_cleanup_processed_posts\n\n\ndef add_background_job(minutes: int) -> None:\n    \"\"\"Add the recurring background job for refreshing feeds.\n\n    minutes: interval in minutes; must be a positive integer.\n    \"\"\"\n\n    scheduler.add_job(\n        id=\"refresh_all_feeds\",\n        func=scheduled_refresh_all_feeds,\n        trigger=\"interval\",\n        minutes=minutes,\n        replace_existing=True,\n    )\n\n\ndef schedule_cleanup_job(retention_days: Optional[int]) -> None:\n    \"\"\"Ensure the periodic cleanup job is scheduled or disabled as needed.\"\"\"\n    job_id = \"cleanup_processed_posts\"\n    if retention_days is None or retention_days <= 0:\n        try:\n            scheduler.remove_job(job_id)\n        except Exception:\n            # Job may not be scheduled; ignore.\n            pass\n        return\n\n    # Run daily; allow scheduler to coalesce missed runs.\n    scheduler.add_job(\n        id=job_id,\n        func=scheduled_cleanup_processed_posts,\n        trigger=\"interval\",\n        hours=24,\n        next_run_time=datetime.utcnow() + timedelta(minutes=15),\n        replace_existing=True,\n    )\n"
  },
  {
    "path": "src/app/config_store.py",
    "content": "from __future__ import annotations\n\nimport hashlib\nimport logging\nimport os\nfrom typing import Any, Dict, Optional, Tuple\n\nfrom flask import current_app\n\nfrom app.db_commit import safe_commit\nfrom app.extensions import db, scheduler\nfrom app.models import (\n    AppSettings,\n    LLMSettings,\n    OutputSettings,\n    ProcessingSettings,\n    WhisperSettings,\n)\nfrom app.runtime_config import config as runtime_config\nfrom shared import defaults as DEFAULTS\nfrom shared.config import Config as PydanticConfig\nfrom shared.config import (\n    GroqWhisperConfig,\n    LocalWhisperConfig,\n    RemoteWhisperConfig,\n    TestWhisperConfig,\n)\n\n# pylint: disable=too-many-lines\n\n\nlogger = logging.getLogger(\"global_logger\")\n\n\ndef _is_empty(value: Any) -> bool:\n    return value is None or value == \"\"\n\n\ndef _parse_int(val: Any) -> Optional[int]:\n    try:\n        return int(val) if val is not None else None\n    except Exception:\n        return None\n\n\ndef _parse_bool(val: Any) -> Optional[bool]:\n    if val is None:\n        return None\n    s = str(val).strip().lower()\n    if s in {\"1\", \"true\", \"yes\", \"on\"}:\n        return True\n    if s in {\"0\", \"false\", \"no\", \"off\"}:\n        return False\n    return None\n\n\ndef _set_if_empty(obj: Any, attr: str, new_val: Any) -> bool:\n    if _is_empty(new_val):\n        return False\n    if _is_empty(getattr(obj, attr)):\n        setattr(obj, attr, new_val)\n        return True\n    return False\n\n\ndef _set_if_default(obj: Any, attr: str, new_val: Any, default_val: Any) -> bool:\n    if new_val is None:\n        return False\n    if getattr(obj, attr) == default_val:\n        setattr(obj, attr, new_val)\n        return True\n    return False\n\n\ndef _ensure_row(model: type, defaults: Dict[str, Any]) -> Any:\n    row = db.session.get(model, 1)\n    if row is None:\n        role = None\n        try:\n            role = current_app.config.get(\"PODLY_APP_ROLE\")\n        except Exception:  # pylint: disable=broad-except\n            role = None\n\n        # Web app should be read-only; only the writer process is allowed to create\n        # missing settings rows.\n        if role == \"writer\":\n            row = model(id=1, **defaults)\n            db.session.add(row)\n            safe_commit(\n                db.session,\n                must_succeed=True,\n                context=\"ensure_settings_row\",\n                logger_obj=logger,\n            )\n        else:\n            logger.warning(\n                \"Settings row %s missing; returning defaults without persisting (role=%s)\",\n                getattr(model, \"__name__\", str(model)),\n                role,\n            )\n            return model(id=1, **defaults)\n    return row\n\n\ndef ensure_defaults() -> None:\n    _ensure_row(\n        LLMSettings,\n        {\n            \"llm_model\": DEFAULTS.LLM_DEFAULT_MODEL,\n            \"openai_timeout\": DEFAULTS.OPENAI_DEFAULT_TIMEOUT_SEC,\n            \"openai_max_tokens\": DEFAULTS.OPENAI_DEFAULT_MAX_TOKENS,\n            \"llm_max_concurrent_calls\": DEFAULTS.LLM_DEFAULT_MAX_CONCURRENT_CALLS,\n            \"llm_max_retry_attempts\": DEFAULTS.LLM_DEFAULT_MAX_RETRY_ATTEMPTS,\n            \"llm_enable_token_rate_limiting\": DEFAULTS.LLM_ENABLE_TOKEN_RATE_LIMITING,\n            \"enable_boundary_refinement\": DEFAULTS.ENABLE_BOUNDARY_REFINEMENT,\n            \"enable_word_level_boundary_refinder\": DEFAULTS.ENABLE_WORD_LEVEL_BOUNDARY_REFINDER,\n        },\n    )\n\n    _ensure_row(\n        WhisperSettings,\n        {\n            \"whisper_type\": DEFAULTS.WHISPER_DEFAULT_TYPE,\n            \"local_model\": DEFAULTS.WHISPER_LOCAL_MODEL,\n            \"remote_model\": DEFAULTS.WHISPER_REMOTE_MODEL,\n            \"remote_base_url\": DEFAULTS.WHISPER_REMOTE_BASE_URL,\n            \"remote_language\": DEFAULTS.WHISPER_REMOTE_LANGUAGE,\n            \"remote_timeout_sec\": DEFAULTS.WHISPER_REMOTE_TIMEOUT_SEC,\n            \"remote_chunksize_mb\": DEFAULTS.WHISPER_REMOTE_CHUNKSIZE_MB,\n            \"groq_model\": DEFAULTS.WHISPER_GROQ_MODEL,\n            \"groq_language\": DEFAULTS.WHISPER_GROQ_LANGUAGE,\n            \"groq_max_retries\": DEFAULTS.WHISPER_GROQ_MAX_RETRIES,\n        },\n    )\n\n    _ensure_row(\n        ProcessingSettings,\n        {\n            \"num_segments_to_input_to_prompt\": DEFAULTS.PROCESSING_NUM_SEGMENTS_TO_INPUT_TO_PROMPT,\n        },\n    )\n\n    _ensure_row(\n        OutputSettings,\n        {\n            \"fade_ms\": DEFAULTS.OUTPUT_FADE_MS,\n            \"min_ad_segement_separation_seconds\": DEFAULTS.OUTPUT_MIN_AD_SEGMENT_SEPARATION_SECONDS,\n            \"min_ad_segment_length_seconds\": DEFAULTS.OUTPUT_MIN_AD_SEGMENT_LENGTH_SECONDS,\n            \"min_confidence\": DEFAULTS.OUTPUT_MIN_CONFIDENCE,\n        },\n    )\n\n    _ensure_row(\n        AppSettings,\n        {\n            \"background_update_interval_minute\": DEFAULTS.APP_BACKGROUND_UPDATE_INTERVAL_MINUTE,\n            \"automatically_whitelist_new_episodes\": DEFAULTS.APP_AUTOMATICALLY_WHITELIST_NEW_EPISODES,\n            \"post_cleanup_retention_days\": DEFAULTS.APP_POST_CLEANUP_RETENTION_DAYS,\n            \"number_of_episodes_to_whitelist_from_archive_of_new_feed\": DEFAULTS.APP_NUM_EPISODES_TO_WHITELIST_FROM_ARCHIVE_OF_NEW_FEED,\n            \"enable_public_landing_page\": DEFAULTS.APP_ENABLE_PUBLIC_LANDING_PAGE,\n            \"user_limit_total\": DEFAULTS.APP_USER_LIMIT_TOTAL,\n            \"autoprocess_on_download\": DEFAULTS.APP_AUTOPROCESS_ON_DOWNLOAD,\n        },\n    )\n\n\ndef _apply_llm_env_overrides_to_db(llm: Any) -> bool:\n    \"\"\"Apply LLM-related environment variable overrides to database settings.\n\n    Returns True if any settings were changed.\n    \"\"\"\n    changed = False\n\n    env_llm_key = (\n        os.environ.get(\"LLM_API_KEY\")\n        or os.environ.get(\"OPENAI_API_KEY\")\n        or os.environ.get(\"GROQ_API_KEY\")\n    )\n    changed = _set_if_empty(llm, \"llm_api_key\", env_llm_key) or changed\n\n    env_llm_model = os.environ.get(\"LLM_MODEL\")\n    changed = (\n        _set_if_default(llm, \"llm_model\", env_llm_model, DEFAULTS.LLM_DEFAULT_MODEL)\n        or changed\n    )\n\n    env_openai_base_url = os.environ.get(\"OPENAI_BASE_URL\")\n    changed = _set_if_empty(llm, \"openai_base_url\", env_openai_base_url) or changed\n\n    env_openai_timeout = _parse_int(os.environ.get(\"OPENAI_TIMEOUT\"))\n    changed = (\n        _set_if_default(\n            llm,\n            \"openai_timeout\",\n            env_openai_timeout,\n            DEFAULTS.OPENAI_DEFAULT_TIMEOUT_SEC,\n        )\n        or changed\n    )\n\n    env_openai_max_tokens = _parse_int(os.environ.get(\"OPENAI_MAX_TOKENS\"))\n    changed = (\n        _set_if_default(\n            llm,\n            \"openai_max_tokens\",\n            env_openai_max_tokens,\n            DEFAULTS.OPENAI_DEFAULT_MAX_TOKENS,\n        )\n        or changed\n    )\n\n    env_llm_max_concurrent = _parse_int(os.environ.get(\"LLM_MAX_CONCURRENT_CALLS\"))\n    changed = (\n        _set_if_default(\n            llm,\n            \"llm_max_concurrent_calls\",\n            env_llm_max_concurrent,\n            DEFAULTS.LLM_DEFAULT_MAX_CONCURRENT_CALLS,\n        )\n        or changed\n    )\n\n    env_llm_max_retries = _parse_int(os.environ.get(\"LLM_MAX_RETRY_ATTEMPTS\"))\n    changed = (\n        _set_if_default(\n            llm,\n            \"llm_max_retry_attempts\",\n            env_llm_max_retries,\n            DEFAULTS.LLM_DEFAULT_MAX_RETRY_ATTEMPTS,\n        )\n        or changed\n    )\n\n    env_llm_enable_token_rl = _parse_bool(\n        os.environ.get(\"LLM_ENABLE_TOKEN_RATE_LIMITING\")\n    )\n    if (\n        llm.llm_enable_token_rate_limiting == DEFAULTS.LLM_ENABLE_TOKEN_RATE_LIMITING\n        and env_llm_enable_token_rl is not None\n    ):\n        llm.llm_enable_token_rate_limiting = bool(env_llm_enable_token_rl)\n        changed = True\n\n    env_llm_max_input_tokens_per_call = _parse_int(\n        os.environ.get(\"LLM_MAX_INPUT_TOKENS_PER_CALL\")\n    )\n    if (\n        llm.llm_max_input_tokens_per_call is None\n        and env_llm_max_input_tokens_per_call is not None\n    ):\n        llm.llm_max_input_tokens_per_call = env_llm_max_input_tokens_per_call\n        changed = True\n\n    env_llm_max_input_tokens_per_minute = _parse_int(\n        os.environ.get(\"LLM_MAX_INPUT_TOKENS_PER_MINUTE\")\n    )\n    if (\n        llm.llm_max_input_tokens_per_minute is None\n        and env_llm_max_input_tokens_per_minute is not None\n    ):\n        llm.llm_max_input_tokens_per_minute = env_llm_max_input_tokens_per_minute\n        changed = True\n\n    return changed\n\n\ndef _apply_whisper_env_overrides_to_db(whisper: Any) -> bool:\n    \"\"\"Apply Whisper-related environment variable overrides to database settings.\n\n    Returns True if any settings were changed.\n    \"\"\"\n    changed = False\n\n    # Respect explicit whisper type env if still default\n    env_whisper_type = os.environ.get(\"WHISPER_TYPE\")\n    if env_whisper_type and isinstance(env_whisper_type, str):\n        env_whisper_type_norm = env_whisper_type.strip().lower()\n        if env_whisper_type_norm in {\"local\", \"remote\", \"groq\"}:\n            changed = (\n                _set_if_default(\n                    whisper,\n                    \"whisper_type\",\n                    env_whisper_type_norm,\n                    DEFAULTS.WHISPER_DEFAULT_TYPE,\n                )\n                or changed\n            )\n\n    # If GROQ_API_KEY is provided, seed both LLM key and Groq whisper key if empty\n    groq_key = os.environ.get(\"GROQ_API_KEY\")\n    changed = _set_if_empty(whisper, \"groq_api_key\", groq_key) or changed\n\n    if whisper.whisper_type == \"remote\":\n        remote_key = os.environ.get(\"WHISPER_REMOTE_API_KEY\") or os.environ.get(\n            \"OPENAI_API_KEY\"\n        )\n        changed = _set_if_empty(whisper, \"remote_api_key\", remote_key) or changed\n\n        remote_base = os.environ.get(\"WHISPER_REMOTE_BASE_URL\") or os.environ.get(\n            \"OPENAI_BASE_URL\"\n        )\n        changed = (\n            _set_if_default(\n                whisper,\n                \"remote_base_url\",\n                remote_base,\n                DEFAULTS.WHISPER_REMOTE_BASE_URL,\n            )\n            or changed\n        )\n\n        remote_model = os.environ.get(\"WHISPER_REMOTE_MODEL\")\n        changed = (\n            _set_if_default(\n                whisper, \"remote_model\", remote_model, DEFAULTS.WHISPER_REMOTE_MODEL\n            )\n            or changed\n        )\n\n        remote_timeout = _parse_int(os.environ.get(\"WHISPER_REMOTE_TIMEOUT_SEC\"))\n        changed = (\n            _set_if_default(\n                whisper,\n                \"remote_timeout_sec\",\n                remote_timeout,\n                DEFAULTS.WHISPER_REMOTE_TIMEOUT_SEC,\n            )\n            or changed\n        )\n\n        remote_chunksize = _parse_int(os.environ.get(\"WHISPER_REMOTE_CHUNKSIZE_MB\"))\n        changed = (\n            _set_if_default(\n                whisper,\n                \"remote_chunksize_mb\",\n                remote_chunksize,\n                DEFAULTS.WHISPER_REMOTE_CHUNKSIZE_MB,\n            )\n            or changed\n        )\n\n    elif whisper.whisper_type == \"groq\":\n        groq_model_env = os.environ.get(\"GROQ_WHISPER_MODEL\") or os.environ.get(\n            \"WHISPER_GROQ_MODEL\"\n        )\n        changed = (\n            _set_if_default(\n                whisper, \"groq_model\", groq_model_env, DEFAULTS.WHISPER_GROQ_MODEL\n            )\n            or changed\n        )\n\n        groq_max_retries_env = _parse_int(os.environ.get(\"GROQ_MAX_RETRIES\"))\n        changed = (\n            _set_if_default(\n                whisper,\n                \"groq_max_retries\",\n                groq_max_retries_env,\n                DEFAULTS.WHISPER_GROQ_MAX_RETRIES,\n            )\n            or changed\n        )\n\n    elif whisper.whisper_type == \"local\":\n        local_model_env = os.environ.get(\"WHISPER_LOCAL_MODEL\")\n        changed = (\n            _set_if_default(\n                whisper, \"local_model\", local_model_env, DEFAULTS.WHISPER_LOCAL_MODEL\n            )\n            or changed\n        )\n\n    return changed\n\n\ndef _apply_env_overrides_to_db_first_boot() -> None:\n    \"\"\"Persist environment-provided overrides into the DB on first boot.\n\n    Only updates fields that are at default/empty values so we don't clobber\n    user-changed settings after first start.\n    \"\"\"\n    llm = LLMSettings.query.get(1)\n    whisper = WhisperSettings.query.get(1)\n    processing = ProcessingSettings.query.get(1)\n    output = OutputSettings.query.get(1)\n    app_s = AppSettings.query.get(1)\n\n    assert llm and whisper and processing and output and app_s\n\n    changed = False\n    changed = _apply_llm_env_overrides_to_db(llm) or changed\n    changed = _apply_whisper_env_overrides_to_db(whisper) or changed\n\n    # Future: add processing/output/app env-to-db seeding if envs defined\n\n    if changed:\n        safe_commit(\n            db.session,\n            must_succeed=True,\n            context=\"env_overrides_to_db\",\n            logger_obj=logger,\n        )\n\n\ndef read_combined() -> Dict[str, Any]:\n    ensure_defaults()\n\n    llm = LLMSettings.query.get(1)\n    whisper = WhisperSettings.query.get(1)\n    processing = ProcessingSettings.query.get(1)\n    output = OutputSettings.query.get(1)\n    app_s = AppSettings.query.get(1)\n\n    assert llm and whisper and processing and output and app_s\n\n    whisper_payload: Dict[str, Any] = {\"whisper_type\": whisper.whisper_type}\n    if whisper.whisper_type == \"local\":\n        whisper_payload.update({\"model\": whisper.local_model})\n    elif whisper.whisper_type == \"remote\":\n        whisper_payload.update(\n            {\n                \"model\": whisper.remote_model,\n                \"api_key\": whisper.remote_api_key,\n                \"base_url\": whisper.remote_base_url,\n                \"language\": whisper.remote_language,\n                \"timeout_sec\": whisper.remote_timeout_sec,\n                \"chunksize_mb\": whisper.remote_chunksize_mb,\n            }\n        )\n    elif whisper.whisper_type == \"groq\":\n        whisper_payload.update(\n            {\n                \"api_key\": whisper.groq_api_key,\n                \"model\": whisper.groq_model,\n                \"language\": whisper.groq_language,\n                \"max_retries\": whisper.groq_max_retries,\n            }\n        )\n    elif whisper.whisper_type == \"test\":\n        whisper_payload.update({})\n\n    return {\n        \"llm\": {\n            \"llm_api_key\": llm.llm_api_key,\n            \"llm_model\": llm.llm_model,\n            \"openai_base_url\": llm.openai_base_url,\n            \"openai_timeout\": llm.openai_timeout,\n            \"openai_max_tokens\": llm.openai_max_tokens,\n            \"llm_max_concurrent_calls\": llm.llm_max_concurrent_calls,\n            \"llm_max_retry_attempts\": llm.llm_max_retry_attempts,\n            \"llm_max_input_tokens_per_call\": llm.llm_max_input_tokens_per_call,\n            \"llm_enable_token_rate_limiting\": llm.llm_enable_token_rate_limiting,\n            \"llm_max_input_tokens_per_minute\": llm.llm_max_input_tokens_per_minute,\n            \"enable_boundary_refinement\": llm.enable_boundary_refinement,\n            \"enable_word_level_boundary_refinder\": llm.enable_word_level_boundary_refinder,\n        },\n        \"whisper\": whisper_payload,\n        \"processing\": {\n            \"num_segments_to_input_to_prompt\": processing.num_segments_to_input_to_prompt,\n        },\n        \"output\": {\n            \"fade_ms\": output.fade_ms,\n            \"min_ad_segement_separation_seconds\": output.min_ad_segement_separation_seconds,\n            \"min_ad_segment_length_seconds\": output.min_ad_segment_length_seconds,\n            \"min_confidence\": output.min_confidence,\n        },\n        \"app\": {\n            \"background_update_interval_minute\": app_s.background_update_interval_minute,\n            \"automatically_whitelist_new_episodes\": app_s.automatically_whitelist_new_episodes,\n            \"post_cleanup_retention_days\": app_s.post_cleanup_retention_days,\n            \"number_of_episodes_to_whitelist_from_archive_of_new_feed\": app_s.number_of_episodes_to_whitelist_from_archive_of_new_feed,\n            \"enable_public_landing_page\": app_s.enable_public_landing_page,\n            \"user_limit_total\": app_s.user_limit_total,\n            \"autoprocess_on_download\": app_s.autoprocess_on_download,\n        },\n    }\n\n\ndef _update_section_llm(data: Dict[str, Any]) -> None:\n    row = LLMSettings.query.get(1)\n    assert row is not None\n    for key in [\n        \"llm_api_key\",\n        \"llm_model\",\n        \"openai_base_url\",\n        \"openai_timeout\",\n        \"openai_max_tokens\",\n        \"llm_max_concurrent_calls\",\n        \"llm_max_retry_attempts\",\n        \"llm_max_input_tokens_per_call\",\n        \"llm_enable_token_rate_limiting\",\n        \"llm_max_input_tokens_per_minute\",\n        \"enable_boundary_refinement\",\n        \"enable_word_level_boundary_refinder\",\n    ]:\n        if key in data:\n            new_val = data[key]\n            if key == \"llm_api_key\" and _is_empty(new_val):\n                continue\n            setattr(row, key, new_val)\n    safe_commit(\n        db.session,\n        must_succeed=True,\n        context=\"update_llm_settings\",\n        logger_obj=logger,\n    )\n\n\ndef _update_section_whisper(data: Dict[str, Any]) -> None:\n    row = WhisperSettings.query.get(1)\n    assert row is not None\n    if \"whisper_type\" in data and data[\"whisper_type\"] in {\n        \"local\",\n        \"remote\",\n        \"groq\",\n        \"test\",\n    }:\n        row.whisper_type = data[\"whisper_type\"]\n    if row.whisper_type == \"local\":\n        if \"model\" in data:\n            row.local_model = data[\"model\"]\n    elif row.whisper_type == \"remote\":\n        for key_map in [\n            (\"model\", \"remote_model\"),\n            (\"api_key\", \"remote_api_key\"),\n            (\"base_url\", \"remote_base_url\"),\n            (\"language\", \"remote_language\"),\n            (\"timeout_sec\", \"remote_timeout_sec\"),\n            (\"chunksize_mb\", \"remote_chunksize_mb\"),\n        ]:\n            src, dst = key_map\n            if src in data:\n                new_val = data[src]\n                if src == \"api_key\" and _is_empty(new_val):\n                    continue\n                setattr(row, dst, new_val)\n    elif row.whisper_type == \"groq\":\n        for key_map in [\n            (\"api_key\", \"groq_api_key\"),\n            (\"model\", \"groq_model\"),\n            (\"language\", \"groq_language\"),\n            (\"max_retries\", \"groq_max_retries\"),\n        ]:\n            src, dst = key_map\n            if src in data:\n                new_val = data[src]\n                if src == \"api_key\" and _is_empty(new_val):\n                    continue\n                setattr(row, dst, new_val)\n    else:\n        # test type has no extra fields\n        pass\n    safe_commit(\n        db.session,\n        must_succeed=True,\n        context=\"update_whisper_settings\",\n        logger_obj=logger,\n    )\n\n\ndef _update_section_processing(data: Dict[str, Any]) -> None:\n    row = ProcessingSettings.query.get(1)\n    assert row is not None\n    for key in [\n        \"num_segments_to_input_to_prompt\",\n    ]:\n        if key in data:\n            setattr(row, key, data[key])\n    safe_commit(\n        db.session,\n        must_succeed=True,\n        context=\"update_processing_settings\",\n        logger_obj=logger,\n    )\n\n\ndef _update_section_output(data: Dict[str, Any]) -> None:\n    row = OutputSettings.query.get(1)\n    assert row is not None\n    for key in [\n        \"fade_ms\",\n        \"min_ad_segement_separation_seconds\",\n        \"min_ad_segment_length_seconds\",\n        \"min_confidence\",\n    ]:\n        if key in data:\n            setattr(row, key, data[key])\n    safe_commit(\n        db.session,\n        must_succeed=True,\n        context=\"update_output_settings\",\n        logger_obj=logger,\n    )\n\n\ndef _update_section_app(data: Dict[str, Any]) -> Tuple[Optional[int], Optional[int]]:\n    row = AppSettings.query.get(1)\n    assert row is not None\n    old_interval: Optional[int] = row.background_update_interval_minute\n    old_retention: Optional[int] = row.post_cleanup_retention_days\n    for key in [\n        \"background_update_interval_minute\",\n        \"automatically_whitelist_new_episodes\",\n        \"post_cleanup_retention_days\",\n        \"number_of_episodes_to_whitelist_from_archive_of_new_feed\",\n        \"enable_public_landing_page\",\n        \"user_limit_total\",\n        \"autoprocess_on_download\",\n    ]:\n        if key in data:\n            setattr(row, key, data[key])\n    safe_commit(\n        db.session,\n        must_succeed=True,\n        context=\"update_app_settings\",\n        logger_obj=logger,\n    )\n    return old_interval, old_retention\n\n\ndef _maybe_reschedule_refresh_job(\n    old_interval: Optional[int], new_interval: Optional[int]\n) -> None:\n    if old_interval == new_interval:\n        return\n\n    job_id = \"refresh_all_feeds\"\n    job = scheduler.get_job(job_id)\n\n    if new_interval is None:\n        if job:\n            try:\n                scheduler.remove_job(job_id)\n            except Exception:\n                pass\n        return\n\n    if not job:\n        return\n\n    # Avoid importing app.background here (it creates a cycle for pylint).\n    # Use best-effort rescheduling on the underlying APScheduler instance.\n    scheduler_obj = getattr(scheduler, \"scheduler\", scheduler)\n    reschedule = getattr(scheduler_obj, \"reschedule_job\", None)\n    if callable(reschedule):\n        reschedule(job_id, trigger=\"interval\", minutes=int(new_interval))\n\n\ndef _maybe_disable_cleanup_job(\n    old_retention: Optional[int], new_retention: Optional[int]\n) -> None:\n    if old_retention == new_retention:\n        return\n\n    job_id = \"cleanup_processed_posts\"\n    job = scheduler.get_job(job_id)\n\n    if new_retention is None or new_retention <= 0:\n        if job:\n            try:\n                scheduler.remove_job(job_id)\n            except Exception:\n                pass\n\n\ndef update_combined(payload: Dict[str, Any]) -> Dict[str, Any]:\n    if \"llm\" in payload:\n        _update_section_llm(payload[\"llm\"] or {})\n    if \"whisper\" in payload:\n        _update_section_whisper(payload[\"whisper\"] or {})\n    if \"processing\" in payload:\n        _update_section_processing(payload[\"processing\"] or {})\n    if \"output\" in payload:\n        _update_section_output(payload[\"output\"] or {})\n    if \"app\" in payload:\n        old_interval, old_retention = _update_section_app(payload[\"app\"] or {})\n\n        app_s = AppSettings.query.get(1)\n        if app_s:\n            _maybe_reschedule_refresh_job(\n                old_interval, app_s.background_update_interval_minute\n            )\n            _maybe_disable_cleanup_job(old_retention, app_s.post_cleanup_retention_days)\n\n    return read_combined()\n\n\ndef to_pydantic_config() -> PydanticConfig:\n    data = read_combined()\n    # Map whisper section to discriminated union config\n    whisper_obj: Optional[\n        LocalWhisperConfig | RemoteWhisperConfig | TestWhisperConfig | GroqWhisperConfig\n    ] = None\n    w = data[\"whisper\"]\n    wtype = w.get(\"whisper_type\")\n    if wtype == \"local\":\n        whisper_obj = LocalWhisperConfig(model=w.get(\"model\", \"base.en\"))\n    elif wtype == \"remote\":\n        whisper_obj = RemoteWhisperConfig(\n            model=w.get(\"model\", \"whisper-1\"),\n            # Allow boot without a remote API key so the UI can be used to set it\n            api_key=w.get(\"api_key\") or \"\",\n            base_url=w.get(\"base_url\", \"https://api.openai.com/v1\"),\n            language=w.get(\"language\", \"en\"),\n            timeout_sec=w.get(\"timeout_sec\", 600),\n            chunksize_mb=w.get(\"chunksize_mb\", 24),\n        )\n    elif wtype == \"groq\":\n        whisper_obj = GroqWhisperConfig(\n            # Allow boot without a Groq API key so the UI can be used to set it\n            api_key=w.get(\"api_key\") or \"\",\n            model=w.get(\"model\", DEFAULTS.WHISPER_GROQ_MODEL),\n            language=w.get(\"language\", \"en\"),\n            max_retries=w.get(\"max_retries\", 3),\n        )\n    elif wtype == \"test\":\n        whisper_obj = TestWhisperConfig()\n\n    return PydanticConfig(\n        llm_api_key=data[\"llm\"].get(\"llm_api_key\"),\n        llm_model=data[\"llm\"].get(\"llm_model\", DEFAULTS.LLM_DEFAULT_MODEL),\n        openai_base_url=data[\"llm\"].get(\"openai_base_url\"),\n        openai_max_tokens=int(\n            data[\"llm\"].get(\"openai_max_tokens\", DEFAULTS.OPENAI_DEFAULT_MAX_TOKENS)\n            or DEFAULTS.OPENAI_DEFAULT_MAX_TOKENS\n        ),\n        openai_timeout=int(\n            data[\"llm\"].get(\"openai_timeout\", DEFAULTS.OPENAI_DEFAULT_TIMEOUT_SEC)\n            or DEFAULTS.OPENAI_DEFAULT_TIMEOUT_SEC\n        ),\n        llm_max_concurrent_calls=int(\n            data[\"llm\"].get(\n                \"llm_max_concurrent_calls\", DEFAULTS.LLM_DEFAULT_MAX_CONCURRENT_CALLS\n            )\n            or DEFAULTS.LLM_DEFAULT_MAX_CONCURRENT_CALLS\n        ),\n        llm_max_retry_attempts=int(\n            data[\"llm\"].get(\n                \"llm_max_retry_attempts\", DEFAULTS.LLM_DEFAULT_MAX_RETRY_ATTEMPTS\n            )\n            or DEFAULTS.LLM_DEFAULT_MAX_RETRY_ATTEMPTS\n        ),\n        llm_max_input_tokens_per_call=data[\"llm\"].get(\"llm_max_input_tokens_per_call\"),\n        llm_enable_token_rate_limiting=bool(\n            data[\"llm\"].get(\n                \"llm_enable_token_rate_limiting\",\n                DEFAULTS.LLM_ENABLE_TOKEN_RATE_LIMITING,\n            )\n        ),\n        llm_max_input_tokens_per_minute=data[\"llm\"].get(\n            \"llm_max_input_tokens_per_minute\"\n        ),\n        enable_boundary_refinement=bool(\n            data[\"llm\"].get(\n                \"enable_boundary_refinement\",\n                DEFAULTS.ENABLE_BOUNDARY_REFINEMENT,\n            )\n        ),\n        enable_word_level_boundary_refinder=bool(\n            data[\"llm\"].get(\n                \"enable_word_level_boundary_refinder\",\n                DEFAULTS.ENABLE_WORD_LEVEL_BOUNDARY_REFINDER,\n            )\n        ),\n        output=data[\"output\"],\n        processing=data[\"processing\"],\n        background_update_interval_minute=data[\"app\"].get(\n            \"background_update_interval_minute\"\n        ),\n        post_cleanup_retention_days=data[\"app\"].get(\"post_cleanup_retention_days\"),\n        whisper=whisper_obj,\n        automatically_whitelist_new_episodes=bool(\n            data[\"app\"].get(\n                \"automatically_whitelist_new_episodes\",\n                DEFAULTS.APP_AUTOMATICALLY_WHITELIST_NEW_EPISODES,\n            )\n        ),\n        number_of_episodes_to_whitelist_from_archive_of_new_feed=int(\n            data[\"app\"].get(\n                \"number_of_episodes_to_whitelist_from_archive_of_new_feed\",\n                DEFAULTS.APP_NUM_EPISODES_TO_WHITELIST_FROM_ARCHIVE_OF_NEW_FEED,\n            )\n            or DEFAULTS.APP_NUM_EPISODES_TO_WHITELIST_FROM_ARCHIVE_OF_NEW_FEED\n        ),\n        enable_public_landing_page=bool(\n            data[\"app\"].get(\n                \"enable_public_landing_page\",\n                DEFAULTS.APP_ENABLE_PUBLIC_LANDING_PAGE,\n            )\n        ),\n        user_limit_total=data[\"app\"].get(\n            \"user_limit_total\", DEFAULTS.APP_USER_LIMIT_TOTAL\n        ),\n        autoprocess_on_download=bool(\n            data[\"app\"].get(\n                \"autoprocess_on_download\",\n                DEFAULTS.APP_AUTOPROCESS_ON_DOWNLOAD,\n            )\n        ),\n    )\n\n\ndef hydrate_runtime_config_inplace(db_config: Optional[PydanticConfig] = None) -> None:\n    \"\"\"Hydrate the in-process runtime config from DB-backed settings in-place.\n\n    Preserves the identity of the `app.config` Pydantic instance so any modules\n    that imported it by value continue to see updated fields.\n    \"\"\"\n    cfg = db_config or to_pydantic_config()\n\n    _log_initial_snapshot(cfg)\n\n    _apply_top_level_env_overrides(cfg)\n\n    _apply_whisper_env_overrides(cfg)\n\n    _apply_llm_model_override(cfg)\n\n    _apply_whisper_type_override(cfg)\n\n    _commit_runtime_config(cfg)\n    _log_final_snapshot()\n\n\ndef _log_initial_snapshot(cfg: PydanticConfig) -> None:\n    logger.info(\n        \"Config hydration: starting with DB values | whisper_type=%s llm_model=%s openai_base_url=%s llm_api_key_set=%s whisper_api_key_set=%s\",\n        getattr(getattr(cfg, \"whisper\", None), \"whisper_type\", None),\n        getattr(cfg, \"llm_model\", None),\n        getattr(cfg, \"openai_base_url\", None),\n        bool(getattr(cfg, \"llm_api_key\", None)),\n        bool(getattr(getattr(cfg, \"whisper\", None), \"api_key\", None)),\n    )\n\n\ndef _apply_top_level_env_overrides(cfg: PydanticConfig) -> None:\n    env_llm_key = (\n        os.environ.get(\"LLM_API_KEY\")\n        or os.environ.get(\"OPENAI_API_KEY\")\n        or os.environ.get(\"GROQ_API_KEY\")\n    )\n    if env_llm_key:\n        cfg.llm_api_key = env_llm_key\n\n    env_openai_base_url = os.environ.get(\"OPENAI_BASE_URL\")\n    if env_openai_base_url:\n        cfg.openai_base_url = env_openai_base_url\n\n\ndef _apply_whisper_env_overrides(cfg: PydanticConfig) -> None:\n    if cfg.whisper is None:\n        return\n    wtype = getattr(cfg.whisper, \"whisper_type\", None)\n    if wtype == \"remote\":\n        remote_key = os.environ.get(\"WHISPER_REMOTE_API_KEY\") or os.environ.get(\n            \"OPENAI_API_KEY\"\n        )\n        remote_base = os.environ.get(\"WHISPER_REMOTE_BASE_URL\") or os.environ.get(\n            \"OPENAI_BASE_URL\"\n        )\n        remote_model = os.environ.get(\"WHISPER_REMOTE_MODEL\")\n        if isinstance(cfg.whisper, RemoteWhisperConfig):\n            if remote_key:\n                cfg.whisper.api_key = remote_key\n            if remote_base:\n                cfg.whisper.base_url = remote_base\n            if remote_model:\n                cfg.whisper.model = remote_model\n    elif wtype == \"groq\":\n        groq_key = os.environ.get(\"GROQ_API_KEY\")\n        groq_model = os.environ.get(\"GROQ_WHISPER_MODEL\") or os.environ.get(\n            \"WHISPER_GROQ_MODEL\"\n        )\n        if isinstance(cfg.whisper, GroqWhisperConfig):\n            if groq_key:\n                cfg.whisper.api_key = groq_key\n            if groq_model:\n                cfg.whisper.model = groq_model\n    elif wtype == \"local\":\n        loc_model = os.environ.get(\"WHISPER_LOCAL_MODEL\")\n        if isinstance(cfg.whisper, LocalWhisperConfig) and loc_model:\n            cfg.whisper.model = loc_model\n\n\ndef _apply_llm_model_override(cfg: PydanticConfig) -> None:\n    env_llm_model = os.environ.get(\"LLM_MODEL\")\n    if env_llm_model:\n        cfg.llm_model = env_llm_model\n\n\ndef _configure_local_whisper(cfg: PydanticConfig) -> None:\n    \"\"\"Configure local whisper type.\"\"\"\n    # Validate that local whisper is available\n    try:\n        import whisper as _  # type: ignore[import-untyped]  # noqa: F401\n    except ImportError as e:\n        error_msg = (\n            f\"WHISPER_TYPE is set to 'local' but whisper library is not available. \"\n            f\"Either install whisper with 'pip install openai-whisper' or set WHISPER_TYPE to 'remote' or 'groq'. \"\n            f\"Import error: {e}\"\n        )\n        logger.error(error_msg)\n        raise RuntimeError(error_msg) from e\n\n    existing_model_any = getattr(cfg.whisper, \"model\", \"base.en\")\n    existing_model = (\n        existing_model_any if isinstance(existing_model_any, str) else \"base.en\"\n    )\n    loc_model_env = os.environ.get(\"WHISPER_LOCAL_MODEL\")\n    loc_model: str = (\n        loc_model_env\n        if isinstance(loc_model_env, str) and loc_model_env\n        else existing_model\n    )\n    cfg.whisper = LocalWhisperConfig(model=loc_model)\n\n\ndef _configure_remote_whisper(cfg: PydanticConfig) -> None:\n    \"\"\"Configure remote whisper type.\"\"\"\n    existing_model_any = getattr(cfg.whisper, \"model\", \"whisper-1\")\n    existing_model = (\n        existing_model_any if isinstance(existing_model_any, str) else \"whisper-1\"\n    )\n    rem_model_env = os.environ.get(\"WHISPER_REMOTE_MODEL\")\n    rem_model: str = (\n        rem_model_env\n        if isinstance(rem_model_env, str) and rem_model_env\n        else existing_model\n    )\n\n    existing_key_any = getattr(cfg.whisper, \"api_key\", \"\")\n    existing_key = existing_key_any if isinstance(existing_key_any, str) else \"\"\n    rem_api_key_env = os.environ.get(\"WHISPER_REMOTE_API_KEY\") or os.environ.get(\n        \"OPENAI_API_KEY\"\n    )\n    rem_api_key: str = (\n        rem_api_key_env\n        if isinstance(rem_api_key_env, str) and rem_api_key_env\n        else existing_key\n    )\n\n    existing_base_any = getattr(cfg.whisper, \"base_url\", \"https://api.openai.com/v1\")\n    existing_base = (\n        existing_base_any\n        if isinstance(existing_base_any, str)\n        else \"https://api.openai.com/v1\"\n    )\n    rem_base_env = os.environ.get(\"WHISPER_REMOTE_BASE_URL\") or os.environ.get(\n        \"OPENAI_BASE_URL\"\n    )\n    rem_base_url: str = (\n        rem_base_env\n        if isinstance(rem_base_env, str) and rem_base_env\n        else existing_base\n    )\n\n    existing_lang_any = getattr(cfg.whisper, \"language\", \"en\")\n    lang: str = existing_lang_any if isinstance(existing_lang_any, str) else \"en\"\n\n    timeout_sec: int = int(\n        os.environ.get(\n            \"WHISPER_REMOTE_TIMEOUT_SEC\",\n            str(getattr(cfg.whisper, \"timeout_sec\", 600)),\n        )\n    )\n    chunksize_mb: int = int(\n        os.environ.get(\n            \"WHISPER_REMOTE_CHUNKSIZE_MB\",\n            str(getattr(cfg.whisper, \"chunksize_mb\", 24)),\n        )\n    )\n\n    cfg.whisper = RemoteWhisperConfig(\n        model=rem_model,\n        api_key=rem_api_key,\n        base_url=rem_base_url,\n        language=lang,\n        timeout_sec=timeout_sec,\n        chunksize_mb=chunksize_mb,\n    )\n\n\ndef _configure_groq_whisper(cfg: PydanticConfig) -> None:\n    \"\"\"Configure groq whisper type.\"\"\"\n    existing_key_any = getattr(cfg.whisper, \"api_key\", \"\")\n    existing_key = existing_key_any if isinstance(existing_key_any, str) else \"\"\n    groq_key_env = os.environ.get(\"GROQ_API_KEY\")\n    groq_api_key: str = (\n        groq_key_env if isinstance(groq_key_env, str) and groq_key_env else existing_key\n    )\n\n    existing_model_any = getattr(cfg.whisper, \"model\", DEFAULTS.WHISPER_GROQ_MODEL)\n    existing_model = (\n        existing_model_any\n        if isinstance(existing_model_any, str)\n        else DEFAULTS.WHISPER_GROQ_MODEL\n    )\n    groq_model_env = os.environ.get(\"GROQ_WHISPER_MODEL\") or os.environ.get(\n        \"WHISPER_GROQ_MODEL\"\n    )\n    groq_model_val: str = (\n        groq_model_env\n        if isinstance(groq_model_env, str) and groq_model_env\n        else existing_model\n    )\n\n    existing_lang_any = getattr(cfg.whisper, \"language\", \"en\")\n    groq_lang: str = existing_lang_any if isinstance(existing_lang_any, str) else \"en\"\n\n    max_retries: int = int(\n        os.environ.get(\"GROQ_MAX_RETRIES\", str(getattr(cfg.whisper, \"max_retries\", 3)))\n    )\n\n    cfg.whisper = GroqWhisperConfig(\n        api_key=groq_api_key,\n        model=groq_model_val,\n        language=groq_lang,\n        max_retries=max_retries,\n    )\n\n\ndef _apply_whisper_type_override(cfg: PydanticConfig) -> None:\n    env_whisper_type = os.environ.get(\"WHISPER_TYPE\")\n\n    # Auto-detect whisper type from API key environment variables if not explicitly set\n    if not env_whisper_type:\n        if os.environ.get(\"WHISPER_REMOTE_API_KEY\"):\n            env_whisper_type = \"remote\"\n            logger.info(\n                \"Auto-detected WHISPER_TYPE=remote from WHISPER_REMOTE_API_KEY environment variable\"\n            )\n        elif os.environ.get(\"GROQ_API_KEY\") and not os.environ.get(\"LLM_API_KEY\"):\n            # Only auto-detect groq for whisper if LLM_API_KEY is not set\n            # (to avoid confusion when GROQ_API_KEY is only meant for LLM)\n            env_whisper_type = \"groq\"\n            logger.info(\n                \"Auto-detected WHISPER_TYPE=groq from GROQ_API_KEY environment variable\"\n            )\n\n    if not env_whisper_type:\n        return\n\n    wtype = env_whisper_type.strip().lower()\n    if wtype == \"local\":\n        _configure_local_whisper(cfg)\n    elif wtype == \"remote\":\n        _configure_remote_whisper(cfg)\n    elif wtype == \"groq\":\n        _configure_groq_whisper(cfg)\n    elif wtype == \"test\":\n        cfg.whisper = TestWhisperConfig()\n\n\ndef _commit_runtime_config(cfg: PydanticConfig) -> None:\n    logger.info(\n        \"Config hydration: after env overrides | whisper_type=%s llm_model=%s openai_base_url=%s llm_api_key_set=%s whisper_api_key_set=%s\",\n        getattr(getattr(cfg, \"whisper\", None), \"whisper_type\", None),\n        getattr(cfg, \"llm_model\", None),\n        getattr(cfg, \"openai_base_url\", None),\n        bool(getattr(cfg, \"llm_api_key\", None)),\n        bool(getattr(getattr(cfg, \"whisper\", None), \"api_key\", None)),\n    )\n    # Copy values from cfg to runtime_config, preserving Pydantic model instances\n    for key in cfg.model_fields.keys():\n        setattr(runtime_config, key, getattr(cfg, key))\n\n\ndef _log_final_snapshot() -> None:\n    logger.info(\n        \"Config hydration: runtime set | whisper_type=%s llm_model=%s openai_base_url=%s\",\n        getattr(getattr(runtime_config, \"whisper\", None), \"whisper_type\", None),\n        getattr(runtime_config, \"llm_model\", None),\n        getattr(runtime_config, \"openai_base_url\", None),\n    )\n\n\ndef ensure_defaults_and_hydrate() -> None:\n    \"\"\"Ensure default rows exist, then hydrate the runtime config from DB.\"\"\"\n    ensure_defaults()\n\n    # Check if environment variables have changed since last boot\n    _check_and_apply_env_changes()\n\n    _apply_env_overrides_to_db_first_boot()\n    hydrate_runtime_config_inplace()\n\n\ndef _calculate_env_hash() -> str:\n    \"\"\"Calculate a hash of all configuration-related environment variables.\"\"\"\n    keys = [\n        # LLM\n        \"LLM_API_KEY\",\n        \"OPENAI_API_KEY\",\n        \"GROQ_API_KEY\",\n        \"LLM_MODEL\",\n        \"OPENAI_BASE_URL\",\n        \"OPENAI_TIMEOUT\",\n        \"OPENAI_MAX_TOKENS\",\n        \"LLM_MAX_CONCURRENT_CALLS\",\n        \"LLM_MAX_RETRY_ATTEMPTS\",\n        \"LLM_ENABLE_TOKEN_RATE_LIMITING\",\n        \"LLM_MAX_INPUT_TOKENS_PER_CALL\",\n        \"LLM_MAX_INPUT_TOKENS_PER_MINUTE\",\n        # Whisper\n        \"WHISPER_TYPE\",\n        \"WHISPER_LOCAL_MODEL\",\n        \"WHISPER_REMOTE_API_KEY\",\n        \"WHISPER_REMOTE_BASE_URL\",\n        \"WHISPER_REMOTE_MODEL\",\n        \"WHISPER_REMOTE_TIMEOUT_SEC\",\n        \"WHISPER_REMOTE_CHUNKSIZE_MB\",\n        \"GROQ_WHISPER_MODEL\",\n        \"WHISPER_GROQ_MODEL\",\n        \"GROQ_MAX_RETRIES\",\n        # App\n        \"PODLY_APP_ROLE\",\n        \"DEVELOPER_MODE\",\n    ]\n\n    # Sort keys to ensure stable hash\n    keys.sort()\n\n    hasher = hashlib.sha256()\n    for key in keys:\n        val = os.environ.get(key, \"\")\n        hasher.update(f\"{key}={val}\".encode(\"utf-8\"))\n\n    return hasher.hexdigest()\n\n\ndef _check_and_apply_env_changes() -> None:\n    \"\"\"Check if env hash changed and force-apply overrides if so.\"\"\"\n    try:\n        app_s = AppSettings.query.get(1)\n        if not app_s:\n            return\n\n        # Check if column exists (handle pre-migration state gracefully)\n        if not hasattr(app_s, \"env_config_hash\"):\n            return\n\n        current_hash = _calculate_env_hash()\n        stored_hash = app_s.env_config_hash\n\n        if stored_hash != current_hash:\n            logger.info(\n                \"Environment configuration changed (hash mismatch). \"\n                \"Applying environment overrides to database settings.\"\n            )\n            _apply_env_overrides_to_db_force()\n\n            app_s.env_config_hash = current_hash\n            safe_commit(\n                db.session,\n                must_succeed=True,\n                context=\"update_env_hash\",\n                logger_obj=logger,\n            )\n\n    except Exception as e:\n        logger.warning(f\"Failed to check/update environment hash: {e}\")\n\n\ndef _apply_llm_env_overrides(llm: LLMSettings) -> bool:\n    \"\"\"Apply environment overrides to LLM settings.\"\"\"\n    changed = False\n\n    env_llm_key = (\n        os.environ.get(\"LLM_API_KEY\")\n        or os.environ.get(\"OPENAI_API_KEY\")\n        or os.environ.get(\"GROQ_API_KEY\")\n    )\n    if env_llm_key:\n        llm.llm_api_key = env_llm_key\n        changed = True\n\n    env_llm_model = os.environ.get(\"LLM_MODEL\")\n    if env_llm_model:\n        llm.llm_model = env_llm_model\n        changed = True\n\n    env_openai_base_url = os.environ.get(\"OPENAI_BASE_URL\")\n    if env_openai_base_url:\n        llm.openai_base_url = env_openai_base_url\n        changed = True\n\n    env_openai_timeout = _parse_int(os.environ.get(\"OPENAI_TIMEOUT\"))\n    if env_openai_timeout is not None:\n        llm.openai_timeout = env_openai_timeout\n        changed = True\n\n    env_openai_max_tokens = _parse_int(os.environ.get(\"OPENAI_MAX_TOKENS\"))\n    if env_openai_max_tokens is not None:\n        llm.openai_max_tokens = env_openai_max_tokens\n        changed = True\n\n    env_llm_max_concurrent = _parse_int(os.environ.get(\"LLM_MAX_CONCURRENT_CALLS\"))\n    if env_llm_max_concurrent is not None:\n        llm.llm_max_concurrent_calls = env_llm_max_concurrent\n        changed = True\n\n    env_llm_max_retries = _parse_int(os.environ.get(\"LLM_MAX_RETRY_ATTEMPTS\"))\n    if env_llm_max_retries is not None:\n        llm.llm_max_retry_attempts = env_llm_max_retries\n        changed = True\n\n    env_llm_enable_token_rl = _parse_bool(\n        os.environ.get(\"LLM_ENABLE_TOKEN_RATE_LIMITING\")\n    )\n    if (\n        llm.llm_enable_token_rate_limiting == DEFAULTS.LLM_ENABLE_TOKEN_RATE_LIMITING\n        and env_llm_enable_token_rl is not None\n    ):\n        llm.llm_enable_token_rate_limiting = bool(env_llm_enable_token_rl)\n        changed = True\n\n    env_llm_max_input_tokens_per_call = _parse_int(\n        os.environ.get(\"LLM_MAX_INPUT_TOKENS_PER_CALL\")\n    )\n    if (\n        llm.llm_max_input_tokens_per_call is None\n        and env_llm_max_input_tokens_per_call is not None\n    ):\n        llm.llm_max_input_tokens_per_call = env_llm_max_input_tokens_per_call\n        changed = True\n\n    env_llm_max_input_tokens_per_minute = _parse_int(\n        os.environ.get(\"LLM_MAX_INPUT_TOKENS_PER_MINUTE\")\n    )\n    if (\n        llm.llm_max_input_tokens_per_minute is None\n        and env_llm_max_input_tokens_per_minute is not None\n    ):\n        llm.llm_max_input_tokens_per_minute = env_llm_max_input_tokens_per_minute\n        changed = True\n\n    return changed\n\n\ndef _apply_whisper_remote_overrides(whisper: WhisperSettings) -> bool:\n    \"\"\"Apply environment overrides for Remote Whisper settings.\"\"\"\n    changed = False\n    remote_key = os.environ.get(\"WHISPER_REMOTE_API_KEY\") or os.environ.get(\n        \"OPENAI_API_KEY\"\n    )\n    if remote_key:\n        whisper.remote_api_key = remote_key\n        changed = True\n\n    remote_base = os.environ.get(\"WHISPER_REMOTE_BASE_URL\") or os.environ.get(\n        \"OPENAI_BASE_URL\"\n    )\n    if remote_base:\n        whisper.remote_base_url = remote_base\n        changed = True\n\n    remote_model = os.environ.get(\"WHISPER_REMOTE_MODEL\")\n    if remote_model:\n        whisper.remote_model = remote_model\n        changed = True\n\n    remote_timeout = _parse_int(os.environ.get(\"WHISPER_REMOTE_TIMEOUT_SEC\"))\n    if remote_timeout is not None:\n        whisper.remote_timeout_sec = remote_timeout\n        changed = True\n\n    remote_chunksize = _parse_int(os.environ.get(\"WHISPER_REMOTE_CHUNKSIZE_MB\"))\n    if remote_chunksize is not None:\n        whisper.remote_chunksize_mb = remote_chunksize\n        changed = True\n    return changed\n\n\ndef _apply_whisper_groq_overrides(whisper: WhisperSettings) -> bool:\n    \"\"\"Apply environment overrides for Groq Whisper settings.\"\"\"\n    changed = False\n    groq_model_env = os.environ.get(\"GROQ_WHISPER_MODEL\") or os.environ.get(\n        \"WHISPER_GROQ_MODEL\"\n    )\n    if groq_model_env:\n        whisper.groq_model = groq_model_env\n        changed = True\n\n    groq_max_retries_env = _parse_int(os.environ.get(\"GROQ_MAX_RETRIES\"))\n    if groq_max_retries_env is not None:\n        whisper.groq_max_retries = groq_max_retries_env\n        changed = True\n    return changed\n\n\ndef _apply_whisper_env_overrides_force(whisper: WhisperSettings) -> bool:\n    \"\"\"Apply environment overrides to Whisper settings.\"\"\"\n    changed = False\n\n    env_whisper_type = os.environ.get(\"WHISPER_TYPE\")\n    if env_whisper_type:\n        wtype = env_whisper_type.strip().lower()\n        if wtype in {\"local\", \"remote\", \"groq\"}:\n            whisper.whisper_type = wtype\n            changed = True\n\n    # Always update Groq API key if present in env\n    groq_key = os.environ.get(\"GROQ_API_KEY\")\n    if groq_key:\n        whisper.groq_api_key = groq_key\n        changed = True\n\n    if whisper.whisper_type == \"remote\":\n        if _apply_whisper_remote_overrides(whisper):\n            changed = True\n\n    elif whisper.whisper_type == \"groq\":\n        if _apply_whisper_groq_overrides(whisper):\n            changed = True\n\n    elif whisper.whisper_type == \"local\":\n        local_model_env = os.environ.get(\"WHISPER_LOCAL_MODEL\")\n        if local_model_env:\n            whisper.local_model = local_model_env\n            changed = True\n\n    return changed\n\n\ndef _apply_env_overrides_to_db_force() -> None:\n    \"\"\"Force-apply environment overrides to DB, overwriting existing values.\"\"\"\n    llm = LLMSettings.query.get(1)\n    whisper = WhisperSettings.query.get(1)\n\n    if not llm or not whisper:\n        return\n\n    llm_changed = _apply_llm_env_overrides(llm)\n    whisper_changed = _apply_whisper_env_overrides_force(whisper)\n\n    if llm_changed or whisper_changed:\n        safe_commit(\n            db.session,\n            must_succeed=True,\n            context=\"force_env_overrides\",\n            logger_obj=logger,\n        )\n"
  },
  {
    "path": "src/app/db_commit.py",
    "content": "from __future__ import annotations\n\nimport logging\nfrom typing import Any\n\n\ndef safe_commit(\n    session: Any,\n    *,\n    context: str,\n    logger_obj: logging.Logger | None = None,\n    must_succeed: bool = True,\n) -> None:\n    \"\"\"Commit the current transaction and rollback on failure.\n\n    This is a minimal replacement for the old SQLite concurrency helpers.\n    \"\"\"\n    log = logger_obj or logging.getLogger(\"global_logger\")\n    try:\n        session.commit()\n    except Exception as exc:  # pylint: disable=broad-except\n        log.error(\"Commit failed in %s, rolling back: %s\", context, exc, exc_info=True)\n        try:\n            session.rollback()\n        except Exception as rb_exc:  # pylint: disable=broad-except\n            log.error(\"Rollback also failed in %s: %s\", context, rb_exc, exc_info=True)\n        if must_succeed:\n            raise\n"
  },
  {
    "path": "src/app/db_guard.py",
    "content": "\"\"\"Shared helpers to protect long-lived sessions in background threads.\"\"\"\n\nfrom __future__ import annotations\n\nimport logging\nfrom contextlib import contextmanager\nfrom typing import Any, Iterator\n\nfrom sqlalchemy.exc import OperationalError, PendingRollbackError\nfrom sqlalchemy.orm import Session, scoped_session\n\nSessionType = Session | scoped_session[Any]\n\n\ndef reset_session(\n    session: SessionType,\n    logger: logging.Logger,\n    context: str,\n    exc: Exception | None = None,\n) -> None:\n    \"\"\"\n    Roll back and remove a session after a failure to avoid leaving it in a bad state.\n    Safe to call even if the session is already closed/invalid.\n    \"\"\"\n    if exc:\n        logger.warning(\n            \"[SESSION_RESET] context=%s exc=%s; rolling back and removing session\",\n            context,\n            exc,\n        )\n    try:\n        session.rollback()\n    except Exception as rb_exc:  # pylint: disable=broad-except\n        logger.warning(\n            \"[SESSION_RESET] rollback failed in context=%s: %s\", context, rb_exc\n        )\n    try:\n        remove_fn = getattr(session, \"remove\", None)\n        if callable(remove_fn):\n            remove_fn()\n    except Exception as rm_exc:  # pylint: disable=broad-except\n        logger.warning(\n            \"[SESSION_RESET] remove failed in context=%s: %s\", context, rm_exc\n        )\n\n\n@contextmanager\ndef db_guard(\n    context: str, session: SessionType, logger: logging.Logger\n) -> Iterator[None]:\n    \"\"\"\n    Guard a block of DB work so lock/rollback errors always clean the session\n    before propagating.\n    \"\"\"\n    try:\n        yield\n    except (OperationalError, PendingRollbackError) as exc:\n        reset_session(session, logger, context, exc)\n        raise\n"
  },
  {
    "path": "src/app/extensions.py",
    "content": "import os\n\nfrom flask_apscheduler import APScheduler  # type: ignore\nfrom flask_migrate import Migrate\nfrom flask_sqlalchemy import SQLAlchemy\n\n# Unbound singletons; initialized in app factory\ndb = SQLAlchemy()\nscheduler = APScheduler()\n\nbase_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))\nmigrations_dir = os.path.join(base_dir, \"migrations\")\n\nmigrate = Migrate(directory=migrations_dir)\n"
  },
  {
    "path": "src/app/feeds.py",
    "content": "import datetime\nimport logging\nimport uuid\nfrom email.utils import format_datetime, parsedate_to_datetime\nfrom typing import Any, Iterable, Optional, cast\nfrom urllib.parse import parse_qsl, urlencode, urlparse, urlunparse\n\nimport feedparser  # type: ignore[import-untyped]\nimport PyRSS2Gen  # type: ignore[import-untyped]\nfrom flask import current_app, g, request\n\nfrom app.extensions import db\nfrom app.models import Feed, Post, User, UserFeed\nfrom app.runtime_config import config\nfrom app.writer.client import writer_client\nfrom podcast_processor.podcast_downloader import find_audio_link\n\nlogger = logging.getLogger(\"global_logger\")\n\n\ndef is_feed_active_for_user(feed_id: int, user: User) -> bool:\n    \"\"\"Check if the feed is within the user's allowance based on subscription date.\"\"\"\n    if user.role == \"admin\":\n        return True\n\n    # Hack: Always treat Feed 1 as active\n    if feed_id == 1:\n        return True\n\n    # Use manual allowance if set, otherwise fall back to plan allowance\n    manual_allowance = user.manual_feed_allowance\n    if manual_allowance is not None:\n        allowance = int(manual_allowance)\n    else:\n        allowance = int(getattr(user, \"feed_allowance\", 0))\n\n    # Sort user's feeds by creation date to determine priority\n    user_feeds = sorted(user.user_feeds, key=lambda uf: uf.created_at)\n\n    for i, uf in enumerate(user_feeds):\n        if uf.feed_id == feed_id:\n            return i < allowance\n\n    return False\n\n\ndef _should_auto_whitelist_new_posts(feed: Feed, post: Optional[Post] = None) -> bool:\n    \"\"\"Return True when new posts should default to whitelisted for this feed.\"\"\"\n    override = getattr(feed, \"auto_whitelist_new_episodes_override\", None)\n    if override is not None:\n        return bool(override)\n\n    if not getattr(config, \"automatically_whitelist_new_episodes\", False):\n        return False\n\n    from app.auth import is_auth_enabled\n\n    # If auth is disabled, we should auto-whitelist if the global setting is on.\n    if not is_auth_enabled():\n        return True\n\n    memberships = getattr(feed, \"user_feeds\", None) or []\n    if not memberships:\n        # No memberships for this feed. If there are no users in the database at all,\n        # still whitelist. This handles fresh installs where no account exists yet.\n        if db.session.query(User.id).first() is None:\n            return True\n        return False\n\n    # Check if at least one member has this feed in their \"active\" list (within allowance)\n    for membership in memberships:\n        user = membership.user\n        if not user:\n            continue\n\n        if is_feed_active_for_user(feed.id, user):\n            return True\n\n    return False\n\n\ndef _get_base_url() -> str:\n    try:\n        # Check various ways HTTP/2 pseudo-headers might be available\n        http2_scheme = (\n            request.headers.get(\":scheme\")\n            or request.headers.get(\"scheme\")\n            or request.environ.get(\"HTTP2_SCHEME\")\n        )\n        http2_authority = (\n            request.headers.get(\":authority\")\n            or request.headers.get(\"authority\")\n            or request.environ.get(\"HTTP2_AUTHORITY\")\n        )\n        host = request.headers.get(\"Host\")\n\n        if http2_scheme and http2_authority:\n            return f\"{http2_scheme}://{http2_authority}\"\n\n        # Fall back to Host header with scheme detection\n        if host:\n            # Check multiple indicators for HTTPS\n            is_https = (\n                request.is_secure\n                or request.headers.get(\"X-Forwarded-Proto\") == \"https\"\n                or request.headers.get(\"Strict-Transport-Security\") is not None\n                or request.headers.get(\"X-Forwarded-Ssl\") == \"on\"\n                or request.environ.get(\"HTTPS\") == \"on\"\n                or request.scheme == \"https\"\n            )\n            scheme = \"https\" if is_https else \"http\"\n            return f\"{scheme}://{host}\"\n    except RuntimeError:\n        # Working outside of request context\n        pass\n\n    # Use localhost with main app port\n    return \"http://localhost:5001\"\n\n\ndef fetch_feed(url: str) -> feedparser.FeedParserDict:\n    logger.info(f\"Fetching feed from URL: {url}\")\n    feed_data = feedparser.parse(url)\n    for entry in feed_data.entries:\n        entry.id = get_guid(entry)\n    return feed_data\n\n\ndef refresh_feed(feed: Feed) -> None:\n    logger.info(f\"Refreshing feed with ID: {feed.id}\")\n    feed_data = fetch_feed(feed.rss_url)\n\n    updates = {}\n    image_info = feed_data.feed.get(\"image\")\n    if image_info and \"href\" in image_info:\n        new_image_url = image_info[\"href\"]\n        if feed.image_url != new_image_url:\n            updates[\"image_url\"] = new_image_url\n\n    existing_posts = {post.guid for post in feed.posts}  # type: ignore[attr-defined]\n    oldest_post = min(\n        (post for post in feed.posts if post.release_date),  # type: ignore[attr-defined]\n        key=lambda p: p.release_date,\n        default=None,\n    )\n\n    new_posts = []\n    for entry in feed_data.entries:\n        if entry.id not in existing_posts:\n            logger.debug(\"found new podcast: %s\", entry.title)\n            p = make_post(feed, entry)\n            # do not allow automatic download of any backcatalog added to the feed\n            if (\n                oldest_post is not None\n                and p.release_date\n                and oldest_post.release_date\n                and p.release_date.date() < oldest_post.release_date.date()\n            ):\n                p.whitelisted = False\n                logger.debug(\n                    f\"skipping post from archive due to \\\nnumber_of_episodes_to_whitelist_from_archive_of_new_feed setting: {entry.title}\"\n                )\n            else:\n                p.whitelisted = _should_auto_whitelist_new_posts(feed, p)\n\n            post_data = {\n                \"guid\": p.guid,\n                \"title\": p.title,\n                \"description\": p.description,\n                \"download_url\": p.download_url,\n                \"release_date\": p.release_date.isoformat() if p.release_date else None,\n                \"duration\": p.duration,\n                \"image_url\": p.image_url,\n                \"whitelisted\": p.whitelisted,\n                \"feed_id\": feed.id,\n            }\n            new_posts.append(post_data)\n\n    if updates or new_posts:\n        writer_client.action(\n            \"refresh_feed\",\n            {\"feed_id\": feed.id, \"updates\": updates, \"new_posts\": new_posts},\n            wait=True,\n        )\n\n    logger.info(f\"Feed with ID: {feed.id} refreshed\")\n\n\ndef add_or_refresh_feed(url: str) -> Feed:\n    feed_data = fetch_feed(url)\n    if \"title\" not in feed_data.feed:\n        logger.error(\"Invalid feed URL\")\n        raise ValueError(f\"Invalid feed URL: {url}\")\n\n    feed = Feed.query.filter_by(rss_url=url).first()\n    if feed:\n        refresh_feed(feed)\n    else:\n        feed = add_feed(feed_data)\n    return feed  # type: ignore[no-any-return]\n\n\ndef add_feed(feed_data: feedparser.FeedParserDict) -> Feed:\n    logger.info(f\"Storing feed: {feed_data.feed.title}\")\n    try:\n        feed_dict = {\n            \"title\": feed_data.feed.title,\n            \"description\": feed_data.feed.get(\"description\", \"\"),\n            \"author\": feed_data.feed.get(\"author\", \"\"),\n            \"rss_url\": feed_data.href,\n            \"image_url\": feed_data.feed.image.href,\n        }\n\n        # Create a temporary feed object to use make_post helper\n        temp_feed = Feed(**feed_dict)\n        temp_feed.id = 0  # Dummy ID\n\n        posts_data = []\n        num_posts_added = 0\n        for entry in feed_data.entries:\n            p = make_post(temp_feed, entry)\n            if (\n                config.number_of_episodes_to_whitelist_from_archive_of_new_feed\n                is not None\n                and num_posts_added\n                >= config.number_of_episodes_to_whitelist_from_archive_of_new_feed\n            ):\n                p.whitelisted = False\n            else:\n                num_posts_added += 1\n                p.whitelisted = config.automatically_whitelist_new_episodes\n\n            post_data = {\n                \"guid\": p.guid,\n                \"title\": p.title,\n                \"description\": p.description,\n                \"download_url\": p.download_url,\n                \"release_date\": p.release_date.isoformat() if p.release_date else None,\n                \"duration\": p.duration,\n                \"image_url\": p.image_url,\n                \"whitelisted\": p.whitelisted,\n            }\n            posts_data.append(post_data)\n\n        result = writer_client.action(\n            \"add_feed\", {\"feed\": feed_dict, \"posts\": posts_data}, wait=True\n        )\n\n        if result is None or result.data is None:\n            raise RuntimeError(\"Failed to get result from writer action\")\n\n        feed_id = result.data[\"feed_id\"]\n        logger.info(f\"Feed stored with ID: {feed_id}\")\n\n        # Return the feed object\n        feed = db.session.get(Feed, feed_id)\n        if feed is None:\n            raise RuntimeError(f\"Feed {feed_id} not found after creation\")\n        return feed\n\n    except Exception as e:\n        logger.error(f\"Failed to store feed: {e}\")\n        raise e\n\n\nclass ItunesRSSItem(PyRSS2Gen.RSSItem):  # type: ignore[misc]\n    def __init__(\n        self,\n        *,\n        title: str,\n        enclosure: PyRSS2Gen.Enclosure,\n        description: str,\n        guid: str,\n        pubDate: Optional[str],\n        image_url: Optional[str] = None,\n        **kwargs: Any,\n    ) -> None:\n        self.image_url = image_url\n        super().__init__(\n            title=title,\n            enclosure=enclosure,\n            description=description,\n            guid=guid,\n            pubDate=pubDate,\n            **kwargs,\n        )\n\n    def publish_extensions(self, handler: Any) -> None:\n        if self.image_url:\n            handler.startElement(\"itunes:image\", {\"href\": self.image_url})\n            handler.endElement(\"itunes:image\")\n        super().publish_extensions(handler)\n\n\ndef feed_item(post: Post, prepend_feed_title: bool = False) -> PyRSS2Gen.RSSItem:\n    \"\"\"\n    Given a post, return the corresponding RSS item. Reference:\n    https://github.com/Podcast-Standards-Project/PSP-1-Podcast-RSS-Specification?tab=readme-ov-file#required-item-elements\n    \"\"\"\n\n    base_url = _get_base_url()\n\n    # Generate URLs that will be proxied by the frontend to the backend\n    audio_url = _append_feed_token_params(f\"{base_url}/api/posts/{post.guid}/download\")\n    post_details_url = _append_feed_token_params(f\"{base_url}/api/posts/{post.guid}\")\n\n    description = (\n        f'{post.description}\\n<p><a href=\"{post_details_url}\">Podly Post Page</a></p>'\n    )\n\n    title = post.title\n    if prepend_feed_title and post.feed:\n        title = f\"[{post.feed.title}] {title}\"\n\n    item = ItunesRSSItem(\n        title=title,\n        enclosure=PyRSS2Gen.Enclosure(\n            url=audio_url,\n            type=\"audio/mpeg\",\n            length=post.audio_len_bytes(),\n        ),\n        description=description,\n        guid=post.guid,\n        pubDate=_format_pub_date(post.release_date),\n        image_url=post.image_url,\n    )\n\n    return item\n\n\ndef generate_feed_xml(feed: Feed) -> Any:\n    logger.info(f\"Generating XML for feed with ID: {feed.id}\")\n\n    include_unprocessed = getattr(config, \"autoprocess_on_download\", True)\n\n    if include_unprocessed:\n        posts = list(cast(Iterable[Post], feed.posts))\n    else:\n        posts = (\n            Post.query.filter(\n                Post.feed_id == feed.id,\n                Post.whitelisted.is_(True),\n                Post.processed_audio_path.isnot(None),\n            )\n            .order_by(Post.release_date.desc().nullslast(), Post.id.desc())\n            .all()\n        )\n\n    items = [feed_item(post) for post in posts]\n\n    base_url = _get_base_url()\n    link = _append_feed_token_params(f\"{base_url}/feed/{feed.id}\")\n\n    last_build_date = format_datetime(datetime.datetime.now(datetime.timezone.utc))\n\n    rss_feed = PyRSS2Gen.RSS2(\n        title=\"[podly] \" + feed.title,\n        link=link,\n        description=feed.description,\n        lastBuildDate=last_build_date,\n        image=PyRSS2Gen.Image(url=feed.image_url, title=feed.title, link=link),\n        items=items,\n    )\n\n    rss_feed.rss_attrs[\"xmlns:itunes\"] = \"http://www.itunes.com/dtds/podcast-1.0.dtd\"\n    rss_feed.rss_attrs[\"xmlns:content\"] = \"http://purl.org/rss/1.0/modules/content/\"\n\n    logger.info(f\"XML generated for feed with ID: {feed.id}\")\n    return rss_feed.to_xml(\"utf-8\")\n\n\ndef generate_aggregate_feed_xml(user: Optional[User]) -> Any:\n    \"\"\"Generate RSS XML for a user's aggregate feed (last 3 processed posts per feed).\"\"\"\n    username = user.username if user else \"Public\"\n    user_id = user.id if user else 0\n    logger.info(f\"Generating aggregate feed XML for: {username}\")\n\n    posts = get_user_aggregate_posts(user_id)\n    items = [feed_item(post, prepend_feed_title=True) for post in posts]\n\n    base_url = _get_base_url()\n    link = _append_feed_token_params(f\"{base_url}/feed/user/{user_id}\")\n\n    last_build_date = format_datetime(datetime.datetime.now(datetime.timezone.utc))\n\n    if current_app.config.get(\"REQUIRE_AUTH\") and user:\n        feed_title = f\"Podly Podcasts - {user.username}\"\n        feed_description = f\"Aggregate feed for {user.username} - Last 3 processed episodes from each subscribed feed.\"\n    else:\n        feed_title = \"Podly Podcasts\"\n        feed_description = (\n            \"Aggregate feed - Last 3 processed episodes from each subscribed feed.\"\n        )\n\n    rss_feed = PyRSS2Gen.RSS2(\n        title=feed_title,\n        link=link,\n        description=feed_description,\n        lastBuildDate=last_build_date,\n        items=items,\n        image=PyRSS2Gen.Image(\n            url=f\"{base_url}/static/images/logos/manifest-icon-512.maskable.png\",\n            title=feed_title,\n            link=link,\n        ),\n    )\n\n    rss_feed.rss_attrs[\"xmlns:itunes\"] = \"http://www.itunes.com/dtds/podcast-1.0.dtd\"\n    rss_feed.rss_attrs[\"xmlns:content\"] = \"http://purl.org/rss/1.0/modules/content/\"\n\n    logger.info(f\"Aggregate XML generated for: {username}\")\n    return rss_feed.to_xml(\"utf-8\")\n\n\ndef get_user_aggregate_posts(user_id: int, limit_per_feed: int = 3) -> list[Post]:\n    \"\"\"Fetch last N processed posts from each of the user's subscribed feeds.\"\"\"\n    if not current_app.config.get(\"REQUIRE_AUTH\") or user_id == 0:\n        feed_ids = [r[0] for r in Feed.query.with_entities(Feed.id).all()]\n    else:\n        user_feeds = UserFeed.query.filter_by(user_id=user_id).all()\n        feed_ids = [uf.feed_id for uf in user_feeds]\n\n    all_posts = []\n    for feed_id in feed_ids:\n        # Fetch last N processed posts for this feed\n        posts = (\n            Post.query.filter(\n                Post.feed_id == feed_id,\n                Post.whitelisted.is_(True),\n                Post.processed_audio_path.isnot(None),\n            )\n            .order_by(Post.release_date.desc().nullslast(), Post.id.desc())\n            .limit(limit_per_feed)\n            .all()\n        )\n        all_posts.extend(posts)\n\n    # Sort all posts by release date descending\n    all_posts.sort(key=lambda p: p.release_date or datetime.datetime.min, reverse=True)\n\n    return all_posts\n\n\ndef _append_feed_token_params(url: str) -> str:\n    if not current_app.config.get(\"REQUIRE_AUTH\"):\n        return url\n\n    try:\n        token_result = getattr(g, \"feed_token\", None)\n        token_id = request.args.get(\"feed_token\")\n        secret = request.args.get(\"feed_secret\")\n    except RuntimeError:\n        return url\n\n    if token_result is not None:\n        token_id = token_id or token_result.token.token_id\n        secret = secret or token_result.token.token_secret\n\n    if not token_id or not secret:\n        return url\n\n    parsed = urlparse(url)\n    query_params = dict(parse_qsl(parsed.query, keep_blank_values=True))\n    query_params[\"feed_token\"] = token_id\n    query_params[\"feed_secret\"] = secret\n    new_query = urlencode(query_params)\n    return urlunparse(parsed._replace(query=new_query))\n\n\ndef make_post(feed: Feed, entry: feedparser.FeedParserDict) -> Post:\n    # Extract episode image URL, fallback to feed image\n    episode_image_url = None\n\n    # Try to get episode-specific image from various RSS fields\n    if hasattr(entry, \"image\") and entry.image:\n        if isinstance(entry.image, dict) and \"href\" in entry.image:\n            episode_image_url = entry.image[\"href\"]\n        elif isinstance(entry.image, str):\n            episode_image_url = entry.image\n\n    # Try iTunes image tag\n    if not episode_image_url and hasattr(entry, \"itunes_image\"):\n        if isinstance(entry.itunes_image, dict) and \"href\" in entry.itunes_image:\n            episode_image_url = entry.itunes_image[\"href\"]\n        elif isinstance(entry.itunes_image, str):\n            episode_image_url = entry.itunes_image\n\n    # Try media:thumbnail or media:content\n    if not episode_image_url and hasattr(entry, \"media_thumbnail\"):\n        if entry.media_thumbnail and len(entry.media_thumbnail) > 0:\n            episode_image_url = entry.media_thumbnail[0].get(\"url\")\n\n    # Fallback to feed image if no episode-specific image found\n    if not episode_image_url:\n        episode_image_url = feed.image_url\n\n    # Try multiple description fields in order of preference\n    description = entry.get(\"description\", \"\")\n    if not description:\n        description = entry.get(\"summary\", \"\")\n    if not description and hasattr(entry, \"content\") and entry.content:\n        description = entry.content[0].get(\"value\", \"\")\n    if not description:\n        description = entry.get(\"subtitle\", \"\")\n\n    return Post(\n        feed_id=feed.id,\n        guid=get_guid(entry),\n        download_url=find_audio_link(entry),\n        title=entry.title,\n        description=description,\n        release_date=_parse_release_date(entry),\n        duration=get_duration(entry),\n        image_url=episode_image_url,\n    )\n\n\ndef _get_entry_field(entry: feedparser.FeedParserDict, field: str) -> Optional[Any]:\n    value = getattr(entry, field, None)\n    return value if value is not None else entry.get(field)\n\n\ndef _parse_datetime_string(\n    value: Optional[str], field: str\n) -> Optional[datetime.datetime]:\n    if not value:\n        return None\n    try:\n        return parsedate_to_datetime(value)\n    except (TypeError, ValueError):\n        logger.debug(\"Failed to parse %s string for release date\", field)\n        return None\n\n\ndef _parse_struct_time(value: Optional[Any], field: str) -> Optional[datetime.datetime]:\n    if not value:\n        return None\n    try:\n        dt = datetime.datetime(*value[:6])\n    except (TypeError, ValueError):\n        logger.debug(\"Failed to parse %s for release date\", field)\n        return None\n    gmtoff = getattr(value, \"tm_gmtoff\", None)\n    if gmtoff is not None:\n        dt = dt.replace(tzinfo=datetime.timezone(datetime.timedelta(seconds=gmtoff)))\n    return dt\n\n\ndef _normalize_to_utc(dt: Optional[datetime.datetime]) -> Optional[datetime.datetime]:\n    if dt is None:\n        return None\n    if dt.tzinfo is None:\n        dt = dt.replace(tzinfo=datetime.timezone.utc)\n    return dt.astimezone(datetime.timezone.utc)\n\n\ndef _parse_release_date(\n    entry: feedparser.FeedParserDict,\n) -> Optional[datetime.datetime]:\n    \"\"\"Parse a release datetime from a feed entry and normalize to UTC.\"\"\"\n    for field in (\"published\", \"updated\"):\n        dt = _parse_datetime_string(_get_entry_field(entry, field), field)\n        normalized = _normalize_to_utc(dt)\n        if normalized:\n            return normalized\n\n    for field in (\"published_parsed\", \"updated_parsed\"):\n        dt = _parse_struct_time(_get_entry_field(entry, field), field)\n        normalized = _normalize_to_utc(dt)\n        if normalized:\n            return normalized\n\n    return None\n\n\ndef _format_pub_date(release_date: Optional[datetime.datetime]) -> Optional[str]:\n    if not release_date:\n        return None\n\n    normalized = release_date\n    if normalized.tzinfo is None:\n        normalized = normalized.replace(tzinfo=datetime.timezone.utc)\n\n    return format_datetime(normalized.astimezone(datetime.timezone.utc))\n\n\n# sometimes feed entry ids are the post url or something else\ndef get_guid(entry: feedparser.FeedParserDict) -> str:\n    try:\n        uuid.UUID(entry.id)\n        return str(entry.id)\n    except ValueError:\n        dlurl = find_audio_link(entry)\n        return str(uuid.uuid5(uuid.NAMESPACE_URL, dlurl))\n\n\ndef get_duration(entry: feedparser.FeedParserDict) -> Optional[int]:\n    try:\n        return int(entry[\"itunes_duration\"])\n    except Exception:  # pylint: disable=broad-except\n        logger.error(\"Failed to get duration\")\n        logger.error(\"Failed to get duration\")\n        return None\n"
  },
  {
    "path": "src/app/ipc.py",
    "content": "import multiprocessing\nimport os\nfrom multiprocessing.managers import BaseManager\nfrom queue import Queue\nfrom typing import Any\n\n\nclass QueueManager(BaseManager):\n    pass\n\n\n# Define the queue globally so it can be registered\n_command_queue: Queue[Any] = Queue()\n\n\ndef _get_default_authkey() -> bytes:\n    # This key is only used for localhost IPC between the web and writer processes.\n    # It must be identical across processes, otherwise Manager proxy calls can fail\n    # with AuthenticationError ('digest sent was rejected').\n    raw = os.environ.get(\"PODLY_IPC_AUTHKEY\", \"podly_secret\")\n    return raw.encode(\"utf-8\")\n\n\ndef _ensure_process_authkey(authkey: bytes) -> None:\n    try:\n        multiprocessing.current_process().authkey = authkey\n    except Exception:\n        # Best-effort: if we can't set it, the explicit authkey passed to the\n        # manager will still be used for direct manager connections.\n        pass\n\n\ndef get_queue() -> Queue[Any]:\n    return _command_queue\n\n\ndef make_server_manager(\n    address: tuple[str, int] = (\"127.0.0.1\", 50001),\n    authkey: bytes | None = None,\n) -> QueueManager:\n    if authkey is None:\n        authkey = _get_default_authkey()\n    _ensure_process_authkey(authkey)\n    QueueManager.register(\"get_command_queue\", callable=get_queue)\n    # Register Queue so we can pass it around for replies\n    QueueManager.register(\"Queue\", callable=Queue)\n    manager = QueueManager(address=address, authkey=authkey)\n    return manager\n\n\ndef make_client_manager(\n    address: tuple[str, int] = (\"127.0.0.1\", 50001),\n    authkey: bytes | None = None,\n) -> QueueManager:\n    if authkey is None:\n        authkey = _get_default_authkey()\n    _ensure_process_authkey(authkey)\n    QueueManager.register(\"get_command_queue\")\n    QueueManager.register(\"Queue\")\n    manager = QueueManager(address=address, authkey=authkey)\n    manager.connect()\n    return manager\n"
  },
  {
    "path": "src/app/job_manager.py",
    "content": "import logging\nimport os\nfrom typing import Any, Dict, Optional, Tuple\n\nfrom app.extensions import db as _db\nfrom app.models import Post, ProcessingJob\nfrom podcast_processor.processing_status_manager import ProcessingStatusManager\n\n\nclass JobManager:\n    \"\"\"Manage the lifecycle guarantees for a single `ProcessingJob` record.\"\"\"\n\n    ACTIVE_STATUSES = {\"pending\", \"running\"}\n\n    def __init__(\n        self,\n        post_guid: str,\n        status_manager: ProcessingStatusManager,\n        logger_obj: logging.Logger,\n        run_id: Optional[str],\n        *,\n        requested_by_user_id: Optional[int] = None,\n        billing_user_id: Optional[int] = None,\n    ) -> None:\n        self.post_guid = post_guid\n        self._status_manager = status_manager\n        self._logger = logger_obj\n        self._run_id = run_id\n        self._requested_by_user_id = requested_by_user_id\n        self._billing_user_id = billing_user_id\n        self.job: Optional[ProcessingJob] = None\n\n    @property\n    def job_id(self) -> Optional[str]:\n        return getattr(self.job, \"id\", None) if self.job else None\n\n    def _reload_job(self) -> Optional[ProcessingJob]:\n        self.job = (\n            ProcessingJob.query.filter_by(post_guid=self.post_guid)\n            .order_by(ProcessingJob.created_at.desc())\n            .first()\n        )\n        return self.job\n\n    def get_active_job(self) -> Optional[ProcessingJob]:\n        job = self.job or self._reload_job()\n        if job and job.status in self.ACTIVE_STATUSES:\n            return job\n        return None\n\n    def ensure_job(self) -> ProcessingJob:\n        job = self.get_active_job()\n        if job:\n            changed = False\n            if self._run_id and job.jobs_manager_run_id != self._run_id:\n                job.jobs_manager_run_id = self._run_id\n                changed = True\n            if self._requested_by_user_id and job.requested_by_user_id is None:\n                job.requested_by_user_id = self._requested_by_user_id\n                changed = True\n            if self._billing_user_id is not None and (\n                job.billing_user_id != self._billing_user_id\n            ):\n                job.billing_user_id = self._billing_user_id\n                changed = True\n            if changed:\n                self._status_manager.db_session.flush()\n            return job\n        job_id = self._status_manager.generate_job_id()\n        job = self._status_manager.create_job(\n            self.post_guid,\n            job_id,\n            self._run_id,\n            requested_by_user_id=self._requested_by_user_id,\n            billing_user_id=self._billing_user_id,\n        )\n        self.job = job\n        return job\n\n    def fail(self, message: str, step: int = 0, progress: float = 0.0) -> ProcessingJob:\n        job = self.ensure_job()\n        step = step or job.current_step or 0\n        progress = progress or job.progress_percentage or 0.0\n        self._status_manager.update_job_status(job, \"failed\", step, message, progress)\n        return job\n\n    def complete(self, message: str = \"Processing complete\") -> ProcessingJob:\n        job = self.ensure_job()\n        total_steps = job.total_steps or 4\n        self._status_manager.update_job_status(\n            job, \"completed\", total_steps, message, 100.0\n        )\n        return job\n\n    def skip(\n        self,\n        message: str = \"Processing skipped\",\n        step: Optional[int] = None,\n        progress: Optional[float] = None,\n    ) -> ProcessingJob:\n        job = self.ensure_job()\n        total_steps = job.total_steps or 4\n        resolved_step = step if step is not None else total_steps\n        resolved_progress = progress if progress is not None else 100.0\n        job.error_message = None\n        self._status_manager.update_job_status(\n            job, \"skipped\", resolved_step, message, resolved_progress\n        )\n        return job\n\n    def _load_and_validate_post(\n        self,\n    ) -> Tuple[Optional[Post], Optional[Dict[str, Any]]]:\n        \"\"\"Load the post and perform lifecycle validations.\"\"\"\n        post = Post.query.filter_by(guid=self.post_guid).first()\n        if not post:\n            job = self._mark_job_skipped(\"Post no longer exists\")\n            return (\n                None,\n                {\n                    \"status\": \"error\",\n                    \"error_code\": \"NOT_FOUND\",\n                    \"message\": \"Post not found\",\n                    \"job_id\": getattr(job, \"id\", None),\n                },\n            )\n\n        if not post.whitelisted:\n            job = self._mark_job_skipped(\"Post not whitelisted\")\n            return (\n                None,\n                {\n                    \"status\": \"error\",\n                    \"error_code\": \"NOT_WHITELISTED\",\n                    \"message\": \"Post not whitelisted\",\n                    \"job_id\": getattr(job, \"id\", None),\n                },\n            )\n\n        if not post.download_url:\n            self._logger.warning(\n                \"Post %s (%s) is whitelisted but missing download_url; marking job as failed\",\n                post.guid,\n                post.title,\n            )\n            job = self.fail(\"Download URL missing\")\n            return (\n                None,\n                {\n                    \"status\": \"error\",\n                    \"error_code\": \"MISSING_DOWNLOAD_URL\",\n                    \"message\": \"Post is missing a download URL\",\n                    \"job_id\": job.id,\n                },\n            )\n\n        if post.processed_audio_path and os.path.exists(post.processed_audio_path):\n            try:\n                job = self.skip(\"Post already processed\")\n            except Exception as err:  # pylint: disable=broad-exception-caught\n                self._logger.error(\n                    \"Failed to mark job as completed during short-circuit for %s: %s\",\n                    self.post_guid,\n                    err,\n                )\n                job = None\n            return (\n                None,\n                {\n                    \"status\": \"skipped\",\n                    \"message\": \"Post already processed\",\n                    \"job_id\": getattr(job, \"id\", None),\n                    \"download_url\": f\"/api/posts/{self.post_guid}/download\",\n                },\n            )\n\n        return post, None\n\n    def _mark_job_skipped(self, reason: str) -> Optional[ProcessingJob]:\n        job = self.get_active_job()\n        if job and job.status in {\"pending\", \"running\"}:\n            job.error_message = None\n            total_steps = job.total_steps or job.current_step or 4\n            self._status_manager.update_job_status(\n                job,\n                \"skipped\",\n                total_steps,\n                reason,\n                100.0,\n            )\n            return job\n\n        try:\n            return self.skip(reason)\n        except Exception as err:  # pylint: disable=broad-exception-caught\n            self._logger.error(\n                \"Failed to mark job as skipped for %s: %s\", self.post_guid, err\n            )\n        return job\n\n    def start_processing(self, priority: str) -> Dict[str, Any]:\n        \"\"\"\n        Handle the end-to-end lifecycle for a single post processing request.\n        Ensures a job exists and is marked ready for the worker thread.\n        \"\"\"\n        _, early_result = self._load_and_validate_post()\n        if early_result:\n            return early_result\n\n        _db.session.expire_all()\n\n        job = self.ensure_job()\n\n        if job.status == \"running\":\n            return {\n                \"status\": \"running\",\n                \"message\": \"Another processing job is already running for this episode\",\n                \"job_id\": job.id,\n            }\n\n        self._status_manager.update_job_status(\n            job,\n            \"pending\",\n            0,\n            f\"Queued for processing (priority={priority})\",\n            0.0,\n        )\n\n        return {\n            \"status\": \"started\",\n            \"message\": \"Job queued for processing\",\n            \"job_id\": job.id,\n        }\n"
  },
  {
    "path": "src/app/jobs_manager.py",
    "content": "import logging\nimport os\nfrom datetime import datetime, timedelta\nfrom threading import Event, Lock, Thread\nfrom typing import Any, Dict, List, Optional, Tuple, cast\n\nfrom sqlalchemy import case\n\nfrom app.db_guard import db_guard, reset_session\nfrom app.extensions import db as _db\nfrom app.extensions import scheduler\nfrom app.feeds import refresh_feed\nfrom app.job_manager import JobManager as SingleJobManager\nfrom app.models import Feed, JobsManagerRun, Post, ProcessingJob\nfrom app.processor import get_processor\nfrom app.writer.client import writer_client\nfrom podcast_processor.podcast_processor import ProcessorException\nfrom podcast_processor.processing_status_manager import ProcessingStatusManager\n\nlogger = logging.getLogger(\"global_logger\")\n\n\nclass JobsManager:\n    \"\"\"\n    Centralized manager for starting, tracking, listing, and cancelling\n    podcast processing jobs.\n\n    Owns a shared worker pool and coordinates with ProcessingStatusManager.\n    \"\"\"\n\n    # Class-level lock to ensure only one job processes at a time across ALL instances\n    _global_processing_lock = Lock()\n\n    def __init__(self) -> None:\n        # Status manager for DB interactions\n        self._status_manager = ProcessingStatusManager(\n            db_session=_db.session, logger=logger\n        )\n\n        # Track the singleton run id with thread-safe access\n        self._run_lock = Lock()\n        self._run_id: Optional[str] = None\n\n        # Persistent worker thread coordination\n        self._stop_event = Event()\n        self._work_event = Event()\n        self._worker_thread = Thread(\n            target=self._worker_loop, name=\"jobs-manager-worker\", daemon=True\n        )\n        self._worker_thread.start()\n\n        # Initialize run via writer\n        with scheduler.app.app_context():\n            try:\n                result = writer_client.action(\n                    \"ensure_active_run\",\n                    {\"trigger\": \"startup\", \"context\": {\"source\": \"init\"}},\n                    wait=True,\n                )\n                if result and result.success and result.data:\n                    self._set_run_id(result.data[\"run_id\"])\n            except Exception as e:\n                logger.error(f\"Failed to initialize run: {e}\")\n\n    def _set_run_id(self, run_id: Optional[str]) -> None:\n        with self._run_lock:\n            self._run_id = run_id\n\n    def _get_run_id(self) -> Optional[str]:\n        with self._run_lock:\n            return self._run_id\n\n    def _wake_worker(self) -> None:\n        self._work_event.set()\n\n    def _wait_for_work(self, timeout: float = 5.0) -> None:\n        triggered = self._work_event.wait(timeout)\n        if triggered:\n            self._work_event.clear()\n\n    # ------------------------ Public API ------------------------\n    def start_post_processing(\n        self,\n        post_guid: str,\n        priority: str = \"interactive\",\n        *,\n        requested_by_user_id: Optional[int] = None,\n        billing_user_id: Optional[int] = None,\n    ) -> Dict[str, Any]:\n        \"\"\"\n        Idempotently start processing for a post. If an active job exists, return it.\n        \"\"\"\n        with scheduler.app.app_context():\n            ensure_result = writer_client.action(\n                \"ensure_active_run\",\n                {\n                    \"trigger\": \"interactive_start\",\n                    \"context\": {\"post_guid\": post_guid, \"priority\": priority},\n                },\n                wait=True,\n            )\n            run_id = None\n            if ensure_result and ensure_result.success and ensure_result.data:\n                run_id = ensure_result.data.get(\"run_id\")\n            self._set_run_id(run_id)\n            start_result = SingleJobManager(\n                post_guid,\n                self._status_manager,\n                logger,\n                run_id,\n                requested_by_user_id=requested_by_user_id,\n                billing_user_id=billing_user_id,\n            ).start_processing(priority)\n        if start_result.get(\"status\") in {\"started\", \"running\"}:\n            self._wake_worker()\n        return start_result\n\n    def enqueue_pending_jobs(\n        self,\n        trigger: str = \"system\",\n        context: Optional[Dict[str, Any]] = None,\n    ) -> Dict[str, Any]:\n        \"\"\"\n        Ensure all posts have job records and enqueue pending work.\n\n        Returns basic stats for logging/monitoring.\n        \"\"\"\n        with scheduler.app.app_context():\n            result = writer_client.action(\n                \"ensure_active_run\", {\"trigger\": trigger, \"context\": context}, wait=True\n            )\n\n            run_id = None\n            if result and result.success and result.data:\n                run_id = result.data[\"run_id\"]\n            self._set_run_id(run_id)\n\n            active_run = _db.session.get(JobsManagerRun, run_id) if run_id else None\n\n            created_count, pending_count = self._cleanup_and_process_new_posts(\n                active_run\n            )\n\n            response = {\n                \"status\": \"ok\",\n                \"created\": created_count,\n                \"pending\": pending_count,\n                \"enqueued\": pending_count,\n                \"run_id\": run_id,\n            }\n        if pending_count:\n            self._wake_worker()\n        return response\n\n    def _ensure_jobs_for_all_posts(self, run_id: Optional[str]) -> int:\n        \"\"\"Ensure every post has an associated ProcessingJob record.\"\"\"\n        posts_without_jobs = (\n            Post.query.outerjoin(ProcessingJob, ProcessingJob.post_guid == Post.guid)\n            .filter(ProcessingJob.id.is_(None))\n            .all()\n        )\n\n        created = 0\n        for post in posts_without_jobs:\n            if post.whitelisted:\n                SingleJobManager(\n                    post.guid,\n                    self._status_manager,\n                    logger,\n                    run_id,\n                ).ensure_job()\n                created += 1\n        return created\n\n    def get_post_status(self, post_guid: str) -> Dict[str, Any]:\n        with scheduler.app.app_context():\n            post = Post.query.filter_by(guid=post_guid).first()\n            if not post:\n                return {\n                    \"status\": \"error\",\n                    \"error_code\": \"NOT_FOUND\",\n                    \"message\": \"Post not found\",\n                }\n\n            job = (\n                ProcessingJob.query.filter_by(post_guid=post_guid)\n                .order_by(ProcessingJob.created_at.desc())\n                .first()\n            )\n\n            if not job:\n                if post.processed_audio_path and os.path.exists(\n                    post.processed_audio_path\n                ):\n                    return {\n                        \"status\": \"skipped\",\n                        \"step\": 4,\n                        \"step_name\": \"Processing skipped\",\n                        \"total_steps\": 4,\n                        \"progress_percentage\": 100.0,\n                        \"message\": \"Post already processed\",\n                        \"download_url\": f\"/api/posts/{post_guid}/download\",\n                    }\n                return {\n                    \"status\": \"not_started\",\n                    \"step\": 0,\n                    \"step_name\": \"Not started\",\n                    \"total_steps\": 4,\n                    \"progress_percentage\": 0.0,\n                    \"message\": \"No processing job found\",\n                }\n\n            response = {\n                \"status\": job.status,\n                \"step\": job.current_step,\n                \"step_name\": job.step_name or \"Unknown\",\n                \"total_steps\": job.total_steps,\n                \"progress_percentage\": job.progress_percentage,\n                \"message\": job.step_name\n                or f\"Step {job.current_step} of {job.total_steps}\",\n            }\n            if job.started_at:\n                response[\"started_at\"] = job.started_at.isoformat()\n            if (\n                job.status in {\"completed\", \"skipped\"}\n                and post.processed_audio_path\n                and os.path.exists(post.processed_audio_path)\n            ):\n                response[\"download_url\"] = f\"/api/posts/{post_guid}/download\"\n            if job.status == \"failed\" and job.error_message:\n                response[\"error\"] = job.error_message\n            if job.status == \"cancelled\" and job.error_message:\n                response[\"message\"] = job.error_message\n            return response\n\n    def get_job_status(self, job_id: str) -> Dict[str, Any]:\n        with scheduler.app.app_context():\n            job = _db.session.get(ProcessingJob, job_id)\n            if not job:\n                return {\n                    \"status\": \"error\",\n                    \"error_code\": \"NOT_FOUND\",\n                    \"message\": \"Job not found\",\n                }\n            return {\n                \"job_id\": job.id,\n                \"post_guid\": job.post_guid,\n                \"status\": job.status,\n                \"step\": job.current_step,\n                \"step_name\": job.step_name,\n                \"total_steps\": job.total_steps,\n                \"progress_percentage\": job.progress_percentage,\n                \"started_at\": job.started_at.isoformat() if job.started_at else None,\n                \"completed_at\": (\n                    job.completed_at.isoformat() if job.completed_at else None\n                ),\n                \"error\": job.error_message,\n            }\n\n    def list_active_jobs(self, limit: int = 100) -> List[Dict[str, Any]]:\n        with scheduler.app.app_context():\n            # Derive a simple priority from status: running > pending\n            priority_order = case(\n                (ProcessingJob.status == \"running\", 2),\n                (ProcessingJob.status == \"pending\", 1),\n                else_=0,\n            ).label(\"priority\")\n\n            rows = (\n                _db.session.query(ProcessingJob, Post, priority_order)\n                .outerjoin(Post, ProcessingJob.post_guid == Post.guid)\n                .filter(ProcessingJob.status.in_([\"pending\", \"running\"]))\n                .order_by(priority_order.desc(), ProcessingJob.created_at.desc())\n                .limit(limit)\n                .all()\n            )\n\n            results: List[Dict[str, Any]] = []\n            for job, post, prio in rows:\n                results.append(\n                    {\n                        \"job_id\": job.id,\n                        \"post_guid\": job.post_guid,\n                        \"post_title\": post.title if post else None,\n                        \"feed_title\": post.feed.title if post and post.feed else None,\n                        \"status\": job.status,\n                        \"priority\": int(prio) if prio is not None else 0,\n                        \"step\": job.current_step,\n                        \"step_name\": job.step_name,\n                        \"total_steps\": job.total_steps,\n                        \"progress_percentage\": job.progress_percentage,\n                        \"created_at\": (\n                            job.created_at.isoformat() if job.created_at else None\n                        ),\n                        \"started_at\": (\n                            job.started_at.isoformat() if job.started_at else None\n                        ),\n                        \"completed_at\": (\n                            job.completed_at.isoformat() if job.completed_at else None\n                        ),\n                        \"error_message\": job.error_message,\n                    }\n                )\n\n            return results\n\n    def list_all_jobs_detailed(self, limit: int = 200) -> List[Dict[str, Any]]:\n        with scheduler.app.app_context():\n            # Priority by status, others ranked lowest\n            priority_order = case(\n                (ProcessingJob.status == \"running\", 2),\n                (ProcessingJob.status == \"pending\", 1),\n                else_=0,\n            ).label(\"priority\")\n\n            rows = (\n                _db.session.query(ProcessingJob, Post, priority_order)\n                .outerjoin(Post, ProcessingJob.post_guid == Post.guid)\n                .order_by(priority_order.desc(), ProcessingJob.created_at.desc())\n                .limit(limit)\n                .all()\n            )\n\n            results: List[Dict[str, Any]] = []\n            for job, post, prio in rows:\n                results.append(\n                    {\n                        \"job_id\": job.id,\n                        \"post_guid\": job.post_guid,\n                        \"post_title\": post.title if post else None,\n                        \"feed_title\": post.feed.title if post and post.feed else None,\n                        \"status\": job.status,\n                        \"priority\": int(prio) if prio is not None else 0,\n                        \"step\": job.current_step,\n                        \"step_name\": job.step_name,\n                        \"total_steps\": job.total_steps,\n                        \"progress_percentage\": job.progress_percentage,\n                        \"created_at\": (\n                            job.created_at.isoformat() if job.created_at else None\n                        ),\n                        \"started_at\": (\n                            job.started_at.isoformat() if job.started_at else None\n                        ),\n                        \"completed_at\": (\n                            job.completed_at.isoformat() if job.completed_at else None\n                        ),\n                        \"error_message\": job.error_message,\n                    }\n                )\n\n            return results\n\n    def cancel_job(self, job_id: str) -> Dict[str, Any]:\n        with scheduler.app.app_context():\n            job = _db.session.get(ProcessingJob, job_id)\n            if not job:\n                return {\n                    \"status\": \"error\",\n                    \"error_code\": \"NOT_FOUND\",\n                    \"message\": \"Job not found\",\n                }\n\n            if job.status in [\"completed\", \"failed\", \"cancelled\", \"skipped\"]:\n                return {\n                    \"status\": \"error\",\n                    \"error_code\": \"ALREADY_FINISHED\",\n                    \"message\": f\"Job already {job.status}\",\n                }\n\n            # Mark job as cancelled in database\n            self._status_manager.mark_cancelled(job_id, \"Cancelled by user request\")\n\n            return {\n                \"status\": \"cancelled\",\n                \"job_id\": job_id,\n                \"message\": \"Job cancelled\",\n            }\n\n    def cancel_post_jobs(self, post_guid: str) -> Dict[str, Any]:\n        with scheduler.app.app_context():\n            # Find active jobs for this post in database\n            active_jobs = (\n                ProcessingJob.query.filter_by(post_guid=post_guid)\n                .filter(ProcessingJob.status.in_([\"pending\", \"running\"]))\n                .all()\n            )\n\n            job_ids = [job.id for job in active_jobs]\n            for job in active_jobs:\n                self._status_manager.mark_cancelled(job.id, \"Cancelled by user request\")\n\n            return {\n                \"status\": \"cancelled\",\n                \"post_guid\": post_guid,\n                \"job_ids\": job_ids,\n                \"message\": f\"Cancelled {len(job_ids)} jobs\",\n            }\n\n    def cleanup_stale_jobs(self, older_than: timedelta) -> int:\n        try:\n            result = writer_client.action(\n                \"cleanup_stale_jobs\",\n                {\"older_than_seconds\": older_than.total_seconds()},\n                wait=True,\n            )\n            if result and result.success and result.data:\n                return cast(int, result.data.get(\"count\", 0))\n            return 0\n        except Exception as e:\n            logger.error(f\"Failed to cleanup stale jobs: {e}\")\n            return 0\n\n    def cleanup_stuck_pending_jobs(self, stuck_threshold_minutes: int = 10) -> int:\n        \"\"\"\n        Clean up jobs that have been stuck in 'pending' status for too long.\n        This indicates they were never picked up by the thread pool.\n        \"\"\"\n        cutoff = datetime.utcnow() - timedelta(minutes=stuck_threshold_minutes)\n        with scheduler.app.app_context():\n            stuck_jobs = ProcessingJob.query.filter(\n                ProcessingJob.status == \"pending\", ProcessingJob.created_at < cutoff\n            ).all()\n\n            count = len(stuck_jobs)\n            for job in stuck_jobs:\n                try:\n                    logger.warning(\n                        f\"Marking stuck pending job {job.id} as failed (created at {job.created_at})\"\n                    )\n                    self._status_manager.update_job_status(\n                        job,\n                        \"failed\",\n                        job.current_step,\n                        f\"Job was stuck in pending status for over {stuck_threshold_minutes} minutes\",\n                    )\n                except Exception as e:  # pylint: disable=broad-except\n                    logger.error(f\"Failed to update stuck job {job.id}: {e}\")\n\n            return count\n\n    def clear_all_jobs(self) -> Dict[str, Any]:\n        \"\"\"\n        Clear all processing jobs from the database.\n        This is typically called during application startup to ensure a clean state.\n        \"\"\"\n        try:\n            result = writer_client.action(\"clear_all_jobs\", {}, wait=True)\n            count = result.data if result and result.success else 0\n            logger.info(f\"Cleared {count} processing jobs on startup\")\n            return {\n                \"status\": \"success\",\n                \"cleared_jobs\": count,\n                \"message\": f\"Cleared {count} jobs from database\",\n            }\n        except Exception as e:\n            logger.error(f\"Error clearing all jobs: {e}\")\n            return {\"status\": \"error\", \"message\": f\"Failed to clear jobs: {str(e)}\"}\n\n    def start_refresh_all_feeds(\n        self,\n        trigger: str = \"scheduled\",\n        context: Optional[Dict[str, Any]] = None,\n    ) -> Dict[str, Any]:\n        \"\"\"\n        Refresh feeds and enqueue per-post processing into internal worker pool.\n        \"\"\"\n        with scheduler.app.app_context():\n            feeds = Feed.query.all()\n            for feed in feeds:\n                refresh_feed(feed)\n\n            # Clean up posts with missing audio files\n            self._cleanup_inconsistent_posts()\n\n            # Process new posts\n            return self.enqueue_pending_jobs(trigger=trigger, context=context)\n\n    # ------------------------ Helpers ------------------------\n    def _cleanup_inconsistent_posts(self) -> None:\n        \"\"\"Clean up posts with missing audio files.\"\"\"\n        try:\n            writer_client.action(\"cleanup_missing_audio_paths\", {}, wait=True)\n        except Exception as e:\n            logger.error(\n                f\"Failed to cleanup inconsistent posts: {e}\",\n                exc_info=True,\n            )\n\n    def _cleanup_and_process_new_posts(\n        self, active_run: Optional[JobsManagerRun]\n    ) -> Tuple[int, int]:\n        \"\"\"Ensure all posts have jobs and return counts for monitoring.\"\"\"\n        run_id = active_run.id if active_run else None\n        created_jobs = self._ensure_jobs_for_all_posts(run_id)\n\n        pending_jobs = (\n            ProcessingJob.query.filter(ProcessingJob.status == \"pending\")\n            .order_by(ProcessingJob.created_at.asc())\n            .all()\n        )\n\n        if active_run and pending_jobs:\n            try:\n                writer_client.action(\n                    \"reassign_pending_jobs\", {\"run_id\": run_id}, wait=True\n                )\n            except Exception as e:  # pylint: disable=broad-except\n                logger.error(\"Failed to reassign pending jobs: %s\", e)\n\n        if created_jobs:\n            logger.info(\"Created %s new job records\", created_jobs)\n\n        logger.info(\n            \"Pending jobs ready for worker: count=%s run_id=%s\",\n            len(pending_jobs),\n            run_id,\n        )\n\n        return created_jobs, len(pending_jobs)\n\n    # Removed _get_active_job_for_guid - now using direct database queries\n\n    # ------------------------ Internal helpers ------------------------\n\n    def _dequeue_next_job(self) -> Optional[Tuple[str, str]]:\n        \"\"\"Return the next pending job id and post guid, or None if idle.\n\n        CRITICAL: This method atomically marks the job as \"running\" when dequeuing\n        to prevent race conditions where multiple jobs could be dequeued before\n        any is marked as running.\n        \"\"\"\n        try:\n            run_id = self._get_run_id()\n            result = writer_client.action(\"dequeue_job\", {\"run_id\": run_id}, wait=True)\n\n            if result and result.success and result.data:\n                job_id = result.data[\"job_id\"]\n                post_guid = result.data[\"post_guid\"]\n\n                logger.info(\n                    \"[JOB_DEQUEUE] Successfully dequeued and marked running: job_id=%s post_guid=%s\",\n                    job_id,\n                    post_guid,\n                )\n                return job_id, post_guid\n\n            return None\n        except Exception as e:\n            logger.error(f\"Error dequeuing job: {e}\")\n            return None\n\n    def _worker_loop(self) -> None:\n        \"\"\"Background loop that continuously processes pending jobs.\n\n        CRITICAL: This runs in a single dedicated daemon thread. Combined with\n        the _global_processing_lock in _process_job, this ensures truly sequential\n        job execution with no parallelism.\n        \"\"\"\n        import threading\n\n        logger.info(\n            \"[WORKER_LOOP] Started single worker thread: thread_name=%s thread_id=%s\",\n            threading.current_thread().name,\n            threading.current_thread().ident,\n        )\n        while not self._stop_event.is_set():\n            try:\n                job_details = self._dequeue_next_job()\n                if not job_details:\n                    self._wait_for_work()\n                    continue\n                job_id, post_guid = job_details\n                self._process_job(job_id, post_guid)\n            except Exception as exc:  # pylint: disable=broad-except\n                logger.error(\"Worker loop error: %s\", exc, exc_info=True)\n                reset_session(_db.session, logger, \"worker_loop_exception\", exc)\n\n    def _process_job(self, job_id: str, post_guid: str) -> None:\n        \"\"\"Execute a single job using the processor.\n\n        Uses a global processing lock to absolutely guarantee single-job execution.\n        \"\"\"\n        # Acquire global lock to ensure only one job runs at a time\n        logger.info(\n            \"[JOB_PROCESS] Waiting for processing lock: job_id=%s post_guid=%s\",\n            job_id,\n            post_guid,\n        )\n        with JobsManager._global_processing_lock:\n            logger.info(\n                \"[JOB_PROCESS] Acquired processing lock: job_id=%s post_guid=%s\",\n                job_id,\n                post_guid,\n            )\n            with scheduler.app.app_context():\n                with db_guard(\"process_job\", _db.session, logger):\n                    try:\n                        # Clear any failed transaction state from prior work on this session.\n                        try:\n                            _db.session.rollback()\n                        except Exception:  # pylint: disable=broad-except\n                            pass\n\n                        # Expire all cached objects to ensure fresh reads\n                        _db.session.expire_all()\n\n                        logger.debug(\n                            \"Worker starting job_id=%s post_guid=%s\", job_id, post_guid\n                        )\n                        worker_post = Post.query.filter_by(guid=post_guid).first()\n                        if not worker_post:\n                            logger.error(\n                                \"Post with GUID %s not found; failing job %s\",\n                                post_guid,\n                                job_id,\n                            )\n                            job = _db.session.get(ProcessingJob, job_id)\n                            if job:\n                                self._status_manager.update_job_status(\n                                    job,\n                                    \"failed\",\n                                    job.current_step or 0,\n                                    \"Post not found\",\n                                    0.0,\n                                )\n                            return\n\n                        def _cancelled() -> bool:\n                            # Expire the job before re-querying to get fresh state\n                            _db.session.expire_all()\n                            current_job = _db.session.get(ProcessingJob, job_id)\n                            return (\n                                current_job is None or current_job.status == \"cancelled\"\n                            )\n\n                        get_processor().process(\n                            worker_post, job_id=job_id, cancel_callback=_cancelled\n                        )\n                    except ProcessorException as exc:\n                        logger.info(\n                            \"Job %s finished with processor exception: %s\", job_id, exc\n                        )\n                    except Exception as exc:  # pylint: disable=broad-except\n                        logger.error(\n                            \"Unexpected error in job %s: %s\", job_id, exc, exc_info=True\n                        )\n                        try:\n                            _db.session.expire_all()\n                            failed_job = _db.session.get(ProcessingJob, job_id)\n                            if failed_job and failed_job.status not in [\n                                \"completed\",\n                                \"cancelled\",\n                                \"failed\",\n                            ]:\n                                self._status_manager.update_job_status(\n                                    failed_job,\n                                    \"failed\",\n                                    failed_job.current_step or 0,\n                                    f\"Job execution failed: {exc}\",\n                                    failed_job.progress_percentage or 0.0,\n                                )\n                        except (\n                            Exception\n                        ) as cleanup_error:  # pylint: disable=broad-except\n                            logger.error(\n                                \"Failed to update job status after error: %s\",\n                                cleanup_error,\n                                exc_info=True,\n                            )\n                    finally:\n                        # Always clean up session state after job processing to release any locks\n                        try:\n                            _db.session.rollback()\n                        except Exception:  # pylint: disable=broad-except\n                            pass\n                        try:\n                            _db.session.remove()\n                        except Exception as exc:  # pylint: disable=broad-except\n                            logger.warning(\n                                \"Failed to remove session after job: %s\", exc\n                            )\n            logger.info(\n                \"[JOB_PROCESS] Released processing lock: job_id=%s post_guid=%s\",\n                job_id,\n                post_guid,\n            )\n\n\n# Singleton accessor\ndef get_jobs_manager() -> JobsManager:\n    if not hasattr(get_jobs_manager, \"_instance\"):\n        get_jobs_manager._instance = JobsManager()  # type: ignore[attr-defined]\n    return get_jobs_manager._instance  # type: ignore[attr-defined, no-any-return]\n\n\ndef scheduled_refresh_all_feeds() -> None:\n    \"\"\"Top-level function for APScheduler to invoke periodically.\"\"\"\n    try:\n        get_jobs_manager().start_refresh_all_feeds(trigger=\"scheduled\")\n    except Exception as e:  # pylint: disable=broad-except\n        logger.error(f\"Scheduled refresh failed: {e}\")\n"
  },
  {
    "path": "src/app/jobs_manager_run_service.py",
    "content": "\"\"\"Helpers for managing the singleton JobsManagerRun row.\"\"\"\n\nfrom __future__ import annotations\n\nimport logging\nfrom datetime import datetime\nfrom typing import Any, Dict, Optional, cast\n\nfrom sqlalchemy import func\n\nfrom app.models import JobsManagerRun, ProcessingJob\n\nlogger = logging.getLogger(\"writer\")\n\nSINGLETON_RUN_ID = \"jobs-manager-singleton\"\n\n\ndef _session_get(session: Any, ident: str) -> Optional[JobsManagerRun]:\n    \"\"\"Get a JobsManagerRun by id from a session-like object.\n\n    Accepts both modern Session objects that implement .get(model, id)\n    and older SQLAlchemy session objects where .query(...).get(id) is used.\n    Returns None if not found.\n    \"\"\"\n    getter = getattr(session, \"get\", None)\n    if callable(getter):\n        return cast(Optional[JobsManagerRun], getter(JobsManagerRun, ident))\n    # Fallback for older SQLAlchemy versions\n    return cast(Optional[JobsManagerRun], session.query(JobsManagerRun).get(ident))\n\n\ndef _build_context_payload(\n    trigger: str, context: Optional[Dict[str, object]], updated_at: datetime\n) -> Dict[str, object]:\n    payload: Dict[str, object] = {}\n    if context:\n        payload.update(context)\n    payload[\"last_trigger\"] = trigger\n    payload[\"last_trigger_at\"] = updated_at.isoformat()\n    return payload\n\n\ndef get_or_create_singleton_run(\n    session: Any, trigger: str, context: Optional[Dict[str, object]] = None\n) -> JobsManagerRun:\n    \"\"\"Return the singleton run, creating it if necessary.\"\"\"\n    now = datetime.utcnow()\n    run = _session_get(session, SINGLETON_RUN_ID)\n    if run:\n        run.trigger = trigger\n        run.context_json = _build_context_payload(trigger, context, now)\n        run.updated_at = now\n        if not run.started_at:\n            run.started_at = now\n        if not run.counters_reset_at:\n            run.counters_reset_at = run.started_at or now\n        session.flush()\n        return run\n\n    run = JobsManagerRun(\n        id=SINGLETON_RUN_ID,\n        status=\"running\",\n        trigger=trigger,\n        started_at=now,\n        counters_reset_at=now,\n        created_at=now,\n        updated_at=now,\n        context_json=_build_context_payload(trigger, context, now),\n    )\n    session.add(run)\n    session.flush()\n    return run\n\n\ndef ensure_active_run(\n    session: Any, trigger: str, context: Optional[Dict[str, object]] = None\n) -> JobsManagerRun:\n    \"\"\"Return the singleton run, ensuring it exists and is up to date.\"\"\"\n    return get_or_create_singleton_run(session, trigger, context)\n\n\ndef get_active_run(session: Any) -> Optional[JobsManagerRun]:\n    \"\"\"Return the singleton run if it exists.\"\"\"\n    return _session_get(session, SINGLETON_RUN_ID)\n\n\ndef recalculate_run_counts(session: Any) -> Optional[JobsManagerRun]:\n    \"\"\"\n    Recompute aggregate counters for the singleton run.\n\n    When no jobs remain in the system the counters are reset to zero so the UI\n    reflects an idle manager.\n    \"\"\"\n    run = get_active_run(session)\n    if not run:\n        return None\n\n    cutoff = run.counters_reset_at\n    # The linter incorrectly flags func.count as not callable.\n    query = session.query(\n        ProcessingJob.status,\n        func.count(ProcessingJob.id),  # pylint: disable=not-callable\n    ).filter(ProcessingJob.jobs_manager_run_id == run.id)\n    if cutoff:\n        query = query.filter(ProcessingJob.created_at >= cutoff)\n    counts = dict(query.group_by(ProcessingJob.status).all())\n\n    logger.debug(\n        \"[WRITER] recalculate_run_counts: run_id=%s counts=%s\",\n        getattr(run, \"id\", None),\n        counts,\n    )\n\n    now = datetime.utcnow()\n    queued = counts.get(\"pending\", 0) + counts.get(\"queued\", 0)\n    running = counts.get(\"running\", 0)\n    completed = counts.get(\"completed\", 0)\n    failed = counts.get(\"failed\", 0) + counts.get(\"cancelled\", 0)\n    skipped = counts.get(\"skipped\", 0)\n    total_jobs = sum(counts.values())\n\n    has_active_work = (queued + running) > 0\n\n    if has_active_work:\n        run.total_jobs = total_jobs\n        run.queued_jobs = queued\n        run.running_jobs = running\n        run.completed_jobs = completed\n        run.failed_jobs = failed\n        if hasattr(run, \"skipped_jobs\"):\n            run.skipped_jobs = skipped\n        run.updated_at = now\n        if run.running_jobs > 0:\n            run.status = \"running\"\n        else:\n            run.status = \"pending\"\n        if not run.started_at:\n            run.started_at = now\n        if not run.counters_reset_at:\n            run.counters_reset_at = run.started_at or now\n        run.completed_at = None\n    else:\n        run.status = \"pending\"\n        run.completed_at = now\n        run.started_at = None\n        run.total_jobs = 0\n        run.queued_jobs = 0\n        run.running_jobs = 0\n        run.completed_jobs = 0\n        run.failed_jobs = 0\n        if hasattr(run, \"skipped_jobs\"):\n            run.skipped_jobs = 0\n        run.updated_at = now\n        run.counters_reset_at = now\n\n    session.flush()\n    return run\n\n\ndef serialize_run(run: JobsManagerRun) -> Dict[str, object]:\n    \"\"\"Return a JSON-serialisable representation of a run.\"\"\"\n    progress_denom = max(run.total_jobs or 0, 1)\n    progress_percentage = (\n        ((run.completed_jobs + getattr(run, \"skipped_jobs\", 0)) / progress_denom)\n        * 100.0\n        if run.total_jobs\n        else 0.0\n    )\n\n    return {\n        \"id\": run.id,\n        \"status\": run.status,\n        \"trigger\": run.trigger,\n        \"started_at\": run.started_at.isoformat() if run.started_at else None,\n        \"completed_at\": run.completed_at.isoformat() if run.completed_at else None,\n        \"updated_at\": run.updated_at.isoformat() if run.updated_at else None,\n        \"total_jobs\": run.total_jobs,\n        \"queued_jobs\": run.queued_jobs,\n        \"running_jobs\": run.running_jobs,\n        \"completed_jobs\": run.completed_jobs,\n        \"failed_jobs\": run.failed_jobs,\n        \"skipped_jobs\": getattr(run, \"skipped_jobs\", 0),\n        \"context\": run.context_json,\n        \"counters_reset_at\": (\n            run.counters_reset_at.isoformat() if run.counters_reset_at else None\n        ),\n        \"progress_percentage\": round(progress_percentage, 2),\n    }\n\n\ndef build_run_status_snapshot(session: Any) -> Optional[Dict[str, object]]:\n    \"\"\"\n    Return a fresh, non-persisted snapshot of the current run counters.\n\n    This mirrors recalculate_run_counts but does not mutate or flush the\n    JobsManagerRun row, making it safe for high-frequency polling without\n    competing for SQLite write locks.\n    \"\"\"\n    run = get_active_run(session)\n    if not run:\n        return None\n\n    cutoff = run.counters_reset_at\n    query = session.query(\n        ProcessingJob.status,\n        func.count(ProcessingJob.id),  # pylint: disable=not-callable\n    ).filter(ProcessingJob.jobs_manager_run_id == run.id)\n    if cutoff:\n        query = query.filter(ProcessingJob.created_at >= cutoff)\n    counts = dict(query.group_by(ProcessingJob.status).all())\n\n    queued = counts.get(\"pending\", 0) + counts.get(\"queued\", 0)\n    running = counts.get(\"running\", 0)\n    completed = counts.get(\"completed\", 0)\n    failed = counts.get(\"failed\", 0) + counts.get(\"cancelled\", 0)\n    skipped = counts.get(\"skipped\", 0)\n    total_jobs = sum(counts.values())\n\n    has_active_work = (queued + running) > 0\n    status = run.status\n    if has_active_work:\n        status = \"running\" if running > 0 else \"pending\"\n    else:\n        status = \"pending\"\n\n    progress_denom = max(total_jobs or 0, 1)\n    progress_percentage = (\n        ((completed + skipped) / progress_denom) * 100.0 if total_jobs else 0.0\n    )\n\n    return {\n        \"id\": run.id,\n        \"status\": status,\n        \"trigger\": run.trigger,\n        \"started_at\": run.started_at.isoformat() if run.started_at else None,\n        \"completed_at\": run.completed_at.isoformat() if run.completed_at else None,\n        \"updated_at\": run.updated_at.isoformat() if run.updated_at else None,\n        \"total_jobs\": total_jobs,\n        \"queued_jobs\": queued,\n        \"running_jobs\": running,\n        \"completed_jobs\": completed,\n        \"failed_jobs\": failed,\n        \"skipped_jobs\": skipped,\n        \"context\": run.context_json,\n        \"counters_reset_at\": (\n            run.counters_reset_at.isoformat() if run.counters_reset_at else None\n        ),\n        \"progress_percentage\": round(progress_percentage, 2),\n    }\n"
  },
  {
    "path": "src/app/logger.py",
    "content": "import json\nimport logging\nimport os\n\n\nclass ExtraFormatter(logging.Formatter):\n    \"\"\"Formatter that appends structured extras to log lines.\n\n    Any LogRecord attributes not in the standard set are captured into a JSON\n    object and appended as ``extra={...}`` so contextual fields are visible in\n    plain-text logs.\n    \"\"\"\n\n    _standard_attrs = {\n        \"name\",\n        \"msg\",\n        \"args\",\n        \"levelname\",\n        \"levelno\",\n        \"pathname\",\n        \"filename\",\n        \"module\",\n        \"exc_info\",\n        \"exc_text\",\n        \"stack_info\",\n        \"lineno\",\n        \"funcName\",\n        \"created\",\n        \"msecs\",\n        \"relativeCreated\",\n        \"thread\",\n        \"threadName\",\n        \"processName\",\n        \"process\",\n        \"message\",\n        \"asctime\",\n    }\n\n    def format(self, record: logging.LogRecord) -> str:\n        base = super().format(record)\n        extras = {\n            k: v for k, v in record.__dict__.items() if k not in self._standard_attrs\n        }\n        if extras:\n            try:\n                extras_json = json.dumps(extras, ensure_ascii=True, default=str)\n            except Exception:\n                extras_json = str(extras)\n            return f\"{base} | extra={extras_json}\"\n        return base\n\n\ndef setup_logger(\n    name: str, log_file: str, level: int = logging.DEBUG\n) -> logging.Logger:\n    \"\"\"Create or return a configured logger.\n\n    - Writes to the specified log_file\n    - Emits to console exactly once (no duplicates)\n    - Disables propagation to avoid duplicate root handling\n    - Guards against adding duplicate handlers across repeated calls\n    \"\"\"\n    file_formatter = ExtraFormatter(\"%(asctime)s %(levelname)s %(message)s\")\n    console_formatter = ExtraFormatter(\"%(levelname)s  [%(name)s] %(message)s\")\n\n    logger = logging.getLogger(name)\n    logger.setLevel(level)\n    # Prevent records from also bubbling up to root logger handlers (which can cause duplicates)\n    logger.propagate = False\n\n    # Ensure directory exists for log file\n    log_dir = os.path.dirname(log_file)\n    if log_dir:\n        os.makedirs(log_dir, exist_ok=True)\n\n    # Add file handler if not already present for this file\n    abs_log_file = os.path.abspath(log_file)\n    has_file_handler = any(\n        isinstance(h, logging.FileHandler)\n        and getattr(h, \"baseFilename\", None) == abs_log_file\n        for h in logger.handlers\n    )\n    if not has_file_handler:\n        file_handler = logging.FileHandler(abs_log_file)\n        file_handler.setFormatter(file_formatter)\n        logger.addHandler(file_handler)\n\n    # Add a single console handler if not already present\n    has_stream_handler = any(\n        isinstance(h, logging.StreamHandler) for h in logger.handlers\n    )\n    if not has_stream_handler:\n        stream_handler = logging.StreamHandler()\n        stream_handler.setFormatter(console_formatter)\n        logger.addHandler(stream_handler)\n\n    return logger\n"
  },
  {
    "path": "src/app/models.py",
    "content": "import os\nimport uuid\nfrom datetime import datetime\n\nfrom sqlalchemy.orm import validates\n\nfrom app.auth.passwords import hash_password, verify_password\nfrom app.extensions import db\nfrom shared import defaults as DEFAULTS\n\n\ndef generate_uuid() -> str:\n    \"\"\"Generate a UUID4 string.\"\"\"\n    return str(uuid.uuid4())\n\n\ndef generate_job_id() -> str:\n    \"\"\"Generate a unique job ID.\"\"\"\n    return generate_uuid()\n\n\n# mypy typing issue https://github.com/python/mypy/issues/17918\nclass Feed(db.Model):  # type: ignore[name-defined, misc]\n    id = db.Column(db.Integer, primary_key=True, autoincrement=True)\n    alt_id = db.Column(\n        db.Text, nullable=True\n    )  # used for backwards compatibility with legacy YAML-based feed definitions\n    title = db.Column(db.Text, nullable=False)\n    description = db.Column(db.Text)\n    author = db.Column(db.Text)\n    rss_url = db.Column(db.Text, unique=True, nullable=False)\n    image_url = db.Column(db.Text)\n    auto_whitelist_new_episodes_override = db.Column(db.Boolean, nullable=True)\n\n    posts = db.relationship(\n        \"Post\", backref=\"feed\", lazy=True, order_by=\"Post.release_date.desc()\"\n    )\n    user_feeds = db.relationship(\n        \"UserFeed\",\n        back_populates=\"feed\",\n        cascade=\"all, delete-orphan\",\n    )\n\n    def __repr__(self) -> str:\n        return f\"<Feed {self.title}>\"\n\n\nclass FeedAccessToken(db.Model):  # type: ignore[name-defined, misc]\n    __tablename__ = \"feed_access_token\"\n\n    id = db.Column(db.Integer, primary_key=True, autoincrement=True)\n    token_id = db.Column(db.String(32), unique=True, nullable=False, index=True)\n    token_hash = db.Column(db.String(64), nullable=False)\n    token_secret = db.Column(db.String(128), nullable=True)\n    feed_id = db.Column(db.Integer, db.ForeignKey(\"feed.id\"), nullable=True)\n    user_id = db.Column(db.Integer, db.ForeignKey(\"users.id\"), nullable=False)\n    created_at = db.Column(db.DateTime, default=datetime.utcnow, nullable=False)\n    last_used_at = db.Column(db.DateTime, nullable=True)\n    revoked = db.Column(db.Boolean, default=False, nullable=False)\n\n    feed = db.relationship(\"Feed\", backref=db.backref(\"access_tokens\", lazy=\"dynamic\"))\n    user = db.relationship(\n        \"User\", backref=db.backref(\"feed_access_tokens\", lazy=\"dynamic\")\n    )\n\n    def __repr__(self) -> str:\n        return (\n            f\"<FeedAccessToken feed={self.feed_id} user={self.user_id}\"\n            f\" revoked={self.revoked}>\"\n        )\n\n\nclass Post(db.Model):  # type: ignore[name-defined, misc]\n    feed_id = db.Column(db.Integer, db.ForeignKey(\"feed.id\"), nullable=False)\n    id = db.Column(db.Integer, primary_key=True, autoincrement=True)\n    guid = db.Column(db.Text, unique=True, nullable=False)\n    download_url = db.Column(\n        db.Text, unique=True, nullable=False\n    )  # remote download URL, not podly url\n    title = db.Column(db.Text, nullable=False)\n    unprocessed_audio_path = db.Column(db.Text)\n    processed_audio_path = db.Column(db.Text)\n    description = db.Column(db.Text)\n    release_date = db.Column(db.DateTime(timezone=True))\n    duration = db.Column(db.Integer)\n    whitelisted = db.Column(db.Boolean, default=False, nullable=False)\n    image_url = db.Column(db.Text)  # Episode thumbnail URL\n    download_count = db.Column(db.Integer, nullable=True, default=0)\n\n    # Latest (most recent) refined ad cut windows for this post.\n    # This is written by the ad classifier boundary refinement step and read by the\n    # audio processor to cut ads using refined (intra-segment) timestamps.\n    refined_ad_boundaries = db.Column(db.JSON, nullable=True)\n    refined_ad_boundaries_updated_at = db.Column(db.DateTime, nullable=True)\n\n    segments = db.relationship(\n        \"TranscriptSegment\",\n        backref=\"post\",\n        lazy=\"dynamic\",\n        order_by=\"TranscriptSegment.sequence_num\",\n    )\n\n    def audio_len_bytes(self) -> int:\n        audio_len_bytes = 0\n        if self.processed_audio_path is not None and os.path.isfile(\n            self.processed_audio_path\n        ):\n            audio_len_bytes = os.path.getsize(self.processed_audio_path)\n\n        return audio_len_bytes\n\n\nclass TranscriptSegment(db.Model):  # type: ignore[name-defined, misc]\n    __tablename__ = \"transcript_segment\"\n    id = db.Column(db.Integer, primary_key=True, autoincrement=True)\n    post_id = db.Column(db.Integer, db.ForeignKey(\"post.id\"), nullable=False)\n    sequence_num = db.Column(db.Integer, nullable=False)\n    start_time = db.Column(db.Float, nullable=False)\n    end_time = db.Column(db.Float, nullable=False)\n    text = db.Column(db.Text, nullable=False)\n\n    identifications = db.relationship(\n        \"Identification\", backref=\"transcript_segment\", lazy=\"dynamic\"\n    )\n\n    __table_args__ = (\n        db.Index(\n            \"ix_transcript_segment_post_id_sequence_num\",\n            \"post_id\",\n            \"sequence_num\",\n            unique=True,\n        ),\n    )\n\n    def __repr__(self) -> str:\n        return f\"<TranscriptSegment {self.id} P:{self.post_id} S:{self.sequence_num} T:{self.start_time:.1f}-{self.end_time:.1f}>\"\n\n\nclass User(db.Model):  # type: ignore[name-defined, misc]\n    __tablename__ = \"users\"\n    id = db.Column(db.Integer, primary_key=True, autoincrement=True)\n    username = db.Column(db.String(255), unique=True, nullable=False, index=True)\n    password_hash = db.Column(db.String(255), nullable=False)\n    role = db.Column(db.String(50), nullable=False, default=\"user\")\n    feed_allowance = db.Column(db.Integer, nullable=False, default=0)\n    feed_subscription_status = db.Column(\n        db.String(32), nullable=False, default=\"inactive\"\n    )\n    stripe_customer_id = db.Column(db.String(64), nullable=True)\n    stripe_subscription_id = db.Column(db.String(64), nullable=True)\n    created_at = db.Column(db.DateTime, default=datetime.utcnow, nullable=False)\n    updated_at = db.Column(\n        db.DateTime, default=datetime.utcnow, onupdate=datetime.utcnow, nullable=False\n    )\n    # Discord SSO fields\n    discord_id = db.Column(db.String(32), unique=True, nullable=True, index=True)\n    discord_username = db.Column(db.String(100), nullable=True)\n    last_active = db.Column(db.DateTime, nullable=True)\n\n    # Admin override for feed allowance (if set, overrides plan-based allowance)\n    manual_feed_allowance = db.Column(db.Integer, nullable=True)\n\n    user_feeds = db.relationship(\n        \"UserFeed\",\n        back_populates=\"user\",\n        cascade=\"all, delete-orphan\",\n    )\n\n    @validates(\"username\")\n    def _normalize_username(self, key: str, value: str) -> str:\n        del key\n        return value.strip().lower()\n\n    def set_password(self, password: str) -> None:\n        self.password_hash = hash_password(password)\n\n    def verify_password(self, password: str) -> bool:\n        return verify_password(password, self.password_hash)\n\n    def __repr__(self) -> str:\n        return f\"<User {self.username} role={self.role}>\"\n\n\nclass ModelCall(db.Model):  # type: ignore[name-defined, misc]\n    __tablename__ = \"model_call\"\n    id = db.Column(db.Integer, primary_key=True, autoincrement=True)\n    post_id = db.Column(db.Integer, db.ForeignKey(\"post.id\"), nullable=False)\n\n    first_segment_sequence_num = db.Column(db.Integer, nullable=False)\n    last_segment_sequence_num = db.Column(db.Integer, nullable=False)\n\n    model_name = db.Column(db.String, nullable=False)\n    prompt = db.Column(db.Text, nullable=False)\n    response = db.Column(db.Text, nullable=True)\n    timestamp = db.Column(db.DateTime, nullable=False, default=datetime.utcnow)\n    status = db.Column(db.String, nullable=False, default=\"pending\")\n    error_message = db.Column(db.Text, nullable=True)\n    retry_attempts = db.Column(db.Integer, nullable=False, default=0)\n\n    identifications = db.relationship(\n        \"Identification\", backref=\"model_call\", lazy=\"dynamic\"\n    )\n    post = db.relationship(\"Post\", backref=db.backref(\"model_calls\", lazy=\"dynamic\"))\n\n    __table_args__ = (\n        db.Index(\n            \"ix_model_call_post_chunk_model\",\n            \"post_id\",\n            \"first_segment_sequence_num\",\n            \"last_segment_sequence_num\",\n            \"model_name\",\n            unique=True,\n        ),\n    )\n\n    def __repr__(self) -> str:\n        return f\"<ModelCall {self.id} P:{self.post_id} Segs:{self.first_segment_sequence_num}-{self.last_segment_sequence_num} M:{self.model_name} S:{self.status}>\"\n\n\nclass Identification(db.Model):  # type: ignore[name-defined, misc]\n    __tablename__ = \"identification\"\n    id = db.Column(db.Integer, primary_key=True, autoincrement=True)\n    transcript_segment_id = db.Column(\n        db.Integer, db.ForeignKey(\"transcript_segment.id\"), nullable=False\n    )\n    model_call_id = db.Column(\n        db.Integer, db.ForeignKey(\"model_call.id\"), nullable=False\n    )\n    confidence = db.Column(db.Float, nullable=True)\n    label = db.Column(db.String, nullable=False)\n\n    __table_args__ = (\n        db.Index(\n            \"ix_identification_segment_call_label\",\n            \"transcript_segment_id\",\n            \"model_call_id\",\n            \"label\",\n            unique=True,\n        ),\n    )\n\n    def __repr__(self) -> str:\n        # Ensure confidence is handled if None for f-string formatting\n        confidence_str = (\n            f\"{self.confidence:.2f}\" if self.confidence is not None else \"N/A\"\n        )\n        return f\"<Identification {self.id} TS:{self.transcript_segment_id} MC:{self.model_call_id} L:{self.label} C:{confidence_str}>\"\n\n\nclass JobsManagerRun(db.Model):  # type: ignore[name-defined, misc]\n    __tablename__ = \"jobs_manager_run\"\n\n    id = db.Column(db.String(36), primary_key=True, default=generate_uuid)\n    status = db.Column(db.String(50), nullable=False, default=\"pending\", index=True)\n    trigger = db.Column(db.String(100), nullable=False)\n    started_at = db.Column(db.DateTime, nullable=True)\n    completed_at = db.Column(db.DateTime, nullable=True)\n    total_jobs = db.Column(db.Integer, nullable=False, default=0)\n    queued_jobs = db.Column(db.Integer, nullable=False, default=0)\n    running_jobs = db.Column(db.Integer, nullable=False, default=0)\n    completed_jobs = db.Column(db.Integer, nullable=False, default=0)\n    failed_jobs = db.Column(db.Integer, nullable=False, default=0)\n    skipped_jobs = db.Column(db.Integer, nullable=False, default=0)\n    context_json = db.Column(db.JSON, nullable=True)\n    counters_reset_at = db.Column(db.DateTime, nullable=True)\n    created_at = db.Column(db.DateTime, default=datetime.utcnow)\n    updated_at = db.Column(\n        db.DateTime, default=datetime.utcnow, onupdate=datetime.utcnow\n    )\n\n    processing_jobs = db.relationship(\n        \"ProcessingJob\", back_populates=\"run\", lazy=\"dynamic\"\n    )\n\n    def __repr__(self) -> str:\n        return (\n            f\"<JobsManagerRun {self.id} status={self.status} \"\n            f\"trigger={self.trigger} total={self.total_jobs}>\"\n        )\n\n\nclass ProcessingJob(db.Model):  # type: ignore[name-defined, misc]\n    __tablename__ = \"processing_job\"\n\n    id = db.Column(db.String(36), primary_key=True, default=generate_job_id)\n    jobs_manager_run_id = db.Column(\n        db.String(36), db.ForeignKey(\"jobs_manager_run.id\"), index=True\n    )\n    post_guid = db.Column(db.String(255), nullable=False, index=True)\n    status = db.Column(\n        db.String(50), nullable=False\n    )  # pending, running, completed, failed, cancelled, skipped\n    current_step = db.Column(db.Integer, default=0)  # 0-4 (0=not started, 4=completed)\n    step_name = db.Column(db.String(100))\n    total_steps = db.Column(db.Integer, default=4)\n    progress_percentage = db.Column(db.Float, default=0.0)\n    started_at = db.Column(db.DateTime)\n    completed_at = db.Column(db.DateTime)\n    error_message = db.Column(db.Text)\n    scheduler_job_id = db.Column(db.String(255))  # APScheduler job ID\n    created_at = db.Column(db.DateTime, default=datetime.utcnow, index=True)\n    requested_by_user_id = db.Column(db.Integer, db.ForeignKey(\"users.id\"))\n    billing_user_id = db.Column(db.Integer, db.ForeignKey(\"users.id\"))\n\n    # Relationships\n    post = db.relationship(\n        \"Post\",\n        backref=\"processing_jobs\",\n        primaryjoin=\"ProcessingJob.post_guid == Post.guid\",\n        foreign_keys=[post_guid],\n    )\n    run = db.relationship(\"JobsManagerRun\", back_populates=\"processing_jobs\")\n    requested_by_user = db.relationship(\n        \"User\",\n        foreign_keys=[requested_by_user_id],\n        backref=db.backref(\"requested_jobs\", lazy=\"dynamic\"),\n    )\n    billing_user = db.relationship(\n        \"User\",\n        foreign_keys=[billing_user_id],\n        backref=db.backref(\"billed_jobs\", lazy=\"dynamic\"),\n    )\n\n    def __repr__(self) -> str:\n        return f\"<ProcessingJob {self.id} Post:{self.post_guid} Status:{self.status} Step:{self.current_step}/{self.total_steps}>\"\n\n\nclass UserFeed(db.Model):  # type: ignore[name-defined, misc]\n    __tablename__ = \"feed_supporter\"\n\n    id = db.Column(db.Integer, primary_key=True, autoincrement=True)\n    feed_id = db.Column(db.Integer, db.ForeignKey(\"feed.id\"), nullable=False)\n    user_id = db.Column(db.Integer, db.ForeignKey(\"users.id\"), nullable=False)\n    created_at = db.Column(db.DateTime, default=datetime.utcnow, nullable=False)\n\n    __table_args__ = (\n        db.UniqueConstraint(\"feed_id\", \"user_id\", name=\"uq_feed_supporter_feed_user\"),\n    )\n\n    feed = db.relationship(\"Feed\", back_populates=\"user_feeds\")\n    user = db.relationship(\"User\", back_populates=\"user_feeds\")\n\n    def __repr__(self) -> str:\n        return f\"<UserFeed feed={self.feed_id} user={self.user_id}>\"\n\n\n# ----- Application Settings (Singleton Tables) -----\n\n\nclass LLMSettings(db.Model):  # type: ignore[name-defined, misc]\n    __tablename__ = \"llm_settings\"\n\n    id = db.Column(db.Integer, primary_key=True, default=1)\n    llm_api_key = db.Column(db.Text, nullable=True)\n    llm_model = db.Column(db.Text, nullable=False, default=DEFAULTS.LLM_DEFAULT_MODEL)\n    openai_base_url = db.Column(db.Text, nullable=True)\n    openai_timeout = db.Column(\n        db.Integer, nullable=False, default=DEFAULTS.OPENAI_DEFAULT_TIMEOUT_SEC\n    )\n    openai_max_tokens = db.Column(\n        db.Integer, nullable=False, default=DEFAULTS.OPENAI_DEFAULT_MAX_TOKENS\n    )\n    llm_max_concurrent_calls = db.Column(\n        db.Integer, nullable=False, default=DEFAULTS.LLM_DEFAULT_MAX_CONCURRENT_CALLS\n    )\n    llm_max_retry_attempts = db.Column(\n        db.Integer, nullable=False, default=DEFAULTS.LLM_DEFAULT_MAX_RETRY_ATTEMPTS\n    )\n    llm_max_input_tokens_per_call = db.Column(db.Integer, nullable=True)\n    llm_enable_token_rate_limiting = db.Column(\n        db.Boolean, nullable=False, default=DEFAULTS.LLM_ENABLE_TOKEN_RATE_LIMITING\n    )\n    llm_max_input_tokens_per_minute = db.Column(db.Integer, nullable=True)\n    enable_boundary_refinement = db.Column(\n        db.Boolean, nullable=False, default=DEFAULTS.ENABLE_BOUNDARY_REFINEMENT\n    )\n    enable_word_level_boundary_refinder = db.Column(\n        db.Boolean,\n        nullable=False,\n        default=DEFAULTS.ENABLE_WORD_LEVEL_BOUNDARY_REFINDER,\n    )\n\n    created_at = db.Column(db.DateTime, nullable=False, default=datetime.utcnow)\n    updated_at = db.Column(db.DateTime, nullable=False, default=datetime.utcnow)\n\n\nclass WhisperSettings(db.Model):  # type: ignore[name-defined, misc]\n    __tablename__ = \"whisper_settings\"\n\n    id = db.Column(db.Integer, primary_key=True, default=1)\n    whisper_type = db.Column(\n        db.Text, nullable=False, default=DEFAULTS.WHISPER_DEFAULT_TYPE\n    )  # local|remote|groq|test\n\n    # Local\n    local_model = db.Column(\n        db.Text, nullable=False, default=DEFAULTS.WHISPER_LOCAL_MODEL\n    )\n\n    # Remote\n    remote_model = db.Column(\n        db.Text, nullable=False, default=DEFAULTS.WHISPER_REMOTE_MODEL\n    )\n    remote_api_key = db.Column(db.Text, nullable=True)\n    remote_base_url = db.Column(\n        db.Text, nullable=False, default=DEFAULTS.WHISPER_REMOTE_BASE_URL\n    )\n    remote_language = db.Column(\n        db.Text, nullable=False, default=DEFAULTS.WHISPER_REMOTE_LANGUAGE\n    )\n    remote_timeout_sec = db.Column(\n        db.Integer, nullable=False, default=DEFAULTS.WHISPER_REMOTE_TIMEOUT_SEC\n    )\n    remote_chunksize_mb = db.Column(\n        db.Integer, nullable=False, default=DEFAULTS.WHISPER_REMOTE_CHUNKSIZE_MB\n    )\n\n    # Groq\n    groq_api_key = db.Column(db.Text, nullable=True)\n    groq_model = db.Column(db.Text, nullable=False, default=DEFAULTS.WHISPER_GROQ_MODEL)\n    groq_language = db.Column(\n        db.Text, nullable=False, default=DEFAULTS.WHISPER_GROQ_LANGUAGE\n    )\n    groq_max_retries = db.Column(\n        db.Integer, nullable=False, default=DEFAULTS.WHISPER_GROQ_MAX_RETRIES\n    )\n\n    created_at = db.Column(db.DateTime, nullable=False, default=datetime.utcnow)\n    updated_at = db.Column(db.DateTime, nullable=False, default=datetime.utcnow)\n\n\nclass ProcessingSettings(db.Model):  # type: ignore[name-defined, misc]\n    __tablename__ = \"processing_settings\"\n\n    id = db.Column(db.Integer, primary_key=True, default=1)\n    # Deprecated: paths are now hardcoded; keep columns for migration compatibility\n    system_prompt_path = db.Column(\n        db.Text, nullable=False, default=\"src/system_prompt.txt\"\n    )\n    user_prompt_template_path = db.Column(\n        db.Text, nullable=False, default=\"src/user_prompt.jinja\"\n    )\n    num_segments_to_input_to_prompt = db.Column(\n        db.Integer,\n        nullable=False,\n        default=DEFAULTS.PROCESSING_NUM_SEGMENTS_TO_INPUT_TO_PROMPT,\n    )\n\n    created_at = db.Column(db.DateTime, nullable=False, default=datetime.utcnow)\n    updated_at = db.Column(db.DateTime, nullable=False, default=datetime.utcnow)\n\n\nclass OutputSettings(db.Model):  # type: ignore[name-defined, misc]\n    __tablename__ = \"output_settings\"\n\n    id = db.Column(db.Integer, primary_key=True, default=1)\n    fade_ms = db.Column(db.Integer, nullable=False, default=DEFAULTS.OUTPUT_FADE_MS)\n    min_ad_segement_separation_seconds = db.Column(\n        db.Integer,\n        nullable=False,\n        default=DEFAULTS.OUTPUT_MIN_AD_SEGMENT_SEPARATION_SECONDS,\n    )\n    min_ad_segment_length_seconds = db.Column(\n        db.Integer,\n        nullable=False,\n        default=DEFAULTS.OUTPUT_MIN_AD_SEGMENT_LENGTH_SECONDS,\n    )\n    min_confidence = db.Column(\n        db.Float, nullable=False, default=DEFAULTS.OUTPUT_MIN_CONFIDENCE\n    )\n\n    created_at = db.Column(db.DateTime, nullable=False, default=datetime.utcnow)\n    updated_at = db.Column(db.DateTime, nullable=False, default=datetime.utcnow)\n\n\nclass AppSettings(db.Model):  # type: ignore[name-defined, misc]\n    __tablename__ = \"app_settings\"\n\n    id = db.Column(db.Integer, primary_key=True, default=1)\n    background_update_interval_minute = db.Column(\n        db.Integer, nullable=True\n    )  # intentionally nullable; default applied in config store/runtime\n    automatically_whitelist_new_episodes = db.Column(\n        db.Boolean,\n        nullable=False,\n        default=DEFAULTS.APP_AUTOMATICALLY_WHITELIST_NEW_EPISODES,\n    )\n    post_cleanup_retention_days = db.Column(\n        db.Integer,\n        nullable=True,\n        default=DEFAULTS.APP_POST_CLEANUP_RETENTION_DAYS,\n    )\n    number_of_episodes_to_whitelist_from_archive_of_new_feed = db.Column(\n        db.Integer,\n        nullable=False,\n        default=DEFAULTS.APP_NUM_EPISODES_TO_WHITELIST_FROM_ARCHIVE_OF_NEW_FEED,\n    )\n    enable_public_landing_page = db.Column(\n        db.Boolean,\n        nullable=False,\n        default=DEFAULTS.APP_ENABLE_PUBLIC_LANDING_PAGE,\n    )\n    user_limit_total = db.Column(db.Integer, nullable=True)\n    autoprocess_on_download = db.Column(\n        db.Boolean,\n        nullable=False,\n        default=DEFAULTS.APP_AUTOPROCESS_ON_DOWNLOAD,\n    )\n\n    # Hash of the environment variables used to seed configuration.\n    # Used to detect changes in environment variables between restarts.\n    env_config_hash = db.Column(db.String(64), nullable=True)\n\n    created_at = db.Column(db.DateTime, nullable=False, default=datetime.utcnow)\n    updated_at = db.Column(db.DateTime, nullable=False, default=datetime.utcnow)\n\n\nclass DiscordSettings(db.Model):  # type: ignore[name-defined, misc]\n    __tablename__ = \"discord_settings\"\n\n    id = db.Column(db.Integer, primary_key=True, default=1)\n    client_id = db.Column(db.Text, nullable=True)\n    client_secret = db.Column(db.Text, nullable=True)\n    redirect_uri = db.Column(db.Text, nullable=True)\n    guild_ids = db.Column(db.Text, nullable=True)  # Comma-separated list\n    allow_registration = db.Column(db.Boolean, nullable=False, default=True)\n\n    created_at = db.Column(db.DateTime, nullable=False, default=datetime.utcnow)\n    updated_at = db.Column(db.DateTime, nullable=False, default=datetime.utcnow)\n"
  },
  {
    "path": "src/app/post_cleanup.py",
    "content": "\"\"\"Cleanup job for pruning processed posts and associated artifacts.\"\"\"\n\nfrom __future__ import annotations\n\nimport logging\nfrom datetime import datetime, timedelta\nfrom pathlib import Path\nfrom typing import Dict, Optional, Sequence, Tuple\n\nfrom sqlalchemy import func\nfrom sqlalchemy.orm import Query\n\nfrom app.db_guard import db_guard, reset_session\nfrom app.extensions import db, scheduler\nfrom app.models import Post, ProcessingJob\nfrom app.runtime_config import config as runtime_config\nfrom app.writer.client import writer_client\nfrom shared import defaults as DEFAULTS\n\nlogger = logging.getLogger(\"global_logger\")\n\n\ndef _build_cleanup_query(\n    retention_days: Optional[int],\n) -> Tuple[Optional[Query[\"Post\"]], Optional[datetime]]:\n    \"\"\"Construct the base query for posts eligible for cleanup.\"\"\"\n    if retention_days is None or retention_days <= 0:\n        return None, None\n\n    cutoff = datetime.utcnow() - timedelta(days=retention_days)\n\n    active_jobs_exists = (\n        db.session.query(ProcessingJob.id)\n        .filter(ProcessingJob.post_guid == Post.guid)\n        .filter(ProcessingJob.status.in_([\"pending\", \"running\"]))\n        .exists()\n    )\n\n    posts_query = Post.query.filter(Post.processed_audio_path.isnot(None)).filter(\n        ~active_jobs_exists\n    )\n\n    return posts_query, cutoff\n\n\ndef count_cleanup_candidates(\n    retention_days: Optional[int],\n) -> Tuple[int, Optional[datetime]]:\n    \"\"\"Return how many posts would currently be removed along with the cutoff.\"\"\"\n    posts_query, cutoff = _build_cleanup_query(retention_days)\n    if posts_query is None or cutoff is None:\n        return 0, None\n\n    posts = posts_query.all()\n    latest_completed = _load_latest_completed_map([post.guid for post in posts])\n    count = sum(\n        1\n        for post in posts\n        if _processed_timestamp_before_cutoff(post, cutoff, latest_completed)\n    )\n    return count, cutoff\n\n\ndef cleanup_processed_posts(retention_days: Optional[int]) -> int:\n    \"\"\"Prune processed posts older than the retention window.\n\n    Posts qualify when their processed audio artifact (or, if missing, the\n    latest completed job) is older than the retention window. Eligible posts\n    are un-whitelisted, artifacts are removed, and dependent rows are deleted,\n    but the post row is retained to prevent reprocessing. Returns the number of\n    posts that were cleaned. Callers must ensure an application context is\n    active.\n    \"\"\"\n    with db_guard(\"cleanup_processed_posts\", db.session, logger):\n        posts_query, cutoff = _build_cleanup_query(retention_days)\n        if posts_query is None or cutoff is None:\n            return 0\n\n        posts: Sequence[Post] = posts_query.all()\n        latest_completed = _load_latest_completed_map([post.guid for post in posts])\n\n        if not posts:\n            return 0\n\n        removed_posts = 0\n\n        for post in posts:\n            if not _processed_timestamp_before_cutoff(post, cutoff, latest_completed):\n                continue\n\n            removed_posts += 1\n            logger.info(\n                \"Cleanup removing post '%s' (guid=%s) completed before %s\",\n                post.title,\n                post.guid,\n                cutoff.isoformat(),\n            )\n            _remove_associated_files(post)\n            try:\n                writer_client.action(\n                    \"cleanup_processed_post\", {\"post_id\": post.id}, wait=True\n                )\n            except Exception as exc:  # pylint: disable=broad-except\n                logger.error(\n                    \"Cleanup failed for post %s (guid=%s): %s\",\n                    post.id,\n                    post.guid,\n                    exc,\n                    exc_info=True,\n                )\n\n        logger.info(\n            \"Cleanup job removed %s posts\",\n            removed_posts,\n        )\n        return removed_posts\n\n\ndef scheduled_cleanup_processed_posts() -> None:\n    \"\"\"Entry-point for APScheduler.\"\"\"\n    retention = getattr(\n        runtime_config,\n        \"post_cleanup_retention_days\",\n        DEFAULTS.APP_POST_CLEANUP_RETENTION_DAYS,\n    )\n    if scheduler.app is None:\n        logger.warning(\"Cleanup skipped: scheduler has no associated app.\")\n        return\n\n    try:\n        with scheduler.app.app_context():\n            cleanup_processed_posts(retention)\n    except Exception as exc:  # pylint: disable=broad-except\n        logger.error(\"Scheduled cleanup failed: %s\", exc, exc_info=True)\n        reset_session(db.session, logger, \"scheduled_cleanup_processed_posts\", exc)\n\n\ndef _remove_associated_files(post: Post) -> None:\n    \"\"\"Delete processed and unprocessed audio files for a post.\"\"\"\n    for path_str in [post.unprocessed_audio_path, post.processed_audio_path]:\n        if not path_str:\n            continue\n        try:\n            file_path = Path(path_str)\n        except Exception:  # pylint: disable=broad-except\n            logger.warning(\"Cleanup: invalid path for post %s: %s\", post.guid, path_str)\n            continue\n        if not file_path.exists():\n            continue\n        try:\n            file_path.unlink()\n            logger.info(\"Cleanup deleted file: %s\", file_path)\n        except OSError as exc:\n            logger.warning(\"Cleanup unable to delete %s: %s\", file_path, exc)\n\n\ndef _load_latest_completed_map(\n    post_guids: Sequence[str],\n) -> Dict[str, Optional[datetime]]:\n    if not post_guids:\n        return {}\n\n    rows = (\n        db.session.query(\n            ProcessingJob.post_guid,\n            func.max(ProcessingJob.completed_at),\n        )\n        .filter(ProcessingJob.post_guid.in_(post_guids))\n        .group_by(ProcessingJob.post_guid)\n        .all()\n    )\n    return dict(rows)\n\n\ndef _processed_timestamp_before_cutoff(\n    post: Post, cutoff: datetime, latest_completed: Dict[str, Optional[datetime]]\n) -> bool:\n    file_timestamp = _get_processed_file_timestamp(post)\n    job_timestamp = latest_completed.get(post.guid)\n\n    candidate: Optional[datetime]\n    if file_timestamp and job_timestamp:\n        candidate = min(file_timestamp, job_timestamp)\n    else:\n        candidate = file_timestamp or job_timestamp\n\n    return bool(candidate and candidate < cutoff)\n\n\ndef _get_processed_file_timestamp(post: Post) -> Optional[datetime]:\n    if not post.processed_audio_path:\n        return None\n\n    try:\n        file_path = Path(post.processed_audio_path)\n    except Exception:  # pylint: disable=broad-except\n        logger.warning(\n            \"Cleanup: invalid processed path for post %s: %s\",\n            post.guid,\n            post.processed_audio_path,\n        )\n        return None\n\n    if not file_path.exists():\n        return None\n\n    try:\n        mtime = file_path.stat().st_mtime\n    except OSError as exc:\n        logger.warning(\"Cleanup: unable to stat processed file %s: %s\", file_path, exc)\n        return None\n\n    return datetime.utcfromtimestamp(mtime)\n"
  },
  {
    "path": "src/app/posts.py",
    "content": "import logging\nfrom pathlib import Path\nfrom typing import List, Optional\n\nfrom app.models import Post\nfrom app.writer.client import writer_client\nfrom podcast_processor.podcast_downloader import get_and_make_download_path\n\nlogger = logging.getLogger(\"global_logger\")\n\n\ndef _collect_processed_paths(post: Post) -> List[Path]:\n    \"\"\"Collect all possible processed audio paths to check for a post.\"\"\"\n    import re\n\n    from podcast_processor.podcast_downloader import sanitize_title\n    from shared.processing_paths import get_srv_root, paths_from_unprocessed_path\n\n    processed_paths_to_check: List[Path] = []\n\n    # 1. Check database path first (most reliable if set)\n    if post.processed_audio_path:\n        processed_paths_to_check.append(Path(post.processed_audio_path))\n\n    # 2. Compute path using paths_from_unprocessed_path (matches processor logic)\n    if post.unprocessed_audio_path and post.feed and post.feed.title:\n        processing_paths = paths_from_unprocessed_path(\n            post.unprocessed_audio_path, post.feed.title\n        )\n        if processing_paths:\n            processed_paths_to_check.append(processing_paths.post_processed_audio_path)\n\n    # 3. Fallback: compute expected path from post/feed titles\n    if post.feed and post.feed.title and post.title:\n        safe_feed_title = sanitize_title(post.feed.title)\n        safe_post_title = sanitize_title(post.title)\n        processed_paths_to_check.append(\n            get_srv_root() / safe_feed_title / f\"{safe_post_title}.mp3\"\n        )\n\n        # 4. Also check with underscore-style sanitization\n        sanitized_feed_title = re.sub(r\"[^a-zA-Z0-9\\s_.-]\", \"\", post.feed.title).strip()\n        sanitized_feed_title = sanitized_feed_title.rstrip(\".\")\n        sanitized_feed_title = re.sub(r\"\\s+\", \"_\", sanitized_feed_title)\n        processed_paths_to_check.append(\n            get_srv_root() / sanitized_feed_title / f\"{safe_post_title}.mp3\"\n        )\n\n    return processed_paths_to_check\n\n\ndef _dedupe_and_find_existing(paths: List[Path]) -> tuple[List[Path], Optional[Path]]:\n    \"\"\"Deduplicate paths and find the first existing one.\"\"\"\n    seen: set[Path] = set()\n    unique_paths: List[Path] = []\n    for p in paths:\n        resolved = p.resolve()\n        if resolved not in seen:\n            seen.add(resolved)\n            unique_paths.append(resolved)\n\n    existing_path: Optional[Path] = None\n    for p in unique_paths:\n        if p.exists():\n            existing_path = p\n            break\n\n    return unique_paths, existing_path\n\n\ndef _remove_file_if_exists(path: Optional[Path], file_type: str, post_id: int) -> None:\n    \"\"\"Remove a file if it exists and log the result.\"\"\"\n    if not path:\n        logger.debug(f\"{file_type} path is None for post {post_id}.\")\n        return\n\n    if not path.exists():\n        logger.debug(f\"No {file_type} file to remove for post {post_id}.\")\n        return\n\n    try:\n        path.unlink()\n        logger.info(f\"Removed {file_type} file: {path}\")\n    except OSError as e:\n        logger.error(f\"Failed to remove {file_type} file {path}: {e}\")\n\n\ndef remove_associated_files(post: Post) -> None:\n    \"\"\"\n    Remove unprocessed and processed audio files associated with a post.\n    Computes paths from post/feed metadata to ensure files are found even\n    if database paths are already cleared.\n\n    We check multiple possible locations for processed audio because the path\n    calculation has varied over time and between different code paths.\n    \"\"\"\n    try:\n        # Collect and find processed audio path\n        processed_paths = _collect_processed_paths(post)\n        unique_paths, processed_abs_path = _dedupe_and_find_existing(processed_paths)\n\n        # Compute expected unprocessed audio path\n        unprocessed_abs_path: Optional[Path] = None\n        if post.title:\n            unprocessed_path = get_and_make_download_path(post.title)\n            if unprocessed_path:\n                unprocessed_abs_path = Path(unprocessed_path).resolve()\n\n        # Fallback: if we couldn't find a processed path, try using the stored path directly\n        if processed_abs_path is None and post.processed_audio_path:\n            processed_abs_path = Path(post.processed_audio_path).resolve()\n\n        # Remove audio files\n        _remove_file_if_exists(unprocessed_abs_path, \"unprocessed audio\", post.id)\n\n        if processed_abs_path:\n            _remove_file_if_exists(processed_abs_path, \"processed audio\", post.id)\n        elif unique_paths:\n            logger.debug(\n                f\"No processed audio file to remove for post {post.id}. \"\n                f\"Checked paths: {[str(p) for p in unique_paths]}\"\n            )\n        else:\n            logger.debug(\n                f\"Could not determine processed audio path for post {post.id}.\"\n            )\n\n    except Exception as e:  # pylint: disable=broad-except\n        logger.error(\n            f\"Unexpected error in remove_associated_files for post {post.id}: {e}\",\n            exc_info=True,\n        )\n\n\ndef clear_post_processing_data(post: Post) -> None:\n    \"\"\"\n    Clear all processing data for a post including:\n    - Audio files (unprocessed and processed)\n    - Database entries (transcript segments, identifications, model calls, processing jobs)\n    - Reset relevant post fields\n    \"\"\"\n    try:\n        logger.info(\n            f\"Starting to clear processing data for post: {post.title} (ID: {post.id})\"\n        )\n\n        # Remove audio files first\n        remove_associated_files(post)\n\n        writer_client.action(\n            \"clear_post_processing_data\", {\"post_id\": post.id}, wait=True\n        )\n\n        logger.info(\n            f\"Successfully cleared all processing data for post: {post.title} (ID: {post.id})\"\n        )\n\n    except Exception as e:\n        logger.error(\n            f\"Error clearing processing data for post {post.id}: {e}\",\n            exc_info=True,\n        )\n        raise PostException(f\"Failed to clear processing data: {str(e)}\") from e\n\n\nclass PostException(Exception):\n    pass\n"
  },
  {
    "path": "src/app/processor.py",
    "content": "from app.runtime_config import config\nfrom podcast_processor.podcast_processor import PodcastProcessor\n\n\nclass ProcessorSingleton:\n    \"\"\"Singleton class to manage the PodcastProcessor instance.\"\"\"\n\n    _instance: PodcastProcessor | None = None\n\n    @classmethod\n    def get_instance(cls) -> PodcastProcessor:\n        \"\"\"Get or create the PodcastProcessor instance.\"\"\"\n        if cls._instance is None:\n            cls._instance = PodcastProcessor(config)\n        return cls._instance\n\n    @classmethod\n    def reset_instance(cls) -> None:\n        \"\"\"Reset the singleton instance (useful for testing).\"\"\"\n        cls._instance = None\n\n\ndef get_processor() -> PodcastProcessor:\n    \"\"\"Get the PodcastProcessor instance.\"\"\"\n    return ProcessorSingleton.get_instance()\n"
  },
  {
    "path": "src/app/routes/__init__.py",
    "content": "from flask import Flask\n\nfrom .auth_routes import auth_bp\nfrom .billing_routes import billing_bp\nfrom .config_routes import config_bp\nfrom .discord_routes import discord_bp\nfrom .feed_routes import feed_bp\nfrom .jobs_routes import jobs_bp\nfrom .main_routes import main_bp\nfrom .post_routes import post_bp\n\n\ndef register_routes(app: Flask) -> None:\n    \"\"\"Register all route blueprints with the Flask app.\"\"\"\n    app.register_blueprint(main_bp)\n    app.register_blueprint(feed_bp)\n    app.register_blueprint(post_bp)\n    app.register_blueprint(config_bp)\n    app.register_blueprint(jobs_bp)\n    app.register_blueprint(auth_bp)\n    app.register_blueprint(billing_bp)\n    app.register_blueprint(discord_bp)\n"
  },
  {
    "path": "src/app/routes/auth_routes.py",
    "content": "from __future__ import annotations\n\nimport logging\nfrom typing import cast\n\nfrom flask import Blueprint, Response, current_app, g, jsonify, request, session\n\nfrom app.auth.service import (\n    AuthServiceError,\n    DuplicateUserError,\n    InvalidCredentialsError,\n    LastAdminRemovalError,\n    PasswordValidationError,\n    UserLimitExceededError,\n    authenticate,\n    change_password,\n    create_user,\n    delete_user,\n    list_users,\n    set_manual_feed_allowance,\n    set_role,\n    update_password,\n    update_user_last_active,\n)\nfrom app.auth.state import failure_rate_limiter\nfrom app.extensions import db\nfrom app.models import User\nfrom app.runtime_config import config as runtime_config\n\nlogger = logging.getLogger(\"global_logger\")\n\n\nauth_bp = Blueprint(\"auth\", __name__)\n\nRouteResult = Response | tuple[Response, int] | tuple[Response, int, dict[str, str]]\n\nSESSION_USER_KEY = \"user_id\"\n\n\ndef _auth_enabled() -> bool:\n    settings = current_app.config.get(\"AUTH_SETTINGS\")\n    return bool(settings and settings.require_auth)\n\n\n@auth_bp.route(\"/api/auth/status\", methods=[\"GET\"])\ndef auth_status() -> Response:\n    landing_enabled = bool(getattr(runtime_config, \"enable_public_landing_page\", False))\n    return jsonify(\n        {\"require_auth\": _auth_enabled(), \"landing_page_enabled\": landing_enabled}\n    )\n\n\n@auth_bp.route(\"/api/auth/login\", methods=[\"POST\"])\ndef login() -> RouteResult:\n    if not _auth_enabled():\n        return jsonify({\"error\": \"Authentication is disabled.\"}), 404\n\n    payload = request.get_json(silent=True) or {}\n    username = (payload.get(\"username\") or \"\").strip()\n    password = payload.get(\"password\") or \"\"\n\n    if not username or not password:\n        return jsonify({\"error\": \"Username and password are required.\"}), 400\n\n    client_identifier = request.remote_addr or \"unknown\"\n    retry_after = failure_rate_limiter.retry_after(client_identifier)\n    if retry_after:\n        return (\n            jsonify({\"error\": \"Too many failed attempts.\", \"retry_after\": retry_after}),\n            429,\n            {\"Retry-After\": str(retry_after)},\n        )\n\n    authenticated = authenticate(username, password)\n    if authenticated is None:\n        backoff = failure_rate_limiter.register_failure(client_identifier)\n        response_headers: dict[str, str] = {}\n        if backoff:\n            response_headers[\"Retry-After\"] = str(backoff)\n        response = jsonify({\"error\": \"Invalid username or password.\"})\n        if response_headers:\n            return response, 401, response_headers\n        return response, 401\n\n    failure_rate_limiter.register_success(client_identifier)\n    session.clear()\n    session[SESSION_USER_KEY] = authenticated.id\n    session.permanent = True\n    update_user_last_active(authenticated.id)\n\n    # Calculate effective allowance for frontend display\n    allowance = getattr(authenticated, \"manual_feed_allowance\", None)\n    if allowance is None:\n        allowance = getattr(authenticated, \"feed_allowance\", 0)\n\n    return jsonify(\n        {\n            \"user\": {\n                \"id\": authenticated.id,\n                \"username\": authenticated.username,\n                \"role\": authenticated.role,\n                \"feed_allowance\": allowance,\n                \"feed_subscription_status\": getattr(\n                    authenticated, \"feed_subscription_status\", \"inactive\"\n                ),\n            }\n        }\n    )\n\n\n@auth_bp.route(\"/api/auth/logout\", methods=[\"POST\"])\ndef logout() -> RouteResult:\n    if not _auth_enabled():\n        return jsonify({\"error\": \"Authentication is disabled.\"}), 404\n\n    if getattr(g, \"current_user\", None) is None:\n        session.clear()\n        return jsonify({\"error\": \"Authentication required.\"}), 401\n\n    session.clear()\n    return Response(status=204)\n\n\n@auth_bp.route(\"/api/auth/me\", methods=[\"GET\"])\ndef auth_me() -> RouteResult:\n    if not _auth_enabled():\n        return jsonify({\"error\": \"Authentication is disabled.\"}), 404\n\n    user = _require_authenticated_user()\n    if user is None:\n        return _unauthorized_response()\n\n    # Calculate effective allowance for frontend display\n    allowance = getattr(user, \"manual_feed_allowance\", None)\n    if allowance is None:\n        allowance = getattr(user, \"feed_allowance\", 0)\n\n    return jsonify(\n        {\n            \"user\": {\n                \"id\": user.id,\n                \"username\": user.username,\n                \"role\": user.role,\n                \"feed_allowance\": allowance,\n                \"feed_subscription_status\": getattr(\n                    user, \"feed_subscription_status\", \"inactive\"\n                ),\n            }\n        }\n    )\n\n\n@auth_bp.route(\"/api/auth/change-password\", methods=[\"POST\"])\ndef change_password_route() -> RouteResult:\n    if not _auth_enabled():\n        return jsonify({\"error\": \"Authentication is disabled.\"}), 404\n\n    user = _require_authenticated_user()\n    if user is None:\n        return _unauthorized_response()\n\n    payload = request.get_json(silent=True) or {}\n    current_password = payload.get(\"current_password\") or \"\"\n    new_password = payload.get(\"new_password\") or \"\"\n\n    if not current_password or not new_password:\n        return (\n            jsonify({\"error\": \"Current and new passwords are required.\"}),\n            400,\n        )\n\n    try:\n        change_password(user, current_password, new_password)\n    except InvalidCredentialsError as exc:\n        return jsonify({\"error\": str(exc)}), 401\n    except PasswordValidationError as exc:\n        return jsonify({\"error\": str(exc)}), 400\n    except AuthServiceError as exc:  # fallback\n        logger.error(\"Password change failed: %s\", exc)\n        return jsonify({\"error\": \"Unable to change password.\"}), 500\n\n    return jsonify({\"status\": \"ok\"})\n\n\n@auth_bp.route(\"/api/auth/users\", methods=[\"GET\"])\ndef list_users_route() -> RouteResult:\n    if not _auth_enabled():\n        return jsonify({\"error\": \"Authentication is disabled.\"}), 404\n\n    user = _require_authenticated_user()\n    if user is None:\n        return _unauthorized_response()\n\n    if not user.role == \"admin\":\n        return jsonify({\"error\": \"Admin privileges required.\"}), 403\n\n    users = list_users()\n    return jsonify(\n        {\n            \"users\": [\n                {\n                    \"id\": u.id,\n                    \"username\": u.username,\n                    \"role\": u.role,\n                    \"created_at\": u.created_at.isoformat(),\n                    \"updated_at\": u.updated_at.isoformat(),\n                    \"last_active\": u.last_active.isoformat() if u.last_active else None,\n                    \"feed_allowance\": getattr(u, \"feed_allowance\", 0),\n                    \"manual_feed_allowance\": getattr(u, \"manual_feed_allowance\", None),\n                    \"feed_subscription_status\": getattr(\n                        u, \"feed_subscription_status\", \"inactive\"\n                    ),\n                }\n                for u in users\n            ]\n        }\n    )\n\n\n@auth_bp.route(\"/api/auth/users\", methods=[\"POST\"])\ndef create_user_route() -> RouteResult:\n    if not _auth_enabled():\n        return jsonify({\"error\": \"Authentication is disabled.\"}), 404\n\n    user = _require_authenticated_user()\n    if user is None:\n        return _unauthorized_response()\n    if user.role != \"admin\":\n        return jsonify({\"error\": \"Admin privileges required.\"}), 403\n\n    payload = request.get_json(silent=True) or {}\n    username = (payload.get(\"username\") or \"\").strip()\n    password = payload.get(\"password\") or \"\"\n    role = (payload.get(\"role\") or \"user\").strip()\n\n    if not username or not password:\n        return jsonify({\"error\": \"Username and password are required.\"}), 400\n\n    try:\n        new_user = create_user(username, password, role)\n    except (\n        PasswordValidationError,\n        DuplicateUserError,\n        UserLimitExceededError,\n        AuthServiceError,\n    ) as exc:\n        status = 409 if isinstance(exc, DuplicateUserError) else 400\n        return jsonify({\"error\": str(exc)}), status\n\n    return (\n        jsonify(\n            {\n                \"user\": {\n                    \"id\": new_user.id,\n                    \"username\": new_user.username,\n                    \"role\": new_user.role,\n                    \"created_at\": new_user.created_at.isoformat(),\n                    \"updated_at\": new_user.updated_at.isoformat(),\n                }\n            }\n        ),\n        201,\n    )\n\n\n@auth_bp.route(\"/api/auth/users/<string:username>\", methods=[\"PATCH\"])\ndef update_user_route(username: str) -> RouteResult:\n    if not _auth_enabled():\n        return jsonify({\"error\": \"Authentication is disabled.\"}), 404\n\n    acting_user = _require_authenticated_user()\n    if acting_user is None:\n        return _unauthorized_response()\n\n    if acting_user.role != \"admin\":\n        return jsonify({\"error\": \"Admin privileges required.\"}), 403\n\n    target = User.query.filter_by(username=username.lower()).first()\n    if target is None:\n        return jsonify({\"error\": \"User not found.\"}), 404\n\n    payload = request.get_json(silent=True) or {}\n    role = payload.get(\"role\")\n    new_password = payload.get(\"password\")\n    manual_feed_allowance = payload.get(\"manual_feed_allowance\")\n\n    try:\n        if role is not None:\n            set_role(target, role)\n        if new_password:\n            update_password(target, new_password)\n        if \"manual_feed_allowance\" in payload:\n            set_manual_feed_allowance(target, manual_feed_allowance)\n        return jsonify({\"status\": \"ok\"})\n    except (PasswordValidationError, LastAdminRemovalError, AuthServiceError) as exc:\n        status_code = 400\n        return jsonify({\"error\": str(exc)}), status_code\n\n\n@auth_bp.route(\"/api/auth/users/<string:username>\", methods=[\"DELETE\"])\ndef delete_user_route(username: str) -> RouteResult:\n    if not _auth_enabled():\n        return jsonify({\"error\": \"Authentication is disabled.\"}), 404\n\n    acting_user = _require_authenticated_user()\n    if acting_user is None:\n        return _unauthorized_response()\n    if acting_user.role != \"admin\":\n        return jsonify({\"error\": \"Admin privileges required.\"}), 403\n\n    target = User.query.filter_by(username=username.lower()).first()\n    if target is None:\n        return jsonify({\"error\": \"User not found.\"}), 404\n\n    try:\n        delete_user(target)\n    except LastAdminRemovalError as exc:\n        return jsonify({\"error\": str(exc)}), 400\n\n    return jsonify({\"status\": \"ok\"})\n\n\ndef _require_authenticated_user() -> User | None:\n    if not _auth_enabled():\n        return None\n\n    current = getattr(g, \"current_user\", None)\n    if current is None:\n        return None\n\n    return cast(User | None, db.session.get(User, current.id))\n\n\ndef _unauthorized_response() -> RouteResult:\n    if not _auth_enabled():\n        return jsonify({\"error\": \"Authentication is disabled.\"}), 404\n\n    return jsonify({\"error\": \"Authentication required.\"}), 401\n"
  },
  {
    "path": "src/app/routes/billing_routes.py",
    "content": "import logging\nimport os\nfrom typing import Any, Optional\n\nfrom flask import Blueprint, jsonify, request\n\nfrom app.extensions import db\nfrom app.models import User, UserFeed\nfrom app.writer.client import writer_client\n\nfrom .auth_routes import _require_authenticated_user\n\nlogger = logging.getLogger(\"global_logger\")\n\nbilling_bp = Blueprint(\"billing\", __name__)\n\n\ndef _get_stripe_client() -> tuple[Optional[Any], Optional[str]]:\n    secret = os.getenv(\"STRIPE_SECRET_KEY\")\n    if not secret:\n        return None, \"Stripe secret key missing\"\n    try:\n        import stripe\n    except ImportError:\n        return None, \"Stripe library not installed\"\n    stripe.api_key = secret\n    return stripe, None\n\n\ndef _product_id() -> Optional[str]:\n    return os.getenv(\"STRIPE_PRODUCT_ID\")\n\n\ndef _min_subscription_amount_cents() -> int:\n    \"\"\"Minimum non-zero subscription amount in cents.\n\n    Allow 0 to cancel, otherwise enforce this minimum.\n    Configurable via STRIPE_MIN_SUBSCRIPTION_AMOUNT_CENTS.\n    \"\"\"\n\n    raw = os.getenv(\"STRIPE_MIN_SUBSCRIPTION_AMOUNT_CENTS\")\n    if raw is None or raw == \"\":\n        return 100\n    try:\n        value = int(raw)\n    except ValueError:\n        logger.warning(\n            \"Invalid STRIPE_MIN_SUBSCRIPTION_AMOUNT_CENTS=%r; defaulting to 100\",\n            raw,\n        )\n        return 100\n    return max(0, value)\n\n\ndef _user_feed_usage(user: User) -> dict[str, int]:\n    feeds_in_use = UserFeed.query.filter_by(user_id=user.id).count()\n    allowance = getattr(user, \"manual_feed_allowance\", None)\n    if allowance is None:\n        allowance = getattr(user, \"feed_allowance\", 0) or 0\n    remaining = max(0, allowance - feeds_in_use)\n    return {\n        \"feed_allowance\": allowance,\n        \"feeds_in_use\": feeds_in_use,\n        \"remaining\": remaining,\n    }\n\n\n@billing_bp.route(\"/api/billing/summary\", methods=[\"GET\"])\ndef billing_summary() -> Any:\n    \"\"\"Return feed allowance and subscription state for the current user.\"\"\"\n    user = _require_authenticated_user()\n    if user is None:\n        logger.warning(\"Billing summary requested by unauthenticated user\")\n        return jsonify({\"error\": \"Authentication required\"}), 401\n\n    logger.info(\"Billing summary requested for user %s\", user.id)\n    usage = _user_feed_usage(user)\n    product_id = _product_id()\n    stripe_client, _ = _get_stripe_client()\n    current_amount = 0\n\n    if (\n        stripe_client is not None\n        and user.stripe_customer_id\n        and not user.stripe_subscription_id\n    ):\n        # Try to find an active subscription if we don't have one linked\n        subs = stripe_client.Subscription.list(\n            customer=user.stripe_customer_id, limit=1, status=\"active\"\n        )\n        if subs and subs.get(\"data\"):\n            sub = subs[\"data\"][0]\n            items = sub.get(\"items\", {}).get(\"data\", [])\n            # For PWYW bundle, allowance is 10 if active\n            feed_allowance = 10 if items else 0\n\n            writer_client.action(\n                \"set_user_billing_fields\",\n                {\n                    \"user_id\": user.id,\n                    \"stripe_subscription_id\": sub[\"id\"],\n                    \"feed_subscription_status\": sub[\"status\"],\n                    \"feed_allowance\": feed_allowance,\n                },\n                wait=True,\n            )\n            db.session.expire(user)\n            usage = _user_feed_usage(user)\n\n    # Fetch current price amount if subscribed\n    if (\n        stripe_client is not None\n        and user.stripe_subscription_id\n        and user.feed_subscription_status == \"active\"\n    ):\n        try:\n            sub = stripe_client.Subscription.retrieve(\n                user.stripe_subscription_id, expand=[\"items.data.price\"]\n            )\n            if sub and sub.get(\"items\") and sub[\"items\"][\"data\"]:\n                price_item = sub[\"items\"][\"data\"][0].get(\"price\")\n                if price_item:\n                    current_amount = price_item.get(\"unit_amount\", 0)\n        except Exception as e:\n            logger.error(\"Error fetching subscription details: %s\", e)\n\n    return jsonify(\n        {\n            \"feed_allowance\": usage[\"feed_allowance\"],\n            \"feeds_in_use\": usage[\"feeds_in_use\"],\n            \"remaining\": usage[\"remaining\"],\n            \"current_amount\": current_amount,\n            \"min_amount_cents\": _min_subscription_amount_cents(),\n            \"subscription_status\": getattr(\n                user, \"feed_subscription_status\", \"inactive\"\n            ),\n            \"stripe_subscription_id\": getattr(user, \"stripe_subscription_id\", None),\n            \"stripe_customer_id\": getattr(user, \"stripe_customer_id\", None),\n            \"product_id\": product_id,\n        }\n    )\n\n\ndef _build_return_urls() -> tuple[str, str]:\n    host = request.host_url.rstrip(\"/\")\n    success = f\"{host}/billing?checkout=success\"\n    cancel = f\"{host}/billing?checkout=cancel\"\n    return success, cancel\n\n\n@billing_bp.route(\"/api/billing/subscription\", methods=[\"POST\"])\ndef update_subscription() -> Any:  # pylint: disable=too-many-statements\n    \"\"\"Update subscription amount or create new subscription.\"\"\"\n    user = _require_authenticated_user()\n    if user is None:\n        logger.warning(\"Update subscription requested by unauthenticated user\")\n        return jsonify({\"error\": \"Authentication required\"}), 401\n\n    payload = request.get_json(silent=True) or {}\n    amount = int(payload.get(\"amount\") or 0)\n    logger.info(\"Update subscription for user %s: %s cents\", user.id, amount)\n\n    # Allow 0 to cancel, otherwise enforce configured minimum.\n    min_amount_cents = _min_subscription_amount_cents()\n    if 0 < amount < min_amount_cents:\n        min_amount_dollars = min_amount_cents / 100.0\n        return (\n            jsonify({\"error\": f\"Minimum amount is ${min_amount_dollars:.2f}\"}),\n            400,\n        )\n\n    stripe_client, stripe_err = _get_stripe_client()\n    product_id = _product_id()\n    if stripe_client is None or not product_id:\n        logger.error(\"Stripe not configured. err=%s\", stripe_err)\n        return (\n            jsonify(\n                {\n                    \"error\": \"STRIPE_NOT_CONFIGURED\",\n                    \"message\": \"Billing system is not configured.\",\n                }\n            ),\n            503,\n        )\n\n    try:\n        requested_subscription_id = payload.get(\"subscription_id\")\n        if (\n            requested_subscription_id\n            and not user.stripe_subscription_id\n            and stripe_client is not None\n        ):\n            # Attach known subscription id to the user if it belongs to their customer\n            sub = stripe_client.Subscription.retrieve(requested_subscription_id)\n            if sub and sub.get(\"customer\") == user.stripe_customer_id:\n                writer_client.action(\n                    \"set_user_billing_fields\",\n                    {\"user_id\": user.id, \"stripe_subscription_id\": sub[\"id\"]},\n                    wait=True,\n                )\n                db.session.expire(user)\n\n        # Ensure customer exists\n        if not user.stripe_customer_id:\n            customer = stripe_client.Customer.create(\n                name=user.username or f\"user-{user.id}\",\n                metadata={\"user_id\": user.id},\n            )\n            writer_client.action(\n                \"set_user_billing_fields\",\n                {\"user_id\": user.id, \"stripe_customer_id\": customer[\"id\"]},\n                wait=True,\n            )\n            db.session.expire(user)\n\n        # If subscription exists, update or cancel\n        if user.stripe_subscription_id:\n            if amount <= 0:\n                logger.info(\"Canceling subscription for user %s\", user.id)\n                stripe_client.Subscription.delete(user.stripe_subscription_id)\n                writer_client.action(\n                    \"set_user_billing_fields\",\n                    {\n                        \"user_id\": user.id,\n                        \"feed_allowance\": 0,\n                        \"feed_subscription_status\": \"canceled\",\n                        \"stripe_subscription_id\": None,\n                    },\n                    wait=True,\n                )\n                db.session.expire(user)\n                usage = _user_feed_usage(user)\n                return jsonify(\n                    {\n                        \"feed_allowance\": usage[\"feed_allowance\"],\n                        \"feeds_in_use\": usage[\"feeds_in_use\"],\n                        \"remaining\": usage[\"remaining\"],\n                        \"subscription_status\": user.feed_subscription_status,\n                        \"requires_stripe_checkout\": False,\n                        \"message\": \"Subscription canceled.\",\n                    }\n                )\n\n            # Update existing subscription with new price\n            sub = stripe_client.Subscription.retrieve(\n                user.stripe_subscription_id, expand=[\"items\"]\n            )\n            items = sub[\"items\"][\"data\"]\n            if not items:\n                return jsonify({\"error\": \"Subscription has no items\"}), 400\n            item_id = items[0][\"id\"]\n\n            updated = stripe_client.Subscription.modify(\n                user.stripe_subscription_id,\n                items=[\n                    {\n                        \"id\": item_id,\n                        \"price_data\": {\n                            \"currency\": \"usd\",\n                            \"product\": product_id,\n                            \"unit_amount\": amount,\n                            \"recurring\": {\"interval\": \"month\"},\n                        },\n                    }\n                ],\n                proration_behavior=\"none\",\n            )\n            logger.info(\n                \"Updated subscription for user %s to amount %s\", user.id, amount\n            )\n            status = updated[\"status\"]\n            writer_client.action(\n                \"set_user_billing_fields\",\n                {\n                    \"user_id\": user.id,\n                    \"feed_allowance\": 10,  # Fixed allowance for active sub\n                    \"feed_subscription_status\": status,\n                },\n                wait=True,\n            )\n            db.session.expire(user)\n            usage = _user_feed_usage(user)\n            return jsonify(\n                {\n                    \"feed_allowance\": usage[\"feed_allowance\"],\n                    \"feeds_in_use\": usage[\"feeds_in_use\"],\n                    \"remaining\": usage[\"remaining\"],\n                    \"subscription_status\": status,\n                    \"requires_stripe_checkout\": False,\n                    \"message\": \"Subscription updated.\",\n                }\n            )\n\n        # Otherwise, create checkout session for a new subscription\n        if amount <= 0:\n            writer_client.action(\n                \"set_user_billing_fields\",\n                {\n                    \"user_id\": user.id,\n                    \"feed_allowance\": 0,\n                    \"feed_subscription_status\": \"inactive\",\n                },\n                wait=True,\n            )\n            db.session.expire(user)\n            usage = _user_feed_usage(user)\n            return jsonify(\n                {\n                    \"feed_allowance\": usage[\"feed_allowance\"],\n                    \"feeds_in_use\": usage[\"feeds_in_use\"],\n                    \"remaining\": usage[\"remaining\"],\n                    \"subscription_status\": user.feed_subscription_status,\n                    \"requires_stripe_checkout\": False,\n                    \"message\": \"No subscription created for zero amount.\",\n                }\n            )\n        logger.info(\n            \"Creating checkout session for user %s with amount %s\", user.id, amount\n        )\n        success_url, cancel_url = _build_return_urls()\n        session = stripe_client.checkout.Session.create(\n            mode=\"subscription\",\n            customer=user.stripe_customer_id,\n            line_items=[\n                {\n                    \"price_data\": {\n                        \"currency\": \"usd\",\n                        \"product\": product_id,\n                        \"unit_amount\": amount,\n                        \"recurring\": {\"interval\": \"month\"},\n                    },\n                    \"quantity\": 1,\n                }\n            ],\n            subscription_data={\"metadata\": {\"user_id\": user.id}},\n            metadata={\"user_id\": user.id},\n            success_url=payload.get(\"success_url\") or success_url,\n            cancel_url=payload.get(\"cancel_url\") or cancel_url,\n        )\n        return jsonify(\n            {\n                \"checkout_url\": session[\"url\"],\n                \"requires_stripe_checkout\": True,\n                \"feed_allowance\": user.feed_allowance,\n                \"feeds_in_use\": _user_feed_usage(user)[\"feeds_in_use\"],\n                \"subscription_status\": user.feed_subscription_status,\n            }\n        )\n    except Exception as exc:  # pylint: disable=broad-except\n        logger.error(\"Stripe error updating subscription: %s\", exc)\n        return jsonify({\"error\": \"STRIPE_ERROR\", \"message\": str(exc)}), 502\n\n    usage = _user_feed_usage(user)\n    return jsonify(\n        {\n            \"feed_allowance\": usage[\"feed_allowance\"],\n            \"feeds_in_use\": usage[\"feeds_in_use\"],\n            \"remaining\": usage[\"remaining\"],\n            \"subscription_status\": user.feed_subscription_status,\n            \"requires_stripe_checkout\": True,\n            \"message\": \"Local update completed.\",\n        }\n    )\n\n\n@billing_bp.route(\"/api/billing/portal-session\", methods=[\"POST\"])\ndef billing_portal_session() -> Any:\n    user = _require_authenticated_user()\n    if user is None:\n        logger.warning(\"Billing portal session requested by unauthenticated user\")\n        return jsonify({\"error\": \"Authentication required\"}), 401\n\n    logger.info(\"Billing portal session requested for user %s\", user.id)\n    stripe_client, stripe_err = _get_stripe_client()\n    if stripe_client is None:\n        return jsonify({\"error\": \"STRIPE_NOT_CONFIGURED\", \"message\": stripe_err}), 400\n    if not user.stripe_customer_id:\n        return (\n            jsonify(\n                {\n                    \"error\": \"NO_STRIPE_CUSTOMER\",\n                    \"message\": \"No Stripe customer on file.\",\n                }\n            ),\n            400,\n        )\n\n    return_url, _ = _build_return_urls()\n    try:\n        session = stripe_client.billing_portal.Session.create(\n            customer=user.stripe_customer_id,\n            return_url=return_url,\n        )\n        return jsonify({\"url\": session[\"url\"]})\n    except Exception as exc:  # pylint: disable=broad-except\n        logger.error(\"Failed to create billing portal session: %s\", exc)\n        return jsonify({\"error\": \"STRIPE_ERROR\", \"message\": str(exc)}), 502\n\n\ndef _update_user_from_subscription(sub: Any) -> None:\n    customer_id = sub.get(\"customer\")\n    if not customer_id:\n        return\n    user = User.query.filter_by(stripe_customer_id=customer_id).first()\n    if not user:\n        return\n\n    status = sub.get(\"status\") if isinstance(sub, dict) else sub[\"status\"]\n\n    # For PWYW bundle, allowance is 10 if active\n    feed_allowance = 10 if status in (\"active\", \"trialing\", \"past_due\") else 0\n\n    writer_client.action(\n        \"set_user_billing_by_customer_id\",\n        {\n            \"stripe_customer_id\": customer_id,\n            \"feed_allowance\": feed_allowance,\n            \"feed_subscription_status\": status,\n            \"stripe_subscription_id\": (\n                sub.get(\"id\") if isinstance(sub, dict) else sub[\"id\"]\n            ),\n        },\n        wait=True,\n    )\n\n\n@billing_bp.route(\"/api/billing/stripe-webhook\", methods=[\"POST\"])\ndef stripe_webhook() -> Any:\n    stripe_client, stripe_err = _get_stripe_client()\n    if stripe_client is None:\n        return jsonify({\"error\": \"STRIPE_NOT_CONFIGURED\", \"message\": stripe_err}), 400\n\n    payload = request.data\n    sig_header = request.headers.get(\"Stripe-Signature\")\n    webhook_secret = os.getenv(\"STRIPE_WEBHOOK_SECRET\")\n    if not webhook_secret:\n        logger.error(\"Stripe webhook secret not configured; rejecting webhook request.\")\n        return (\n            jsonify(\n                {\n                    \"error\": \"WEBHOOK_SECRET_MISSING\",\n                    \"message\": \"Stripe webhook secret is not configured.\",\n                }\n            ),\n            400,\n        )\n\n    try:\n        event = stripe_client.Webhook.construct_event(\n            payload, sig_header, webhook_secret\n        )\n        logger.info(\"Stripe webhook received: %s\", event[\"type\"])\n    except Exception as exc:  # pylint: disable=broad-except\n        logger.error(\"Invalid Stripe webhook: %s\", exc)\n        return jsonify({\"error\": \"INVALID_SIGNATURE\"}), 400\n\n    event_type = event[\"type\"]\n    data_object = event[\"data\"][\"object\"]\n\n    if event_type in (\n        \"customer.subscription.created\",\n        \"customer.subscription.updated\",\n        \"customer.subscription.deleted\",\n        \"customer.subscription.paused\",\n    ):\n        _update_user_from_subscription(data_object)\n    elif event_type == \"checkout.session.completed\":\n        sub_id = data_object.get(\"subscription\")\n        customer_id = data_object.get(\"customer\")\n        user_id = data_object.get(\"metadata\", {}).get(\"user_id\")\n        user = None\n        if customer_id:\n            user = User.query.filter_by(stripe_customer_id=customer_id).first()\n        if user is None and user_id:\n            user = db.session.get(User, int(user_id))\n        if user and customer_id:\n            writer_client.action(\n                \"set_user_billing_fields\",\n                {\"user_id\": user.id, \"stripe_customer_id\": customer_id},\n                wait=True,\n            )\n            db.session.expire(user)\n        if user and sub_id:\n            writer_client.action(\n                \"set_user_billing_fields\",\n                {\"user_id\": user.id, \"stripe_subscription_id\": sub_id},\n                wait=True,\n            )\n            db.session.expire(user)\n    else:\n        logger.info(\"Unhandled Stripe event: %s\", event_type)\n\n    return jsonify({\"status\": \"ok\"})\n"
  },
  {
    "path": "src/app/routes/config_routes.py",
    "content": "import logging\nimport os\nfrom typing import Any, Dict\n\nimport flask\nimport litellm\nfrom flask import Blueprint, jsonify, request\nfrom groq import Groq\nfrom openai import OpenAI\n\nfrom app.auth.guards import require_admin\nfrom app.config_store import read_combined, to_pydantic_config\nfrom app.processor import ProcessorSingleton\nfrom app.runtime_config import config as runtime_config\nfrom app.writer.client import writer_client\nfrom shared.llm_utils import model_uses_max_completion_tokens\n\nlogger = logging.getLogger(\"global_logger\")\n\n\nconfig_bp = Blueprint(\"config\", __name__)\n\n\ndef _mask_secret(value: Any | None) -> str | None:\n    if value is None:\n        return None\n    try:\n        secret = str(value).strip()\n    except Exception:  # pragma: no cover - defensive\n        return None\n\n    if not secret:\n        return None\n    if len(secret) <= 8:\n        return secret\n    return f\"{secret[:4]}...{secret[-4:]}\"\n\n\ndef _sanitize_config_for_client(cfg: Dict[str, Any]) -> Dict[str, Any]:\n    try:\n        data: Dict[str, Any] = dict(cfg)\n        llm: Dict[str, Any] = dict(data.get(\"llm\", {}))\n        whisper: Dict[str, Any] = dict(data.get(\"whisper\", {}))\n\n        llm_api_key = llm.pop(\"llm_api_key\", None)\n        if llm_api_key:\n            llm[\"llm_api_key_preview\"] = _mask_secret(llm_api_key)\n\n        whisper_api_key = whisper.pop(\"api_key\", None)\n        if whisper_api_key:\n            whisper[\"api_key_preview\"] = _mask_secret(whisper_api_key)\n\n        data[\"llm\"] = llm\n        data[\"whisper\"] = whisper\n        return data\n    except Exception:\n        return {}\n\n\n@config_bp.route(\"/api/config\", methods=[\"GET\"])\ndef api_get_config() -> flask.Response:\n    _, error_response = require_admin()\n    if error_response:\n        return error_response\n\n    try:\n        data = read_combined()\n\n        _hydrate_runtime_config(data)\n\n        env_metadata = _build_env_override_metadata(data)\n\n        return flask.jsonify(\n            {\n                \"config\": _sanitize_config_for_client(data),\n                \"env_overrides\": env_metadata,\n            }\n        )\n    except Exception as e:  # pylint: disable=broad-except\n        logger.error(f\"Failed to read configuration: {e}\")\n        return flask.make_response(\n            jsonify({\"error\": \"Failed to read configuration\"}), 500\n        )\n\n\ndef _hydrate_runtime_config(data: Dict[str, Any]) -> None:\n    _hydrate_llm_config(data)\n    _hydrate_whisper_config(data)\n    _hydrate_app_config(data)\n\n\ndef _hydrate_llm_config(data: Dict[str, Any]) -> None:\n    data.setdefault(\"llm\", {})\n    llm = data[\"llm\"]\n    llm[\"llm_api_key\"] = getattr(runtime_config, \"llm_api_key\", llm.get(\"llm_api_key\"))\n    llm[\"llm_model\"] = getattr(runtime_config, \"llm_model\", llm.get(\"llm_model\"))\n    llm[\"openai_base_url\"] = getattr(\n        runtime_config, \"openai_base_url\", llm.get(\"openai_base_url\")\n    )\n    llm[\"openai_timeout\"] = getattr(\n        runtime_config, \"openai_timeout\", llm.get(\"openai_timeout\")\n    )\n    llm[\"openai_max_tokens\"] = getattr(\n        runtime_config, \"openai_max_tokens\", llm.get(\"openai_max_tokens\")\n    )\n    llm[\"llm_max_concurrent_calls\"] = getattr(\n        runtime_config, \"llm_max_concurrent_calls\", llm.get(\"llm_max_concurrent_calls\")\n    )\n    llm[\"llm_max_retry_attempts\"] = getattr(\n        runtime_config, \"llm_max_retry_attempts\", llm.get(\"llm_max_retry_attempts\")\n    )\n    llm[\"llm_max_input_tokens_per_call\"] = getattr(\n        runtime_config,\n        \"llm_max_input_tokens_per_call\",\n        llm.get(\"llm_max_input_tokens_per_call\"),\n    )\n    llm[\"llm_enable_token_rate_limiting\"] = getattr(\n        runtime_config,\n        \"llm_enable_token_rate_limiting\",\n        llm.get(\"llm_enable_token_rate_limiting\"),\n    )\n    llm[\"llm_max_input_tokens_per_minute\"] = getattr(\n        runtime_config,\n        \"llm_max_input_tokens_per_minute\",\n        llm.get(\"llm_max_input_tokens_per_minute\"),\n    )\n\n\ndef _hydrate_whisper_config(data: Dict[str, Any]) -> None:\n    data.setdefault(\"whisper\", {})\n    whisper = data[\"whisper\"]\n    rt_whisper = getattr(runtime_config, \"whisper\", None)\n\n    if isinstance(rt_whisper, dict):\n        _overlay_whisper_dict(whisper, rt_whisper)\n        return\n\n    if rt_whisper is not None and hasattr(rt_whisper, \"whisper_type\"):\n        _overlay_whisper_object(whisper, rt_whisper)\n\n\ndef _overlay_whisper_dict(target: Dict[str, Any], source: Dict[str, Any]) -> None:\n    wtype = source.get(\"whisper_type\")\n    target[\"whisper_type\"] = wtype or target.get(\"whisper_type\")\n    if wtype == \"local\":\n        target[\"model\"] = source.get(\"model\", target.get(\"model\"))\n    elif wtype == \"remote\":\n        _overlay_remote_whisper_fields(target, source)\n    elif wtype == \"groq\":\n        _overlay_groq_whisper_fields(target, source)\n\n\ndef _overlay_whisper_object(target: Dict[str, Any], source: Any) -> None:\n    wtype = getattr(source, \"whisper_type\")\n    target[\"whisper_type\"] = wtype\n    if wtype == \"local\":\n        target[\"model\"] = getattr(source, \"model\", target.get(\"model\"))\n    elif wtype == \"remote\":\n        _overlay_remote_whisper_fields(target, source)\n    elif wtype == \"groq\":\n        _overlay_groq_whisper_fields(target, source)\n\n\ndef _overlay_remote_whisper_fields(target: Dict[str, Any], source: Any) -> None:\n    target[\"model\"] = _get_attr_or_value(source, \"model\", target.get(\"model\"))\n    target[\"api_key\"] = _get_attr_or_value(source, \"api_key\", target.get(\"api_key\"))\n    target[\"base_url\"] = _get_attr_or_value(source, \"base_url\", target.get(\"base_url\"))\n    target[\"language\"] = _get_attr_or_value(source, \"language\", target.get(\"language\"))\n    target[\"timeout_sec\"] = _get_attr_or_value(\n        source, \"timeout_sec\", target.get(\"timeout_sec\")\n    )\n    target[\"chunksize_mb\"] = _get_attr_or_value(\n        source, \"chunksize_mb\", target.get(\"chunksize_mb\")\n    )\n\n\ndef _overlay_groq_whisper_fields(target: Dict[str, Any], source: Any) -> None:\n    target[\"api_key\"] = _get_attr_or_value(source, \"api_key\", target.get(\"api_key\"))\n    target[\"model\"] = _get_attr_or_value(source, \"model\", target.get(\"model\"))\n    target[\"language\"] = _get_attr_or_value(source, \"language\", target.get(\"language\"))\n    target[\"max_retries\"] = _get_attr_or_value(\n        source, \"max_retries\", target.get(\"max_retries\")\n    )\n\n\ndef _get_attr_or_value(source: Any, key: str, default: Any) -> Any:\n    if isinstance(source, dict):\n        return source.get(key, default)\n    return getattr(source, key, default)\n\n\ndef _hydrate_app_config(data: Dict[str, Any]) -> None:\n    data.setdefault(\"app\", {})\n    app_cfg = data[\"app\"]\n    app_cfg[\"post_cleanup_retention_days\"] = getattr(\n        runtime_config,\n        \"post_cleanup_retention_days\",\n        app_cfg.get(\"post_cleanup_retention_days\"),\n    )\n    app_cfg[\"enable_public_landing_page\"] = getattr(\n        runtime_config,\n        \"enable_public_landing_page\",\n        app_cfg.get(\"enable_public_landing_page\"),\n    )\n    app_cfg[\"user_limit_total\"] = getattr(\n        runtime_config, \"user_limit_total\", app_cfg.get(\"user_limit_total\")\n    )\n    app_cfg[\"autoprocess_on_download\"] = getattr(\n        runtime_config,\n        \"autoprocess_on_download\",\n        app_cfg.get(\"autoprocess_on_download\"),\n    )\n\n\ndef _first_env(env_names: list[str]) -> tuple[str | None, str | None]:\n    \"\"\"Return first found environment variable name and value.\"\"\"\n    for name in env_names:\n        value = os.environ.get(name)\n        if value is not None and value != \"\":\n            return name, value\n    return None, None\n\n\ndef _register_override(\n    overrides: Dict[str, Any],\n    path: str,\n    env_var: str | None,\n    value: Any | None,\n    *,\n    secret: bool = False,\n) -> None:\n    \"\"\"Register an environment override in the metadata dict.\"\"\"\n    if not env_var or value is None:\n        return\n    entry: Dict[str, Any] = {\"env_var\": env_var}\n    if secret:\n        entry[\"is_secret\"] = True\n        entry[\"value_preview\"] = _mask_secret(value)\n    else:\n        entry[\"value\"] = value\n    overrides[path] = entry\n\n\ndef _register_llm_overrides(overrides: Dict[str, Any]) -> None:\n    \"\"\"Register LLM-related environment overrides.\"\"\"\n    env_var, env_value = _first_env([\"LLM_API_KEY\", \"OPENAI_API_KEY\", \"GROQ_API_KEY\"])\n    _register_override(overrides, \"llm.llm_api_key\", env_var, env_value, secret=True)\n\n    base_url = os.environ.get(\"OPENAI_BASE_URL\")\n    if base_url:\n        _register_override(\n            overrides, \"llm.openai_base_url\", \"OPENAI_BASE_URL\", base_url\n        )\n\n    llm_model = os.environ.get(\"LLM_MODEL\")\n    if llm_model:\n        _register_override(overrides, \"llm.llm_model\", \"LLM_MODEL\", llm_model)\n\n\ndef _register_groq_shared_overrides(overrides: Dict[str, Any]) -> None:\n    \"\"\"Register shared Groq API key override metadata.\"\"\"\n    groq_key = os.environ.get(\"GROQ_API_KEY\")\n    if groq_key:\n        _register_override(\n            overrides, \"groq.api_key\", \"GROQ_API_KEY\", groq_key, secret=True\n        )\n\n\ndef _register_remote_whisper_overrides(overrides: Dict[str, Any]) -> None:\n    \"\"\"Register remote whisper environment overrides.\"\"\"\n    remote_key = _first_env([\"WHISPER_REMOTE_API_KEY\", \"OPENAI_API_KEY\"])\n    _register_override(\n        overrides, \"whisper.api_key\", remote_key[0], remote_key[1], secret=True\n    )\n\n    remote_base = _first_env([\"WHISPER_REMOTE_BASE_URL\", \"OPENAI_BASE_URL\"])\n    _register_override(overrides, \"whisper.base_url\", remote_base[0], remote_base[1])\n\n    remote_model = os.environ.get(\"WHISPER_REMOTE_MODEL\")\n    if remote_model:\n        _register_override(\n            overrides, \"whisper.model\", \"WHISPER_REMOTE_MODEL\", remote_model\n        )\n\n    remote_timeout = os.environ.get(\"WHISPER_REMOTE_TIMEOUT_SEC\")\n    if remote_timeout:\n        _register_override(\n            overrides,\n            \"whisper.timeout_sec\",\n            \"WHISPER_REMOTE_TIMEOUT_SEC\",\n            remote_timeout,\n        )\n\n    remote_chunksize = os.environ.get(\"WHISPER_REMOTE_CHUNKSIZE_MB\")\n    if remote_chunksize:\n        _register_override(\n            overrides,\n            \"whisper.chunksize_mb\",\n            \"WHISPER_REMOTE_CHUNKSIZE_MB\",\n            remote_chunksize,\n        )\n\n\ndef _register_groq_whisper_overrides(overrides: Dict[str, Any]) -> None:\n    \"\"\"Register groq whisper environment overrides.\"\"\"\n    groq_key = os.environ.get(\"GROQ_API_KEY\")\n    if groq_key:\n        _register_override(\n            overrides, \"whisper.api_key\", \"GROQ_API_KEY\", groq_key, secret=True\n        )\n\n    groq_model_env, groq_model_val = _first_env(\n        [\"GROQ_WHISPER_MODEL\", \"WHISPER_GROQ_MODEL\"]\n    )\n    _register_override(overrides, \"whisper.model\", groq_model_env, groq_model_val)\n\n    groq_retries = os.environ.get(\"GROQ_MAX_RETRIES\")\n    if groq_retries:\n        _register_override(\n            overrides, \"whisper.max_retries\", \"GROQ_MAX_RETRIES\", groq_retries\n        )\n\n\ndef _register_local_whisper_overrides(overrides: Dict[str, Any]) -> None:\n    \"\"\"Register local whisper environment overrides.\"\"\"\n    local_model = os.environ.get(\"WHISPER_LOCAL_MODEL\")\n    if local_model:\n        _register_override(\n            overrides, \"whisper.model\", \"WHISPER_LOCAL_MODEL\", local_model\n        )\n\n\ndef _determine_whisper_type_for_metadata(data: Dict[str, Any]) -> str | None:\n    \"\"\"Determine whisper type for environment metadata (with auto-detection).\"\"\"\n    whisper_cfg = data.get(\"whisper\", {}) or {}\n    wtype = whisper_cfg.get(\"whisper_type\")\n\n    env_whisper_type = os.environ.get(\"WHISPER_TYPE\")\n\n    # Auto-detect whisper type from API key environment variables if not explicitly set\n    # (matching the logic in config_store._apply_whisper_type_override)\n    if not env_whisper_type:\n        if os.environ.get(\"WHISPER_REMOTE_API_KEY\"):\n            env_whisper_type = \"remote\"\n        elif os.environ.get(\"GROQ_API_KEY\") and not os.environ.get(\"LLM_API_KEY\"):\n            env_whisper_type = \"groq\"\n\n    if env_whisper_type:\n        wtype = env_whisper_type.strip().lower()\n\n    return wtype if isinstance(wtype, str) else None\n\n\ndef _build_env_override_metadata(data: Dict[str, Any]) -> Dict[str, Any]:\n    overrides: Dict[str, Any] = {}\n\n    _register_llm_overrides(overrides)\n    _register_groq_shared_overrides(overrides)\n\n    env_whisper_type = os.environ.get(\"WHISPER_TYPE\")\n    if env_whisper_type:\n        _register_override(\n            overrides, \"whisper.whisper_type\", \"WHISPER_TYPE\", env_whisper_type\n        )\n\n    wtype = _determine_whisper_type_for_metadata(data)\n\n    if wtype == \"remote\":\n        _register_remote_whisper_overrides(overrides)\n    elif wtype == \"groq\":\n        _register_groq_whisper_overrides(overrides)\n    elif wtype == \"local\":\n        _register_local_whisper_overrides(overrides)\n\n    return overrides\n\n\n@config_bp.route(\"/api/config\", methods=[\"PUT\"])\ndef api_put_config() -> flask.Response:\n    _, error_response = require_admin()\n    if error_response:\n        return error_response\n\n    payload = request.get_json(silent=True) or {}\n\n    llm_payload = payload.get(\"llm\")\n    if isinstance(llm_payload, dict):\n        llm_payload.pop(\"llm_api_key_preview\", None)\n\n    whisper_payload = payload.get(\"whisper\")\n    if isinstance(whisper_payload, dict):\n        whisper_payload.pop(\"api_key_preview\", None)\n\n    try:\n        result = writer_client.action(\n            \"update_combined_config\",\n            {\"payload\": payload},\n            wait=True,\n        )\n        if not result or not result.success:\n            raise RuntimeError(getattr(result, \"error\", \"Writer update failed\"))\n        data = result.data or {}\n\n        try:\n            db_cfg = to_pydantic_config()\n        except Exception as hydrate_err:  # pylint: disable=broad-except\n            logger.error(f\"Post-update config hydration failed: {hydrate_err}\")\n            return flask.make_response(\n                jsonify(\n                    {\"error\": \"Invalid configuration\", \"details\": str(hydrate_err)}\n                ),\n                400,\n            )\n\n        for field_name in runtime_config.__class__.model_fields.keys():\n            setattr(runtime_config, field_name, getattr(db_cfg, field_name))\n        ProcessorSingleton.reset_instance()\n\n        return flask.jsonify(_sanitize_config_for_client(data))\n    except Exception as e:  # pylint: disable=broad-except\n        logger.error(f\"Failed to update configuration: {e}\")\n        return flask.make_response(\n            jsonify({\"error\": \"Failed to update configuration\", \"details\": str(e)}), 400\n        )\n\n\n@config_bp.route(\"/api/config/test-llm\", methods=[\"POST\"])\ndef api_test_llm() -> flask.Response:\n    _, error_response = require_admin()\n    if error_response:\n        return error_response\n\n    payload: Dict[str, Any] = request.get_json(silent=True) or {}\n    llm: Dict[str, Any] = dict(payload.get(\"llm\", {}))\n\n    api_key: str | None = llm.get(\"llm_api_key\") or getattr(\n        runtime_config, \"llm_api_key\", None\n    )\n    model_val = llm.get(\"llm_model\")\n    model: str = (\n        model_val\n        if isinstance(model_val, str)\n        else getattr(runtime_config, \"llm_model\", \"gpt-4o\")\n    )\n    base_url: str | None = llm.get(\"openai_base_url\") or getattr(\n        runtime_config, \"openai_base_url\", None\n    )\n    timeout_val = llm.get(\"openai_timeout\")\n    timeout: int = (\n        int(timeout_val)\n        if timeout_val is not None\n        else int(getattr(runtime_config, \"openai_timeout\", 30))\n    )\n\n    if not api_key:\n        return flask.make_response(\n            jsonify({\"ok\": False, \"error\": \"Missing llm_api_key\"}), 400\n        )\n\n    try:\n        # Configure litellm for this probe\n        litellm.api_key = api_key\n        if base_url:\n            litellm.api_base = base_url\n\n        # Minimal completion to validate connectivity and credentials\n        messages = [\n            {\"role\": \"system\", \"content\": \"You are a healthcheck probe.\"},\n            {\"role\": \"user\", \"content\": \"ping\"},\n        ]\n\n        completion_kwargs: Dict[str, Any] = {\n            \"model\": model,\n            \"messages\": messages,\n            \"timeout\": timeout,\n        }\n\n        if model_uses_max_completion_tokens(model):\n            completion_kwargs[\"max_completion_tokens\"] = 1\n        else:\n            completion_kwargs[\"max_tokens\"] = 1\n\n        _ = litellm.completion(**completion_kwargs)\n\n        return flask.jsonify(\n            {\n                \"ok\": True,\n                \"message\": \"LLM connection OK\",\n                \"model\": model,\n                \"base_url\": base_url,\n            }\n        )\n    except Exception as e:  # pylint: disable=broad-except\n        logger.error(f\"LLM connection test failed: {e}\")\n        return flask.make_response(jsonify({\"ok\": False, \"error\": str(e)}), 400)\n\n\ndef _make_error_response(error_msg: str, status_code: int = 400) -> flask.Response:\n    return flask.make_response(jsonify({\"ok\": False, \"error\": error_msg}), status_code)\n\n\ndef _make_success_response(message: str, **extra_data: Any) -> flask.Response:\n    response_data = {\"ok\": True, \"message\": message}\n    response_data.update(extra_data)\n    return flask.jsonify(response_data)\n\n\ndef _get_whisper_config_value(\n    whisper_cfg: Dict[str, Any], key: str, default: Any | None = None\n) -> Any | None:\n    value = whisper_cfg.get(key)\n    if value is not None:\n        return value\n    try:\n        runtime_whisper = getattr(runtime_config, \"whisper\", None)\n        if runtime_whisper is not None:\n            return getattr(runtime_whisper, key, default)\n    except Exception:  # pragma: no cover - defensive\n        pass\n    return default\n\n\ndef _get_env_whisper_api_key(whisper_type: str) -> str | None:\n    if whisper_type == \"remote\":\n        return os.environ.get(\"WHISPER_REMOTE_API_KEY\") or os.environ.get(\n            \"OPENAI_API_KEY\"\n        )\n    if whisper_type == \"groq\":\n        return os.environ.get(\"GROQ_API_KEY\")\n    return None\n\n\ndef _determine_whisper_type(whisper_cfg: Dict[str, Any]) -> str | None:\n    wtype_any = whisper_cfg.get(\"whisper_type\")\n    if isinstance(wtype_any, str):\n        return wtype_any\n    try:\n        runtime_whisper = getattr(runtime_config, \"whisper\", None)\n        if runtime_whisper is not None and hasattr(runtime_whisper, \"whisper_type\"):\n            rt_type = getattr(runtime_whisper, \"whisper_type\")\n            return rt_type if isinstance(rt_type, str) else None\n    except Exception:  # pragma: no cover - defensive\n        pass\n    return None\n\n\ndef _test_local_whisper(whisper_cfg: Dict[str, Any]) -> flask.Response:\n    \"\"\"Test local whisper configuration.\"\"\"\n    model_name = _get_whisper_config_value(whisper_cfg, \"model\", \"base.en\")\n    try:\n        import whisper  # type: ignore[import-untyped]\n    except ImportError as e:\n        return _make_error_response(f\"whisper not installed: {e}\")\n\n    try:\n        available = whisper.available_models()\n    except Exception as e:  # pragma: no cover - library call\n        available = []\n        logger.warning(f\"Failed to list local whisper models: {e}\")\n\n    if model_name not in available:\n        return flask.make_response(\n            jsonify(\n                {\n                    \"ok\": False,\n                    \"error\": f\"Model '{model_name}' not available. Install or adjust model.\",\n                    \"available_models\": available,\n                }\n            ),\n            400,\n        )\n    return _make_success_response(f\"Local whisper OK (model {model_name})\")\n\n\ndef _test_remote_whisper(whisper_cfg: Dict[str, Any]) -> flask.Response:\n    \"\"\"Test remote whisper configuration.\"\"\"\n    api_key_any = _get_whisper_config_value(whisper_cfg, \"api_key\")\n    base_url_any = _get_whisper_config_value(\n        whisper_cfg, \"base_url\", \"https://api.openai.com/v1\"\n    )\n    timeout_any = _get_whisper_config_value(whisper_cfg, \"timeout_sec\", 30)\n\n    api_key: str | None = api_key_any if isinstance(api_key_any, str) else None\n    base_url: str | None = base_url_any if isinstance(base_url_any, str) else None\n    timeout: int = int(timeout_any) if timeout_any is not None else 30\n\n    if not api_key:\n        api_key = _get_env_whisper_api_key(\"remote\")\n\n    if not api_key:\n        return _make_error_response(\"Missing whisper.api_key\")\n\n    _ = OpenAI(base_url=base_url, api_key=api_key, timeout=timeout).models.list()\n    return _make_success_response(\"Remote whisper connection OK\", base_url=base_url)\n\n\ndef _test_groq_whisper(whisper_cfg: Dict[str, Any]) -> flask.Response:\n    \"\"\"Test groq whisper configuration.\"\"\"\n    groq_api_key_any = _get_whisper_config_value(whisper_cfg, \"api_key\")\n    groq_api_key: str | None = (\n        groq_api_key_any if isinstance(groq_api_key_any, str) else None\n    )\n\n    if not groq_api_key:\n        groq_api_key = _get_env_whisper_api_key(\"groq\")\n\n    if not groq_api_key:\n        return _make_error_response(\"Missing whisper.api_key\")\n\n    _ = Groq(api_key=groq_api_key).models.list()\n    return _make_success_response(\"Groq whisper connection OK\")\n\n\n@config_bp.route(\"/api/config/test-whisper\", methods=[\"POST\"])\ndef api_test_whisper() -> flask.Response:\n    \"\"\"Test whisper configuration based on whisper_type.\"\"\"\n    # pylint: disable=too-many-return-statements\n    _, error_response = require_admin()\n    if error_response:\n        return error_response\n\n    payload: Dict[str, Any] = request.get_json(silent=True) or {}\n    whisper_cfg: Dict[str, Any] = dict(payload.get(\"whisper\", {}))\n\n    wtype = _determine_whisper_type(whisper_cfg)\n    if not wtype:\n        return _make_error_response(\"Missing whisper_type\")\n\n    try:\n        if wtype == \"local\":\n            return _test_local_whisper(whisper_cfg)\n        if wtype == \"remote\":\n            return _test_remote_whisper(whisper_cfg)\n        if wtype == \"groq\":\n            return _test_groq_whisper(whisper_cfg)\n        return _make_error_response(f\"Unknown whisper_type '{wtype}'\")\n    except Exception as e:  # pylint: disable=broad-except\n        logger.error(f\"Whisper connection test failed: {e}\")\n        return _make_error_response(str(e))\n\n\n@config_bp.route(\"/api/config/whisper-capabilities\", methods=[\"GET\"])\ndef api_get_whisper_capabilities() -> flask.Response:\n    \"\"\"Report Whisper capabilities for the current runtime.\n\n    Currently returns a boolean indicating whether local Whisper is importable.\n    This enables the frontend to hide the 'local' option when unavailable.\n    \"\"\"\n    _, error_response = require_admin()\n    if error_response:\n        return error_response\n\n    local_available = False\n    try:  # pragma: no cover - simple import feature check\n        import whisper\n\n        # If import succeeds, we consider local whisper available.\n        # Optionally probe models list, but ignore failures here.\n        try:\n            _ = whisper.available_models()  # noqa: F841\n        except Exception:\n            pass\n        local_available = True\n    except Exception:\n        local_available = False\n\n    return flask.jsonify({\"local_available\": local_available})\n\n\n@config_bp.route(\"/api/config/api_configured_check\", methods=[\"GET\"])\ndef api_configured_check() -> flask.Response:\n    \"\"\"Return whether the API configuration is sufficient to process.\n\n    For our purposes, this means an LLM API key is present either in the\n    persisted config or the runtime overlay.\n    \"\"\"\n    _, error_response = require_admin()\n    if error_response:\n        return error_response\n\n    try:\n        data = read_combined()\n        _hydrate_runtime_config(data)\n\n        llm = data.get(\"llm\", {}) if isinstance(data, dict) else {}\n        api_key = llm.get(\"llm_api_key\")\n        configured = bool(api_key)\n        return flask.jsonify({\"configured\": configured})\n    except Exception as e:  # pylint: disable=broad-except\n        logger.error(f\"Failed to check API configuration: {e}\")\n        # Be conservative: report not configured on error\n        return flask.jsonify({\"configured\": False})\n"
  },
  {
    "path": "src/app/routes/discord_routes.py",
    "content": "from __future__ import annotations\n\nimport logging\nimport os\nfrom typing import TYPE_CHECKING\n\nfrom flask import (\n    Blueprint,\n    Response,\n    current_app,\n    jsonify,\n    request,\n    session,\n)\n\nfrom app.auth.discord_service import (\n    DiscordAuthError,\n    DiscordRegistrationDisabledError,\n    build_authorization_url,\n    check_guild_membership,\n    exchange_code_for_token,\n    find_or_create_user_from_discord,\n    generate_oauth_state,\n    get_discord_user,\n)\nfrom app.auth.discord_settings import reload_discord_settings\nfrom app.auth.guards import require_admin\nfrom app.writer.client import writer_client\n\nif TYPE_CHECKING:\n    from app.auth.discord_settings import DiscordSettings\n\nlogger = logging.getLogger(\"global_logger\")\n\ndiscord_bp = Blueprint(\"discord\", __name__)\n\nSESSION_OAUTH_STATE_KEY = \"discord_oauth_state\"\nSESSION_USER_KEY = \"user_id\"\nSESSION_OAUTH_PROMPT_UPGRADED = \"discord_prompt_upgraded\"\n\n\ndef _get_discord_settings() -> DiscordSettings | None:\n    return current_app.config.get(\"DISCORD_SETTINGS\")\n\n\ndef _mask_secret(value: str | None) -> str | None:\n    \"\"\"Mask a secret value for display.\"\"\"\n    if not value:\n        return None\n    if len(value) <= 8:\n        return value\n    return f\"{value[:4]}...{value[-4:]}\"\n\n\ndef _has_env_override(env_var: str) -> bool:\n    \"\"\"Check if an environment variable is set.\"\"\"\n    return bool(os.environ.get(env_var))\n\n\n@discord_bp.route(\"/api/auth/discord/status\", methods=[\"GET\"])\ndef discord_status() -> Response:\n    \"\"\"Return whether Discord SSO is enabled.\"\"\"\n    settings = _get_discord_settings()\n    return jsonify(\n        {\n            \"enabled\": settings.enabled if settings else False,\n        }\n    )\n\n\n@discord_bp.route(\"/api/auth/discord/config\", methods=[\"GET\"])\ndef discord_config_get() -> Response | tuple[Response, int]:\n    \"\"\"Get Discord configuration (admin only).\"\"\"\n    _, error_response = require_admin()\n    if error_response:\n        return error_response, error_response.status_code\n\n    settings = _get_discord_settings()\n\n    # Build env override info\n    env_overrides: dict[str, dict[str, str]] = {}\n    if _has_env_override(\"DISCORD_CLIENT_ID\"):\n        env_overrides[\"client_id\"] = {\"env_var\": \"DISCORD_CLIENT_ID\"}\n    if _has_env_override(\"DISCORD_CLIENT_SECRET\"):\n        env_overrides[\"client_secret\"] = {\n            \"env_var\": \"DISCORD_CLIENT_SECRET\",\n            \"is_secret\": \"true\",\n        }\n    if _has_env_override(\"DISCORD_REDIRECT_URI\"):\n        env_overrides[\"redirect_uri\"] = {\n            \"env_var\": \"DISCORD_REDIRECT_URI\",\n            \"value\": os.environ.get(\"DISCORD_REDIRECT_URI\", \"\"),\n        }\n    if _has_env_override(\"DISCORD_GUILD_IDS\"):\n        env_overrides[\"guild_ids\"] = {\n            \"env_var\": \"DISCORD_GUILD_IDS\",\n            \"value\": os.environ.get(\"DISCORD_GUILD_IDS\", \"\"),\n        }\n    if _has_env_override(\"DISCORD_ALLOW_REGISTRATION\"):\n        env_overrides[\"allow_registration\"] = {\n            \"env_var\": \"DISCORD_ALLOW_REGISTRATION\",\n            \"value\": os.environ.get(\"DISCORD_ALLOW_REGISTRATION\", \"\"),\n        }\n\n    return jsonify(\n        {\n            \"config\": {\n                \"enabled\": settings.enabled if settings else False,\n                \"client_id\": settings.client_id if settings else None,\n                \"client_secret_preview\": (\n                    _mask_secret(settings.client_secret) if settings else None\n                ),\n                \"redirect_uri\": settings.redirect_uri if settings else None,\n                \"guild_ids\": (\n                    \",\".join(settings.guild_ids)\n                    if settings and settings.guild_ids\n                    else \"\"\n                ),\n                \"allow_registration\": settings.allow_registration if settings else True,\n            },\n            \"env_overrides\": env_overrides,\n        }\n    )\n\n\n@discord_bp.route(\"/api/auth/discord/config\", methods=[\"PUT\"])\ndef discord_config_put() -> Response | tuple[Response, int]:\n    \"\"\"Update Discord configuration (admin only).\"\"\"\n    _, error_response = require_admin()\n    if error_response:\n        return error_response, error_response.status_code\n\n    payload = request.get_json(silent=True) or {}\n\n    try:\n        update_params: dict[str, object] = {}\n\n        if \"client_id\" in payload and not _has_env_override(\"DISCORD_CLIENT_ID\"):\n            update_params[\"client_id\"] = payload[\"client_id\"] or None\n\n        if \"client_secret\" in payload and not _has_env_override(\n            \"DISCORD_CLIENT_SECRET\"\n        ):\n            secret = payload[\"client_secret\"]\n            if secret and not str(secret).endswith(\"...\"):\n                update_params[\"client_secret\"] = secret\n\n        if \"redirect_uri\" in payload and not _has_env_override(\"DISCORD_REDIRECT_URI\"):\n            update_params[\"redirect_uri\"] = payload[\"redirect_uri\"] or None\n\n        if \"guild_ids\" in payload and not _has_env_override(\"DISCORD_GUILD_IDS\"):\n            update_params[\"guild_ids\"] = payload[\"guild_ids\"] or None\n\n        if \"allow_registration\" in payload and not _has_env_override(\n            \"DISCORD_ALLOW_REGISTRATION\"\n        ):\n            update_params[\"allow_registration\"] = bool(payload[\"allow_registration\"])\n\n        if update_params:\n            result = writer_client.action(\n                \"update_discord_settings\", update_params, wait=True\n            )\n            if not result or not result.success:\n                raise RuntimeError(getattr(result, \"error\", \"Writer update failed\"))\n\n        # Reload settings into app config\n        new_settings = reload_discord_settings(current_app)\n\n        logger.info(\"Discord settings updated (enabled=%s)\", new_settings.enabled)\n\n        return jsonify(\n            {\n                \"status\": \"ok\",\n                \"config\": {\n                    \"enabled\": new_settings.enabled,\n                    \"client_id\": new_settings.client_id,\n                    \"client_secret_preview\": _mask_secret(new_settings.client_secret),\n                    \"redirect_uri\": new_settings.redirect_uri,\n                    \"guild_ids\": (\n                        \",\".join(new_settings.guild_ids)\n                        if new_settings.guild_ids\n                        else \"\"\n                    ),\n                    \"allow_registration\": new_settings.allow_registration,\n                },\n            }\n        )\n\n    except Exception as e:\n        logger.exception(\"Failed to update Discord settings: %s\", e)\n        return jsonify({\"error\": \"Failed to update Discord settings\"}), 500\n\n\n@discord_bp.route(\"/api/auth/discord/login\", methods=[\"GET\"])\ndef discord_login() -> Response | tuple[Response, int]:\n    \"\"\"Start the Discord OAuth2 flow by returning the authorization URL.\"\"\"\n    settings = _get_discord_settings()\n    if not settings or not settings.enabled:\n        return jsonify({\"error\": \"Discord SSO is not configured.\"}), 404\n\n    prompt = request.args.get(\"prompt\", \"none\")\n    state = generate_oauth_state()\n    session[SESSION_OAUTH_STATE_KEY] = state\n    session[SESSION_OAUTH_PROMPT_UPGRADED] = prompt == \"consent\"\n\n    auth_url = build_authorization_url(settings, state, prompt=prompt)\n    return jsonify({\"authorization_url\": auth_url})\n\n\n@discord_bp.route(\"/api/auth/discord/callback\", methods=[\"GET\"])\ndef discord_callback() -> Response:\n    \"\"\"Handle the OAuth2 callback from Discord.\"\"\"\n    settings = _get_discord_settings()\n    if not settings or not settings.enabled:\n        return Response(\n            response=\"\",\n            status=302,\n            headers={\"Location\": \"/?error=discord_not_configured\"},\n        )\n\n    # Verify state to prevent CSRF\n    state = request.args.get(\"state\")\n    expected_state = session.pop(SESSION_OAUTH_STATE_KEY, None)\n    if not state or state != expected_state:\n        return Response(\n            response=\"\", status=302, headers={\"Location\": \"/?error=invalid_state\"}\n        )\n\n    # Check for error from Discord (e.g., user denied access)\n    error = request.args.get(\"error\")\n    if error:\n        if error in {\"interaction_required\", \"login_required\", \"consent_required\"}:\n            # Try again with an explicit consent prompt (only once) to avoid loops.\n            if not session.get(SESSION_OAUTH_PROMPT_UPGRADED):\n                new_state = generate_oauth_state()\n                session[SESSION_OAUTH_STATE_KEY] = new_state\n                session[SESSION_OAUTH_PROMPT_UPGRADED] = True\n                auth_url = build_authorization_url(\n                    settings, new_state, prompt=\"consent\"\n                )\n                return Response(response=\"\", status=302, headers={\"Location\": auth_url})\n\n        return Response(\n            response=\"\", status=302, headers={\"Location\": f\"/?error={error}\"}\n        )\n\n    code = request.args.get(\"code\")\n    if not code:\n        return Response(\n            response=\"\", status=302, headers={\"Location\": \"/?error=missing_code\"}\n        )\n\n    try:\n        # Exchange code for token\n        token_data = exchange_code_for_token(settings, code)\n        access_token = token_data[\"access_token\"]\n\n        # Get Discord user info\n        discord_user = get_discord_user(access_token)\n\n        # Check guild requirements if configured\n        if settings.guild_ids:\n            is_allowed = check_guild_membership(access_token, settings)\n            if not is_allowed:\n                return Response(\n                    response=\"\",\n                    status=302,\n                    headers={\"Location\": \"/?error=guild_requirement_not_met\"},\n                )\n\n        # Find or create user\n        user = find_or_create_user_from_discord(discord_user, settings)\n\n        # Create session\n        session.clear()\n        session[SESSION_USER_KEY] = user.id\n        session.permanent = True\n        session.pop(SESSION_OAUTH_PROMPT_UPGRADED, None)\n\n        logger.info(\n            \"Discord SSO login successful for user %s (discord_id=%s)\",\n            user.username,\n            discord_user.id,\n        )\n        return Response(response=\"\", status=302, headers={\"Location\": \"/\"})\n\n    except DiscordRegistrationDisabledError:\n        return Response(\n            response=\"\",\n            status=302,\n            headers={\"Location\": \"/?error=registration_disabled\"},\n        )\n    except DiscordAuthError as e:\n        logger.warning(\"Discord auth error: %s\", e)\n        return Response(\n            response=\"\", status=302, headers={\"Location\": \"/?error=auth_failed\"}\n        )\n    except Exception as e:\n        logger.exception(\"Discord auth failed unexpectedly: %s\", e)\n        return Response(\n            response=\"\", status=302, headers={\"Location\": \"/?error=auth_failed\"}\n        )\n"
  },
  {
    "path": "src/app/routes/feed_routes.py",
    "content": "import logging\nimport re\nimport secrets\nfrom pathlib import Path\nfrom threading import Thread\nfrom typing import Any, Optional, cast\n\n# pylint: disable=chained-comparison\nfrom urllib.parse import urlencode, urlparse, urlunparse\n\nimport requests\nimport validators\nfrom flask import (\n    Blueprint,\n    Flask,\n    Response,\n    current_app,\n    g,\n    jsonify,\n    make_response,\n    redirect,\n    request,\n    send_from_directory,\n    url_for,\n)\nfrom flask.typing import ResponseReturnValue\n\nfrom app.auth import is_auth_enabled\nfrom app.auth.guards import require_admin\nfrom app.auth.service import update_user_last_active\nfrom app.extensions import db\nfrom app.feeds import (\n    add_or_refresh_feed,\n    generate_aggregate_feed_xml,\n    generate_feed_xml,\n    is_feed_active_for_user,\n    refresh_feed,\n)\nfrom app.jobs_manager import get_jobs_manager\nfrom app.models import (\n    Feed,\n    Post,\n    User,\n    UserFeed,\n)\nfrom app.writer.client import writer_client\nfrom podcast_processor.podcast_downloader import sanitize_title\nfrom shared.processing_paths import get_in_root, get_srv_root\n\nfrom .auth_routes import _require_authenticated_user as _auth_get_user\n\nlogger = logging.getLogger(\"global_logger\")\n\n\nfeed_bp = Blueprint(\"feed\", __name__)\n\n\ndef fix_url(url: str) -> str:\n    url = re.sub(r\"(http(s)?):/([^/])\", r\"\\1://\\3\", url)\n    if not url.startswith(\"http://\") and not url.startswith(\"https://\"):\n        url = \"https://\" + url\n    return url\n\n\ndef _user_feed_count(user_id: int) -> int:\n    return int(UserFeed.query.filter_by(user_id=user_id).count())\n\n\ndef _get_latest_post(feed: Feed) -> Post | None:\n    return cast(\n        Optional[Post],\n        Post.query.filter_by(feed_id=feed.id)\n        .order_by(Post.release_date.desc().nullslast(), Post.id.desc())\n        .first(),\n    )\n\n\ndef _ensure_user_feed_membership(feed: Feed, user_id: int | None) -> tuple[bool, int]:\n    \"\"\"Add a user↔feed link if missing. Returns (created, previous_feed_member_count).\"\"\"\n    if not user_id:\n        return False, UserFeed.query.filter_by(feed_id=feed.id).count()\n    result = writer_client.action(\n        \"ensure_user_feed_membership\",\n        {\"feed_id\": feed.id, \"user_id\": int(user_id)},\n        wait=True,\n    )\n    if not result or not result.success or not isinstance(result.data, dict):\n        raise RuntimeError(getattr(result, \"error\", \"Failed to join feed\"))\n    return bool(result.data.get(\"created\")), int(result.data.get(\"previous_count\") or 0)\n\n\ndef _whitelist_latest_for_first_member(\n    feed: Feed, requested_by_user_id: int | None\n) -> None:\n    \"\"\"When a feed goes from 0→1 members, whitelist and process the latest post.\"\"\"\n    try:\n        result = writer_client.action(\n            \"whitelist_latest_post_for_feed\", {\"feed_id\": feed.id}, wait=True\n        )\n        if not result or not result.success or not isinstance(result.data, dict):\n            return\n        post_guid = result.data.get(\"post_guid\")\n        updated = bool(result.data.get(\"updated\"))\n        if not updated or not post_guid:\n            return\n    except Exception:  # pylint: disable=broad-except\n        return\n    try:\n        get_jobs_manager().start_post_processing(\n            str(post_guid),\n            priority=\"interactive\",\n            requested_by_user_id=requested_by_user_id,\n            billing_user_id=requested_by_user_id,\n        )\n    except Exception as exc:  # pylint: disable=broad-except\n        logger.error(\n            \"Failed to enqueue processing for latest post %s: %s\", post_guid, exc\n        )\n\n\ndef _handle_developer_mode_feed(url: str, user: Optional[User]) -> ResponseReturnValue:\n    try:\n        feed_id_str = url.split(\"/\")[-1]\n        feed_num = int(feed_id_str)\n\n        result = writer_client.action(\n            \"create_dev_test_feed\",\n            {\n                \"rss_url\": url,\n                \"title\": f\"Test Feed {feed_num}\",\n                \"image_url\": \"https://via.placeholder.com/150\",\n                \"description\": \"A test feed for development\",\n                \"author\": \"Test Author\",\n                \"post_count\": 5,\n                \"guid_prefix\": f\"test-guid-{feed_num}\",\n                \"download_url_prefix\": f\"http://test-feed/{feed_num}\",\n            },\n            wait=True,\n        )\n        if not result or not result.success or not isinstance(result.data, dict):\n            raise RuntimeError(getattr(result, \"error\", \"Failed to create test feed\"))\n        feed_id = int(result.data[\"feed_id\"])\n        feed = db.session.get(Feed, feed_id)\n        if not feed:\n            raise RuntimeError(\"Test feed disappeared\")\n\n        if user:\n            created, previous_count = _ensure_user_feed_membership(feed, user.id)\n            if created and previous_count == 0:\n                _whitelist_latest_for_first_member(feed, getattr(user, \"id\", None))\n\n        return redirect(url_for(\"main.index\"))\n\n    except Exception as e:\n        logger.error(f\"Error adding test feed: {e}\")\n        return make_response((f\"Error adding test feed: {e}\", 500))\n\n\ndef _check_feed_allowance(user: User, url: str) -> Optional[ResponseReturnValue]:\n    if user.role == \"admin\":\n        return None\n\n    existing_feed = Feed.query.filter_by(rss_url=url).first()\n    existing_membership = None\n    if existing_feed:\n        existing_membership = UserFeed.query.filter_by(\n            feed_id=existing_feed.id, user_id=user.id\n        ).first()\n\n    # Use manual allowance if set, otherwise fall back to plan allowance\n    allowance = user.manual_feed_allowance\n    if allowance is None:\n        allowance = getattr(user, \"feed_allowance\", 0) or 0\n\n    if allowance > 0:\n        current_count = _user_feed_count(user.id)\n        if current_count >= allowance and existing_membership is None:\n            return (\n                jsonify(\n                    {\n                        \"error\": \"FEED_LIMIT_REACHED\",\n                        \"message\": f\"Your plan allows {allowance} feeds. Increase your plan to add more.\",\n                        \"feeds_in_use\": current_count,\n                        \"feed_allowance\": allowance,\n                    }\n                ),\n                402,\n            )\n    return None\n\n\n@feed_bp.route(\"/feed\", methods=[\"POST\"])\ndef add_feed() -> ResponseReturnValue:\n    settings = current_app.config.get(\"AUTH_SETTINGS\")\n    user = None\n    if settings and settings.require_auth:\n        user, error = _require_user_or_error()\n        if error:\n            return error\n    url = request.form.get(\"url\")\n    if not url:\n        return make_response((\"URL is required\", 400))\n\n    url = fix_url(url)\n\n    if current_app.config.get(\"developer_mode\") and url.startswith(\"http://test-feed/\"):\n        return _handle_developer_mode_feed(url, user)\n\n    if not validators.url(url):\n        return make_response((\"Invalid URL\", 400))\n\n    try:\n        if user:\n            allowance_error = _check_feed_allowance(user, url)\n            if allowance_error:\n                return allowance_error\n\n        feed = add_or_refresh_feed(url)\n        if user:\n            created, previous_count = _ensure_user_feed_membership(feed, user.id)\n            if created and previous_count == 0:\n                _whitelist_latest_for_first_member(feed, getattr(user, \"id\", None))\n        elif not is_auth_enabled():\n            # In no-auth mode, if this feed has no members, trigger whitelisting for the latest post.\n            if UserFeed.query.filter_by(feed_id=feed.id).count() == 0:\n                _whitelist_latest_for_first_member(feed, None)\n\n        app = cast(Any, current_app)._get_current_object()\n        Thread(\n            target=_enqueue_pending_jobs_async,\n            args=(app,),\n            daemon=True,\n            name=\"enqueue-jobs-after-add\",\n        ).start()\n        return redirect(url_for(\"main.index\"))\n    except Exception as e:  # pylint: disable=broad-except\n        logger.error(f\"Error adding feed: {e}\")\n        return make_response((f\"Error adding feed: {e}\", 500))\n\n\n@feed_bp.route(\"/api/feeds/<int:feed_id>/share-link\", methods=[\"POST\"])\ndef create_feed_share_link(feed_id: int) -> ResponseReturnValue:\n    settings = current_app.config.get(\"AUTH_SETTINGS\")\n    if not settings or not settings.require_auth:\n        return jsonify({\"error\": \"Authentication is disabled.\"}), 404\n\n    current = getattr(g, \"current_user\", None)\n    if current is None:\n        return jsonify({\"error\": \"Authentication required.\"}), 401\n\n    feed = Feed.query.get_or_404(feed_id)\n    user = db.session.get(User, current.id)\n    if user is None:\n        return jsonify({\"error\": \"User not found.\"}), 404\n\n    result = writer_client.action(\n        \"create_feed_access_token\",\n        {\"user_id\": user.id, \"feed_id\": feed.id},\n        wait=True,\n    )\n    if not result or not result.success or not isinstance(result.data, dict):\n        return jsonify({\"error\": \"Failed to create feed token\"}), 500\n    token_id = str(result.data[\"token_id\"])\n    secret = str(result.data[\"secret\"])\n\n    parsed = urlparse(request.host_url)\n    netloc = parsed.netloc\n    scheme = parsed.scheme\n    path = f\"/feed/{feed.id}\"\n    query = urlencode({\"feed_token\": token_id, \"feed_secret\": secret})\n    prefilled_url = urlunparse((scheme, netloc, path, \"\", query, \"\"))\n\n    return (\n        jsonify(\n            {\n                \"url\": prefilled_url,\n                \"feed_token\": token_id,\n                \"feed_secret\": secret,\n                \"feed_id\": feed.id,\n            }\n        ),\n        201,\n    )\n\n\n@feed_bp.route(\"/api/feeds/search\", methods=[\"GET\"])\ndef search_feeds() -> ResponseReturnValue:\n    term = (request.args.get(\"term\") or \"\").strip()\n    logger.info(\"Searching for podcasts with term: %s\", term)\n    if not term:\n        return jsonify({\"error\": \"term parameter is required\"}), 400\n\n    try:\n        headers = {\n            \"User-Agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36\"\n        }\n        response = requests.get(\n            \"http://api.podcastindex.org/search\",\n            headers=headers,\n            params={\"term\": term},\n            timeout=10,\n        )\n        response.raise_for_status()\n        upstream_data = response.json()\n    except requests.exceptions.RequestException as exc:\n        logger.error(\"Podcast search request failed: %s\", exc)\n        return jsonify({\"error\": \"Search request failed\"}), 502\n    except ValueError:\n        logger.error(\"Podcast search returned non-JSON response\")\n        return (\n            jsonify({\"error\": \"Unexpected response from search provider\"}),\n            502,\n        )\n\n    results = upstream_data.get(\"results\") or []\n    transformed_results = []\n\n    if current_app.config.get(\"developer_mode\") and term.lower() == \"test\":\n        logger.info(\"Developer mode test search - adding mock results\")\n        for i in range(1, 11):\n            transformed_results.append(\n                {\n                    \"title\": f\"Test Feed {i}\",\n                    \"author\": \"Test Author\",\n                    \"feedUrl\": f\"http://test-feed/{i}\",\n                    \"artwork\": \"https://via.placeholder.com/150\",\n                    \"genres\": [\"Test Genre\"],\n                }\n            )\n    else:\n        logger.info(\n            \"(dev mode disabled) Podcast search returned %d results\", len(results)\n        )\n\n    for item in results:\n        feed_url = item.get(\"feedUrl\")\n        if not feed_url:\n            continue\n\n        transformed_results.append(\n            {\n                \"title\": item.get(\"collectionName\")\n                or item.get(\"trackName\")\n                or \"Unknown title\",\n                \"author\": item.get(\"artistName\") or \"\",\n                \"feedUrl\": feed_url,\n                \"artworkUrl\": item.get(\"artworkUrl100\")\n                or item.get(\"artworkUrl600\")\n                or \"\",\n                \"description\": item.get(\"collectionCensoredName\")\n                or item.get(\"trackCensoredName\")\n                or \"\",\n                \"genres\": item.get(\"genres\") or [],\n            }\n        )\n\n    total = upstream_data.get(\"resultCount\")\n    if not isinstance(total, int) or total == 0:\n        total = len(transformed_results)\n\n    return jsonify(\n        {\n            \"results\": transformed_results,\n            \"total\": total,\n        }\n    )\n\n\n@feed_bp.route(\"/feed/<int:f_id>\", methods=[\"GET\"])\ndef get_feed(f_id: int) -> Response:\n    if hasattr(g, \"current_user\") and g.current_user:\n        update_user_last_active(g.current_user.id)\n\n    feed = Feed.query.get_or_404(f_id)\n\n    # Refresh the feed\n    refresh_feed(feed)\n\n    # Generate the XML\n    xml_content = generate_feed_xml(feed)\n\n    response = make_response(xml_content)\n    response.headers[\"Content-Type\"] = \"application/rss+xml\"\n    return response\n\n\n@feed_bp.route(\"/feed/<int:f_id>\", methods=[\"DELETE\"])\ndef delete_feed(f_id: int) -> ResponseReturnValue:  # pylint: disable=too-many-branches\n    user, error = _require_user_or_error(allow_missing_auth=True)\n    if error:\n        return error\n\n    feed = Feed.query.get_or_404(f_id)\n    if user is not None and user.role != \"admin\":\n        return (\n            jsonify({\"error\": \"Only administrators can delete feeds.\"}),\n            403,\n        )\n\n    # Get all post IDs for this feed\n    post_ids = [post.id for post in feed.posts]\n\n    # Delete audio files if they exist\n    for post in feed.posts:\n        if post.unprocessed_audio_path and Path(post.unprocessed_audio_path).exists():\n            try:\n                Path(post.unprocessed_audio_path).unlink()\n                logger.info(f\"Deleted unprocessed audio: {post.unprocessed_audio_path}\")\n            except Exception as e:  # pylint: disable=broad-except\n                logger.error(\n                    f\"Error deleting unprocessed audio {post.unprocessed_audio_path}: {e}\"\n                )\n\n        if post.processed_audio_path and Path(post.processed_audio_path).exists():\n            try:\n                Path(post.processed_audio_path).unlink()\n                logger.info(f\"Deleted processed audio: {post.processed_audio_path}\")\n            except Exception as e:  # pylint: disable=broad-except\n                logger.error(\n                    f\"Error deleting processed audio {post.processed_audio_path}: {e}\"\n                )\n\n    # Clean up directory structures\n    _cleanup_feed_directories(feed)\n\n    try:\n        result = writer_client.action(\n            \"delete_feed_cascade\", {\"feed_id\": feed.id}, wait=True\n        )\n        if not result or not result.success:\n            raise RuntimeError(getattr(result, \"error\", \"Failed to delete feed\"))\n    except Exception as e:  # pylint: disable=broad-except\n        logger.error(\"Failed to delete feed %s: %s\", feed.id, e)\n        return make_response((\"Failed to delete feed\", 500))\n\n    logger.info(\n        f\"Deleted feed: {feed.title} (ID: {feed.id}) with {len(post_ids)} posts\"\n    )\n    return make_response(\"\", 204)\n\n\n@feed_bp.route(\"/api/feeds/<int:f_id>/refresh\", methods=[\"POST\"])\ndef refresh_feed_endpoint(f_id: int) -> ResponseReturnValue:\n    \"\"\"\n    Refresh the specified feed and return a JSON response indicating the result.\n    \"\"\"\n    if hasattr(g, \"current_user\") and g.current_user:\n        update_user_last_active(g.current_user.id)\n\n    feed = Feed.query.get_or_404(f_id)\n    feed_title = feed.title\n    app = cast(Any, current_app)._get_current_object()\n\n    Thread(\n        target=_refresh_feed_background,\n        args=(app, f_id),\n        daemon=True,\n        name=f\"feed-refresh-{f_id}\",\n    ).start()\n\n    return (\n        jsonify(\n            {\n                \"status\": \"accepted\",\n                \"message\": f'Feed \"{feed_title}\" refresh queued for processing',\n            }\n        ),\n        202,\n    )\n\n\n@feed_bp.route(\"/api/feeds/<int:feed_id>/settings\", methods=[\"PATCH\"])\ndef update_feed_settings_endpoint(feed_id: int) -> ResponseReturnValue:\n    _, error_response = require_admin(\"update feed settings\")\n    if error_response is not None:\n        return error_response\n\n    payload = request.get_json(silent=True) or {}\n    if \"auto_whitelist_new_episodes_override\" not in payload:\n        return jsonify({\"error\": \"No settings provided.\"}), 400\n\n    override = payload.get(\"auto_whitelist_new_episodes_override\")\n    if override is not None and not isinstance(override, bool):\n        return (\n            jsonify(\n                {\n                    \"error\": \"auto_whitelist_new_episodes_override must be a boolean or null.\"\n                }\n            ),\n            400,\n        )\n\n    result = writer_client.action(\n        \"update_feed_settings\",\n        {\"feed_id\": feed_id, \"auto_whitelist_new_episodes_override\": override},\n        wait=True,\n    )\n    if result is None or not result.success:\n        return (\n            jsonify({\"error\": getattr(result, \"error\", \"Failed to update feed\")}),\n            500,\n        )\n\n    feed = db.session.get(Feed, feed_id)\n    if feed is None:\n        return jsonify({\"error\": \"Feed not found\"}), 404\n\n    return jsonify(_serialize_feed(feed, current_user=getattr(g, \"current_user\", None)))\n\n\ndef _refresh_feed_background(app: Flask, feed_id: int) -> None:\n    with app.app_context():\n        feed = db.session.get(Feed, feed_id)\n        if not feed:\n            logger.warning(\"Feed %s disappeared before refresh could run\", feed_id)\n            return\n\n        try:\n            refresh_feed(feed)\n            get_jobs_manager().enqueue_pending_jobs(\n                trigger=\"feed_refresh\", context={\"feed_id\": feed_id}\n            )\n        except Exception as exc:  # pylint: disable=broad-except\n            logger.error(\"Failed to refresh feed %s asynchronously: %s\", feed_id, exc)\n\n\n@feed_bp.route(\"/api/feeds/refresh-all\", methods=[\"POST\"])\ndef refresh_all_feeds_endpoint() -> Response:\n    \"\"\"Trigger a refresh for all feeds and enqueue pending jobs.\"\"\"\n    if hasattr(g, \"current_user\") and g.current_user:\n        update_user_last_active(g.current_user.id)\n\n    result = get_jobs_manager().start_refresh_all_feeds(trigger=\"manual_refresh\")\n    feed_count = Feed.query.count()\n    return jsonify(\n        {\n            \"status\": \"success\",\n            \"feeds_refreshed\": feed_count,\n            \"jobs_enqueued\": result.get(\"enqueued\", 0),\n        }\n    )\n\n\ndef _enqueue_pending_jobs_async(app: Flask) -> None:\n    with app.app_context():\n        try:\n            get_jobs_manager().enqueue_pending_jobs(trigger=\"feed_refresh\")\n        except Exception as exc:  # pylint: disable=broad-except\n            logger.error(\"Failed to enqueue pending jobs asynchronously: %s\", exc)\n\n\ndef _cleanup_feed_directories(feed: Feed) -> None:\n    \"\"\"\n    Clean up directory structures for a feed in both in/ and srv/ directories.\n\n    Args:\n        feed: The Feed object being deleted\n    \"\"\"\n    # Clean up srv/ directory (processed audio)\n    # srv/{sanitized_feed_title}/\n    sanitized_feed_title = sanitize_title(feed.title)\n    # Use the same sanitization logic as in processing_paths.py\n    sanitized_feed_title = re.sub(\n        r\"[^a-zA-Z0-9\\s_.-]\", \"\", sanitized_feed_title\n    ).strip()\n    sanitized_feed_title = sanitized_feed_title.rstrip(\".\")\n    sanitized_feed_title = re.sub(r\"\\s+\", \"_\", sanitized_feed_title)\n\n    srv_feed_dir = get_srv_root() / sanitized_feed_title\n    if srv_feed_dir.exists() and srv_feed_dir.is_dir():\n        try:\n            # Remove all files in the directory first\n            for file_path in srv_feed_dir.iterdir():\n                if file_path.is_file():\n                    file_path.unlink()\n                    logger.info(f\"Deleted processed audio file: {file_path}\")\n            # Remove the directory itself\n            srv_feed_dir.rmdir()\n            logger.info(f\"Deleted processed audio directory: {srv_feed_dir}\")\n        except Exception as e:  # pylint: disable=broad-except\n            logger.error(\n                f\"Error deleting processed audio directory {srv_feed_dir}: {e}\"\n            )\n\n    # Clean up in/ directories (unprocessed audio)\n    # in/{sanitized_post_title}/\n    for post in feed.posts:  # type: ignore[attr-defined]\n        sanitized_post_title = sanitize_title(post.title)\n        in_post_dir = get_in_root() / sanitized_post_title\n        if in_post_dir.exists() and in_post_dir.is_dir():\n            try:\n                # Remove all files in the directory first\n                for file_path in in_post_dir.iterdir():\n                    if file_path.is_file():\n                        file_path.unlink()\n                        logger.info(f\"Deleted unprocessed audio file: {file_path}\")\n                # Remove the directory itself\n                in_post_dir.rmdir()\n                logger.info(f\"Deleted unprocessed audio directory: {in_post_dir}\")\n            except Exception as e:  # pylint: disable=broad-except\n                logger.error(\n                    f\"Error deleting unprocessed audio directory {in_post_dir}: {e}\"\n                )\n\n\n@feed_bp.route(\"/<path:something_or_rss>\", methods=[\"GET\"])\ndef get_feed_by_alt_or_url(something_or_rss: str) -> Response:\n    # first try to serve ANY static file matching the path\n    if current_app.static_folder is not None:\n        # Use Flask's safe helper to prevent directory traversal outside static_folder\n        try:\n            return send_from_directory(current_app.static_folder, something_or_rss)\n        except Exception:\n            # Not a valid static file; fall through to RSS/DB lookup\n            pass\n    feed = Feed.query.filter_by(rss_url=something_or_rss).first()\n    if feed:\n        xml_content = generate_feed_xml(feed)\n        response = make_response(xml_content)\n        response.headers[\"Content-Type\"] = \"application/rss+xml\"\n        return response\n\n    return make_response((\"Feed not found\", 404))\n\n\n@feed_bp.route(\"/feeds\", methods=[\"GET\"])\ndef api_feeds() -> ResponseReturnValue:\n    settings = current_app.config.get(\"AUTH_SETTINGS\")\n    if settings and settings.require_auth:\n        user, error = _require_user_or_error()\n        if error:\n            return error\n        if user and user.role != \"admin\":\n            feeds = (\n                Feed.query.join(UserFeed, UserFeed.feed_id == Feed.id)\n                .filter(UserFeed.user_id == user.id)\n                .all()\n            )\n            # Hack: Always include Feed 1\n            feed_1 = Feed.query.get(1)\n            if feed_1 and feed_1 not in feeds:\n                feeds.append(feed_1)\n        else:\n            feeds = Feed.query.all()\n        current_user = user\n    else:\n        feeds = Feed.query.all()\n        current_user = getattr(g, \"current_user\", None)\n\n    feeds_data = [_serialize_feed(feed, current_user=current_user) for feed in feeds]\n    return jsonify(feeds_data)\n\n\n@feed_bp.route(\"/api/feeds/<int:feed_id>/join\", methods=[\"POST\"])\ndef api_join_feed(feed_id: int) -> ResponseReturnValue:\n    user, error = _require_user_or_error()\n    if error:\n        return error\n    if user is None:\n        return jsonify({\"error\": \"Authentication required.\"}), 401\n\n    feed = Feed.query.get_or_404(feed_id)\n    existing_membership = UserFeed.query.filter_by(\n        feed_id=feed.id, user_id=user.id\n    ).first()\n    if user.role != \"admin\":\n        # Use manual allowance if set, otherwise fall back to plan allowance\n        allowance = user.manual_feed_allowance\n        if allowance is None:\n            allowance = getattr(user, \"feed_allowance\", 0) or 0\n\n        at_capacity = allowance > 0 and _user_feed_count(user.id) >= allowance\n        missing_membership = existing_membership is None\n        if at_capacity and missing_membership:\n            return (\n                jsonify(\n                    {\n                        \"error\": \"FEED_LIMIT_REACHED\",\n                        \"message\": f\"Your plan allows {allowance} feeds. Increase your plan to add more.\",\n                        \"feeds_in_use\": _user_feed_count(user.id),\n                        \"feed_allowance\": allowance,\n                    }\n                ),\n                402,\n            )\n    if existing_membership:\n        refreshed = Feed.query.get(feed_id)\n        return jsonify(_serialize_feed(refreshed or feed, current_user=user)), 200\n\n    created, previous_count = _ensure_user_feed_membership(\n        feed, getattr(user, \"id\", None)\n    )\n    if created and previous_count == 0:\n        _whitelist_latest_for_first_member(feed, getattr(user, \"id\", None))\n    refreshed = Feed.query.get(feed_id)\n    return (\n        jsonify(_serialize_feed(refreshed or feed, current_user=user)),\n        200,\n    )\n\n\n@feed_bp.route(\"/api/feeds/<int:feed_id>/exit\", methods=[\"POST\"])\ndef api_exit_feed(feed_id: int) -> ResponseReturnValue:\n    user, error = _require_user_or_error()\n    if error:\n        return error\n    if user is None:\n        return jsonify({\"error\": \"Authentication required.\"}), 401\n\n    feed = Feed.query.get_or_404(feed_id)\n    writer_client.action(\n        \"remove_user_feed_membership\",\n        {\"feed_id\": feed.id, \"user_id\": user.id},\n        wait=True,\n    )\n    refreshed = Feed.query.get(feed_id)\n    return (\n        jsonify(_serialize_feed(refreshed or feed, current_user=user)),\n        200,\n    )\n\n\n@feed_bp.route(\"/api/feeds/<int:feed_id>/leave\", methods=[\"POST\"])\ndef api_leave_feed(feed_id: int) -> ResponseReturnValue:\n    \"\"\"Remove current user membership; hide from their view.\"\"\"\n    user, error = _require_user_or_error()\n    if error:\n        return error\n    if user is None:\n        return jsonify({\"error\": \"Authentication required.\"}), 401\n\n    feed = Feed.query.get_or_404(feed_id)\n    writer_client.action(\n        \"remove_user_feed_membership\",\n        {\"feed_id\": feed.id, \"user_id\": user.id},\n        wait=True,\n    )\n    return jsonify({\"status\": \"ok\", \"feed_id\": feed.id})\n\n\n@feed_bp.route(\"/feed/user/<int:user_id>\", methods=[\"GET\"])\ndef get_user_aggregate_feed(user_id: int) -> Response:\n    \"\"\"Serve the aggregate RSS feed for a specific user.\"\"\"\n    # Auth check is handled by middleware via feed_token\n    # If auth is disabled, this is public.\n    # If auth is enabled, middleware ensures we have a valid token for this user_id.\n\n    if is_auth_enabled():\n        current = getattr(g, \"current_user\", None)\n        if current is None:\n            return make_response((\"Authentication required\", 401))\n        if current.role != \"admin\" and current.id != user_id:\n            return make_response((\"Forbidden\", 403))\n\n    user = db.session.get(User, user_id)\n    if not user:\n        if user_id == 0 and not is_auth_enabled():\n            # Support anonymous aggregate feed when auth is disabled\n            xml_content = generate_aggregate_feed_xml(None)\n            response = make_response(xml_content)\n            response.headers[\"Content-Type\"] = \"application/rss+xml\"\n            return response\n        return make_response((\"User not found\", 404))\n\n    xml_content = generate_aggregate_feed_xml(user)\n    response = make_response(xml_content)\n    response.headers[\"Content-Type\"] = \"application/rss+xml\"\n    return response\n\n\n@feed_bp.route(\"/feed/aggregate\", methods=[\"GET\"])\ndef get_aggregate_feed_redirect() -> ResponseReturnValue:\n    \"\"\"Convenience endpoint to redirect to the user's aggregate feed.\"\"\"\n    settings = current_app.config.get(\"AUTH_SETTINGS\")\n\n    # Case 1: Auth Disabled -> Redirect to Admin User (or ID 0 if none exist)\n    if not settings or not settings.require_auth:\n        admin = User.query.filter_by(role=\"admin\").first()\n        user_id = admin.id if admin else 0\n        return redirect(url_for(\"feed.get_user_aggregate_feed\", user_id=user_id))\n\n    # Case 2: Auth Enabled -> Require explicit user link\n    # We cannot easily determine \"current user\" for a podcast player without a token.\n    # If accessed via browser with session, we could redirect, but for consistency\n    # we should probably just tell them to get their link.\n\n    current = getattr(g, \"current_user\", None)\n    if current:\n        return redirect(url_for(\"feed.get_user_aggregate_feed\", user_id=current.id))\n\n    return (\n        jsonify(\n            {\n                \"error\": \"Authentication required\",\n                \"message\": \"Please use your unique aggregate feed URL from the dashboard.\",\n            }\n        ),\n        401,\n    )\n\n\n@feed_bp.route(\"/api/user/aggregate-link\", methods=[\"POST\"])\ndef create_aggregate_feed_link() -> ResponseReturnValue:\n    \"\"\"Generate a unique RSS link for the current user's aggregate feed.\"\"\"\n    settings = current_app.config.get(\"AUTH_SETTINGS\")\n\n    user = None\n    if not settings or not settings.require_auth:\n        # Auth disabled: Use admin user or first available user\n        user = User.query.filter_by(role=\"admin\").first()\n        if not user:\n            user = User.query.first()\n\n        if not user:\n            # Create a default admin user if none exists\n            default_username = \"admin\"\n            default_password = secrets.token_urlsafe(16)\n\n            result = writer_client.action(\n                \"create_user\",\n                {\n                    \"username\": default_username,\n                    \"password\": default_password,\n                    \"role\": \"admin\",\n                },\n                wait=True,\n            )\n            if result and result.success and isinstance(result.data, dict):\n                user_id = result.data.get(\"user_id\")\n                if user_id:\n                    user = db.session.get(User, user_id)\n\n            if not user:\n                return (\n                    jsonify({\"error\": \"No user found and failed to create one.\"}),\n                    500,\n                )\n    else:\n        user, error = _require_user_or_error()\n        if error:\n            return error\n\n    if user is None:\n        return jsonify({\"error\": \"Authentication required.\"}), 401\n\n    # Create a token with feed_id=None (Aggregate Token)\n    result = writer_client.action(\n        \"create_feed_access_token\",\n        {\"user_id\": user.id, \"feed_id\": None},\n        wait=True,\n    )\n    if not result or not result.success or not isinstance(result.data, dict):\n        return jsonify({\"error\": \"Failed to create aggregate feed token\"}), 500\n\n    token_id = str(result.data[\"token_id\"])\n    secret = str(result.data[\"secret\"])\n\n    parsed = urlparse(request.host_url)\n    netloc = parsed.netloc\n    scheme = parsed.scheme\n    path = f\"/feed/user/{user.id}\"\n\n    # If auth is disabled, we don't strictly need the token params,\n    # but including them doesn't hurt and ensures the link works if auth is enabled later.\n    # However, to keep it clean for single-user mode:\n    settings = current_app.config.get(\"AUTH_SETTINGS\")\n    if settings and settings.require_auth:\n        query = urlencode({\"feed_token\": token_id, \"feed_secret\": secret})\n    else:\n        query = \"\"\n\n    full_url = urlunparse((scheme, netloc, path, \"\", query, \"\"))\n\n    return (\n        jsonify(\n            {\n                \"url\": full_url,\n                \"feed_token\": token_id,\n                \"feed_secret\": secret,\n            }\n        ),\n        201,\n    )\n\n\ndef _require_user_or_error(\n    allow_missing_auth: bool = False,\n) -> tuple[User | None, ResponseReturnValue | None]:\n    settings = current_app.config.get(\"AUTH_SETTINGS\")\n    if not settings or not settings.require_auth:\n        if allow_missing_auth:\n            return None, None\n        return None, (jsonify({\"error\": \"Authentication is disabled.\"}), 404)\n\n    current = getattr(g, \"current_user\", None)\n    if current is None:\n        return None, (jsonify({\"error\": \"Authentication required.\"}), 401)\n\n    user = _auth_get_user()\n    if user is None:\n        return None, (jsonify({\"error\": \"User not found.\"}), 404)\n\n    return user, None\n\n\ndef _serialize_feed(\n    feed: Feed,\n    *,\n    current_user: Optional[User] = None,\n) -> dict[str, Any]:\n    auth_enabled = is_auth_enabled()\n    member_ids = [membership.user_id for membership in getattr(feed, \"user_feeds\", [])]\n\n    # In no-auth mode, everyone is functionally a member.\n    is_member = not auth_enabled or bool(\n        current_user and getattr(current_user, \"id\", None) in member_ids\n    )\n\n    # Hack: Always treat Feed 1 as a member\n    if feed.id == 1 and (current_user or not auth_enabled):\n        is_member = True\n\n    is_active_subscription = False\n    if is_member:\n        if current_user:\n            is_active_subscription = is_feed_active_for_user(feed.id, current_user)\n        elif not auth_enabled:\n            is_active_subscription = True\n\n    feed_payload = {\n        \"id\": feed.id,\n        \"title\": feed.title,\n        \"rss_url\": feed.rss_url,\n        \"description\": feed.description,\n        \"author\": feed.author,\n        \"image_url\": feed.image_url,\n        \"auto_whitelist_new_episodes_override\": getattr(\n            feed, \"auto_whitelist_new_episodes_override\", None\n        ),\n        \"posts_count\": len(feed.posts),\n        \"member_count\": len(member_ids),\n        \"is_member\": is_member,\n        \"is_active_subscription\": is_active_subscription,\n    }\n    return feed_payload\n"
  },
  {
    "path": "src/app/routes/jobs_routes.py",
    "content": "import logging\n\nimport flask\nfrom flask import Blueprint, request\nfrom flask.typing import ResponseReturnValue\n\nfrom app.extensions import db\nfrom app.jobs_manager import get_jobs_manager\nfrom app.jobs_manager_run_service import build_run_status_snapshot\nfrom app.post_cleanup import cleanup_processed_posts, count_cleanup_candidates\nfrom app.runtime_config import config as runtime_config\n\nlogger = logging.getLogger(\"global_logger\")\n\n\njobs_bp = Blueprint(\"jobs\", __name__)\n\n\n@jobs_bp.route(\"/api/jobs/active\", methods=[\"GET\"])\ndef api_list_active_jobs() -> ResponseReturnValue:\n    try:\n        limit = int(request.args.get(\"limit\", \"100\"))\n    except ValueError:\n        limit = 100\n    result = get_jobs_manager().list_active_jobs(limit=limit)\n    return flask.jsonify(result)\n\n\n@jobs_bp.route(\"/api/jobs/all\", methods=[\"GET\"])\ndef api_list_all_jobs() -> ResponseReturnValue:\n    try:\n        limit = int(request.args.get(\"limit\", \"100\"))\n    except ValueError:\n        limit = 100\n    result = get_jobs_manager().list_all_jobs_detailed(limit=limit)\n    return flask.jsonify(result)\n\n\n@jobs_bp.route(\"/api/job-manager/status\", methods=[\"GET\"])\ndef api_job_manager_status() -> ResponseReturnValue:\n    run_snapshot = build_run_status_snapshot(db.session)\n    return flask.jsonify({\"run\": run_snapshot})\n\n\n@jobs_bp.route(\"/api/jobs/<string:job_id>/cancel\", methods=[\"POST\"])\ndef api_cancel_job(job_id: str) -> ResponseReturnValue:\n    try:\n        result = get_jobs_manager().cancel_job(job_id)\n        status_code = (\n            200\n            if result.get(\"status\") == \"cancelled\"\n            else (404 if result.get(\"error_code\") == \"NOT_FOUND\" else 400)\n        )\n\n        db.session.expire_all()\n\n        return flask.jsonify(result), status_code\n    except Exception as e:\n        logger.error(f\"Failed to cancel job {job_id}: {e}\")\n        return (\n            flask.jsonify(\n                {\n                    \"status\": \"error\",\n                    \"error_code\": \"CANCEL_FAILED\",\n                    \"message\": f\"Failed to cancel job: {str(e)}\",\n                }\n            ),\n            500,\n        )\n\n\n@jobs_bp.route(\"/api/jobs/cleanup/preview\", methods=[\"GET\"])\ndef api_cleanup_preview() -> ResponseReturnValue:\n    retention = getattr(runtime_config, \"post_cleanup_retention_days\", None)\n    count, cutoff = count_cleanup_candidates(retention)\n    return flask.jsonify(\n        {\n            \"count\": count,\n            \"retention_days\": retention,\n            \"cutoff_utc\": cutoff.isoformat() if cutoff else None,\n        }\n    )\n\n\n@jobs_bp.route(\"/api/jobs/cleanup/run\", methods=[\"POST\"])\ndef api_run_cleanup() -> ResponseReturnValue:\n    retention = getattr(runtime_config, \"post_cleanup_retention_days\", None)\n    if retention is None or retention <= 0:\n        return flask.jsonify(\n            {\n                \"status\": \"disabled\",\n                \"message\": \"Cleanup is disabled because retention_days <= 0.\",\n            }\n        )\n\n    try:\n        removed = cleanup_processed_posts(retention)\n        remaining, cutoff = count_cleanup_candidates(retention)\n    except Exception as exc:  # pylint: disable=broad-except\n        logger.error(\"Manual cleanup failed: %s\", exc, exc_info=True)\n        return (\n            flask.jsonify(\n                {\n                    \"status\": \"error\",\n                    \"message\": \"Cleanup job failed. Check server logs for details.\",\n                }\n            ),\n            500,\n        )\n\n    return flask.jsonify(\n        {\n            \"status\": \"ok\",\n            \"removed_posts\": removed,\n            \"remaining_candidates\": remaining,\n            \"retention_days\": retention,\n            \"cutoff_utc\": cutoff.isoformat() if cutoff else None,\n        }\n    )\n"
  },
  {
    "path": "src/app/routes/main_routes.py",
    "content": "import logging\nimport os\n\nimport flask\nfrom flask import Blueprint, send_from_directory\n\nfrom app.auth.guards import require_admin\nfrom app.extensions import db\nfrom app.models import Feed, Post, User\nfrom app.runtime_config import config\nfrom app.writer.client import writer_client\n\nlogger = logging.getLogger(\"global_logger\")\n\nlogger = logging.getLogger(\"global_logger\")\n\n\nmain_bp = Blueprint(\"main\", __name__)\n\n\n@main_bp.route(\"/\")\ndef index() -> flask.Response:\n    \"\"\"Serve the React app's index.html.\"\"\"\n    static_folder = flask.current_app.static_folder\n    if static_folder and os.path.exists(os.path.join(static_folder, \"index.html\")):\n        return send_from_directory(static_folder, \"index.html\")\n\n    feeds = Feed.query.all()\n    return flask.make_response(\n        flask.render_template(\"index.html\", feeds=feeds, config=config), 200\n    )\n\n\n@main_bp.route(\"/api/landing/status\", methods=[\"GET\"])\ndef landing_status() -> flask.Response:\n    \"\"\"Public landing-page status with user counts and limits.\n\n    Intended for the unauthenticated landing page; returns current user count\n    and configured total limit (if any) so the UI can show remaining slots.\n    \"\"\"\n\n    require_auth = False\n    landing_enabled = False\n\n    try:\n        settings = flask.current_app.config.get(\"AUTH_SETTINGS\")\n        require_auth = bool(settings and settings.require_auth)\n    except Exception:  # pragma: no cover - defensive\n        require_auth = False\n\n    try:\n        landing_enabled = bool(getattr(config, \"enable_public_landing_page\", False))\n    except Exception:  # pragma: no cover - defensive\n        landing_enabled = False\n\n    try:\n        user_count = int(User.query.count())\n    except Exception:  # pragma: no cover - defensive\n        user_count = 0\n\n    limit_raw = getattr(config, \"user_limit_total\", None)\n    try:\n        user_limit_total = int(limit_raw) if limit_raw is not None else None\n    except Exception:  # pragma: no cover - defensive\n        user_limit_total = None\n\n    slots_remaining = None\n    if user_limit_total is not None:\n        slots_remaining = max(user_limit_total - user_count, 0)\n\n    return flask.jsonify(\n        {\n            \"require_auth\": require_auth,\n            \"landing_page_enabled\": landing_enabled,\n            \"user_count\": user_count,\n            \"user_limit_total\": user_limit_total,\n            \"slots_remaining\": slots_remaining,\n        }\n    )\n\n\n@main_bp.route(\"/<path:path>\")\ndef catch_all(path: str) -> flask.Response:\n    \"\"\"Serve React app for all frontend routes, or serve static files.\"\"\"\n    # Don't handle API routes - let them be handled by API blueprint\n    if path.startswith(\"api/\"):\n        flask.abort(404)\n\n    static_folder = flask.current_app.static_folder\n    if static_folder:\n        # First try to serve a static file if it exists\n        static_file_path = os.path.join(static_folder, path)\n        if os.path.exists(static_file_path) and os.path.isfile(static_file_path):\n            return send_from_directory(static_folder, path)\n\n        # If it's not a static file and index.html exists, serve the React app\n        if os.path.exists(os.path.join(static_folder, \"index.html\")):\n            return send_from_directory(static_folder, \"index.html\")\n\n    # Fallback to 404\n    flask.abort(404)\n\n\n@main_bp.route(\"/feed/<int:f_id>/toggle-whitelist-all/<val>\", methods=[\"POST\"])\ndef whitelist_all(f_id: str, val: str) -> flask.Response:\n    _, error_response = require_admin(\"toggle whitelist for all posts\")\n    if error_response:\n        return error_response\n\n    feed = Feed.query.get_or_404(f_id)\n    new_status = val.lower() == \"true\"\n    try:\n        result = writer_client.action(\n            \"toggle_whitelist_all_for_feed\",\n            {\"feed_id\": feed.id, \"new_status\": new_status},\n            wait=True,\n        )\n        if not result or not result.success:\n            raise RuntimeError(getattr(result, \"error\", \"Unknown writer error\"))\n    except Exception:  # pylint: disable=broad-except\n        return flask.make_response(\n            (\n                flask.jsonify(\n                    {\n                        \"error\": \"Database busy, please retry\",\n                        \"retry_after_seconds\": 1,\n                    }\n                ),\n                503,\n            )\n        )\n    return flask.make_response(\"\", 200)\n\n\n@main_bp.route(\"/set_whitelist/<string:p_guid>/<val>\", methods=[\"GET\"])\ndef set_whitelist(p_guid: str, val: str) -> flask.Response:\n    logger.info(f\"Setting whitelist status for post with GUID: {p_guid} to {val}\")\n    post = Post.query.filter_by(guid=p_guid).first()\n    if post is None:\n        return flask.make_response((\"Post not found\", 404))\n\n    new_status = val.lower() == \"true\"\n    try:\n        result = writer_client.update(\n            \"Post\", post.id, {\"whitelisted\": new_status}, wait=True\n        )\n        if not result or not result.success:\n            raise RuntimeError(getattr(result, \"error\", \"Unknown writer error\"))\n        db.session.expire(post)\n    except Exception:  # pylint: disable=broad-except\n        return flask.make_response(\n            (\n                flask.jsonify(\n                    {\n                        \"error\": \"Database busy, please retry\",\n                        \"retry_after_seconds\": 1,\n                    }\n                ),\n                503,\n            )\n        )\n\n    return index()\n"
  },
  {
    "path": "src/app/routes/post_routes.py",
    "content": "import logging\nimport math\nimport os\nfrom pathlib import Path\nfrom typing import Any, Dict, Optional, cast\n\nimport flask\nfrom flask import Blueprint, g, jsonify, request, send_file\nfrom flask.typing import ResponseReturnValue\n\nfrom app.auth.guards import require_admin\nfrom app.auth.service import update_user_last_active\nfrom app.extensions import db\nfrom app.jobs_manager import get_jobs_manager\nfrom app.models import (\n    Feed,\n    Identification,\n    ModelCall,\n    Post,\n    TranscriptSegment,\n)\nfrom app.posts import clear_post_processing_data\nfrom app.routes.post_stats_utils import (\n    count_model_calls,\n    is_mixed_segment,\n    parse_refined_windows,\n)\nfrom app.runtime_config import config as runtime_config\nfrom app.writer.client import writer_client\n\nlogger = logging.getLogger(\"global_logger\")\n\n\npost_bp = Blueprint(\"post\", __name__)\n\n\ndef _is_latest_post(feed: Feed, post: Post) -> bool:\n    \"\"\"Return True if the post is the latest by release_date (fallback to id).\"\"\"\n    latest = (\n        Post.query.filter_by(feed_id=feed.id)\n        .order_by(Post.release_date.desc().nullslast(), Post.id.desc())\n        .first()\n    )\n    return bool(latest and latest.id == post.id)\n\n\ndef _increment_download_count(post: Post) -> None:\n    \"\"\"Safely increment the download counter for a post.\"\"\"\n    try:\n        writer_client.action(\n            \"increment_download_count\", {\"post_id\": post.id}, wait=False\n        )\n    except Exception as e:  # pylint: disable=broad-except\n        logger.error(f\"Failed to increment download count for post {post.guid}: {e}\")\n\n\ndef _ensure_whitelisted_for_download(\n    post: Post, p_guid: str\n) -> Optional[flask.Response]:\n    \"\"\"Make sure a post is whitelisted before serving or queuing processing.\"\"\"\n    if post.whitelisted:\n        return None\n\n    if not getattr(runtime_config, \"autoprocess_on_download\", False):\n        logger.warning(\n            \"Post %s not whitelisted and auto-process is disabled\", post.guid\n        )\n        return flask.make_response((\"Post not whitelisted\", 403))\n\n    try:\n        writer_client.action(\n            \"whitelist_post\",\n            {\"post_id\": post.id},\n            wait=True,\n        )\n        post.whitelisted = True\n        logger.info(\"Auto-whitelisted post %s on download request\", p_guid)\n        return None\n    except Exception as exc:  # pylint: disable=broad-except\n        logger.warning(\n            \"Failed to auto-whitelist post %s on download: %s\", post.guid, exc\n        )\n        return flask.make_response((\"Post not whitelisted\", 403))\n\n\ndef _missing_processed_audio_response(post: Post, p_guid: str) -> flask.Response:\n    \"\"\"Return a response when processed audio is missing, optionally queueing work.\"\"\"\n    if not getattr(runtime_config, \"autoprocess_on_download\", False):\n        logger.warning(\"Processed audio not found for post: %s\", post.id)\n        return flask.make_response((\"Processed audio not found\", 404))\n\n    logger.info(\n        \"Auto-processing on download is enabled; queuing processing for %s\",\n        p_guid,\n    )\n    requester = getattr(getattr(g, \"current_user\", None), \"id\", None)\n    job_response = get_jobs_manager().start_post_processing(\n        p_guid,\n        priority=\"download\",\n        requested_by_user_id=requester,\n        billing_user_id=requester,\n    )\n    status = cast(Optional[str], job_response.get(\"status\"))\n    status_code = {\n        \"completed\": 200,\n        \"skipped\": 200,\n        \"error\": 400,\n        \"running\": 202,\n        \"started\": 202,\n    }.get(status or \"pending\", 202)\n    message = job_response.get(\n        \"message\",\n        \"Processing queued because audio was not ready for download\",\n    )\n    return flask.make_response(\n        flask.jsonify({**job_response, \"message\": message}),\n        status_code,\n    )\n\n\n@post_bp.route(\"/api/feeds/<int:feed_id>/posts\", methods=[\"GET\"])\ndef api_feed_posts(feed_id: int) -> flask.Response:\n    \"\"\"Return a paginated JSON list of posts for a specific feed.\"\"\"\n\n    # Ensure we have fresh data\n    db.session.expire_all()\n\n    feed = Feed.query.get_or_404(feed_id)\n\n    # Pagination and filtering\n    try:\n        page = int(request.args.get(\"page\", 1))\n    except (TypeError, ValueError):\n        page = 1\n    page = max(page, 1)\n\n    try:\n        page_size = int(request.args.get(\"page_size\", 25))\n    except (TypeError, ValueError):\n        page_size = 25\n    page_size = max(1, min(page_size, 200))\n\n    whitelisted_only = str(request.args.get(\"whitelisted_only\", \"false\")).lower() in {\n        \"1\",\n        \"true\",\n        \"yes\",\n        \"on\",\n    }\n\n    # Query posts directly to avoid stale relationship cache\n    base_query = Post.query.filter_by(feed_id=feed.id)\n    if whitelisted_only:\n        base_query = base_query.filter_by(whitelisted=True)\n\n    ordered_query = base_query.order_by(\n        Post.release_date.desc().nullslast(), Post.id.desc()\n    )\n\n    total_posts = ordered_query.count()\n    whitelisted_total = Post.query.filter_by(feed_id=feed.id, whitelisted=True).count()\n\n    db_posts = ordered_query.offset((page - 1) * page_size).limit(page_size).all()\n\n    posts = [\n        {\n            \"id\": post.id,\n            \"guid\": post.guid,\n            \"title\": post.title,\n            \"description\": post.description,\n            \"release_date\": (\n                post.release_date.isoformat() if post.release_date else None\n            ),\n            \"duration\": post.duration,\n            \"whitelisted\": post.whitelisted,\n            \"has_processed_audio\": post.processed_audio_path is not None,\n            \"has_unprocessed_audio\": post.unprocessed_audio_path is not None,\n            \"download_url\": post.download_url,\n            \"image_url\": post.image_url,\n            \"download_count\": post.download_count,\n        }\n        for post in db_posts\n    ]\n\n    total_pages = math.ceil(total_posts / page_size) if total_posts else 0\n\n    return flask.jsonify(\n        {\n            \"items\": posts,\n            \"page\": page,\n            \"page_size\": page_size,\n            \"total\": total_posts,\n            \"total_pages\": total_pages,\n            \"whitelisted_total\": whitelisted_total,\n        }\n    )\n\n\n@post_bp.route(\"/api/posts/<string:p_guid>/processing-estimate\", methods=[\"GET\"])\ndef api_post_processing_estimate(p_guid: str) -> ResponseReturnValue:\n    post = Post.query.filter_by(guid=p_guid).first()\n    if post is None:\n        return flask.make_response(flask.jsonify({\"error\": \"Post not found\"}), 404)\n\n    feed = db.session.get(Feed, post.feed_id)\n    if feed is None:\n        return flask.make_response(flask.jsonify({\"error\": \"Feed not found\"}), 404)\n\n    _, error = require_admin(\"estimate processing costs\")\n    if error:\n        return error\n\n    minutes = max(1.0, float(post.duration or 0) / 60.0) if post.duration else 60.0\n\n    return flask.jsonify(\n        {\n            \"post_guid\": post.guid,\n            \"estimated_minutes\": minutes,\n            \"can_process\": True,\n            \"reason\": None,\n        }\n    )\n\n\n@post_bp.route(\"/post/<string:p_guid>/json\", methods=[\"GET\"])\ndef get_post_json(p_guid: str) -> flask.Response:\n    logger.info(f\"API request for post details with GUID: {p_guid}\")\n    post = Post.query.filter_by(guid=p_guid).first()\n    if post is None:\n        return flask.make_response(jsonify({\"error\": \"Post not found\"}), 404)\n\n    segment_count = post.segments.count()\n    transcript_segments = []\n\n    if segment_count > 0:\n        sample_segments = post.segments.limit(5).all()\n        for segment in sample_segments:\n            transcript_segments.append(\n                {\n                    \"id\": segment.id,\n                    \"sequence_num\": segment.sequence_num,\n                    \"start_time\": segment.start_time,\n                    \"end_time\": segment.end_time,\n                    \"text\": (\n                        segment.text[:100] + \"...\"\n                        if len(segment.text) > 100\n                        else segment.text\n                    ),\n                }\n            )\n\n    whisper_model_calls = []\n    for model_call in post.model_calls.filter(\n        ModelCall.model_name.like(\"%whisper%\")\n    ).all():\n        whisper_model_calls.append(\n            {\n                \"id\": model_call.id,\n                \"model_name\": model_call.model_name,\n                \"status\": model_call.status,\n                \"first_segment\": model_call.first_segment_sequence_num,\n                \"last_segment\": model_call.last_segment_sequence_num,\n                \"timestamp\": (\n                    model_call.timestamp.isoformat() if model_call.timestamp else None\n                ),\n                \"response\": (\n                    model_call.response[:100] + \"...\"\n                    if model_call.response and len(model_call.response) > 100\n                    else model_call.response\n                ),\n                \"error\": model_call.error_message,\n            }\n        )\n\n    post_data = {\n        \"id\": post.id,\n        \"guid\": post.guid,\n        \"title\": post.title,\n        \"feed_id\": post.feed_id,\n        \"unprocessed_audio_path\": post.unprocessed_audio_path,\n        \"processed_audio_path\": post.processed_audio_path,\n        \"has_unprocessed_audio\": post.unprocessed_audio_path is not None,\n        \"has_processed_audio\": post.processed_audio_path is not None,\n        \"transcript_segment_count\": segment_count,\n        \"transcript_sample\": transcript_segments,\n        \"model_call_count\": post.model_calls.count(),\n        \"whisper_model_calls\": whisper_model_calls,\n        \"whitelisted\": post.whitelisted,\n        \"download_count\": post.download_count,\n    }\n\n    return flask.jsonify(post_data)\n\n\n@post_bp.route(\"/post/<string:p_guid>/debug\", methods=[\"GET\"])\ndef post_debug(p_guid: str) -> flask.Response:\n    \"\"\"Debug view for a post, showing model calls, transcript segments, and identifications.\"\"\"\n    post = Post.query.filter_by(guid=p_guid).first()\n    if post is None:\n        return flask.make_response((\"Post not found\", 404))\n\n    model_calls = (\n        ModelCall.query.filter_by(post_id=post.id)\n        .order_by(ModelCall.model_name, ModelCall.first_segment_sequence_num)\n        .all()\n    )\n\n    transcript_segments = post.segments.all()\n\n    identifications = (\n        Identification.query.join(TranscriptSegment)\n        .filter(TranscriptSegment.post_id == post.id)\n        .order_by(TranscriptSegment.sequence_num)\n        .all()\n    )\n\n    model_call_statuses, model_types = count_model_calls(model_calls)\n\n    content_segments = sum(1 for i in identifications if i.label == \"content\")\n    ad_segments = sum(1 for i in identifications if i.label == \"ad\")\n\n    stats = {\n        \"total_segments\": len(transcript_segments),\n        \"total_model_calls\": len(model_calls),\n        \"total_identifications\": len(identifications),\n        \"content_segments\": content_segments,\n        \"ad_segments_count\": ad_segments,\n        \"model_call_statuses\": model_call_statuses,\n        \"model_types\": model_types,\n        \"download_count\": post.download_count,\n    }\n\n    return flask.make_response(\n        flask.render_template(\n            \"post_debug.html\",\n            post=post,\n            model_calls=model_calls,\n            transcript_segments=transcript_segments,\n            identifications=identifications,\n            stats=stats,\n        ),\n        200,\n    )\n\n\n@post_bp.route(\"/api/posts/<string:p_guid>/stats\", methods=[\"GET\"])\ndef api_post_stats(p_guid: str) -> flask.Response:\n    \"\"\"Get processing statistics for a post in JSON format.\"\"\"\n    post = Post.query.filter_by(guid=p_guid).first()\n    if post is None:\n        return flask.make_response(flask.jsonify({\"error\": \"Post not found\"}), 404)\n\n    model_calls = (\n        ModelCall.query.filter_by(post_id=post.id)\n        .order_by(ModelCall.model_name, ModelCall.first_segment_sequence_num)\n        .all()\n    )\n\n    transcript_segments = post.segments.all()\n\n    identifications = (\n        Identification.query.join(TranscriptSegment)\n        .filter(TranscriptSegment.post_id == post.id)\n        .order_by(TranscriptSegment.sequence_num)\n        .all()\n    )\n\n    model_call_statuses: Dict[str, int] = {}\n    model_types: Dict[str, int] = {}\n\n    for call in model_calls:\n        if call.status not in model_call_statuses:\n            model_call_statuses[call.status] = 0\n        model_call_statuses[call.status] += 1\n\n        if call.model_name not in model_types:\n            model_types[call.model_name] = 0\n        model_types[call.model_name] += 1\n\n    content_segments = sum(1 for i in identifications if i.label == \"content\")\n    ad_segments = sum(1 for i in identifications if i.label == \"ad\")\n\n    # Refined ad windows are written by boundary refinement and are used for precise\n    # cutting. We also derive a UI-only \"mixed\" flag for segments that overlap a\n    # refined ad window but are not fully contained by it (i.e., segment contains\n    # both content and ad).\n    raw_refined = getattr(post, \"refined_ad_boundaries\", None) or []\n    refined_windows = parse_refined_windows(raw_refined)\n\n    model_call_details = []\n    for call in model_calls:\n        model_call_details.append(\n            {\n                \"id\": call.id,\n                \"model_name\": call.model_name,\n                \"status\": call.status,\n                \"segment_range\": f\"{call.first_segment_sequence_num}-{call.last_segment_sequence_num}\",\n                \"first_segment_sequence_num\": call.first_segment_sequence_num,\n                \"last_segment_sequence_num\": call.last_segment_sequence_num,\n                \"timestamp\": call.timestamp.isoformat() if call.timestamp else None,\n                \"retry_attempts\": call.retry_attempts,\n                \"error_message\": call.error_message,\n                \"prompt\": call.prompt,\n                \"response\": call.response,\n            }\n        )\n\n    transcript_segments_data = []\n    segment_mixed_by_id: Dict[int, bool] = {}\n    for segment in transcript_segments:\n        segment_identifications = [\n            i for i in identifications if i.transcript_segment_id == segment.id\n        ]\n\n        has_ad_label = any(i.label == \"ad\" for i in segment_identifications)\n        primary_label = \"ad\" if has_ad_label else \"content\"\n\n        seg_start = float(segment.start_time)\n        seg_end = float(segment.end_time)\n        mixed = bool(has_ad_label) and is_mixed_segment(\n            seg_start=seg_start, seg_end=seg_end, refined_windows=refined_windows\n        )\n        segment_mixed_by_id[int(segment.id)] = mixed\n\n        transcript_segments_data.append(\n            {\n                \"id\": segment.id,\n                \"sequence_num\": segment.sequence_num,\n                \"start_time\": round(segment.start_time, 1),\n                \"end_time\": round(segment.end_time, 1),\n                \"text\": segment.text,\n                \"primary_label\": primary_label,\n                \"mixed\": mixed,\n                \"identifications\": [\n                    {\n                        \"id\": ident.id,\n                        \"label\": ident.label,\n                        \"confidence\": (\n                            round(ident.confidence, 2) if ident.confidence else None\n                        ),\n                        \"model_call_id\": ident.model_call_id,\n                    }\n                    for ident in segment_identifications\n                ],\n            }\n        )\n\n    identifications_data = []\n    for identification in identifications:\n        segment = identification.transcript_segment\n        identifications_data.append(\n            {\n                \"id\": identification.id,\n                \"transcript_segment_id\": identification.transcript_segment_id,\n                \"label\": identification.label,\n                \"confidence\": (\n                    round(identification.confidence, 2)\n                    if identification.confidence\n                    else None\n                ),\n                \"model_call_id\": identification.model_call_id,\n                \"segment_sequence_num\": segment.sequence_num,\n                \"segment_start_time\": round(segment.start_time, 1),\n                \"segment_end_time\": round(segment.end_time, 1),\n                \"segment_text\": segment.text,\n                \"mixed\": bool(segment_mixed_by_id.get(int(segment.id), False)),\n            }\n        )\n\n    stats_data = {\n        \"post\": {\n            \"guid\": post.guid,\n            \"title\": post.title,\n            \"duration\": post.duration,\n            \"release_date\": (\n                post.release_date.isoformat() if post.release_date else None\n            ),\n            \"whitelisted\": post.whitelisted,\n            \"has_processed_audio\": post.processed_audio_path is not None,\n            \"download_count\": post.download_count,\n        },\n        \"processing_stats\": {\n            \"total_segments\": len(transcript_segments),\n            \"total_model_calls\": len(model_calls),\n            \"total_identifications\": len(identifications),\n            \"content_segments\": content_segments,\n            \"ad_segments_count\": ad_segments,\n            \"model_call_statuses\": model_call_statuses,\n            \"model_types\": model_types,\n        },\n        \"model_calls\": model_call_details,\n        \"transcript_segments\": transcript_segments_data,\n        \"identifications\": identifications_data,\n    }\n\n    return flask.jsonify(stats_data)\n\n\n@post_bp.route(\"/api/posts/<string:p_guid>/whitelist\", methods=[\"POST\"])\ndef api_toggle_whitelist(p_guid: str) -> ResponseReturnValue:\n    \"\"\"Toggle whitelist status for a post via API (admins only).\"\"\"\n    post = Post.query.filter_by(guid=p_guid).first()\n    if post is None:\n        return flask.make_response(flask.jsonify({\"error\": \"Post not found\"}), 404)\n\n    feed = db.session.get(Feed, post.feed_id)\n    if feed is None:\n        return flask.make_response(flask.jsonify({\"error\": \"Feed not found\"}), 404)\n\n    user, error = require_admin(\"whitelist this episode\")\n    if error:\n        return error\n    if user is not None and user.role != \"admin\":\n        return (\n            flask.jsonify(\n                {\n                    \"error\": \"FORBIDDEN\",\n                    \"message\": \"Only admins can change whitelist status.\",\n                }\n            ),\n            403,\n        )\n\n    data = request.get_json()\n    if data is None or \"whitelisted\" not in data:\n        return flask.make_response(\n            flask.jsonify({\"error\": \"Missing whitelisted field\"}), 400\n        )\n\n    try:\n        writer_client.update(\n            \"Post\", post.id, {\"whitelisted\": bool(data[\"whitelisted\"])}, wait=True\n        )\n        # Refresh post object\n        db.session.expire(post)\n    except Exception as e:\n        logger.error(f\"Failed to toggle whitelist: {e}\")\n        return (\n            flask.jsonify(\n                {\n                    \"error\": \"Failed to update post\",\n                }\n            ),\n            500,\n        )\n\n    response_body: Dict[str, Any] = {\n        \"guid\": post.guid,\n        \"whitelisted\": post.whitelisted,\n        \"message\": \"Whitelist status updated successfully\",\n    }\n\n    trigger_processing = bool(data.get(\"trigger_processing\"))\n    if post.whitelisted and trigger_processing:\n        billing_user_id = getattr(user, \"id\", None)\n        job_response = get_jobs_manager().start_post_processing(\n            post.guid,\n            priority=\"interactive\",\n            requested_by_user_id=billing_user_id,\n            billing_user_id=billing_user_id,\n        )\n        response_body[\"processing_job\"] = job_response\n\n    return flask.jsonify(response_body)\n\n\n@post_bp.route(\"/api/feeds/<int:feed_id>/toggle-whitelist-all\", methods=[\"POST\"])\ndef api_toggle_whitelist_all(feed_id: int) -> ResponseReturnValue:\n    \"\"\"Intelligently toggle whitelist status for all posts in a feed.\n\n    Admin only.\n    \"\"\"\n    feed = Feed.query.get_or_404(feed_id)\n\n    _, error = require_admin(\"toggle whitelist for all posts\")\n    if error:\n        return error\n\n    if not feed.posts:\n        return flask.jsonify(\n            {\n                \"message\": \"No posts found in this feed\",\n                \"whitelisted_count\": 0,\n                \"total_count\": 0,\n            }\n        )\n\n    all_whitelisted = all(post.whitelisted for post in feed.posts)\n    new_status = not all_whitelisted\n\n    try:\n        result = writer_client.action(\n            \"toggle_whitelist_all_for_feed\",\n            {\"feed_id\": feed.id, \"new_status\": new_status},\n            wait=True,\n        )\n        if not result or not result.success:\n            raise RuntimeError(getattr(result, \"error\", \"Unknown writer error\"))\n        updated = int((result.data or {}).get(\"updated_count\") or 0)\n    except Exception:  # pylint: disable=broad-except\n        return (\n            flask.jsonify(\n                {\n                    \"error\": \"Database busy, please retry\",\n                    \"retry_after_seconds\": 1,\n                }\n            ),\n            503,\n        )\n\n    whitelisted_count = Post.query.filter_by(feed_id=feed.id, whitelisted=True).count()\n    total_count = Post.query.filter_by(feed_id=feed.id).count()\n\n    return flask.jsonify(\n        {\n            \"message\": f\"{'Whitelisted' if new_status else 'Unwhitelisted'} all posts\",\n            \"whitelisted_count\": whitelisted_count,\n            \"total_count\": total_count,\n            \"all_whitelisted\": new_status,\n            \"updated_count\": updated,\n        }\n    )\n\n\n@post_bp.route(\"/api/posts/<string:p_guid>/process\", methods=[\"POST\"])\ndef api_process_post(p_guid: str) -> ResponseReturnValue:\n    \"\"\"Start processing a post and return immediately.\n\n    Admin only.\n    \"\"\"\n    post = Post.query.filter_by(guid=p_guid).first()\n    if not post:\n        return (\n            flask.jsonify(\n                {\n                    \"status\": \"error\",\n                    \"error_code\": \"NOT_FOUND\",\n                    \"message\": \"Post not found\",\n                }\n            ),\n            404,\n        )\n\n    feed = db.session.get(Feed, post.feed_id)\n    if feed is None:\n        return (\n            flask.jsonify(\n                {\n                    \"status\": \"error\",\n                    \"error_code\": \"FEED_NOT_FOUND\",\n                    \"message\": \"Feed not found\",\n                }\n            ),\n            404,\n        )\n\n    user, error = require_admin(\"process this episode\")\n    if error:\n        return error\n\n    if not post.whitelisted:\n        return (\n            flask.jsonify(\n                {\n                    \"status\": \"error\",\n                    \"error_code\": \"NOT_WHITELISTED\",\n                    \"message\": \"Post not whitelisted\",\n                }\n            ),\n            400,\n        )\n\n    if post.processed_audio_path and os.path.exists(post.processed_audio_path):\n        return flask.jsonify(\n            {\n                \"status\": \"completed\",\n                \"message\": \"Post already processed\",\n                \"download_url\": f\"/api/posts/{p_guid}/download\",\n            }\n        )\n\n    billing_user_id = getattr(user, \"id\", None)\n\n    try:\n        result = get_jobs_manager().start_post_processing(\n            p_guid,\n            priority=\"interactive\",\n            requested_by_user_id=billing_user_id,\n            billing_user_id=billing_user_id,\n        )\n        status_code = 200 if result.get(\"status\") in (\"started\", \"completed\") else 400\n        return flask.jsonify(result), status_code\n    except Exception as e:\n        logger.error(f\"Failed to start processing job for {p_guid}: {e}\")\n        return (\n            flask.jsonify(\n                {\n                    \"status\": \"error\",\n                    \"error_code\": \"JOB_START_FAILED\",\n                    \"message\": f\"Failed to start processing job: {str(e)}\",\n                }\n            ),\n            500,\n        )\n\n\n@post_bp.route(\"/api/posts/<string:p_guid>/reprocess\", methods=[\"POST\"])\ndef api_reprocess_post(p_guid: str) -> ResponseReturnValue:\n    \"\"\"Clear all processing data for a post and start processing from scratch.\n\n    Admin only.\n    \"\"\"\n    logger.info(\"[API] Reprocess requested for post_guid=%s\", p_guid)\n\n    post = Post.query.filter_by(guid=p_guid).first()\n    if not post:\n        logger.warning(\"[API] Reprocess: post not found for guid=%s\", p_guid)\n        return (\n            flask.jsonify(\n                {\n                    \"status\": \"error\",\n                    \"error_code\": \"NOT_FOUND\",\n                    \"message\": \"Post not found\",\n                }\n            ),\n            404,\n        )\n\n    feed = db.session.get(Feed, post.feed_id)\n    if feed is None:\n        logger.warning(\n            \"[API] Reprocess: feed not found for guid=%s feed_id=%s\",\n            p_guid,\n            getattr(post, \"feed_id\", None),\n        )\n        return (\n            flask.jsonify(\n                {\n                    \"status\": \"error\",\n                    \"error_code\": \"FEED_NOT_FOUND\",\n                    \"message\": \"Feed not found\",\n                }\n            ),\n            404,\n        )\n\n    user, error = require_admin(\"reprocess this episode\")\n    if error:\n        logger.warning(\"[API] Reprocess: auth error for guid=%s\", p_guid)\n        return error\n    if user and user.role != \"admin\":\n        logger.warning(\n            \"[API] Reprocess: non-admin user attempted reprocess guid=%s user_id=%s role=%s\",\n            p_guid,\n            getattr(user, \"id\", None),\n            getattr(user, \"role\", None),\n        )\n        return (\n            flask.jsonify(\n                {\n                    \"status\": \"error\",\n                    \"error_code\": \"REPROCESS_FORBIDDEN\",\n                    \"message\": \"Only admins can reprocess episodes.\",\n                }\n            ),\n            403,\n        )\n\n    if not post.whitelisted:\n        logger.info(\n            \"[API] Reprocess: post not whitelisted guid=%s post_id=%s\",\n            p_guid,\n            getattr(post, \"id\", None),\n        )\n        return (\n            flask.jsonify(\n                {\n                    \"status\": \"error\",\n                    \"error_code\": \"NOT_WHITELISTED\",\n                    \"message\": \"Post not whitelisted\",\n                }\n            ),\n            400,\n        )\n\n    billing_user_id = getattr(user, \"id\", None)\n\n    try:\n        logger.info(\n            \"[API] Reprocess: cancelling jobs and clearing processing data guid=%s post_id=%s\",\n            p_guid,\n            getattr(post, \"id\", None),\n        )\n        get_jobs_manager().cancel_post_jobs(p_guid)\n        clear_post_processing_data(post)\n        logger.info(\n            \"[API] Reprocess: starting post processing guid=%s post_id=%s\",\n            p_guid,\n            getattr(post, \"id\", None),\n        )\n        result = get_jobs_manager().start_post_processing(\n            p_guid,\n            priority=\"interactive\",\n            requested_by_user_id=billing_user_id,\n            billing_user_id=billing_user_id,\n        )\n        status_code = 200 if result.get(\"status\") in (\"started\", \"completed\") else 400\n        if result.get(\"status\") == \"started\":\n            result[\"message\"] = \"Post cleared and reprocessing started\"\n        logger.info(\n            \"[API] Reprocess: completed guid=%s status=%s code=%s\",\n            p_guid,\n            result.get(\"status\"),\n            status_code,\n        )\n        return flask.jsonify(result), status_code\n    except Exception as e:\n        logger.error(f\"Failed to reprocess post {p_guid}: {e}\", exc_info=True)\n        return (\n            flask.jsonify(\n                {\n                    \"status\": \"error\",\n                    \"error_code\": \"REPROCESS_FAILED\",\n                    \"message\": f\"Failed to reprocess post: {str(e)}\",\n                }\n            ),\n            500,\n        )\n\n\n@post_bp.route(\"/api/posts/<string:p_guid>/status\", methods=[\"GET\"])\ndef api_post_status(p_guid: str) -> ResponseReturnValue:\n    \"\"\"Get the current processing status of a post via JobsManager.\"\"\"\n    result = get_jobs_manager().get_post_status(p_guid)\n    status_code = (\n        200\n        if result.get(\"status\") != \"error\"\n        else (404 if result.get(\"error_code\") == \"NOT_FOUND\" else 400)\n    )\n    return flask.jsonify(result), status_code\n\n\n@post_bp.route(\"/api/posts/<string:p_guid>/audio\", methods=[\"GET\"])\ndef api_get_post_audio(p_guid: str) -> ResponseReturnValue:\n    \"\"\"API endpoint to serve processed audio files with proper CORS headers.\"\"\"\n    logger.info(f\"API request for audio file with GUID: {p_guid}\")\n\n    post = Post.query.filter_by(guid=p_guid).first()\n    if post is None:\n        logger.warning(f\"Post with GUID: {p_guid} not found\")\n        return flask.make_response(\n            jsonify({\"error\": \"Post not found\", \"error_code\": \"NOT_FOUND\"}), 404\n        )\n\n    if not post.whitelisted:\n        logger.warning(f\"Post: {post.title} is not whitelisted\")\n        return flask.make_response(\n            jsonify({\"error\": \"Post not whitelisted\", \"error_code\": \"NOT_WHITELISTED\"}),\n            403,\n        )\n\n    if not post.processed_audio_path or not Path(post.processed_audio_path).exists():\n        logger.warning(f\"Processed audio not found for post: {post.id}\")\n        return flask.make_response(\n            jsonify(\n                {\n                    \"error\": \"Processed audio not available\",\n                    \"error_code\": \"AUDIO_NOT_READY\",\n                    \"message\": \"Post needs to be processed first\",\n                }\n            ),\n            404,\n        )\n\n    try:\n        response = send_file(\n            path_or_file=Path(post.processed_audio_path).resolve(),\n            mimetype=\"audio/mpeg\",\n            as_attachment=False,\n        )\n        response.headers[\"Accept-Ranges\"] = \"bytes\"\n        return response\n    except Exception as e:  # pylint: disable=broad-except\n        logger.error(f\"Error serving audio file for {p_guid}: {e}\")\n        return flask.make_response(\n            jsonify(\n                {\"error\": \"Error serving audio file\", \"error_code\": \"SERVER_ERROR\"}\n            ),\n            500,\n        )\n\n\n@post_bp.route(\"/api/posts/<string:p_guid>/download\", methods=[\"GET\"])\ndef api_download_post(p_guid: str) -> flask.Response:\n    \"\"\"API endpoint to download processed audio files.\"\"\"\n    current_user = getattr(g, \"current_user\", None)\n    if current_user:\n        update_user_last_active(current_user.id)\n\n    logger.info(f\"Request to download post with GUID: {p_guid}\")\n    post = Post.query.filter_by(guid=p_guid).first()\n    if post is None:\n        logger.warning(f\"Post with GUID: {p_guid} not found\")\n        return flask.make_response((\"Post not found\", 404))\n\n    whitelist_response = _ensure_whitelisted_for_download(post, p_guid)\n    if whitelist_response:\n        return whitelist_response\n\n    if not post.processed_audio_path or not Path(post.processed_audio_path).exists():\n        return _missing_processed_audio_response(post, p_guid)\n\n    try:\n        response = send_file(\n            path_or_file=Path(post.processed_audio_path).resolve(),\n            mimetype=\"audio/mpeg\",\n            as_attachment=True,\n            download_name=f\"{post.title}.mp3\",\n        )\n    except Exception as e:  # pylint: disable=broad-except\n        logger.error(f\"Error serving file for {p_guid}: {e}\")\n        return flask.make_response((\"Error serving file\", 500))\n\n    _increment_download_count(post)\n    return response\n\n\n@post_bp.route(\"/api/posts/<string:p_guid>/download/original\", methods=[\"GET\"])\ndef api_download_original_post(p_guid: str) -> flask.Response:\n    \"\"\"API endpoint to download original (unprocessed) audio files.\"\"\"\n    logger.info(f\"Request to download original post with GUID: {p_guid}\")\n    post = Post.query.filter_by(guid=p_guid).first()\n    if post is None:\n        logger.warning(f\"Post with GUID: {p_guid} not found\")\n        return flask.make_response((\"Post not found\", 404))\n\n    if not post.whitelisted:\n        logger.warning(f\"Post: {post.title} is not whitelisted\")\n        return flask.make_response((\"Post not whitelisted\", 403))\n\n    if (\n        not post.unprocessed_audio_path\n        or not Path(post.unprocessed_audio_path).exists()\n    ):\n        logger.warning(f\"Original audio not found for post: {post.id}\")\n        return flask.make_response((\"Original audio not found\", 404))\n\n    try:\n        response = send_file(\n            path_or_file=Path(post.unprocessed_audio_path).resolve(),\n            mimetype=\"audio/mpeg\",\n            as_attachment=True,\n            download_name=f\"{post.title}_original.mp3\",\n        )\n    except Exception as e:  # pylint: disable=broad-except\n        logger.error(f\"Error serving original file for {p_guid}: {e}\")\n        return flask.make_response((\"Error serving file\", 500))\n\n    _increment_download_count(post)\n    return response\n\n\n# Legacy endpoints for backward compatibility\n@post_bp.route(\"/post/<string:p_guid>.mp3\", methods=[\"GET\"])\ndef download_post_legacy(p_guid: str) -> flask.Response:\n    return api_download_post(p_guid)\n\n\n@post_bp.route(\"/post/<string:p_guid>/original.mp3\", methods=[\"GET\"])\ndef download_original_post_legacy(p_guid: str) -> flask.Response:\n    return api_download_original_post(p_guid)\n"
  },
  {
    "path": "src/app/routes/post_stats_utils.py",
    "content": "from __future__ import annotations\n\nfrom typing import Any, Dict, Iterable, List, Tuple\n\n\ndef count_model_calls(\n    model_calls: Iterable[Any],\n) -> Tuple[Dict[str, int], Dict[str, int]]:\n    model_call_statuses: Dict[str, int] = {}\n    model_types: Dict[str, int] = {}\n\n    for call in model_calls:\n        status = getattr(call, \"status\", None)\n        model_name = getattr(call, \"model_name\", None)\n\n        if status is not None:\n            model_call_statuses[status] = model_call_statuses.get(status, 0) + 1\n        if model_name is not None:\n            model_types[model_name] = model_types.get(model_name, 0) + 1\n\n    return model_call_statuses, model_types\n\n\ndef parse_refined_windows(raw_refined: Any) -> List[Tuple[float, float]]:\n    refined_windows: List[Tuple[float, float]] = []\n    if not isinstance(raw_refined, list):\n        return refined_windows\n\n    for item in raw_refined:\n        if not isinstance(item, dict):\n            continue\n\n        start_raw = item.get(\"refined_start\")\n        end_raw = item.get(\"refined_end\")\n        if start_raw is None or end_raw is None:\n            continue\n\n        try:\n            start_v = float(start_raw)\n            end_v = float(end_raw)\n        except Exception:\n            continue\n\n        if end_v > start_v:\n            refined_windows.append((start_v, end_v))\n\n    return refined_windows\n\n\ndef is_mixed_segment(\n    *, seg_start: float, seg_end: float, refined_windows: List[Tuple[float, float]]\n) -> bool:\n    for win_start, win_end in refined_windows:\n        overlaps = seg_start <= win_end and seg_end >= win_start\n        if not overlaps:\n            continue\n\n        fully_contained = seg_start >= win_start and seg_end <= win_end\n        if not fully_contained:\n            return True\n\n    return False\n"
  },
  {
    "path": "src/app/runtime_config.py",
    "content": "\"\"\"\nRuntime configuration module - isolated to prevent circular imports.\nInitializes the global config object that is used throughout the application.\n\"\"\"\n\nimport os\nimport sys\n\nfrom shared import defaults as DEFAULTS\nfrom shared.config import Config as RuntimeConfig\nfrom shared.config import LocalWhisperConfig, OutputConfig, ProcessingConfig\n\nis_test = \"pytest\" in sys.modules\n\n# For tests, use in-memory config for deterministic behavior. For runtime,\n# initialize with sensible defaults; DB-backed settings will hydrate immediately after migrations.\nif is_test:\n    from shared.test_utils import create_standard_test_config\n\n    config = create_standard_test_config()\nelse:\n    config = RuntimeConfig(\n        llm_api_key=None,\n        llm_model=DEFAULTS.LLM_DEFAULT_MODEL,\n        openai_base_url=None,\n        openai_max_tokens=DEFAULTS.OPENAI_DEFAULT_MAX_TOKENS,\n        openai_timeout=DEFAULTS.OPENAI_DEFAULT_TIMEOUT_SEC,\n        output=OutputConfig(\n            fade_ms=DEFAULTS.OUTPUT_FADE_MS,\n            min_ad_segement_separation_seconds=DEFAULTS.OUTPUT_MIN_AD_SEGMENT_SEPARATION_SECONDS,\n            min_ad_segment_length_seconds=DEFAULTS.OUTPUT_MIN_AD_SEGMENT_LENGTH_SECONDS,\n            min_confidence=DEFAULTS.OUTPUT_MIN_CONFIDENCE,\n        ),\n        processing=ProcessingConfig(\n            num_segments_to_input_to_prompt=DEFAULTS.PROCESSING_NUM_SEGMENTS_TO_INPUT_TO_PROMPT,\n            max_overlap_segments=DEFAULTS.PROCESSING_MAX_OVERLAP_SEGMENTS,\n        ),\n        background_update_interval_minute=DEFAULTS.APP_BACKGROUND_UPDATE_INTERVAL_MINUTE,\n        post_cleanup_retention_days=DEFAULTS.APP_POST_CLEANUP_RETENTION_DAYS,\n        llm_max_concurrent_calls=DEFAULTS.LLM_DEFAULT_MAX_CONCURRENT_CALLS,\n        llm_max_retry_attempts=DEFAULTS.LLM_DEFAULT_MAX_RETRY_ATTEMPTS,\n        llm_enable_token_rate_limiting=DEFAULTS.LLM_ENABLE_TOKEN_RATE_LIMITING,\n        llm_max_input_tokens_per_call=DEFAULTS.LLM_MAX_INPUT_TOKENS_PER_CALL,\n        llm_max_input_tokens_per_minute=DEFAULTS.LLM_MAX_INPUT_TOKENS_PER_MINUTE,\n        automatically_whitelist_new_episodes=DEFAULTS.APP_AUTOMATICALLY_WHITELIST_NEW_EPISODES,\n        number_of_episodes_to_whitelist_from_archive_of_new_feed=DEFAULTS.APP_NUM_EPISODES_TO_WHITELIST_FROM_ARCHIVE_OF_NEW_FEED,\n        whisper=LocalWhisperConfig(model=DEFAULTS.WHISPER_LOCAL_MODEL),\n        enable_public_landing_page=DEFAULTS.APP_ENABLE_PUBLIC_LANDING_PAGE,\n        user_limit_total=DEFAULTS.APP_USER_LIMIT_TOTAL,\n        developer_mode=os.environ.get(\"DEVELOPER_MODE\", \"false\").lower() == \"true\",\n        autoprocess_on_download=DEFAULTS.APP_AUTOPROCESS_ON_DOWNLOAD,\n    )\n"
  },
  {
    "path": "src/app/static/.gitignore",
    "content": "# This file ensures the static directory exists in the repository.\n# Frontend build assets are generated here but not committed to git.\n*\n!.gitignore"
  },
  {
    "path": "src/app/templates/index.html",
    "content": "<!DOCTYPE html>\n<html lang=\"en\">\n  <head>\n    <meta charset=\"UTF-8\" />\n    <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\" />\n    <link\n      rel=\"icon\"\n      type=\"image/png\"\n      href=\"{{ url_for('static', filename='images/logos/favicon-48x48.png') }}\"\n      sizes=\"48x48\"\n    />\n    <link\n      rel=\"icon\"\n      type=\"image/svg+xml\"\n      href=\"{{ url_for('static', filename='images/logos/favicon.svg') }}\"\n    />\n    <link\n      rel=\"shortcut icon\"\n      href=\"{{ url_for('static', filename='images/logos/favicon.ico') }}\"\n    />\n    <link\n      rel=\"apple-touch-icon\"\n      sizes=\"180x180\"\n      href=\"{{ url_for('static', filename='images/logos/apple-touch-icon.png') }}\"\n    />\n    <meta name=\"apple-mobile-web-app-title\" content=\"Podly\" />\n    <link\n      rel=\"manifest\"\n      href=\"{{ url_for('static', filename='site.webmanifest') }}\"\n    />\n    <title>Podly - Redirecting to New UI</title>\n    <style>\n      body {\n        font-family: Arial, sans-serif;\n        text-align: center;\n        margin: 50px;\n        background-color: #f5f5f5;\n      }\n      .container {\n        max-width: 600px;\n        margin: 0 auto;\n        background: white;\n        padding: 40px;\n        border-radius: 10px;\n        box-shadow: 0 2px 10px rgba(0,0,0,0.1);\n      }\n      .logo {\n        width: 300px;\n        margin-bottom: 20px;\n      }\n      .redirect-link {\n        display: inline-block;\n        background-color: #007bff;\n        color: white;\n        padding: 15px 30px;\n        text-decoration: none;\n        border-radius: 5px;\n        font-size: 18px;\n        margin-top: 20px;\n        transition: background-color 0.3s;\n      }\n      .redirect-link:hover {\n        background-color: #0056b3;\n      }\n      .countdown {\n        margin-top: 20px;\n        font-size: 14px;\n        color: #666;\n      }\n    </style>\n  </head>\n  <body>\n    <div class=\"container\">\n      <img\n        src=\"{{ url_for('static', filename='images/logos/logo_with_text.png') }}\"\n        alt=\"Podly Logo\"\n        class=\"logo\"\n      />\n      <h1>Welcome to Podly</h1>\n      <p>We've moved to a new and improved interface!</p>\n      <p>You will be automatically redirected to our new UI in <span id=\"countdown\">5</span> seconds.</p>\n\n      {% set redirect_url = \"http://\" + request.host.split(':')[0] + \":5001\" %}\n\n      <a href=\"{{ redirect_url }}\" class=\"redirect-link\">\n        Go to New UI Now\n      </a>\n\n      <div class=\"countdown\">\n        <p>If you are not redirected automatically, click the button above.</p>\n      </div>\n    </div>\n\n    <script>\n      // Auto-redirect after 5 seconds\n      let countdown = 5;\n      const countdownElement = document.getElementById('countdown');\n\n      const currentHost = window.location.hostname;\n      const redirectUrl = `http://${currentHost}:5001`;\n\n      const timer = setInterval(() => {\n        countdown--;\n        countdownElement.textContent = countdown;\n\n        if (countdown <= 0) {\n          clearInterval(timer);\n          window.location.href = redirectUrl;\n        }\n      }, 1000);\n    </script>\n  </body>\n</html>\n"
  },
  {
    "path": "src/app/timeout_decorator.py",
    "content": "import functools\nimport threading\nfrom typing import Any, Callable, List, Optional, TypeVar\n\nT = TypeVar(\"T\")\n\n\nclass TimeoutException(Exception):\n    \"\"\"Custom exception to indicate a timeout.\"\"\"\n\n\ndef timeout_decorator(timeout: int) -> Callable[[Callable[..., T]], Callable[..., T]]:\n    \"\"\"\n    Decorator to enforce a timeout on a function.\n    If the function execution exceeds the timeout, a TimeoutException is raised.\n    \"\"\"\n\n    def decorator(func: Callable[..., T]) -> Callable[..., T]:\n        @functools.wraps(func)\n        def wrapper(*args: Any, **kwargs: Any) -> T:\n            timeout_flag = threading.Event()\n            result: List[Optional[T]] = [None]\n\n            def target() -> None:\n                try:\n                    result[0] = func(*args, **kwargs)\n                except Exception as e:  # pylint: disable=broad-exception-caught\n                    print(f\"Exception in thread: {e}\")\n                finally:\n                    timeout_flag.set()\n\n            thread = threading.Thread(target=target)\n            thread.start()\n            thread.join(timeout)\n            if not timeout_flag.is_set():\n                raise TimeoutException(\n                    f\"Function '{func.__name__}' exceeded timeout of {timeout} seconds.\"\n                )\n            return result[0]  # type: ignore\n\n        return wrapper\n\n    return decorator\n"
  },
  {
    "path": "src/app/writer/__init__.py",
    "content": "from .executor import CommandExecutor\nfrom .service import run_writer_service\n\n__all__ = [\"CommandExecutor\", \"run_writer_service\"]\n"
  },
  {
    "path": "src/app/writer/__main__.py",
    "content": "from .service import run_writer_service\n\nif __name__ == \"__main__\":\n    run_writer_service()\n"
  },
  {
    "path": "src/app/writer/actions/__init__.py",
    "content": "\"\"\"Writer action function re-exports.\n\nMypy runs with `--no-implicit-reexport`, so imports use explicit aliasing.\n\"\"\"\n\n# pylint: disable=useless-import-alias\n\nfrom .cleanup import (\n    cleanup_missing_audio_paths_action as cleanup_missing_audio_paths_action,\n)\nfrom .cleanup import cleanup_processed_post_action as cleanup_processed_post_action\nfrom .cleanup import (\n    clear_post_processing_data_action as clear_post_processing_data_action,\n)\nfrom .feeds import add_feed_action as add_feed_action\nfrom .feeds import create_dev_test_feed_action as create_dev_test_feed_action\nfrom .feeds import create_feed_access_token_action as create_feed_access_token_action\nfrom .feeds import delete_feed_cascade_action as delete_feed_cascade_action\nfrom .feeds import (\n    ensure_user_feed_membership_action as ensure_user_feed_membership_action,\n)\nfrom .feeds import increment_download_count_action as increment_download_count_action\nfrom .feeds import refresh_feed_action as refresh_feed_action\nfrom .feeds import (\n    remove_user_feed_membership_action as remove_user_feed_membership_action,\n)\nfrom .feeds import (\n    toggle_whitelist_all_for_feed_action as toggle_whitelist_all_for_feed_action,\n)\nfrom .feeds import touch_feed_access_token_action as touch_feed_access_token_action\nfrom .feeds import update_feed_settings_action as update_feed_settings_action\nfrom .feeds import (\n    whitelist_latest_post_for_feed_action as whitelist_latest_post_for_feed_action,\n)\nfrom .feeds import whitelist_post_action as whitelist_post_action\nfrom .jobs import cancel_existing_jobs_action as cancel_existing_jobs_action\nfrom .jobs import cleanup_stale_jobs_action as cleanup_stale_jobs_action\nfrom .jobs import clear_all_jobs_action as clear_all_jobs_action\nfrom .jobs import create_job_action as create_job_action\nfrom .jobs import dequeue_job_action as dequeue_job_action\nfrom .jobs import mark_cancelled_action as mark_cancelled_action\nfrom .jobs import reassign_pending_jobs_action as reassign_pending_jobs_action\nfrom .jobs import update_job_status_action as update_job_status_action\nfrom .processor import insert_identifications_action as insert_identifications_action\nfrom .processor import mark_model_call_failed_action as mark_model_call_failed_action\nfrom .processor import replace_identifications_action as replace_identifications_action\nfrom .processor import replace_transcription_action as replace_transcription_action\nfrom .processor import upsert_model_call_action as upsert_model_call_action\nfrom .processor import (\n    upsert_whisper_model_call_action as upsert_whisper_model_call_action,\n)\nfrom .system import ensure_active_run_action as ensure_active_run_action\nfrom .system import update_combined_config_action as update_combined_config_action\nfrom .system import update_discord_settings_action as update_discord_settings_action\nfrom .users import create_user_action as create_user_action\nfrom .users import delete_user_action as delete_user_action\nfrom .users import set_manual_feed_allowance_action as set_manual_feed_allowance_action\nfrom .users import (\n    set_user_billing_by_customer_id_action as set_user_billing_by_customer_id_action,\n)\nfrom .users import set_user_billing_fields_action as set_user_billing_fields_action\nfrom .users import set_user_role_action as set_user_role_action\nfrom .users import update_user_last_active_action as update_user_last_active_action\nfrom .users import update_user_password_action as update_user_password_action\nfrom .users import upsert_discord_user_action as upsert_discord_user_action\n"
  },
  {
    "path": "src/app/writer/actions/cleanup.py",
    "content": "import logging\nimport os\nfrom typing import Any, Dict\n\nfrom app.extensions import db\nfrom app.jobs_manager_run_service import recalculate_run_counts\nfrom app.models import (\n    Identification,\n    ModelCall,\n    Post,\n    ProcessingJob,\n    TranscriptSegment,\n)\n\nlogger = logging.getLogger(\"writer\")\n\n\ndef cleanup_missing_audio_paths_action(params: Dict[str, Any]) -> int:\n    inconsistent_posts = Post.query.filter(\n        Post.whitelisted,\n        (\n            (Post.unprocessed_audio_path.isnot(None))\n            | (Post.processed_audio_path.isnot(None))\n        ),\n    ).all()\n\n    count = 0\n    for post in inconsistent_posts:\n        changed = False\n        if post.processed_audio_path and not os.path.exists(post.processed_audio_path):\n            post.processed_audio_path = None\n            changed = True\n        if post.unprocessed_audio_path and not os.path.exists(\n            post.unprocessed_audio_path\n        ):\n            post.unprocessed_audio_path = None\n            changed = True\n\n        if changed:\n            latest_job = (\n                ProcessingJob.query.filter_by(post_guid=post.guid)\n                .order_by(ProcessingJob.created_at.desc())\n                .first()\n            )\n            if latest_job and latest_job.status not in {\"pending\", \"running\"}:\n                latest_job.status = \"pending\"\n                latest_job.current_step = 0\n                latest_job.progress_percentage = 0.0\n                latest_job.step_name = \"Not started\"\n                latest_job.error_message = None\n                latest_job.started_at = None\n                latest_job.completed_at = None\n\n            count += 1\n\n    return count\n\n\ndef clear_post_processing_data_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    post_id = params.get(\"post_id\")\n    post = db.session.get(Post, post_id)\n    if not post:\n        raise ValueError(f\"Post {post_id} not found\")\n\n    logger.info(\"[WRITER] clear_post_processing_data_action: post_id=%s\", post_id)\n\n    # Chunked deletes for segments and identifications\n    while True:\n        ids_batch = [\n            row[0]\n            for row in db.session.query(TranscriptSegment.id)\n            .filter_by(post_id=post.id)\n            .limit(500)\n            .all()\n        ]\n        if not ids_batch:\n            logger.debug(\n                \"[WRITER] clear_post_processing_data_action: no more segments for post_id=%s\",\n                post_id,\n            )\n            break\n\n        db.session.query(Identification).filter(\n            Identification.transcript_segment_id.in_(ids_batch)\n        ).delete(synchronize_session=False)\n\n        db.session.query(TranscriptSegment).filter(\n            TranscriptSegment.id.in_(ids_batch)\n        ).delete(synchronize_session=False)\n\n    # Model calls\n    db.session.query(ModelCall).filter_by(post_id=post.id).delete()\n\n    # Processing jobs\n    db.session.query(ProcessingJob).filter_by(post_guid=post.guid).delete()\n\n    # Reset post fields\n    post.unprocessed_audio_path = None\n    post.processed_audio_path = None\n    post.duration = None\n\n    logger.info(\n        \"[WRITER] clear_post_processing_data_action: completed post_id=%s\", post_id\n    )\n\n    return {\"post_id\": post.id}\n\n\ndef cleanup_processed_post_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    post_id = params.get(\"post_id\")\n    if not post_id:\n        raise ValueError(\"post_id is required\")\n\n    post = db.session.get(Post, int(post_id))\n    if not post:\n        raise ValueError(f\"Post {post_id} not found\")\n\n    logger.info(\"[WRITER] cleanup_processed_post_action: post_id=%s\", post_id)\n\n    # Remove processing artifacts and dependent rows.\n    clear_post_processing_data_action({\"post_id\": post.id})\n    post.whitelisted = False\n\n    recalculate_run_counts(db.session)\n\n    logger.info(\"[WRITER] cleanup_processed_post_action: completed post_id=%s\", post_id)\n\n    return {\"post_id\": post.id}\n"
  },
  {
    "path": "src/app/writer/actions/feeds.py",
    "content": "import hashlib\nimport secrets\nimport uuid\nfrom datetime import datetime\nfrom typing import Any, Dict\n\nfrom sqlalchemy import func\n\nfrom app.extensions import db\nfrom app.jobs_manager_run_service import recalculate_run_counts\nfrom app.models import (\n    Feed,\n    FeedAccessToken,\n    Identification,\n    ModelCall,\n    Post,\n    ProcessingJob,\n    TranscriptSegment,\n    UserFeed,\n)\n\n\ndef refresh_feed_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    feed_id = params.get(\"feed_id\")\n    updates = params.get(\"updates\", {})\n    new_posts_data = params.get(\"new_posts\", [])\n\n    feed = db.session.get(Feed, feed_id)\n    if not feed:\n        raise ValueError(f\"Feed {feed_id} not found\")\n\n    for k, v in updates.items():\n        setattr(feed, k, v)\n\n    created_posts = []\n    for post_data in new_posts_data:\n        # Handle datetime deserialization\n        if \"release_date\" in post_data and isinstance(post_data[\"release_date\"], str):\n            post_data[\"release_date\"] = datetime.fromisoformat(\n                post_data[\"release_date\"]\n            )\n\n        post = Post(**post_data)\n        db.session.add(post)\n        created_posts.append(post)\n\n    db.session.flush()\n\n    for post in created_posts:\n        if post.whitelisted:\n            job = ProcessingJob(\n                id=str(uuid.uuid4()),\n                post_guid=post.guid,\n                status=\"pending\",\n                current_step=0,\n                total_steps=4,\n                progress_percentage=0.0,\n                created_at=datetime.utcnow(),\n            )\n            db.session.add(job)\n\n    recalculate_run_counts(db.session)\n\n    return {\"feed_id\": feed.id, \"new_posts_count\": len(created_posts)}\n\n\ndef add_feed_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    feed_data = params.get(\"feed\")\n    if not isinstance(feed_data, dict):\n        raise ValueError(\"feed data must be a dictionary\")\n    posts_data = params.get(\"posts\", [])\n\n    feed = Feed(**feed_data)\n    db.session.add(feed)\n    db.session.flush()\n\n    created_posts = []\n    for post_data in posts_data:\n        post_data[\"feed_id\"] = feed.id\n        if \"release_date\" in post_data and isinstance(post_data[\"release_date\"], str):\n            post_data[\"release_date\"] = datetime.fromisoformat(\n                post_data[\"release_date\"]\n            )\n\n        post = Post(**post_data)\n        db.session.add(post)\n        created_posts.append(post)\n\n    db.session.flush()\n\n    for post in created_posts:\n        if post.whitelisted:\n            job = ProcessingJob(\n                id=str(uuid.uuid4()),\n                post_guid=post.guid,\n                status=\"pending\",\n                current_step=0,\n                total_steps=4,\n                progress_percentage=0.0,\n                created_at=datetime.utcnow(),\n            )\n            db.session.add(job)\n\n    recalculate_run_counts(db.session)\n\n    return {\"feed_id\": feed.id}\n\n\ndef update_feed_settings_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    feed_id = params.get(\"feed_id\")\n    if not feed_id:\n        raise ValueError(\"feed_id is required\")\n\n    feed = db.session.get(Feed, int(feed_id))\n    if not feed:\n        raise ValueError(f\"Feed {feed_id} not found\")\n\n    if \"auto_whitelist_new_episodes_override\" in params:\n        feed.auto_whitelist_new_episodes_override = params.get(\n            \"auto_whitelist_new_episodes_override\"\n        )\n\n    db.session.flush()\n    return {\"feed_id\": feed.id}\n\n\ndef increment_download_count_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    post_id = params.get(\"post_id\")\n    if not post_id:\n        raise ValueError(\"post_id is required\")\n\n    updated = Post.query.filter_by(id=post_id).update(\n        {Post.download_count: func.coalesce(Post.download_count, 0) + 1},\n        synchronize_session=False,\n    )\n\n    return {\"post_id\": post_id, \"updated\": updated}\n\n\ndef whitelist_post_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    post_id = params.get(\"post_id\")\n    if not post_id:\n        raise ValueError(\"post_id is required\")\n\n    updated = Post.query.filter_by(id=int(post_id)).update(\n        {Post.whitelisted: True}, synchronize_session=False\n    )\n    return {\"post_id\": int(post_id), \"updated\": int(updated)}\n\n\ndef ensure_user_feed_membership_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    feed_id = params.get(\"feed_id\")\n    user_id = params.get(\"user_id\")\n    if not feed_id or not user_id:\n        raise ValueError(\"feed_id and user_id are required\")\n\n    feed_id_i = int(feed_id)\n    user_id_i = int(user_id)\n\n    previous_count = int(UserFeed.query.filter_by(feed_id=feed_id_i).count())\n    existing = UserFeed.query.filter_by(feed_id=feed_id_i, user_id=user_id_i).first()\n    if existing:\n        return {\"created\": False, \"previous_count\": previous_count}\n\n    db.session.add(UserFeed(feed_id=feed_id_i, user_id=user_id_i))\n    db.session.flush()\n    return {\"created\": True, \"previous_count\": previous_count}\n\n\ndef remove_user_feed_membership_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    feed_id = params.get(\"feed_id\")\n    user_id = params.get(\"user_id\")\n    if not feed_id or not user_id:\n        raise ValueError(\"feed_id and user_id are required\")\n\n    removed = UserFeed.query.filter_by(\n        feed_id=int(feed_id), user_id=int(user_id)\n    ).delete(synchronize_session=False)\n    return {\"removed\": int(removed)}\n\n\ndef whitelist_latest_post_for_feed_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    feed_id = params.get(\"feed_id\")\n    if not feed_id:\n        raise ValueError(\"feed_id is required\")\n\n    latest = (\n        Post.query.filter_by(feed_id=int(feed_id))\n        .order_by(Post.release_date.desc().nullslast(), Post.id.desc())\n        .first()\n    )\n    if not latest:\n        return {\"updated\": False}\n    if latest.whitelisted:\n        return {\"updated\": False, \"post_guid\": latest.guid}\n\n    latest.whitelisted = True\n    db.session.flush()\n    return {\"updated\": True, \"post_guid\": latest.guid}\n\n\ndef toggle_whitelist_all_for_feed_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    feed_id = params.get(\"feed_id\")\n    new_status = params.get(\"new_status\")\n    if feed_id is None or new_status is None:\n        raise ValueError(\"feed_id and new_status are required\")\n\n    updated = Post.query.filter_by(feed_id=int(feed_id)).update(\n        {Post.whitelisted: bool(new_status)},\n        synchronize_session=False,\n    )\n    return {\"feed_id\": int(feed_id), \"updated_count\": int(updated)}\n\n\ndef create_dev_test_feed_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    rss_url = params.get(\"rss_url\")\n    title = params.get(\"title\")\n    if not rss_url or not title:\n        raise ValueError(\"rss_url and title are required\")\n\n    existing = Feed.query.filter_by(rss_url=rss_url).first()\n    if existing:\n        return {\"feed_id\": existing.id, \"created\": False}\n\n    feed = Feed(\n        title=title,\n        rss_url=rss_url,\n        image_url=params.get(\"image_url\"),\n        description=params.get(\"description\"),\n        author=params.get(\"author\"),\n    )\n    db.session.add(feed)\n    db.session.flush()\n\n    now = datetime.utcnow()\n    # Use a larger default so dev/test feeds exercise paging in the UI\n    post_count = int(params.get(\"post_count\") or 30)\n    for i in range(1, post_count + 1):\n        guid = f\"{params.get('guid_prefix') or 'test-guid'}-{feed.id}-{i}\"\n        post = Post(\n            feed_id=feed.id,\n            guid=guid,\n            title=f\"Test Episode {i}\",\n            download_url=f\"{params.get('download_url_prefix') or 'http://test-feed'}/{feed.id}/{i}.mp3\",\n            release_date=now,\n            duration=3600,\n            description=f\"Test episode description {i}\",\n            whitelisted=True,\n        )\n        db.session.add(post)\n        db.session.flush()\n\n        job = ProcessingJob(\n            post_guid=post.guid,\n            status=\"completed\",\n            current_step=4,\n            total_steps=4,\n            progress_percentage=100.0,\n            started_at=now,\n            completed_at=now,\n            step_name=\"completed\",\n        )\n        db.session.add(job)\n\n    return {\"feed_id\": feed.id, \"created\": True}\n\n\ndef delete_feed_cascade_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    feed_id = params.get(\"feed_id\")\n    if not feed_id:\n        raise ValueError(\"feed_id is required\")\n\n    feed_id_i = int(feed_id)\n    feed = db.session.get(Feed, feed_id_i)\n    if not feed:\n        return {\"deleted\": False}\n\n    post_rows = db.session.query(Post.id, Post.guid).filter_by(feed_id=feed_id_i).all()\n    post_ids = [row[0] for row in post_rows]\n    post_guids = [row[1] for row in post_rows]\n\n    batch_size = 200\n    if post_ids:\n        while True:\n            seg_ids = [\n                seg_id\n                for (seg_id,) in db.session.query(TranscriptSegment.id)\n                .filter(TranscriptSegment.post_id.in_(post_ids))\n                .limit(batch_size)\n                .all()\n            ]\n            if not seg_ids:\n                break\n            db.session.query(Identification).filter(\n                Identification.transcript_segment_id.in_(seg_ids)\n            ).delete(synchronize_session=False)\n            db.session.query(TranscriptSegment).filter(\n                TranscriptSegment.id.in_(seg_ids)\n            ).delete(synchronize_session=False)\n\n        while True:\n            mc_ids = [\n                mc_id\n                for (mc_id,) in db.session.query(ModelCall.id)\n                .filter(ModelCall.post_id.in_(post_ids))\n                .limit(batch_size)\n                .all()\n            ]\n            if not mc_ids:\n                break\n            db.session.query(ModelCall).filter(ModelCall.id.in_(mc_ids)).delete(\n                synchronize_session=False\n            )\n\n        while True:\n            job_ids = [\n                job_id\n                for (job_id,) in db.session.query(ProcessingJob.id)\n                .filter(ProcessingJob.post_guid.in_(post_guids))\n                .limit(batch_size)\n                .all()\n            ]\n            if not job_ids:\n                break\n            db.session.query(ProcessingJob).filter(\n                ProcessingJob.id.in_(job_ids)\n            ).delete(synchronize_session=False)\n\n        db.session.query(Post).filter(Post.id.in_(post_ids)).delete(\n            synchronize_session=False\n        )\n\n    FeedAccessToken.query.filter(FeedAccessToken.feed_id == feed_id_i).delete(\n        synchronize_session=False\n    )\n    UserFeed.query.filter(UserFeed.feed_id == feed_id_i).delete(\n        synchronize_session=False\n    )\n    db.session.delete(feed)\n    return {\"deleted\": True, \"feed_id\": feed_id_i}\n\n\ndef _hash_token(secret_value: str) -> str:\n    return hashlib.sha256(secret_value.encode(\"utf-8\")).hexdigest()\n\n\ndef create_feed_access_token_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    user_id = params.get(\"user_id\")\n    feed_id = params.get(\"feed_id\")\n\n    if not user_id:\n        raise ValueError(\"user_id is required\")\n\n    # feed_id can be None for aggregate tokens\n\n    query = FeedAccessToken.query.filter_by(user_id=int(user_id), revoked=False)\n\n    if feed_id is not None:\n        query = query.filter_by(feed_id=int(feed_id))\n    else:\n        query = query.filter(FeedAccessToken.feed_id.is_(None))\n\n    existing = query.first()\n\n    if existing is not None:\n        if existing.token_secret:\n            return {\"token_id\": existing.token_id, \"secret\": existing.token_secret}\n\n        secret_value = secrets.token_urlsafe(18)\n        existing.token_hash = _hash_token(secret_value)\n        existing.token_secret = secret_value\n        db.session.flush()\n        return {\"token_id\": existing.token_id, \"secret\": secret_value}\n\n    token_id = uuid.uuid4().hex\n    secret_value = secrets.token_urlsafe(18)\n    token = FeedAccessToken(\n        token_id=token_id,\n        token_hash=_hash_token(secret_value),\n        token_secret=secret_value,\n        feed_id=int(feed_id) if feed_id is not None else None,\n        user_id=int(user_id),\n    )\n    db.session.add(token)\n    db.session.flush()\n    return {\"token_id\": token_id, \"secret\": secret_value}\n\n\ndef touch_feed_access_token_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    token_id = params.get(\"token_id\")\n    secret_value = params.get(\"secret\")\n    if not token_id:\n        raise ValueError(\"token_id is required\")\n\n    token = FeedAccessToken.query.filter_by(token_id=token_id, revoked=False).first()\n    if token is None:\n        return {\"updated\": False}\n\n    token.last_used_at = datetime.utcnow()\n    if token.token_secret is None and secret_value:\n        token.token_secret = str(secret_value)\n    db.session.flush()\n    return {\"updated\": True}\n"
  },
  {
    "path": "src/app/writer/actions/jobs.py",
    "content": "from datetime import datetime, timedelta\nfrom typing import Any, Dict, Optional\n\nfrom app.extensions import db\nfrom app.jobs_manager_run_service import recalculate_run_counts\nfrom app.models import ProcessingJob\n\n\ndef dequeue_job_action(params: Dict[str, Any]) -> Optional[Dict[str, Any]]:\n    run_id = params.get(\"run_id\")\n\n    # Check for running jobs\n    running_job = (\n        ProcessingJob.query.filter(ProcessingJob.status == \"running\")\n        .order_by(ProcessingJob.started_at.desc().nullslast())\n        .first()\n    )\n    if running_job:\n        return None\n\n    job = (\n        ProcessingJob.query.filter(ProcessingJob.status == \"pending\")\n        .order_by(ProcessingJob.created_at.asc())\n        .first()\n    )\n    if not job:\n        return None\n\n    job.status = \"running\"\n    job.started_at = datetime.utcnow()\n\n    if run_id and job.jobs_manager_run_id != run_id:\n        job.jobs_manager_run_id = run_id\n\n    return {\"job_id\": job.id, \"post_guid\": job.post_guid}\n\n\ndef cleanup_stale_jobs_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    older_than_seconds = params.get(\"older_than_seconds\", 3600)\n    cutoff = datetime.utcnow() - timedelta(seconds=older_than_seconds)\n\n    old_jobs = ProcessingJob.query.filter(ProcessingJob.created_at < cutoff).all()\n\n    count = len(old_jobs)\n    for job in old_jobs:\n        db.session.delete(job)\n\n    return {\"count\": count}\n\n\ndef clear_all_jobs_action(params: Dict[str, Any]) -> int:\n    all_jobs = ProcessingJob.query.all()\n    count = len(all_jobs)\n    for job in all_jobs:\n        db.session.delete(job)\n    return count\n\n\ndef create_job_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    job_data = params.get(\"job_data\")\n    if not isinstance(job_data, dict):\n        raise ValueError(\"job_data must be a dictionary\")\n\n    # Convert date strings back to datetime objects if necessary\n    if \"created_at\" in job_data and isinstance(job_data[\"created_at\"], str):\n        job_data[\"created_at\"] = datetime.fromisoformat(job_data[\"created_at\"])\n\n    job = ProcessingJob(**job_data)\n    db.session.add(job)\n\n    if job.jobs_manager_run_id:\n        recalculate_run_counts(db.session)\n\n    db.session.flush()\n    return {\"job_id\": job.id}\n\n\ndef cancel_existing_jobs_action(params: Dict[str, Any]) -> int:\n    post_guid = params.get(\"post_guid\")\n    current_job_id = params.get(\"current_job_id\")\n\n    existing_jobs = (\n        ProcessingJob.query.filter_by(post_guid=post_guid)\n        .filter(\n            ProcessingJob.status.in_([\"pending\", \"running\"]),\n            ProcessingJob.id != current_job_id,\n        )\n        .all()\n    )\n\n    count = len(existing_jobs)\n    for existing_job in existing_jobs:\n        db.session.delete(existing_job)\n\n    if count > 0:\n        recalculate_run_counts(db.session)\n\n    return count\n\n\ndef update_job_status_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    job_id = params.get(\"job_id\")\n    status = params.get(\"status\")\n    step = params.get(\"step\")\n    step_name = params.get(\"step_name\")\n    progress = params.get(\"progress\")\n    error_message = params.get(\"error_message\")\n\n    job = db.session.get(ProcessingJob, job_id)\n    if not job:\n        raise ValueError(f\"Job {job_id} not found\")\n\n    job.status = status\n    job.current_step = step\n    job.step_name = step_name\n    if progress is not None:\n        job.progress_percentage = progress\n\n    if error_message:\n        job.error_message = error_message\n\n    if status == \"running\" and not job.started_at:\n        job.started_at = datetime.utcnow()\n    elif (\n        status in [\"completed\", \"failed\", \"cancelled\", \"skipped\"]\n        and not job.completed_at\n    ):\n        job.completed_at = datetime.utcnow()\n\n    if job.jobs_manager_run_id:\n        recalculate_run_counts(db.session)\n\n    return {\"job_id\": job.id, \"status\": job.status}\n\n\ndef mark_cancelled_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    job_id = params.get(\"job_id\")\n    reason = params.get(\"reason\")\n\n    job = db.session.get(ProcessingJob, job_id)\n    if not job:\n        raise ValueError(f\"Job {job_id} not found\")\n\n    job.status = \"cancelled\"\n    job.error_message = reason\n    job.completed_at = datetime.utcnow()\n\n    if job.jobs_manager_run_id:\n        recalculate_run_counts(db.session)\n\n    return {\"job_id\": job.id, \"status\": \"cancelled\"}\n\n\ndef reassign_pending_jobs_action(params: Dict[str, Any]) -> int:\n    run_id = params.get(\"run_id\")\n    if not run_id:\n        return 0\n\n    pending_jobs = (\n        ProcessingJob.query.filter(ProcessingJob.status == \"pending\")\n        .order_by(ProcessingJob.created_at.asc())\n        .all()\n    )\n\n    reassigned = 0\n    for job in pending_jobs:\n        if job.jobs_manager_run_id != run_id:\n            job.jobs_manager_run_id = run_id\n            reassigned += 1\n\n    if reassigned:\n        recalculate_run_counts(db.session)\n\n    return reassigned\n"
  },
  {
    "path": "src/app/writer/actions/processor.py",
    "content": "from __future__ import annotations\n\nfrom datetime import datetime\nfrom typing import Any, Dict, Iterable, List\n\nfrom sqlalchemy.dialects.sqlite import insert as sqlite_insert\nfrom sqlalchemy.exc import IntegrityError\n\nfrom app.extensions import db\nfrom app.models import Identification, ModelCall, TranscriptSegment\n\n\ndef upsert_model_call_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    post_id = params.get(\"post_id\")\n    model_name = params.get(\"model_name\")\n    first_seq = params.get(\"first_segment_sequence_num\")\n    last_seq = params.get(\"last_segment_sequence_num\")\n    prompt = params.get(\"prompt\")\n\n    if post_id is None or model_name is None or first_seq is None or last_seq is None:\n        raise ValueError(\n            \"post_id, model_name, first_segment_sequence_num, last_segment_sequence_num are required\"\n        )\n    if not isinstance(prompt, str) or not prompt:\n        raise ValueError(\"prompt is required\")\n\n    def _query() -> ModelCall | None:\n        return (\n            db.session.query(ModelCall)\n            .filter_by(\n                post_id=int(post_id),\n                model_name=str(model_name),\n                first_segment_sequence_num=int(first_seq),\n                last_segment_sequence_num=int(last_seq),\n            )\n            .order_by(ModelCall.timestamp.desc())\n            .first()\n        )\n\n    model_call = _query()\n    if model_call is None:\n        model_call = ModelCall(\n            post_id=int(post_id),\n            first_segment_sequence_num=int(first_seq),\n            last_segment_sequence_num=int(last_seq),\n            model_name=str(model_name),\n            prompt=str(prompt),\n            status=\"pending\",\n            timestamp=datetime.utcnow(),\n            retry_attempts=0,\n            error_message=None,\n            response=None,\n        )\n        db.session.add(model_call)\n        try:\n            db.session.flush()\n        except IntegrityError:\n            db.session.rollback()\n            model_call = _query()\n            if model_call is None:\n                raise\n\n    # Match prior behavior: reset only when pending/failed_retries.\n    if model_call.status in [\"pending\", \"failed_retries\"]:\n        model_call.status = \"pending\"\n        model_call.prompt = str(prompt)\n        model_call.retry_attempts = 0\n        model_call.error_message = None\n        model_call.response = None\n\n    db.session.flush()\n    return {\"model_call_id\": int(model_call.id)}\n\n\ndef upsert_whisper_model_call_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    post_id = params.get(\"post_id\")\n    model_name = params.get(\"model_name\")\n    first_seq = params.get(\"first_segment_sequence_num\", 0)\n    last_seq = params.get(\"last_segment_sequence_num\", -1)\n    prompt = params.get(\"prompt\") or \"Whisper transcription job\"\n\n    if post_id is None or model_name is None:\n        raise ValueError(\"post_id and model_name are required\")\n\n    reset_fields: Dict[str, Any] = params.get(\"reset_fields\") or {\n        \"status\": \"pending\",\n        \"prompt\": \"Whisper transcription job\",\n        \"retry_attempts\": 0,\n        \"error_message\": None,\n        \"response\": None,\n    }\n\n    def _query() -> ModelCall | None:\n        return (\n            db.session.query(ModelCall)\n            .filter_by(\n                post_id=int(post_id),\n                model_name=str(model_name),\n                first_segment_sequence_num=int(first_seq),\n                last_segment_sequence_num=int(last_seq),\n            )\n            .order_by(ModelCall.timestamp.desc())\n            .first()\n        )\n\n    model_call = _query()\n    if model_call is None:\n        model_call = ModelCall(\n            post_id=int(post_id),\n            model_name=str(model_name),\n            first_segment_sequence_num=int(first_seq),\n            last_segment_sequence_num=int(last_seq),\n            prompt=str(prompt),\n            status=str(reset_fields.get(\"status\") or \"pending\"),\n            retry_attempts=int(reset_fields.get(\"retry_attempts\") or 0),\n            error_message=reset_fields.get(\"error_message\"),\n            response=reset_fields.get(\"response\"),\n            timestamp=datetime.utcnow(),\n        )\n        db.session.add(model_call)\n        try:\n            db.session.flush()\n        except IntegrityError:\n            db.session.rollback()\n            model_call = _query()\n            if model_call is None:\n                raise\n\n    for k, v in reset_fields.items():\n        if hasattr(model_call, k):\n            setattr(model_call, k, v)\n\n    db.session.flush()\n    return {\"model_call_id\": int(model_call.id)}\n\n\ndef _normalize_segments_payload(\n    segments: Iterable[Dict[str, Any]],\n) -> List[Dict[str, Any]]:\n    normalized: List[Dict[str, Any]] = []\n    for seg in segments:\n        if not isinstance(seg, dict):\n            continue\n        normalized.append(\n            {\n                \"post_id\": int(seg[\"post_id\"]),\n                \"sequence_num\": int(seg[\"sequence_num\"]),\n                \"start_time\": float(seg[\"start_time\"]),\n                \"end_time\": float(seg[\"end_time\"]),\n                \"text\": str(seg[\"text\"]),\n            }\n        )\n    return normalized\n\n\ndef replace_transcription_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    post_id = params.get(\"post_id\")\n    segments = params.get(\"segments\")\n    model_call_id = params.get(\"model_call_id\")\n\n    if post_id is None:\n        raise ValueError(\"post_id is required\")\n    if not isinstance(segments, list):\n        raise ValueError(\"segments must be a list\")\n\n    post_id_i = int(post_id)\n\n    seg_ids = [\n        row[0]\n        for row in db.session.query(TranscriptSegment.id)\n        .filter(TranscriptSegment.post_id == post_id_i)\n        .all()\n    ]\n    if seg_ids:\n        db.session.query(Identification).filter(\n            Identification.transcript_segment_id.in_(seg_ids)\n        ).delete(synchronize_session=False)\n\n    db.session.query(TranscriptSegment).filter(\n        TranscriptSegment.post_id == post_id_i\n    ).delete(synchronize_session=False)\n\n    payload = []\n    for i, seg in enumerate(segments):\n        if not isinstance(seg, dict):\n            continue\n        payload.append(\n            {\n                \"post_id\": post_id_i,\n                \"sequence_num\": int(seg.get(\"sequence_num\", i)),\n                \"start_time\": float(seg[\"start_time\"]),\n                \"end_time\": float(seg[\"end_time\"]),\n                \"text\": str(seg[\"text\"]),\n            }\n        )\n\n    if payload:\n        db.session.execute(sqlite_insert(TranscriptSegment).values(payload))\n\n    if model_call_id is not None:\n        mc = db.session.get(ModelCall, int(model_call_id))\n        if mc is not None:\n            mc.first_segment_sequence_num = 0\n            mc.last_segment_sequence_num = len(payload) - 1\n            mc.response = f\"{len(payload)} segments transcribed.\"\n            mc.status = \"success\"\n            mc.error_message = None\n\n    db.session.flush()\n    return {\"post_id\": post_id_i, \"segment_count\": len(payload)}\n\n\ndef mark_model_call_failed_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    model_call_id = params.get(\"model_call_id\")\n    error_message = params.get(\"error_message\")\n    status = params.get(\"status\", \"failed_permanent\")\n\n    if model_call_id is None:\n        raise ValueError(\"model_call_id is required\")\n\n    mc = db.session.get(ModelCall, int(model_call_id))\n    if mc is None:\n        return {\"updated\": False}\n\n    mc.status = str(status)\n    mc.error_message = str(error_message) if error_message is not None else None\n    db.session.flush()\n    return {\"updated\": True, \"model_call_id\": int(mc.id)}\n\n\ndef insert_identifications_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    identifications = params.get(\"identifications\")\n    if not isinstance(identifications, list):\n        raise ValueError(\"identifications must be a list\")\n\n    values = []\n    for ident in identifications:\n        if not isinstance(ident, dict):\n            continue\n        values.append(\n            {\n                \"transcript_segment_id\": int(ident[\"transcript_segment_id\"]),\n                \"model_call_id\": int(ident[\"model_call_id\"]),\n                \"label\": str(ident.get(\"label\") or \"ad\"),\n                \"confidence\": ident.get(\"confidence\"),\n            }\n        )\n\n    if not values:\n        return {\"inserted\": 0}\n\n    stmt = sqlite_insert(Identification).values(values).prefix_with(\"OR IGNORE\")\n    result = db.session.execute(stmt)\n    db.session.flush()\n    return {\"inserted\": int(getattr(result, \"rowcount\", 0) or 0)}\n\n\ndef replace_identifications_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    delete_ids = params.get(\"delete_ids\") or []\n    new_identifications = params.get(\"new_identifications\") or []\n\n    if not isinstance(delete_ids, list) or not isinstance(new_identifications, list):\n        raise ValueError(\"delete_ids and new_identifications must be lists\")\n\n    if delete_ids:\n        db.session.query(Identification).filter(\n            Identification.id.in_([int(i) for i in delete_ids])\n        ).delete(synchronize_session=False)\n\n    inserted = insert_identifications_action(\n        {\"identifications\": new_identifications}\n    ).get(\"inserted\", 0)\n\n    db.session.flush()\n    return {\"deleted\": len(delete_ids), \"inserted\": int(inserted)}\n"
  },
  {
    "path": "src/app/writer/actions/system.py",
    "content": "import logging\nfrom datetime import datetime\nfrom typing import Any, Dict\n\nfrom app.extensions import db\nfrom app.jobs_manager_run_service import get_or_create_singleton_run\nfrom app.models import DiscordSettings\n\nlogger = logging.getLogger(\"writer\")\n\n\ndef ensure_active_run_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    trigger = params.get(\"trigger\", \"system\")\n    context = params.get(\"context\")\n\n    logger.info(\n        \"[WRITER] ensure_active_run_action: trigger=%s context_keys=%s\",\n        trigger,\n        list(context.keys()) if isinstance(context, dict) else None,\n    )\n\n    run = get_or_create_singleton_run(db.session, trigger, context)\n    db.session.flush()  # Ensure ID is available\n\n    logger.info(\n        \"[WRITER] ensure_active_run_action: obtained run_id=%s status=%s\",\n        getattr(run, \"id\", None),\n        getattr(run, \"status\", None),\n    )\n\n    return {\"run_id\": run.id}\n\n\ndef update_discord_settings_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    settings = db.session.get(DiscordSettings, 1)\n    if settings is None:\n        settings = DiscordSettings(id=1)\n        db.session.add(settings)\n\n    for field in (\n        \"client_id\",\n        \"client_secret\",\n        \"redirect_uri\",\n        \"guild_ids\",\n        \"allow_registration\",\n    ):\n        if field in params:\n            setattr(settings, field, params.get(field))\n\n    settings.updated_at = datetime.utcnow()\n    db.session.flush()\n    return {\"updated\": True}\n\n\ndef update_combined_config_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    payload = params.get(\"payload\")\n    if not isinstance(payload, dict):\n        raise ValueError(\"payload must be a dictionary\")\n\n    # Import locally to avoid cyclic dependencies\n    from app.config_store import (  # pylint: disable=import-outside-toplevel\n        hydrate_runtime_config_inplace,\n        update_combined,\n    )\n\n    updated = update_combined(payload)\n\n    # Ensure the running process sees the new config immediately\n    hydrate_runtime_config_inplace()\n\n    # Reset processor instance to pick up new config (e.g. litellm globals)\n    # Import locally to avoid cyclic dependencies\n    import importlib\n\n    processor = importlib.import_module(\"app.processor\")\n    processor.ProcessorSingleton.reset_instance()\n\n    if not isinstance(updated, dict):\n        return {\"updated\": True}\n    return updated\n"
  },
  {
    "path": "src/app/writer/actions/users.py",
    "content": "from datetime import datetime\nfrom typing import Any, Dict\n\nfrom app.extensions import db\nfrom app.models import FeedAccessToken, User\n\n\ndef create_user_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    username = (params.get(\"username\") or \"\").strip().lower()\n    password = params.get(\"password\")\n    role = params.get(\"role\") or \"user\"\n\n    if not username:\n        raise ValueError(\"username is required\")\n    if not isinstance(password, str) or not password:\n        raise ValueError(\"password is required\")\n    if role not in {\"admin\", \"user\"}:\n        raise ValueError(\"role must be 'admin' or 'user'\")\n    if User.query.filter_by(username=username).first():\n        raise ValueError(\"A user with that username already exists\")\n\n    user = User(username=username, role=role)\n    user.set_password(password)\n    db.session.add(user)\n    db.session.flush()\n    return {\"user_id\": user.id}\n\n\ndef update_user_password_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    user_id = params.get(\"user_id\")\n    new_password = params.get(\"new_password\")\n    if not user_id:\n        raise ValueError(\"user_id is required\")\n    if not isinstance(new_password, str) or not new_password:\n        raise ValueError(\"new_password is required\")\n\n    user = db.session.get(User, int(user_id))\n    if not user:\n        raise ValueError(f\"User {user_id} not found\")\n\n    user.set_password(new_password)\n    db.session.flush()\n    return {\"user_id\": user.id}\n\n\ndef delete_user_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    user_id = params.get(\"user_id\")\n    if not user_id:\n        raise ValueError(\"user_id is required\")\n    user = db.session.get(User, int(user_id))\n    if not user:\n        return {\"deleted\": False}\n\n    # FeedAccessToken.user_id is non-nullable; without cascading deletes SQLAlchemy\n    # will attempt to NULL the FK when deleting a User, causing an IntegrityError.\n    # Delete tokens explicitly as part of the writer action.\n    tokens = (\n        db.session.query(FeedAccessToken)\n        .filter(FeedAccessToken.user_id == user.id)\n        .all()\n    )\n    for token in tokens:\n        db.session.delete(token)\n\n    db.session.delete(user)\n    return {\"deleted\": True}\n\n\ndef set_user_role_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    user_id = params.get(\"user_id\")\n    role = params.get(\"role\")\n    if not user_id or not role:\n        raise ValueError(\"user_id and role are required\")\n    if role not in {\"admin\", \"user\"}:\n        raise ValueError(\"role must be 'admin' or 'user'\")\n    user = db.session.get(User, int(user_id))\n    if not user:\n        raise ValueError(f\"User {user_id} not found\")\n    user.role = role\n    db.session.flush()\n    return {\"user_id\": user.id}\n\n\ndef set_manual_feed_allowance_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    user_id = params.get(\"user_id\")\n    allowance = params.get(\"allowance\")\n\n    if not user_id:\n        raise ValueError(\"user_id is required\")\n\n    user = db.session.get(User, int(user_id))\n    if not user:\n        raise ValueError(f\"User {user_id} not found\")\n\n    if allowance is None:\n        user.manual_feed_allowance = None\n    else:\n        try:\n            user.manual_feed_allowance = int(allowance)\n        except (ValueError, TypeError) as exc:\n            raise ValueError(\"allowance must be an integer or None\") from exc\n\n    db.session.flush()\n    return {\"user_id\": user.id}\n\n\ndef upsert_discord_user_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    discord_id = params.get(\"discord_id\")\n    discord_username = params.get(\"discord_username\")\n    allow_registration = bool(params.get(\"allow_registration\", True))\n\n    if not discord_id or not discord_username:\n        raise ValueError(\"discord_id and discord_username are required\")\n\n    existing_user: User | None = User.query.filter_by(\n        discord_id=str(discord_id)\n    ).first()\n    if existing_user:\n        existing_user.discord_username = str(discord_username)\n        db.session.flush()\n        return {\"user_id\": existing_user.id, \"created\": False}\n\n    if not allow_registration:\n        raise ValueError(\"Self-registration via Discord is disabled\")\n\n    base_username = str(discord_username).lower().replace(\" \", \"_\")[:50]\n    username = base_username\n    counter = 1\n    while User.query.filter_by(username=username).first():\n        username = f\"{base_username}_{counter}\"\n        counter += 1\n\n    new_user = User(\n        username=username,\n        password_hash=\"\",\n        role=\"user\",\n        discord_id=str(discord_id),\n        discord_username=str(discord_username),\n    )\n    db.session.add(new_user)\n    db.session.flush()\n    return {\"user_id\": new_user.id, \"created\": True}\n\n\ndef set_user_billing_fields_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    user_id = params.get(\"user_id\")\n    if not user_id:\n        raise ValueError(\"user_id is required\")\n\n    user = db.session.get(User, int(user_id))\n    if not user:\n        raise ValueError(f\"User {user_id} not found\")\n\n    if \"stripe_customer_id\" in params:\n        user.stripe_customer_id = params.get(\"stripe_customer_id\")\n    if \"stripe_subscription_id\" in params:\n        user.stripe_subscription_id = params.get(\"stripe_subscription_id\")\n    if \"feed_allowance\" in params:\n        user.feed_allowance = int(params.get(\"feed_allowance\") or 0)\n    if \"feed_subscription_status\" in params:\n        user.feed_subscription_status = params.get(\"feed_subscription_status\") or \"\"\n\n    db.session.flush()\n    return {\"user_id\": user.id}\n\n\ndef set_user_billing_by_customer_id_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    customer_id = params.get(\"stripe_customer_id\")\n    if not customer_id:\n        raise ValueError(\"stripe_customer_id is required\")\n\n    user = User.query.filter_by(stripe_customer_id=customer_id).first()\n    if not user:\n        return {\"updated\": False}\n\n    if \"stripe_subscription_id\" in params:\n        user.stripe_subscription_id = params.get(\"stripe_subscription_id\")\n    if \"feed_allowance\" in params:\n        user.feed_allowance = int(params.get(\"feed_allowance\") or 0)\n    if \"feed_subscription_status\" in params:\n        user.feed_subscription_status = params.get(\"feed_subscription_status\") or \"\"\n\n    db.session.flush()\n    return {\"updated\": True, \"user_id\": user.id}\n\n\ndef update_user_last_active_action(params: Dict[str, Any]) -> Dict[str, Any]:\n    user_id = params.get(\"user_id\")\n    if not user_id:\n        raise ValueError(\"user_id is required\")\n\n    user = db.session.get(User, int(user_id))\n    if not user:\n        raise ValueError(f\"User {user_id} not found\")\n\n    user.last_active = datetime.utcnow()\n    db.session.flush()\n    return {\"user_id\": user.id, \"last_active\": user.last_active.isoformat()}\n"
  },
  {
    "path": "src/app/writer/client.py",
    "content": "import os\nimport uuid\nfrom queue import Empty\nfrom typing import Any, Callable, Dict, Optional, cast\n\nfrom flask import current_app\n\nfrom app.ipc import make_client_manager\nfrom app.writer.model_ops import execute_model_command\nfrom app.writer.protocol import WriteCommand, WriteCommandType, WriteResult\n\n\nclass WriterClient:\n    def __init__(self) -> None:\n        self.manager: Any = None\n        self.queue: Any = None\n\n    def connect(self) -> None:\n        if not self.manager:\n            self.manager = make_client_manager()\n            self.queue = self.manager.get_command_queue()  # pylint: disable=no-member\n\n    def _should_use_local_fallback(self) -> bool:\n        if os.environ.get(\"PYTEST_CURRENT_TEST\"):\n            return True\n        if os.environ.get(\"PODLY_WRITER_LOCAL_FALLBACK\") == \"1\":\n            return True\n        try:\n            return bool(getattr(current_app, \"testing\", False))\n        except Exception:  # pylint: disable=broad-except\n            return False\n\n    def _local_execute(self, cmd: WriteCommand) -> WriteResult:\n        # Import locally to avoid cyclic dependencies\n        from app import models  # pylint: disable=import-outside-toplevel\n        from app.extensions import db  # pylint: disable=import-outside-toplevel\n\n        model_map: Dict[str, Any] = {}\n        for name, obj in vars(models).items():\n            if isinstance(obj, type) and issubclass(obj, db.Model) and obj != db.Model:\n                model_map[name] = obj\n\n        try:\n            if cmd.type == WriteCommandType.TRANSACTION:\n                return self._local_execute_transaction(cmd, model_map)\n\n            result = self._local_execute_single(cmd, model_map)\n            if result.success:\n                db.session.commit()\n            else:\n                db.session.rollback()\n            return result\n        except Exception as exc:  # pylint: disable=broad-except\n            db.session.rollback()\n            return WriteResult(cmd.id, False, error=str(exc))\n\n    def _local_execute_single(\n        self, cmd: WriteCommand, model_map: Dict[str, Any]\n    ) -> WriteResult:\n        if cmd.type == WriteCommandType.ACTION:\n            return self._local_execute_action(cmd)\n        return self._local_execute_model(cmd, model_map)\n\n    def _local_execute_transaction(\n        self, cmd: WriteCommand, model_map: Dict[str, Any]\n    ) -> WriteResult:\n        # Import locally to avoid cyclic dependencies\n        from app.extensions import db  # pylint: disable=import-outside-toplevel\n\n        results = []\n        for sub_cmd_data in cmd.data.get(\"commands\", []):\n            if isinstance(sub_cmd_data, dict):\n                sub_cmd = WriteCommand(\n                    id=sub_cmd_data.get(\"id\", \"sub\"),\n                    type=WriteCommandType(sub_cmd_data.get(\"type\")),\n                    model=sub_cmd_data.get(\"model\"),\n                    data=sub_cmd_data.get(\"data\", {}),\n                )\n            else:\n                sub_cmd = sub_cmd_data\n\n            res = self._local_execute_single(sub_cmd, model_map)\n            if not res.success:\n                db.session.rollback()\n                return WriteResult(\n                    cmd.id,\n                    False,\n                    error=f\"Transaction failed at {sub_cmd.id}: {res.error}\",\n                )\n            results.append(res)\n\n        db.session.commit()\n        return WriteResult(cmd.id, True, data={\"results\": [r.data for r in results]})\n\n    def _local_execute_action(self, cmd: WriteCommand) -> WriteResult:\n        # Import locally to avoid cyclic dependencies\n        # pylint: disable=import-outside-toplevel\n        from app.writer import actions as writer_actions\n\n        action_name = cmd.data.get(\"action\")\n        func_name = f\"{action_name}_action\" if action_name else None\n        func_obj = getattr(writer_actions, func_name, None) if func_name else None\n        if func_obj is None or not callable(func_obj):\n            return WriteResult(cmd.id, False, error=f\"Unknown action: {action_name}\")\n\n        func = cast(Callable[[Dict[str, Any]], Any], func_obj)\n        result = func(cmd.data.get(\"params\", {}))  # pylint: disable=not-callable\n        return WriteResult(\n            cmd.id,\n            True,\n            data=result if isinstance(result, dict) else {\"result\": result},\n        )\n\n    def _local_execute_model(\n        self, cmd: WriteCommand, model_map: Dict[str, Any]\n    ) -> WriteResult:\n        # Import locally to avoid cyclic dependencies\n        from app.extensions import db  # pylint: disable=import-outside-toplevel\n\n        if not cmd.model or cmd.model not in model_map:\n            return WriteResult(cmd.id, False, error=f\"Unknown model: {cmd.model}\")\n\n        model_cls = model_map[cmd.model]\n        return execute_model_command(\n            cmd=cmd, model_cls=model_cls, db_session=db.session\n        )\n\n    def submit(\n        self, cmd: WriteCommand, wait: bool = False, timeout: int = 10\n    ) -> Optional[WriteResult]:\n        if not self.queue:\n            try:\n                self.connect()\n            except Exception:  # pylint: disable=broad-except\n                if self._should_use_local_fallback():\n                    result = self._local_execute(cmd)\n                    return result if wait else None\n                raise\n\n        if wait:\n            if not self.manager:\n                raise RuntimeError(\"Manager not connected\")\n            # Create a temporary queue for the reply\n            reply_q = self.manager.Queue()  # pylint: disable=no-member\n            cmd.reply_queue = reply_q\n\n        if self.queue:\n            self.queue.put(cmd)\n\n        if wait:\n            try:\n                return reply_q.get(timeout=timeout)  # type: ignore\n            except Empty as exc:\n                raise TimeoutError(\"Writer service did not respond\") from exc\n        return None\n\n    def create(\n        self, model: str, data: Dict[str, Any], wait: bool = True\n    ) -> Optional[WriteResult]:\n        cmd = WriteCommand(\n            id=str(uuid.uuid4()), type=WriteCommandType.CREATE, model=model, data=data\n        )\n        return self.submit(cmd, wait=wait)\n\n    def update(\n        self, model: str, pk: Any, data: Dict[str, Any], wait: bool = True\n    ) -> Optional[WriteResult]:\n        data[\"id\"] = pk\n        cmd = WriteCommand(\n            id=str(uuid.uuid4()), type=WriteCommandType.UPDATE, model=model, data=data\n        )\n        return self.submit(cmd, wait=wait)\n\n    def delete(self, model: str, pk: Any, wait: bool = True) -> Optional[WriteResult]:\n        cmd = WriteCommand(\n            id=str(uuid.uuid4()),\n            type=WriteCommandType.DELETE,\n            model=model,\n            data={\"id\": pk},\n        )\n        return self.submit(cmd, wait=wait)\n\n    def action(\n        self, action_name: str, params: Dict[str, Any], wait: bool = True\n    ) -> Optional[WriteResult]:\n        cmd = WriteCommand(\n            id=str(uuid.uuid4()),\n            type=WriteCommandType.ACTION,\n            model=None,\n            data={\"action\": action_name, \"params\": params},\n        )\n        return self.submit(cmd, wait=wait)\n\n\n# Singleton instance\nwriter_client = WriterClient()\n"
  },
  {
    "path": "src/app/writer/executor.py",
    "content": "import logging\nfrom typing import Any, Callable, Dict\n\nfrom flask import Flask\n\nfrom app import models\nfrom app.extensions import db\nfrom app.writer import actions as writer_actions\nfrom app.writer.model_ops import execute_model_command\nfrom app.writer.protocol import WriteCommand, WriteCommandType, WriteResult\n\nlogger = logging.getLogger(\"writer\")\n\n\nclass CommandExecutor:\n    def __init__(self, app: Flask):\n        self.app = app\n        self.models = self._discover_models()\n        self.actions: Dict[str, Any] = {}  # Registry for custom actions\n        self._register_default_actions()\n\n    def _register_default_actions(self) -> None:\n        self.register_action(\n            \"ensure_active_run\", writer_actions.ensure_active_run_action\n        )\n        self.register_action(\"dequeue_job\", writer_actions.dequeue_job_action)\n        self.register_action(\n            \"cleanup_stale_jobs\", writer_actions.cleanup_stale_jobs_action\n        )\n        self.register_action(\"clear_all_jobs\", writer_actions.clear_all_jobs_action)\n        self.register_action(\n            \"cleanup_missing_audio_paths\",\n            writer_actions.cleanup_missing_audio_paths_action,\n        )\n        self.register_action(\"create_job\", writer_actions.create_job_action)\n        self.register_action(\n            \"cancel_existing_jobs\", writer_actions.cancel_existing_jobs_action\n        )\n        self.register_action(\n            \"update_job_status\", writer_actions.update_job_status_action\n        )\n        self.register_action(\"mark_cancelled\", writer_actions.mark_cancelled_action)\n        self.register_action(\n            \"reassign_pending_jobs\", writer_actions.reassign_pending_jobs_action\n        )\n        self.register_action(\"refresh_feed\", writer_actions.refresh_feed_action)\n        self.register_action(\"add_feed\", writer_actions.add_feed_action)\n        self.register_action(\n            \"update_feed_settings\", writer_actions.update_feed_settings_action\n        )\n        self.register_action(\n            \"clear_post_processing_data\",\n            writer_actions.clear_post_processing_data_action,\n        )\n        self.register_action(\n            \"cleanup_processed_post\", writer_actions.cleanup_processed_post_action\n        )\n        self.register_action(\n            \"increment_download_count\", writer_actions.increment_download_count_action\n        )\n        self.register_action(\n            \"set_user_billing_fields\", writer_actions.set_user_billing_fields_action\n        )\n        self.register_action(\n            \"set_user_billing_by_customer_id\",\n            writer_actions.set_user_billing_by_customer_id_action,\n        )\n        self.register_action(\n            \"ensure_user_feed_membership\",\n            writer_actions.ensure_user_feed_membership_action,\n        )\n        self.register_action(\n            \"remove_user_feed_membership\",\n            writer_actions.remove_user_feed_membership_action,\n        )\n        self.register_action(\n            \"whitelist_latest_post_for_feed\",\n            writer_actions.whitelist_latest_post_for_feed_action,\n        )\n        self.register_action(\n            \"toggle_whitelist_all_for_feed\",\n            writer_actions.toggle_whitelist_all_for_feed_action,\n        )\n        self.register_action(\n            \"whitelist_post\",\n            writer_actions.whitelist_post_action,\n        )\n        self.register_action(\n            \"create_dev_test_feed\", writer_actions.create_dev_test_feed_action\n        )\n        self.register_action(\n            \"delete_feed_cascade\", writer_actions.delete_feed_cascade_action\n        )\n        self.register_action(\n            \"update_discord_settings\", writer_actions.update_discord_settings_action\n        )\n        self.register_action(\n            \"update_combined_config\", writer_actions.update_combined_config_action\n        )\n        self.register_action(\n            \"create_feed_access_token\", writer_actions.create_feed_access_token_action\n        )\n        self.register_action(\n            \"touch_feed_access_token\", writer_actions.touch_feed_access_token_action\n        )\n        self.register_action(\"create_user\", writer_actions.create_user_action)\n        self.register_action(\n            \"update_user_password\", writer_actions.update_user_password_action\n        )\n        self.register_action(\"delete_user\", writer_actions.delete_user_action)\n        self.register_action(\"set_user_role\", writer_actions.set_user_role_action)\n        self.register_action(\n            \"set_manual_feed_allowance\", writer_actions.set_manual_feed_allowance_action\n        )\n        self.register_action(\n            \"upsert_discord_user\", writer_actions.upsert_discord_user_action\n        )\n\n        self.register_action(\n            \"upsert_model_call\", writer_actions.upsert_model_call_action\n        )\n        self.register_action(\n            \"upsert_whisper_model_call\", writer_actions.upsert_whisper_model_call_action\n        )\n        self.register_action(\n            \"replace_transcription\", writer_actions.replace_transcription_action\n        )\n        self.register_action(\n            \"mark_model_call_failed\", writer_actions.mark_model_call_failed_action\n        )\n        self.register_action(\n            \"insert_identifications\", writer_actions.insert_identifications_action\n        )\n        self.register_action(\n            \"replace_identifications\", writer_actions.replace_identifications_action\n        )\n        self.register_action(\n            \"update_user_last_active\", writer_actions.update_user_last_active_action\n        )\n\n    def _discover_models(self) -> Dict[str, Any]:\n        \"\"\"Discover all SQLAlchemy models in app.models\"\"\"\n        model_map = {}\n        for name, obj in vars(models).items():\n            if isinstance(obj, type) and issubclass(obj, db.Model) and obj != db.Model:\n                model_map[name] = obj\n        return model_map\n\n    def register_action(self, name: str, func: Callable[[Dict[str, Any]], Any]) -> None:\n        self.actions[name] = func\n\n    def process_command(self, cmd: WriteCommand) -> WriteResult:\n        with self.app.app_context():\n            try:\n                logger.info(\n                    \"[WRITER] Processing command: id=%s type=%s model=%s\",\n                    cmd.id,\n                    cmd.type,\n                    cmd.model,\n                )\n                if cmd.type == WriteCommandType.TRANSACTION:\n                    result = self._handle_transaction(cmd)\n                    if result.success:\n                        logger.debug(\n                            \"[WRITER] Committing TRANSACTION command id=%s\", cmd.id\n                        )\n                        db.session.commit()\n                    else:\n                        logger.debug(\n                            \"[WRITER] Rolling back TRANSACTION command id=%s\", cmd.id\n                        )\n                        db.session.rollback()\n                    return result\n\n                # Single operation\n                result = self._execute_single_command(cmd)\n                if result.success:\n                    # Suppress commit log for empty dequeue_job actions (polling)\n                    is_polling_noop = (\n                        cmd.type == WriteCommandType.ACTION\n                        and cmd.data.get(\"action\") == \"dequeue_job\"\n                        and not result.data\n                    )\n\n                    if not is_polling_noop:\n                        logger.info(\"[WRITER] Committing single command id=%s\", cmd.id)\n                    db.session.commit()\n                else:\n                    logger.info(\"[WRITER] Rolling back single command id=%s\", cmd.id)\n                    db.session.rollback()\n                return result\n\n            except Exception as e:\n                logger.error(\n                    \"[WRITER] Error processing command id=%s: %s\",\n                    cmd.id,\n                    e,\n                    exc_info=True,\n                )\n                db.session.rollback()\n                return WriteResult(cmd.id, False, error=str(e))\n\n    def _execute_single_command(self, cmd: WriteCommand) -> WriteResult:\n        if cmd.type == WriteCommandType.ACTION:\n            return self._handle_action(cmd)\n\n        if not cmd.model or cmd.model not in self.models:\n            return WriteResult(cmd.id, False, error=f\"Unknown model: {cmd.model}\")\n\n        model_cls = self.models[cmd.model]\n        if cmd.type in (\n            WriteCommandType.CREATE,\n            WriteCommandType.UPDATE,\n            WriteCommandType.DELETE,\n        ):\n            return execute_model_command(\n                cmd=cmd, model_cls=model_cls, db_session=db.session\n            )\n\n        return WriteResult(cmd.id, False, error=\"Unknown command type\")\n\n    def _handle_transaction(self, cmd: WriteCommand) -> WriteResult:\n        sub_commands_data = cmd.data.get(\"commands\", [])\n        results = []\n\n        try:\n            for sub_cmd_data in sub_commands_data:\n                if isinstance(sub_cmd_data, dict):\n                    sub_cmd = WriteCommand(\n                        id=sub_cmd_data.get(\"id\", \"sub\"),\n                        type=WriteCommandType(sub_cmd_data.get(\"type\")),\n                        model=sub_cmd_data.get(\"model\"),\n                        data=sub_cmd_data.get(\"data\", {}),\n                    )\n                else:\n                    sub_cmd = sub_cmd_data\n\n                res = self._execute_single_command(sub_cmd)\n                if not res.success:\n                    # Let process_command handle rollback\n                    return WriteResult(\n                        cmd.id,\n                        False,\n                        error=f\"Transaction failed at {sub_cmd.id}: {res.error}\",\n                    )\n                results.append(res)\n\n            # Let process_command handle commit\n            return WriteResult(\n                cmd.id,\n                True,\n                data={\n                    \"results\": [\n                        {\n                            \"command_id\": r.command_id,\n                            \"success\": r.success,\n                            \"data\": r.data,\n                            \"error\": r.error,\n                        }\n                        for r in results\n                    ]\n                },\n            )\n\n        except Exception as e:\n            # Let process_command handle rollback\n            return WriteResult(cmd.id, False, error=str(e))\n\n    def _handle_action(self, cmd: WriteCommand) -> WriteResult:\n        action_name = cmd.data.get(\"action\")\n        if action_name not in self.actions:\n            return WriteResult(cmd.id, False, error=f\"Unknown action: {action_name}\")\n\n        func = self.actions[action_name]\n        try:\n            result = func(cmd.data.get(\"params\", {}))\n            # Commit is handled by process_command\n            return WriteResult(cmd.id, True, data=result)\n        except Exception as e:\n            # Rollback is handled by process_command\n            raise e\n"
  },
  {
    "path": "src/app/writer/model_ops.py",
    "content": "from __future__ import annotations\n\nfrom typing import Any\n\nfrom app.writer.protocol import WriteCommand, WriteCommandType, WriteResult\n\n\ndef execute_model_command(\n    *,\n    cmd: WriteCommand,\n    model_cls: Any,\n    db_session: Any,\n) -> WriteResult:\n    if cmd.type == WriteCommandType.CREATE:\n        obj = model_cls(**cmd.data)\n        db_session.add(obj)\n        db_session.flush()\n        data = {\"id\": obj.id} if hasattr(obj, \"id\") else None\n        return WriteResult(cmd.id, True, data=data)\n\n    if cmd.type == WriteCommandType.UPDATE:\n        pk = cmd.data.get(\"id\")\n        if not pk:\n            return WriteResult(cmd.id, False, error=\"Missing 'id' in data for UPDATE\")\n\n        obj = db_session.get(model_cls, pk)\n        if not obj:\n            return WriteResult(\n                cmd.id, False, error=f\"Record not found: {cmd.model} {pk}\"\n            )\n\n        for k, v in cmd.data.items():\n            if k != \"id\" and hasattr(obj, k):\n                setattr(obj, k, v)\n        return WriteResult(cmd.id, True)\n\n    if cmd.type == WriteCommandType.DELETE:\n        pk = cmd.data.get(\"id\")\n        if not pk:\n            return WriteResult(cmd.id, False, error=\"Missing 'id' in data for DELETE\")\n\n        obj = db_session.get(model_cls, pk)\n        if obj:\n            db_session.delete(obj)\n        return WriteResult(cmd.id, True)\n\n    return WriteResult(cmd.id, False, error=\"Unknown command type\")\n"
  },
  {
    "path": "src/app/writer/protocol.py",
    "content": "from dataclasses import dataclass\nfrom enum import Enum\nfrom typing import Any, Dict, Optional\n\n\nclass WriteCommandType(Enum):\n    CREATE = \"create\"\n    UPDATE = \"update\"\n    DELETE = \"delete\"\n    # Critical for integrity: Execute multiple operations in one commit\n    TRANSACTION = \"transaction\"\n    # For complex logic that needs to run inside the writer (e.g. \"deduct_credits_and_start_job\")\n    ACTION = \"action\"\n\n\n@dataclass\nclass WriteCommand:\n    id: str\n    type: WriteCommandType\n    model: Optional[str]\n    data: Dict[str, Any]\n    # The queue to send the result back to (managed by the client)\n    reply_queue: Any = None\n\n\n@dataclass\nclass WriteResult:\n    command_id: str\n    success: bool\n    data: Optional[Dict[str, Any]] = None\n    error: Optional[str] = None\n"
  },
  {
    "path": "src/app/writer/service.py",
    "content": "import logging\nimport threading\nimport time\n\nfrom app.ipc import get_queue, make_server_manager\nfrom app.logger import setup_logger\nfrom app.writer.protocol import WriteCommandType\n\nfrom .executor import CommandExecutor\n\nlogger = setup_logger(\"writer\", \"src/instance/logs/app.log\", level=logging.INFO)\n\n\ndef run_writer_service() -> None:\n    from app import create_writer_app\n\n    logger.info(\"Starting Writer Service...\")\n\n    # 1. Start the IPC Server\n    manager = make_server_manager()\n    server = manager.get_server()\n\n    server_thread = threading.Thread(target=server.serve_forever)\n    server_thread.daemon = True\n    server_thread.start()\n    logger.info(\"IPC Server started on port 50001\")\n\n    # 2. Get the queue\n    queue = get_queue()\n\n    # 3. Initialize App and Executor\n    app = create_writer_app()\n    executor = CommandExecutor(app)\n\n    logger.info(\"Writer Loop starting...\")\n\n    # 4. Writer Loop\n    while True:\n        try:\n            cmd = queue.get()\n\n            # Check if this is a polling command (dequeue_job)\n            is_polling = (\n                getattr(cmd, \"type\", None) == WriteCommandType.ACTION\n                and isinstance(getattr(cmd, \"data\", None), dict)\n                and cmd.data.get(\"action\") == \"dequeue_job\"\n            )\n\n            if not is_polling:\n                logger.info(\n                    \"[WRITER] Received command: id=%s type=%s model=%s has_reply=%s\",\n                    getattr(cmd, \"id\", None),\n                    getattr(cmd, \"type\", None),\n                    getattr(cmd, \"model\", None),\n                    bool(getattr(cmd, \"reply_queue\", None)),\n                )\n\n            result = executor.process_command(cmd)\n\n            # Only log finished/reply if not polling or if polling actually did something\n            if not is_polling or (result and result.data):\n                logger.info(\n                    \"[WRITER] Finished command: id=%s success=%s error=%s\",\n                    getattr(result, \"command_id\", None),\n                    getattr(result, \"success\", None),\n                    getattr(result, \"error\", None),\n                )\n\n            if cmd.reply_queue:\n                if not is_polling or (result and result.data):\n                    logger.info(\n                        \"[WRITER] Sending reply for command id=%s\",\n                        getattr(cmd, \"id\", None),\n                    )\n                cmd.reply_queue.put(result)\n\n        except Exception as e:\n            logger.error(\"Error in writer loop: %s\", e, exc_info=True)\n            time.sleep(1)\n"
  },
  {
    "path": "src/boundary_refinement_prompt.jinja",
    "content": "You are analyzing podcast transcript segments to precisely identify advertisement boundaries.\n\nYour job is to determine the EXACT start and end points of advertisement content by analyzing transition patterns and content flow.\n\nBOUNDARY DETECTION RULES:\n\n**AD START INDICATORS** (extend boundary backward):\n- Sponsor introductions: \"This episode is brought to you by...\", \"And now a word from our sponsor\"\n- Transition phrases: \"Before we continue...\", \"Let me tell you about...\", \"Speaking of...\"\n- Host acknowledgments: \"I want to thank...\", \"Special thanks to...\", \"Our sponsor today is...\"\n- Subtle lead-ins: \"You know what's interesting...\", \"I've been using...\", \"Let me share something...\"\n\n**AD END INDICATORS** (extend boundary forward):\n- Sponsor conclusions: \"Thanks to [sponsor]\", \"That's [website].com\", \"Use code [PROMO]\"\n- Final CTAs: \"Visit today\", \"Don't wait\", \"Get started now\", \"Learn more at...\"\n- Transition back: \"Now back to...\", \"Let's continue...\", \"So anyway...\", \"Where were we...\"\n- Topic resumption: Clear return to previous discussion topic\n\n**CONTENT RESUMPTION SIGNALS** (stop ad boundary):\n- Natural conversation flow: Questions, responses, continued technical discussion\n- Topic changes: New subjects unrelated to sponsor\n- Interview continuation: \"So tell me about...\", \"What do you think about...\"\n- Technical deep-dives: Code examples, implementation details, architecture discussion\n\n**CONFIDENCE-BASED BOUNDARY RULES**:\n- **High Confidence (>0.9)**: Aggressive boundary extension, include subtle transitions\n- **Medium Confidence (0.7-0.9)**: Conservative extension, clear transition signals only\n- **Low Confidence (<0.7)**: Minimal changes, bias toward preserving content\n\n**ANALYSIS CONTEXT**:\n- **Detected Ad Block**: {{ad_start}}s - {{ad_end}}s\n- **Original Confidence**: {{ad_confidence}}\n\n**CONTEXT SEGMENTS**:\n{% for segment in context_segments -%}\n[{{segment.start_time}}] {{segment.text}}\n{% endfor %}\n\n**OUTPUT FORMAT**:\nRespond with valid JSON containing refined boundaries:\n```json\n{\n  \"refined_start\": {{ad_start}},\n  \"refined_end\": {{ad_end}},\n  \"start_adjustment_reason\": \"reason for start boundary change\",\n  \"end_adjustment_reason\": \"reason for end boundary change\"\n}\n```\n\n**REFINEMENT GUIDELINES**:\n- If no refinement needed, return original timestamps with \"No adjustment needed\" reasons\n- Keep adjustments close to the detected timestamps\n- For confidence {{ad_confidence}}: {% if ad_confidence > 0.9 %}be aggressive with boundary extension{% elif ad_confidence > 0.7 %}be conservative, only extend for clear signals{% else %}minimal changes, preserve content{% endif %}\n- Always ensure refined_start < refined_end\n"
  },
  {
    "path": "src/main.py",
    "content": "import os\n\nfrom waitress import serve\n\nfrom app import create_web_app\n\n\ndef main() -> None:\n    \"\"\"Main entry point for the application.\"\"\"\n    app = create_web_app()\n\n    # Start the application server\n    threads_env = os.environ.get(\"SERVER_THREADS\")\n    try:\n        threads = int(threads_env) if threads_env is not None else 1\n    except ValueError:\n        threads = 1\n\n    port = os.environ.get(\"PORT\", 5001)\n    serve(\n        app,\n        host=\"0.0.0.0\",\n        port=port,\n        threads=threads,\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "src/migrations/README",
    "content": "Single-database configuration for Flask.\n"
  },
  {
    "path": "src/migrations/alembic.ini",
    "content": "# A generic, single database configuration.\n\n[alembic]\n# template used to generate migration files\n# file_template = %%(rev)s_%%(slug)s\nscript_location = %(here)s\n\n# set to 'true' to run the environment during\n# the 'revision' command, regardless of autogenerate\n# revision_environment = false\n\n\n# Logging configuration\n[loggers]\nkeys = root,sqlalchemy,alembic,flask_migrate\n\n[handlers]\nkeys = console\n\n[formatters]\nkeys = generic\n\n[logger_root]\nlevel = DEBUG\nhandlers = console\nqualname =\n\n[logger_sqlalchemy]\nlevel = WARN\nhandlers =\nqualname = sqlalchemy.engine\n\n[logger_alembic]\nlevel = INFO\nhandlers =\nqualname = alembic\n\n[logger_flask_migrate]\nlevel = INFO\nhandlers =\nqualname = flask_migrate\n\n[handler_console]\nclass = StreamHandler\nargs = (sys.stderr,)\nlevel = NOTSET\nformatter = generic\n\n[formatter_generic]\nformat = %(levelname)-5.5s [%(name)s] %(message)s\ndatefmt = %H:%M:%S\n"
  },
  {
    "path": "src/migrations/env.py",
    "content": "import logging\nfrom logging.config import fileConfig\n\nfrom alembic import context\nfrom flask import current_app\n\n# this is the Alembic Config object, which provides\n# access to the values within the .ini file in use.\nconfig = context.config\n\n# Interpret the config file for Python logging.\n# This line sets up loggers basically.\nfileConfig(config.config_file_name, disable_existing_loggers=False)\nlogger = logging.getLogger(\"alembic.env\")\n\n\ndef get_engine():\n    try:\n        # this works with Flask-SQLAlchemy<3 and Alchemical\n        return current_app.extensions[\"migrate\"].db.get_engine()\n    except (TypeError, AttributeError):\n        # this works with Flask-SQLAlchemy>=3\n        return current_app.extensions[\"migrate\"].db.engine\n\n\ndef get_engine_url():\n    try:\n        return get_engine().url.render_as_string(hide_password=False).replace(\"%\", \"%%\")\n    except AttributeError:\n        return str(get_engine().url).replace(\"%\", \"%%\")\n\n\n# add your model's MetaData object here\n# for 'autogenerate' support\n# from myapp import mymodel\n# target_metadata = mymodel.Base.metadata\nconfig.set_main_option(\"sqlalchemy.url\", get_engine_url())\ntarget_db = current_app.extensions[\"migrate\"].db\n\n# other values from the config, defined by the needs of env.py,\n# can be acquired:\n# my_important_option = config.get_main_option(\"my_important_option\")\n# ... etc.\n\n\ndef get_metadata():\n    if hasattr(target_db, \"metadatas\"):\n        return target_db.metadatas[None]\n    return target_db.metadata\n\n\ndef run_migrations_offline():\n    \"\"\"Run migrations in 'offline' mode.\n\n    This configures the context with just a URL\n    and not an Engine, though an Engine is acceptable\n    here as well.  By skipping the Engine creation\n    we don't even need a DBAPI to be available.\n\n    Calls to context.execute() here emit the given string to the\n    script output.\n\n    \"\"\"\n    url = config.get_main_option(\"sqlalchemy.url\")\n    context.configure(url=url, target_metadata=get_metadata(), literal_binds=True)\n\n    with context.begin_transaction():\n        context.run_migrations()\n\n\ndef run_migrations_online():\n    \"\"\"Run migrations in 'online' mode.\n\n    In this scenario we need to create an Engine\n    and associate a connection with the context.\n\n    \"\"\"\n\n    # this callback is used to prevent an auto-migration from being generated\n    # when there are no changes to the schema\n    # reference: http://alembic.zzzcomputing.com/en/latest/cookbook.html\n    def process_revision_directives(context, revision, directives):\n        if getattr(config.cmd_opts, \"autogenerate\", False):\n            script = directives[0]\n            if script.upgrade_ops.is_empty():\n                directives[:] = []\n                logger.info(\"No changes in schema detected.\")\n\n    conf_args = current_app.extensions[\"migrate\"].configure_args\n    if conf_args.get(\"process_revision_directives\") is None:\n        conf_args[\"process_revision_directives\"] = process_revision_directives\n\n    connectable = get_engine()\n\n    with connectable.connect() as connection:\n        context.configure(\n            connection=connection, target_metadata=get_metadata(), **conf_args\n        )\n\n        with context.begin_transaction():\n            context.run_migrations()\n\n\nif context.is_offline_mode():\n    run_migrations_offline()\nelse:\n    run_migrations_online()\n"
  },
  {
    "path": "src/migrations/script.py.mako",
    "content": "\"\"\"${message}\n\nRevision ID: ${up_revision}\nRevises: ${down_revision | comma,n}\nCreate Date: ${create_date}\n\n\"\"\"\nfrom alembic import op\nimport sqlalchemy as sa\n${imports if imports else \"\"}\n\n# revision identifiers, used by Alembic.\nrevision = ${repr(up_revision)}\ndown_revision = ${repr(down_revision)}\nbranch_labels = ${repr(branch_labels)}\ndepends_on = ${repr(depends_on)}\n\n\ndef upgrade():\n    ${upgrades if upgrades else \"pass\"}\n\n\ndef downgrade():\n    ${downgrades if downgrades else \"pass\"}\n"
  },
  {
    "path": "src/migrations/versions/0d954a44fa8e_feed_access.py",
    "content": "\"\"\"feed_access\n\nRevision ID: 0d954a44fa8e\nRevises: 91ff431c832e\nCreate Date: 2025-11-04 21:43:07.716121\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"0d954a44fa8e\"\ndown_revision = \"91ff431c832e\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    op.create_table(\n        \"feed_access_token\",\n        sa.Column(\"id\", sa.Integer(), autoincrement=True, nullable=False),\n        sa.Column(\"token_id\", sa.String(length=32), nullable=False),\n        sa.Column(\"token_hash\", sa.String(length=64), nullable=False),\n        sa.Column(\"feed_id\", sa.Integer(), nullable=False),\n        sa.Column(\"user_id\", sa.Integer(), nullable=False),\n        sa.Column(\"created_at\", sa.DateTime(), nullable=False),\n        sa.Column(\"last_used_at\", sa.DateTime(), nullable=True),\n        sa.Column(\"revoked\", sa.Boolean(), nullable=False),\n        sa.ForeignKeyConstraint(\n            [\"feed_id\"],\n            [\"feed.id\"],\n        ),\n        sa.ForeignKeyConstraint(\n            [\"user_id\"],\n            [\"users.id\"],\n        ),\n        sa.PrimaryKeyConstraint(\"id\"),\n    )\n    with op.batch_alter_table(\"feed_access_token\", schema=None) as batch_op:\n        batch_op.create_index(\n            batch_op.f(\"ix_feed_access_token_token_id\"), [\"token_id\"], unique=True\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"feed_access_token\", schema=None) as batch_op:\n        batch_op.drop_index(batch_op.f(\"ix_feed_access_token_token_id\"))\n\n    op.drop_table(\"feed_access_token\")\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/16311623dd58_env_hash.py",
    "content": "\"\"\"env_hash\n\nRevision ID: 16311623dd58\nRevises: 5bccc39c9685\nCreate Date: 2025-12-14 10:32:15.843860\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"16311623dd58\"\ndown_revision = \"5bccc39c9685\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"app_settings\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\"env_config_hash\", sa.String(length=64), nullable=True)\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"app_settings\", schema=None) as batch_op:\n        batch_op.drop_column(\"env_config_hash\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/185d3448990e_stripe.py",
    "content": "\"\"\"stripe\n\nRevision ID: 185d3448990e\nRevises: 35b12b2d9feb\nCreate Date: 2025-12-10 21:51:55.888021\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"185d3448990e\"\ndown_revision = \"35b12b2d9feb\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    bind = op.get_bind()\n    inspector = sa.inspect(bind)\n\n    if inspector.has_table(\"credit_transaction\"):\n        indexes = [i[\"name\"] for i in inspector.get_indexes(\"credit_transaction\")]\n        with op.batch_alter_table(\"credit_transaction\", schema=None) as batch_op:\n            if \"ix_credit_transaction_feed_id\" in indexes:\n                batch_op.drop_index(batch_op.f(\"ix_credit_transaction_feed_id\"))\n            if \"ix_credit_transaction_post_id\" in indexes:\n                batch_op.drop_index(batch_op.f(\"ix_credit_transaction_post_id\"))\n            if \"ix_credit_transaction_user_created\" in indexes:\n                batch_op.drop_index(batch_op.f(\"ix_credit_transaction_user_created\"))\n            if \"ix_credit_transaction_user_id\" in indexes:\n                batch_op.drop_index(batch_op.f(\"ix_credit_transaction_user_id\"))\n\n        op.drop_table(\"credit_transaction\")\n\n    if inspector.has_table(\"app_settings\"):\n        columns = [c[\"name\"] for c in inspector.get_columns(\"app_settings\")]\n        with op.batch_alter_table(\"app_settings\", schema=None) as batch_op:\n            if \"minutes_per_credit\" in columns:\n                batch_op.drop_column(\"minutes_per_credit\")\n\n    if inspector.has_table(\"users\"):\n        columns = [c[\"name\"] for c in inspector.get_columns(\"users\")]\n        with op.batch_alter_table(\"users\", schema=None) as batch_op:\n            if \"feed_allowance\" not in columns:\n                batch_op.add_column(\n                    sa.Column(\"feed_allowance\", sa.Integer(), nullable=False)\n                )\n            if \"feed_subscription_status\" not in columns:\n                batch_op.add_column(\n                    sa.Column(\n                        \"feed_subscription_status\", sa.String(length=32), nullable=False\n                    )\n                )\n            if \"stripe_customer_id\" not in columns:\n                batch_op.add_column(\n                    sa.Column(\"stripe_customer_id\", sa.String(length=64), nullable=True)\n                )\n            if \"stripe_subscription_id\" not in columns:\n                batch_op.add_column(\n                    sa.Column(\n                        \"stripe_subscription_id\", sa.String(length=64), nullable=True\n                    )\n                )\n            if \"credits_balance\" in columns:\n                batch_op.drop_column(\"credits_balance\")\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    bind = op.get_bind()\n    inspector = sa.inspect(bind)\n\n    if inspector.has_table(\"users\"):\n        columns = [c[\"name\"] for c in inspector.get_columns(\"users\")]\n        with op.batch_alter_table(\"users\", schema=None) as batch_op:\n            if \"credits_balance\" not in columns:\n                batch_op.add_column(\n                    sa.Column(\n                        \"credits_balance\",\n                        sa.NUMERIC(precision=12, scale=2),\n                        nullable=False,\n                    )\n                )\n            if \"stripe_subscription_id\" in columns:\n                batch_op.drop_column(\"stripe_subscription_id\")\n            if \"stripe_customer_id\" in columns:\n                batch_op.drop_column(\"stripe_customer_id\")\n            if \"feed_subscription_status\" in columns:\n                batch_op.drop_column(\"feed_subscription_status\")\n            if \"feed_allowance\" in columns:\n                batch_op.drop_column(\"feed_allowance\")\n\n    if inspector.has_table(\"app_settings\"):\n        columns = [c[\"name\"] for c in inspector.get_columns(\"app_settings\")]\n        with op.batch_alter_table(\"app_settings\", schema=None) as batch_op:\n            if \"minutes_per_credit\" not in columns:\n                batch_op.add_column(\n                    sa.Column(\n                        \"minutes_per_credit\",\n                        sa.INTEGER(),\n                        server_default=sa.text(\"(60)\"),\n                        nullable=False,\n                    )\n                )\n\n    if not inspector.has_table(\"credit_transaction\"):\n        op.create_table(\n            \"credit_transaction\",\n            sa.Column(\"id\", sa.INTEGER(), nullable=False),\n            sa.Column(\"user_id\", sa.INTEGER(), nullable=False),\n            sa.Column(\"feed_id\", sa.INTEGER(), nullable=True),\n            sa.Column(\"post_id\", sa.INTEGER(), nullable=True),\n            sa.Column(\"idempotency_key\", sa.VARCHAR(length=128), nullable=True),\n            sa.Column(\n                \"amount_signed\", sa.NUMERIC(precision=12, scale=2), nullable=False\n            ),\n            sa.Column(\"type\", sa.VARCHAR(length=32), nullable=False),\n            sa.Column(\"note\", sa.TEXT(), nullable=True),\n            sa.Column(\"created_at\", sa.DATETIME(), nullable=False),\n            sa.ForeignKeyConstraint(\n                [\"feed_id\"],\n                [\"feed.id\"],\n            ),\n            sa.ForeignKeyConstraint(\n                [\"post_id\"],\n                [\"post.id\"],\n            ),\n            sa.ForeignKeyConstraint(\n                [\"user_id\"],\n                [\"users.id\"],\n            ),\n            sa.PrimaryKeyConstraint(\"id\"),\n            sa.UniqueConstraint(\"idempotency_key\"),\n        )\n        with op.batch_alter_table(\"credit_transaction\", schema=None) as batch_op:\n            batch_op.create_index(\n                batch_op.f(\"ix_credit_transaction_user_id\"), [\"user_id\"], unique=False\n            )\n            batch_op.create_index(\n                batch_op.f(\"ix_credit_transaction_user_created\"),\n                [\"user_id\", \"created_at\"],\n                unique=False,\n            )\n            batch_op.create_index(\n                batch_op.f(\"ix_credit_transaction_post_id\"), [\"post_id\"], unique=False\n            )\n            batch_op.create_index(\n                batch_op.f(\"ix_credit_transaction_feed_id\"), [\"feed_id\"], unique=False\n            )\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/18c2402c9202_cleanup_retention_days.py",
    "content": "\"\"\"cleanup_retention_days\n\nRevision ID: 18c2402c9202\nRevises: a6f5df1a50ac\nCreate Date: 2025-11-03 22:05:56.956113\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"18c2402c9202\"\ndown_revision = \"a6f5df1a50ac\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"app_settings\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\"post_cleanup_retention_days\", sa.Integer(), nullable=True)\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"app_settings\", schema=None) as batch_op:\n        batch_op.drop_column(\"post_cleanup_retention_days\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/2e25a15d11de_per_feed_auto_whitelist.py",
    "content": "\"\"\"per feed auto whitelist\n\nRevision ID: 2e25a15d11de\nRevises: 82cfcc8e0326\nCreate Date: 2026-01-12 12:47:42.611999\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"2e25a15d11de\"\ndown_revision = \"82cfcc8e0326\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"feed\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\n                \"auto_whitelist_new_episodes_override\", sa.Boolean(), nullable=True\n            )\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"feed\", schema=None) as batch_op:\n        batch_op.drop_column(\"auto_whitelist_new_episodes_override\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/31d767deb401_credits.py",
    "content": "\"\"\"credits\n\nRevision ID: 31d767deb401\nRevises: 608e0b27fcda\nCreate Date: 2025-11-29 11:42:27.900494\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"31d767deb401\"\ndown_revision = \"608e0b27fcda\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    bind = op.get_bind()\n    inspector = sa.inspect(bind)\n    existing_tables = set(inspector.get_table_names())\n\n    # ### commands auto generated by Alembic - please adjust! ###\n    if \"credit_transaction\" not in existing_tables:\n        op.create_table(\n            \"credit_transaction\",\n            sa.Column(\"id\", sa.Integer(), autoincrement=True, nullable=False),\n            sa.Column(\"user_id\", sa.Integer(), nullable=False),\n            sa.Column(\"feed_id\", sa.Integer(), nullable=True),\n            sa.Column(\"post_id\", sa.Integer(), nullable=True),\n            sa.Column(\"idempotency_key\", sa.String(length=128), nullable=True),\n            sa.Column(\n                \"amount_signed\", sa.Numeric(precision=12, scale=1), nullable=False\n            ),\n            sa.Column(\"type\", sa.String(length=32), nullable=False),\n            sa.Column(\"note\", sa.Text(), nullable=True),\n            sa.Column(\"created_at\", sa.DateTime(), nullable=False),\n            sa.ForeignKeyConstraint(\n                [\"feed_id\"],\n                [\"feed.id\"],\n            ),\n            sa.ForeignKeyConstraint(\n                [\"post_id\"],\n                [\"post.id\"],\n            ),\n            sa.ForeignKeyConstraint(\n                [\"user_id\"],\n                [\"users.id\"],\n            ),\n            sa.PrimaryKeyConstraint(\"id\"),\n            sa.UniqueConstraint(\"idempotency_key\"),\n        )\n        with op.batch_alter_table(\"credit_transaction\", schema=None) as batch_op:\n            batch_op.create_index(\n                batch_op.f(\"ix_credit_transaction_feed_id\"), [\"feed_id\"], unique=False\n            )\n            batch_op.create_index(\n                batch_op.f(\"ix_credit_transaction_post_id\"), [\"post_id\"], unique=False\n            )\n            batch_op.create_index(\n                \"ix_credit_transaction_user_created\",\n                [\"user_id\", \"created_at\"],\n                unique=False,\n            )\n            batch_op.create_index(\n                batch_op.f(\"ix_credit_transaction_user_id\"), [\"user_id\"], unique=False\n            )\n\n    if \"app_settings\" in existing_tables:\n        app_columns = {col[\"name\"] for col in inspector.get_columns(\"app_settings\")}\n        if \"minutes_per_credit\" not in app_columns:\n            with op.batch_alter_table(\"app_settings\", schema=None) as batch_op:\n                batch_op.add_column(\n                    sa.Column(\n                        \"minutes_per_credit\",\n                        sa.Integer(),\n                        nullable=False,\n                        server_default=sa.text(\"60\"),\n                    )\n                )\n\n    if \"feed\" in existing_tables:\n        feed_columns = {col[\"name\"] for col in inspector.get_columns(\"feed\")}\n        if \"sponsor_user_id\" not in feed_columns:\n            with op.batch_alter_table(\"feed\", schema=None) as batch_op:\n                batch_op.add_column(\n                    sa.Column(\"sponsor_user_id\", sa.Integer(), nullable=True)\n                )\n                batch_op.add_column(sa.Column(\"sponsor_note\", sa.Text(), nullable=True))\n                batch_op.create_index(\n                    batch_op.f(\"ix_feed_sponsor_user_id\"),\n                    [\"sponsor_user_id\"],\n                    unique=False,\n                )\n                batch_op.create_foreign_key(\n                    \"fk_feed_sponsor_user_id\",\n                    \"users\",\n                    [\"sponsor_user_id\"],\n                    [\"id\"],\n                )\n\n    if \"users\" in existing_tables:\n        user_columns = {col[\"name\"] for col in inspector.get_columns(\"users\")}\n        if \"credits_balance\" not in user_columns:\n            with op.batch_alter_table(\"users\", schema=None) as batch_op:\n                batch_op.add_column(\n                    sa.Column(\n                        \"credits_balance\",\n                        sa.Numeric(precision=12, scale=1),\n                        nullable=False,\n                        server_default=sa.text(\"1\"),\n                    )\n                )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    bind = op.get_bind()\n    inspector = sa.inspect(bind)\n    existing_tables = set(inspector.get_table_names())\n\n    # ### commands auto generated by Alembic - please adjust! ###\n    if \"users\" in existing_tables:\n        user_columns = {col[\"name\"] for col in inspector.get_columns(\"users\")}\n        if \"credits_balance\" in user_columns:\n            with op.batch_alter_table(\"users\", schema=None) as batch_op:\n                batch_op.drop_column(\"credits_balance\")\n\n    if \"feed\" in existing_tables:\n        feed_columns = {col[\"name\"] for col in inspector.get_columns(\"feed\")}\n        if \"sponsor_user_id\" in feed_columns or \"sponsor_note\" in feed_columns:\n            with op.batch_alter_table(\"feed\", schema=None) as batch_op:\n                if \"fk_feed_sponsor_user_id\" in {\n                    fk[\"name\"] for fk in inspector.get_foreign_keys(\"feed\")\n                }:\n                    batch_op.drop_constraint(\n                        \"fk_feed_sponsor_user_id\", type_=\"foreignkey\"\n                    )\n                if \"ix_feed_sponsor_user_id\" in {\n                    idx[\"name\"] for idx in inspector.get_indexes(\"feed\")\n                }:\n                    batch_op.drop_index(batch_op.f(\"ix_feed_sponsor_user_id\"))\n                if \"sponsor_note\" in feed_columns:\n                    batch_op.drop_column(\"sponsor_note\")\n                if \"sponsor_user_id\" in feed_columns:\n                    batch_op.drop_column(\"sponsor_user_id\")\n\n    if \"app_settings\" in existing_tables:\n        app_columns = {col[\"name\"] for col in inspector.get_columns(\"app_settings\")}\n        if \"minutes_per_credit\" in app_columns:\n            with op.batch_alter_table(\"app_settings\", schema=None) as batch_op:\n                batch_op.drop_column(\"minutes_per_credit\")\n\n    if \"credit_transaction\" in existing_tables:\n        with op.batch_alter_table(\"credit_transaction\", schema=None) as batch_op:\n            existing_indexes = {\n                idx[\"name\"] for idx in inspector.get_indexes(\"credit_transaction\")\n            }\n            if batch_op.f(\"ix_credit_transaction_user_id\") in existing_indexes:\n                batch_op.drop_index(batch_op.f(\"ix_credit_transaction_user_id\"))\n            if \"ix_credit_transaction_user_created\" in existing_indexes:\n                batch_op.drop_index(\"ix_credit_transaction_user_created\")\n            if batch_op.f(\"ix_credit_transaction_post_id\") in existing_indexes:\n                batch_op.drop_index(batch_op.f(\"ix_credit_transaction_post_id\"))\n            if batch_op.f(\"ix_credit_transaction_feed_id\") in existing_indexes:\n                batch_op.drop_index(batch_op.f(\"ix_credit_transaction_feed_id\"))\n\n        op.drop_table(\"credit_transaction\")\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/35b12b2d9feb_landing_page.py",
    "content": "\"\"\"landing page\n\nRevision ID: 35b12b2d9feb\nRevises: eb51923af483\nCreate Date: 2025-12-01 23:49:10.400190\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"35b12b2d9feb\"\ndown_revision = \"eb51923af483\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"app_settings\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\n                \"enable_public_landing_page\",\n                sa.Boolean(),\n                nullable=False,\n                server_default=sa.text(\"false\"),\n            ),\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"app_settings\", schema=None) as batch_op:\n        batch_op.drop_column(\"enable_public_landing_page\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/3c7f5f7640e4_add_counters_reset_timestamp.py",
    "content": "\"\"\"add counters reset timestamp to jobs_manager_run\n\nRevision ID: 3c7f5f7640e4\nRevises: c0f8893ce927\nCreate Date: 2026-12-01 00:00:00.000000\n\n\"\"\"\n\nfrom __future__ import annotations\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"3c7f5f7640e4\"\ndown_revision = \"c0f8893ce927\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade() -> None:\n    bind = op.get_bind()\n    inspector = sa.inspect(bind)\n\n    existing_tables = set(inspector.get_table_names())\n    if \"jobs_manager_run\" not in existing_tables:\n        return\n\n    columns = {col[\"name\"] for col in inspector.get_columns(\"jobs_manager_run\")}\n    if \"counters_reset_at\" not in columns:\n        with op.batch_alter_table(\"jobs_manager_run\", schema=None) as batch_op:\n            batch_op.add_column(\n                sa.Column(\"counters_reset_at\", sa.DateTime(), nullable=True)\n            )\n\n        op.execute(\n            sa.text(\n                \"UPDATE jobs_manager_run \"\n                \"SET counters_reset_at = COALESCE(started_at, created_at, CURRENT_TIMESTAMP) \"\n                \"WHERE counters_reset_at IS NULL\"\n            )\n        )\n\n\ndef downgrade() -> None:\n    bind = op.get_bind()\n    inspector = sa.inspect(bind)\n\n    existing_tables = set(inspector.get_table_names())\n    if \"jobs_manager_run\" not in existing_tables:\n        return\n\n    columns = {col[\"name\"] for col in inspector.get_columns(\"jobs_manager_run\")}\n    if \"counters_reset_at\" in columns:\n        with op.batch_alter_table(\"jobs_manager_run\", schema=None) as batch_op:\n            batch_op.drop_column(\"counters_reset_at\")\n"
  },
  {
    "path": "src/migrations/versions/3d232f215842_migration.py",
    "content": "\"\"\"migration\n\nRevision ID: 3d232f215842\nRevises: f7a4195e0953\nCreate Date: 2026-01-11 18:35:34.763013\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"3d232f215842\"\ndown_revision = \"f7a4195e0953\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"llm_settings\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\n                \"enable_word_level_boundary_refinder\",\n                sa.Boolean(),\n                nullable=False,\n                server_default=sa.text(\"0\"),\n            )\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"llm_settings\", schema=None) as batch_op:\n        batch_op.drop_column(\"enable_word_level_boundary_refinder\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/3eb0a3a0870b_discord.py",
    "content": "\"\"\"discord\n\nRevision ID: 3eb0a3a0870b\nRevises: 31d767deb401\nCreate Date: 2025-11-29 12:41:40.446049\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"3eb0a3a0870b\"\ndown_revision = \"31d767deb401\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"users\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\"discord_id\", sa.String(length=32), nullable=True)\n        )\n        batch_op.add_column(\n            sa.Column(\"discord_username\", sa.String(length=100), nullable=True)\n        )\n        batch_op.create_index(\n            batch_op.f(\"ix_users_discord_id\"), [\"discord_id\"], unique=True\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"users\", schema=None) as batch_op:\n        batch_op.drop_index(batch_op.f(\"ix_users_discord_id\"))\n        batch_op.drop_column(\"discord_username\")\n        batch_op.drop_column(\"discord_id\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/401071604e7b_config_tables.py",
    "content": "\"\"\"Create settings tables and seed defaults\n\nRevision ID: 401071604e7b\nRevises: 611dcb5d7f12\nCreate Date: 2025-09-28 00:00:00.000000\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"401071604e7b\"\ndown_revision = \"611dcb5d7f12\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    bind = op.get_bind()\n    inspector = sa.inspect(bind)\n    existing_tables = set(inspector.get_table_names())\n\n    if \"llm_settings\" not in existing_tables:\n        op.create_table(\n            \"llm_settings\",\n            sa.Column(\"id\", sa.Integer(), nullable=False),\n            sa.Column(\"llm_api_key\", sa.Text(), nullable=True),\n            sa.Column(\n                \"llm_model\",\n                sa.Text(),\n                nullable=False,\n                server_default=\"groq/openai/gpt-oss-120b\",\n            ),\n            sa.Column(\"openai_base_url\", sa.Text(), nullable=True),\n            sa.Column(\n                \"openai_timeout\", sa.Integer(), nullable=False, server_default=\"300\"\n            ),\n            sa.Column(\n                \"openai_max_tokens\", sa.Integer(), nullable=False, server_default=\"4096\"\n            ),\n            sa.Column(\n                \"llm_max_concurrent_calls\",\n                sa.Integer(),\n                nullable=False,\n                server_default=\"3\",\n            ),\n            sa.Column(\n                \"llm_max_retry_attempts\",\n                sa.Integer(),\n                nullable=False,\n                server_default=\"5\",\n            ),\n            sa.Column(\"llm_max_input_tokens_per_call\", sa.Integer(), nullable=True),\n            sa.Column(\n                \"llm_enable_token_rate_limiting\",\n                sa.Boolean(),\n                nullable=False,\n                server_default=sa.text(\"0\"),\n            ),\n            sa.Column(\"llm_max_input_tokens_per_minute\", sa.Integer(), nullable=True),\n            sa.Column(\n                \"created_at\",\n                sa.DateTime(),\n                nullable=False,\n                server_default=sa.func.current_timestamp(),\n            ),\n            sa.Column(\n                \"updated_at\",\n                sa.DateTime(),\n                nullable=False,\n                server_default=sa.func.current_timestamp(),\n            ),\n            sa.PrimaryKeyConstraint(\"id\"),\n        )\n\n    if \"whisper_settings\" not in existing_tables:\n        op.create_table(\n            \"whisper_settings\",\n            sa.Column(\"id\", sa.Integer(), nullable=False),\n            sa.Column(\"whisper_type\", sa.Text(), nullable=False, server_default=\"groq\"),\n            sa.Column(\n                \"local_model\", sa.Text(), nullable=False, server_default=\"base.en\"\n            ),\n            sa.Column(\n                \"remote_model\", sa.Text(), nullable=False, server_default=\"whisper-1\"\n            ),\n            sa.Column(\"remote_api_key\", sa.Text(), nullable=True),\n            sa.Column(\n                \"remote_base_url\",\n                sa.Text(),\n                nullable=False,\n                server_default=\"https://api.openai.com/v1\",\n            ),\n            sa.Column(\n                \"remote_language\", sa.Text(), nullable=False, server_default=\"en\"\n            ),\n            sa.Column(\n                \"remote_timeout_sec\", sa.Integer(), nullable=False, server_default=\"600\"\n            ),\n            sa.Column(\n                \"remote_chunksize_mb\", sa.Integer(), nullable=False, server_default=\"24\"\n            ),\n            sa.Column(\"groq_api_key\", sa.Text(), nullable=True),\n            sa.Column(\n                \"groq_model\",\n                sa.Text(),\n                nullable=False,\n                server_default=\"whisper-large-v3-turbo\",\n            ),\n            sa.Column(\"groq_language\", sa.Text(), nullable=False, server_default=\"en\"),\n            sa.Column(\n                \"groq_max_retries\", sa.Integer(), nullable=False, server_default=\"3\"\n            ),\n            sa.Column(\n                \"groq_initial_backoff\", sa.Float(), nullable=False, server_default=\"1.0\"\n            ),\n            sa.Column(\n                \"groq_backoff_factor\", sa.Float(), nullable=False, server_default=\"2.0\"\n            ),\n            sa.Column(\n                \"created_at\",\n                sa.DateTime(),\n                nullable=False,\n                server_default=sa.func.current_timestamp(),\n            ),\n            sa.Column(\n                \"updated_at\",\n                sa.DateTime(),\n                nullable=False,\n                server_default=sa.func.current_timestamp(),\n            ),\n            sa.PrimaryKeyConstraint(\"id\"),\n        )\n\n    if \"processing_settings\" not in existing_tables:\n        op.create_table(\n            \"processing_settings\",\n            sa.Column(\"id\", sa.Integer(), nullable=False),\n            sa.Column(\n                \"system_prompt_path\",\n                sa.Text(),\n                nullable=False,\n                server_default=\"src/system_prompt.txt\",\n            ),\n            sa.Column(\n                \"user_prompt_template_path\",\n                sa.Text(),\n                nullable=False,\n                server_default=\"src/user_prompt.jinja\",\n            ),\n            sa.Column(\n                \"num_segments_to_input_to_prompt\",\n                sa.Integer(),\n                nullable=False,\n                server_default=\"60\",\n            ),\n            sa.Column(\n                \"created_at\",\n                sa.DateTime(),\n                nullable=False,\n                server_default=sa.func.current_timestamp(),\n            ),\n            sa.Column(\n                \"updated_at\",\n                sa.DateTime(),\n                nullable=False,\n                server_default=sa.func.current_timestamp(),\n            ),\n            sa.PrimaryKeyConstraint(\"id\"),\n        )\n\n    if \"output_settings\" not in existing_tables:\n        op.create_table(\n            \"output_settings\",\n            sa.Column(\"id\", sa.Integer(), nullable=False),\n            sa.Column(\"fade_ms\", sa.Integer(), nullable=False, server_default=\"3000\"),\n            sa.Column(\n                \"min_ad_segement_separation_seconds\",\n                sa.Integer(),\n                nullable=False,\n                server_default=\"60\",\n            ),\n            sa.Column(\n                \"min_ad_segment_length_seconds\",\n                sa.Integer(),\n                nullable=False,\n                server_default=\"14\",\n            ),\n            sa.Column(\n                \"min_confidence\", sa.Float(), nullable=False, server_default=\"0.8\"\n            ),\n            sa.Column(\n                \"created_at\",\n                sa.DateTime(),\n                nullable=False,\n                server_default=sa.func.current_timestamp(),\n            ),\n            sa.Column(\n                \"updated_at\",\n                sa.DateTime(),\n                nullable=False,\n                server_default=sa.func.current_timestamp(),\n            ),\n            sa.PrimaryKeyConstraint(\"id\"),\n        )\n\n    if \"app_settings\" not in existing_tables:\n        op.create_table(\n            \"app_settings\",\n            sa.Column(\"id\", sa.Integer(), nullable=False),\n            sa.Column(\"background_update_interval_minute\", sa.Integer(), nullable=True),\n            sa.Column(\n                \"automatically_whitelist_new_episodes\",\n                sa.Boolean(),\n                nullable=False,\n                server_default=sa.text(\"1\"),\n            ),\n            sa.Column(\n                \"number_of_episodes_to_whitelist_from_archive_of_new_feed\",\n                sa.Integer(),\n                nullable=False,\n                server_default=\"1\",\n            ),\n            sa.Column(\n                \"created_at\",\n                sa.DateTime(),\n                nullable=False,\n                server_default=sa.func.current_timestamp(),\n            ),\n            sa.Column(\n                \"updated_at\",\n                sa.DateTime(),\n                nullable=False,\n                server_default=sa.func.current_timestamp(),\n            ),\n            sa.PrimaryKeyConstraint(\"id\"),\n        )\n\n    # Seed singleton rows (id=1) - SQLite requires one statement per execute\n    op.execute(\n        sa.text(\"INSERT INTO llm_settings (id) VALUES (1) ON CONFLICT(id) DO NOTHING\")\n    )\n    op.execute(\n        sa.text(\n            \"INSERT INTO whisper_settings (id) VALUES (1) ON CONFLICT(id) DO NOTHING\"\n        )\n    )\n    op.execute(\n        sa.text(\n            \"INSERT INTO processing_settings (id) VALUES (1) ON CONFLICT(id) DO NOTHING\"\n        )\n    )\n    op.execute(\n        sa.text(\n            \"INSERT INTO output_settings (id) VALUES (1) ON CONFLICT(id) DO NOTHING\"\n        )\n    )\n    op.execute(\n        sa.text(\"INSERT INTO app_settings (id) VALUES (1) ON CONFLICT(id) DO NOTHING\")\n    )\n\n\ndef downgrade():\n    op.drop_table(\"app_settings\")\n    op.drop_table(\"output_settings\")\n    op.drop_table(\"processing_settings\")\n    op.drop_table(\"whisper_settings\")\n    op.drop_table(\"llm_settings\")\n"
  },
  {
    "path": "src/migrations/versions/58b4eedd4c61_add_last_active_to_user.py",
    "content": "\"\"\"add_last_active_to_user\n\nRevision ID: 58b4eedd4c61\nRevises: 73a6b9f9b643\nCreate Date: 2025-12-20 14:01:36.022682\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"58b4eedd4c61\"\ndown_revision = \"73a6b9f9b643\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"users\", schema=None) as batch_op:\n        batch_op.add_column(sa.Column(\"last_active\", sa.DateTime(), nullable=True))\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"users\", schema=None) as batch_op:\n        batch_op.drop_column(\"last_active\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/5bccc39c9685_zero_initial_allowance.py",
    "content": "\"\"\"zero initial allowance\n\nRevision ID: 5bccc39c9685\nRevises: ab643af6472e\nCreate Date: 2025-12-12 14:21:35.530141\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"5bccc39c9685\"\ndown_revision = \"ab643af6472e\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    pass\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    pass\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/608e0b27fcda_stronger_access_token.py",
    "content": "\"\"\"stronger_access_token\n\nRevision ID: 608e0b27fcda\nRevises: f6d5fee57cc3\nCreate Date: 2025-11-05 21:27:10.923394\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"608e0b27fcda\"\ndown_revision = \"f6d5fee57cc3\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"feed_access_token\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\"token_secret\", sa.String(length=128), nullable=True)\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"feed_access_token\", schema=None) as batch_op:\n        batch_op.drop_column(\"token_secret\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/611dcb5d7f12_add_image_url_to_post_model_for_episode_.py",
    "content": "\"\"\"Add image_url to Post model for episode thumbnails\n\nRevision ID: 611dcb5d7f12\nRevises: b038c2f99086\nCreate Date: 2025-05-25 13:39:49.168287\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"611dcb5d7f12\"\ndown_revision = \"b038c2f99086\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"post\", schema=None) as batch_op:\n        batch_op.add_column(sa.Column(\"image_url\", sa.Text(), nullable=True))\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"post\", schema=None) as batch_op:\n        batch_op.drop_column(\"image_url\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/6e0e16299dcb_alternate_feed_id.py",
    "content": "\"\"\"alternate feed ID\n\nRevision ID: 6e0e16299dcb\nRevises: 770771437280\nCreate Date: 2024-11-23 11:04:37.861614\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"6e0e16299dcb\"\ndown_revision = \"770771437280\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"feed\", schema=None) as batch_op:\n        batch_op.add_column(sa.Column(\"alt_id\", sa.Text(), nullable=True))\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"feed\", schema=None) as batch_op:\n        batch_op.drop_column(\"alt_id\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/73a6b9f9b643_allow_null_feed_id_for_aggregate_tokens.py",
    "content": "\"\"\"allow_null_feed_id_for_aggregate_tokens\n\nRevision ID: 73a6b9f9b643\nRevises: 89d86978f407\nCreate Date: 2025-12-14 13:28:57.243239\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"73a6b9f9b643\"\ndown_revision = \"89d86978f407\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"feed_access_token\", schema=None) as batch_op:\n        batch_op.alter_column(\"feed_id\", existing_type=sa.INTEGER(), nullable=True)\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"feed_access_token\", schema=None) as batch_op:\n        batch_op.alter_column(\"feed_id\", existing_type=sa.INTEGER(), nullable=False)\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/770771437280_episode_whitelist.py",
    "content": "\"\"\"episode whitelist\n\nRevision ID: 770771437280\nRevises: fa3a95ecd67d\nCreate Date: 2024-11-16 08:27:46.081562\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"770771437280\"\ndown_revision = \"fa3a95ecd67d\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"post\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\n                \"whitelisted\", sa.Boolean(), nullable=False, server_default=sa.false()\n            )\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"post\", schema=None) as batch_op:\n        batch_op.drop_column(\"whitelisted\")\n\n    op.create_table(\n        \"ad_identification\",\n        sa.Column(\"id\", sa.INTEGER(), nullable=False),\n        sa.Column(\"post_id\", sa.INTEGER(), nullable=False),\n        sa.Column(\"content\", sa.TEXT(), nullable=False),\n        sa.Column(\"timestamp\", sa.DATETIME(), nullable=True),\n        sa.ForeignKeyConstraint(\n            [\"post_id\"],\n            [\"post.id\"],\n        ),\n        sa.PrimaryKeyConstraint(\"id\"),\n        sa.UniqueConstraint(\"post_id\"),\n    )\n    op.create_table(\n        \"identification\",\n        sa.Column(\"id\", sa.INTEGER(), nullable=False),\n        sa.Column(\"post_id\", sa.INTEGER(), nullable=False),\n        sa.Column(\"content\", sa.TEXT(), nullable=False),\n        sa.Column(\"timestamp\", sa.DATETIME(), nullable=True),\n        sa.ForeignKeyConstraint(\n            [\"post_id\"],\n            [\"post.id\"],\n        ),\n        sa.PrimaryKeyConstraint(\"id\"),\n        sa.UniqueConstraint(\"post_id\"),\n    )\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/7de4e57ec4bb_discord_settings.py",
    "content": "\"\"\"discord settings\n\nRevision ID: 7de4e57ec4bb\nRevises: 3eb0a3a0870b\nCreate Date: 2025-11-29 12:47:45.289285\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"7de4e57ec4bb\"\ndown_revision = \"3eb0a3a0870b\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    op.create_table(\n        \"discord_settings\",\n        sa.Column(\"id\", sa.Integer(), nullable=False),\n        sa.Column(\"client_id\", sa.Text(), nullable=True),\n        sa.Column(\"client_secret\", sa.Text(), nullable=True),\n        sa.Column(\"redirect_uri\", sa.Text(), nullable=True),\n        sa.Column(\"guild_ids\", sa.Text(), nullable=True),\n        sa.Column(\"allow_registration\", sa.Boolean(), nullable=False),\n        sa.Column(\"created_at\", sa.DateTime(), nullable=False),\n        sa.Column(\"updated_at\", sa.DateTime(), nullable=False),\n        sa.PrimaryKeyConstraint(\"id\"),\n    )\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    op.drop_table(\"discord_settings\")\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/802a2365976d_gruanular_credits.py",
    "content": "\"\"\"gruanular credits\n\nRevision ID: 802a2365976d\nRevises: 7de4e57ec4bb\nCreate Date: 2025-11-29 19:10:18.950548\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"802a2365976d\"\ndown_revision = \"7de4e57ec4bb\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"credit_transaction\", schema=None) as batch_op:\n        batch_op.alter_column(\n            \"amount_signed\",\n            existing_type=sa.NUMERIC(precision=12, scale=1),\n            type_=sa.Numeric(precision=12, scale=2),\n            existing_nullable=False,\n        )\n\n    with op.batch_alter_table(\"users\", schema=None) as batch_op:\n        batch_op.alter_column(\n            \"credits_balance\",\n            existing_type=sa.NUMERIC(precision=12, scale=1),\n            type_=sa.Numeric(precision=12, scale=2),\n            existing_nullable=False,\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"users\", schema=None) as batch_op:\n        batch_op.alter_column(\n            \"credits_balance\",\n            existing_type=sa.Numeric(precision=12, scale=2),\n            type_=sa.NUMERIC(precision=12, scale=1),\n            existing_nullable=False,\n        )\n\n    with op.batch_alter_table(\"credit_transaction\", schema=None) as batch_op:\n        batch_op.alter_column(\n            \"amount_signed\",\n            existing_type=sa.Numeric(precision=12, scale=2),\n            type_=sa.NUMERIC(precision=12, scale=1),\n            existing_nullable=False,\n        )\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/82cfcc8e0326_refined_cuts.py",
    "content": "\"\"\"refined cuts\n\nRevision ID: 82cfcc8e0326\nRevises: 3d232f215842\nCreate Date: 2026-01-11 20:44:32.127284\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"82cfcc8e0326\"\ndown_revision = \"3d232f215842\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"post\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\"refined_ad_boundaries\", sa.JSON(), nullable=True)\n        )\n        batch_op.add_column(\n            sa.Column(\"refined_ad_boundaries_updated_at\", sa.DateTime(), nullable=True)\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"post\", schema=None) as batch_op:\n        batch_op.drop_column(\"refined_ad_boundaries_updated_at\")\n        batch_op.drop_column(\"refined_ad_boundaries\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/89d86978f407_limit_users.py",
    "content": "\"\"\"limit users\n\nRevision ID: 89d86978f407\nRevises: 16311623dd58\nCreate Date: 2025-12-14 12:45:22.788888\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"89d86978f407\"\ndown_revision = \"16311623dd58\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"app_settings\", schema=None) as batch_op:\n        batch_op.add_column(sa.Column(\"user_limit_total\", sa.Integer(), nullable=True))\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"app_settings\", schema=None) as batch_op:\n        batch_op.drop_column(\"user_limit_total\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/91ff431c832e_download_count.py",
    "content": "\"\"\"download_count\n\nRevision ID: 91ff431c832e\nRevises: 18c2402c9202\nCreate Date: 2025-11-03 23:24:04.934488\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"91ff431c832e\"\ndown_revision = \"18c2402c9202\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"jobs_manager_run\", schema=None) as batch_op:\n        batch_op.alter_column(\n            \"created_at\",\n            existing_type=sa.DATETIME(),\n            nullable=True,\n            existing_server_default=sa.text(\"(CURRENT_TIMESTAMP)\"),\n        )\n        batch_op.alter_column(\n            \"updated_at\",\n            existing_type=sa.DATETIME(),\n            nullable=True,\n            existing_server_default=sa.text(\"(CURRENT_TIMESTAMP)\"),\n        )\n        batch_op.drop_column(\"previous_run_id\")\n\n    with op.batch_alter_table(\"post\", schema=None) as batch_op:\n        batch_op.add_column(sa.Column(\"download_count\", sa.Integer(), nullable=True))\n\n    with op.batch_alter_table(\"users\", schema=None) as batch_op:\n        batch_op.drop_constraint(batch_op.f(\"uq_users_username\"), type_=\"unique\")\n        batch_op.drop_index(batch_op.f(\"ix_users_username\"))\n        batch_op.create_index(\n            batch_op.f(\"ix_users_username\"), [\"username\"], unique=True\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"users\", schema=None) as batch_op:\n        batch_op.drop_index(batch_op.f(\"ix_users_username\"))\n        batch_op.create_index(\n            batch_op.f(\"ix_users_username\"), [\"username\"], unique=False\n        )\n        batch_op.create_unique_constraint(batch_op.f(\"uq_users_username\"), [\"username\"])\n\n    with op.batch_alter_table(\"post\", schema=None) as batch_op:\n        batch_op.drop_column(\"download_count\")\n\n    with op.batch_alter_table(\"jobs_manager_run\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\"previous_run_id\", sa.VARCHAR(length=36), nullable=True)\n        )\n        batch_op.alter_column(\n            \"updated_at\",\n            existing_type=sa.DATETIME(),\n            nullable=False,\n            existing_server_default=sa.text(\"(CURRENT_TIMESTAMP)\"),\n        )\n        batch_op.alter_column(\n            \"created_at\",\n            existing_type=sa.DATETIME(),\n            nullable=False,\n            existing_server_default=sa.text(\"(CURRENT_TIMESTAMP)\"),\n        )\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/999b921ffc58_migration.py",
    "content": "\"\"\"migration\n\nRevision ID: 999b921ffc58\nRevises: 401071604e7b\nCreate Date: 2025-10-18 15:11:24.463135\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"999b921ffc58\"\ndown_revision = \"401071604e7b\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    bind = op.get_bind()\n    inspector = sa.inspect(bind)\n    existing_tables = set(inspector.get_table_names())\n\n    # Create jobs_manager_run table only if it doesn't exist (makes migration idempotent)\n    if \"jobs_manager_run\" not in existing_tables:\n        op.create_table(\n            \"jobs_manager_run\",\n            sa.Column(\"id\", sa.String(length=36), nullable=False),\n            sa.Column(\n                \"status\", sa.String(length=50), nullable=False, server_default=\"pending\"\n            ),\n            sa.Column(\"trigger\", sa.String(length=100), nullable=False),\n            sa.Column(\"started_at\", sa.DateTime(), nullable=True),\n            sa.Column(\"completed_at\", sa.DateTime(), nullable=True),\n            sa.Column(\"total_jobs\", sa.Integer(), nullable=False, server_default=\"0\"),\n            sa.Column(\"queued_jobs\", sa.Integer(), nullable=False, server_default=\"0\"),\n            sa.Column(\"running_jobs\", sa.Integer(), nullable=False, server_default=\"0\"),\n            sa.Column(\n                \"completed_jobs\", sa.Integer(), nullable=False, server_default=\"0\"\n            ),\n            sa.Column(\"failed_jobs\", sa.Integer(), nullable=False, server_default=\"0\"),\n            sa.Column(\"context_json\", sa.JSON(), nullable=True),\n            sa.Column(\"previous_run_id\", sa.String(length=36), nullable=True),\n            sa.Column(\n                \"created_at\",\n                sa.DateTime(),\n                nullable=False,\n                server_default=sa.func.current_timestamp(),\n            ),\n            sa.Column(\n                \"updated_at\",\n                sa.DateTime(),\n                nullable=False,\n                server_default=sa.func.current_timestamp(),\n            ),\n            sa.PrimaryKeyConstraint(\"id\"),\n        )\n\n    # Index on status for quick filtering (create only if missing)\n    if \"jobs_manager_run\" in existing_tables:\n        existing_indexes = {\n            idx[\"name\"] for idx in inspector.get_indexes(\"jobs_manager_run\")\n        }\n    else:\n        existing_indexes = set()\n\n    if \"ix_jobs_manager_run_status\" not in existing_indexes:\n        op.create_index(\n            \"ix_jobs_manager_run_status\", \"jobs_manager_run\", [\"status\"], unique=False\n        )\n\n    # Add jobs_manager_run_id column and FK to processing_job only if column doesn't exist\n    processing_cols = {col[\"name\"] for col in inspector.get_columns(\"processing_job\")}\n    if \"jobs_manager_run_id\" not in processing_cols:\n        with op.batch_alter_table(\"processing_job\", schema=None) as batch_op:\n            batch_op.add_column(\n                sa.Column(\"jobs_manager_run_id\", sa.String(length=36), nullable=True)\n            )\n            batch_op.create_index(\n                batch_op.f(\"ix_processing_job_jobs_manager_run_id\"),\n                [\"jobs_manager_run_id\"],\n                unique=False,\n            )\n            batch_op.create_foreign_key(\n                \"fk_processing_job_jobs_manager_run_id\",\n                \"jobs_manager_run\",\n                [\"jobs_manager_run_id\"],\n                [\"id\"],\n            )\n\n    with op.batch_alter_table(\"whisper_settings\", schema=None) as batch_op:\n        batch_op.drop_column(\"groq_initial_backoff\")\n        batch_op.drop_column(\"groq_backoff_factor\")\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    bind = op.get_bind()\n    inspector = sa.inspect(bind)\n    existing_tables = set(inspector.get_table_names())\n\n    with op.batch_alter_table(\"whisper_settings\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\n                \"groq_backoff_factor\",\n                sa.FLOAT(),\n                server_default=sa.text(\"'2.0'\"),\n                nullable=False,\n            )\n        )\n        batch_op.add_column(\n            sa.Column(\n                \"groq_initial_backoff\",\n                sa.FLOAT(),\n                server_default=sa.text(\"'1.0'\"),\n                nullable=False,\n            )\n        )\n\n    with op.batch_alter_table(\"processing_job\", schema=None) as batch_op:\n        # Only drop FK/index/column if they exist\n        processing_cols = {\n            col[\"name\"] for col in inspector.get_columns(\"processing_job\")\n        }\n        if \"jobs_manager_run_id\" in processing_cols:\n            batch_op.drop_constraint(\n                \"fk_processing_job_jobs_manager_run_id\", type_=\"foreignkey\"\n            )\n            batch_op.drop_index(batch_op.f(\"ix_processing_job_jobs_manager_run_id\"))\n            batch_op.drop_column(\"jobs_manager_run_id\")\n\n    # Drop jobs_manager_run index and table if present\n    if \"jobs_manager_run\" in existing_tables:\n        # drop index if exists\n        try:\n            op.drop_index(\"ix_jobs_manager_run_status\", table_name=\"jobs_manager_run\")\n        except Exception:\n            # ignore if index doesn't exist\n            pass\n        op.drop_table(\"jobs_manager_run\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/a6f5df1a50ac_add_users_table.py",
    "content": "\"\"\"add users table\n\nRevision ID: a6f5df1a50ac\nRevises: 3c7f5f7640e4\nCreate Date: 2024-05-15 00:00:00.000000\n\"\"\"\n\nfrom __future__ import annotations\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"a6f5df1a50ac\"\ndown_revision = \"3c7f5f7640e4\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade() -> None:\n    op.create_table(\n        \"users\",\n        sa.Column(\"id\", sa.Integer(), primary_key=True, autoincrement=True),\n        sa.Column(\"username\", sa.String(length=255), nullable=False),\n        sa.Column(\"password_hash\", sa.String(length=255), nullable=False),\n        sa.Column(\"role\", sa.String(length=50), nullable=False, server_default=\"user\"),\n        sa.Column(\n            \"created_at\",\n            sa.DateTime(),\n            nullable=False,\n            server_default=sa.text(\"CURRENT_TIMESTAMP\"),\n        ),\n        sa.Column(\n            \"updated_at\",\n            sa.DateTime(),\n            nullable=False,\n            server_default=sa.text(\"CURRENT_TIMESTAMP\"),\n        ),\n        sa.UniqueConstraint(\"username\", name=\"uq_users_username\"),\n    )\n    op.create_index(\"ix_users_username\", \"users\", [\"username\"], unique=False)\n\n\ndef downgrade() -> None:\n    op.drop_index(\"ix_users_username\", table_name=\"users\")\n    op.drop_table(\"users\")\n"
  },
  {
    "path": "src/migrations/versions/ab643af6472e_add_manual_feed_allowance_to_user.py",
    "content": "\"\"\"add_manual_feed_allowance_to_user\n\nRevision ID: ab643af6472e\nRevises: 185d3448990e\nCreate Date: 2025-12-12 14:06:14.400553\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"ab643af6472e\"\ndown_revision = \"185d3448990e\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"feed\", schema=None) as batch_op:\n        batch_op.drop_index(batch_op.f(\"ix_feed_sponsor_user_id\"))\n        batch_op.drop_constraint(\n            batch_op.f(\"fk_feed_sponsor_user_id\"), type_=\"foreignkey\"\n        )\n        batch_op.drop_column(\"sponsor_user_id\")\n        batch_op.drop_column(\"sponsor_note\")\n\n    with op.batch_alter_table(\"users\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\"manual_feed_allowance\", sa.Integer(), nullable=True)\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"users\", schema=None) as batch_op:\n        batch_op.drop_column(\"manual_feed_allowance\")\n\n    with op.batch_alter_table(\"feed\", schema=None) as batch_op:\n        batch_op.add_column(sa.Column(\"sponsor_note\", sa.TEXT(), nullable=True))\n        batch_op.add_column(sa.Column(\"sponsor_user_id\", sa.INTEGER(), nullable=True))\n        batch_op.create_foreign_key(\n            batch_op.f(\"fk_feed_sponsor_user_id\"), \"users\", [\"sponsor_user_id\"], [\"id\"]\n        )\n        batch_op.create_index(\n            batch_op.f(\"ix_feed_sponsor_user_id\"), [\"sponsor_user_id\"], unique=False\n        )\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/b038c2f99086_add_processingjob_table_for_async_.py",
    "content": "\"\"\"Add ProcessingJob table for async episode processing\n\nRevision ID: b038c2f99086\nRevises: b92e47a03bb2\nCreate Date: 2025-05-25 12:18:50.783647\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"b038c2f99086\"\ndown_revision = \"b92e47a03bb2\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    op.create_table(\n        \"processing_job\",\n        sa.Column(\"id\", sa.String(length=36), nullable=False),\n        sa.Column(\"post_guid\", sa.String(length=255), nullable=False),\n        sa.Column(\"status\", sa.String(length=50), nullable=False),\n        sa.Column(\"current_step\", sa.Integer(), nullable=True),\n        sa.Column(\"step_name\", sa.String(length=100), nullable=True),\n        sa.Column(\"total_steps\", sa.Integer(), nullable=True),\n        sa.Column(\"progress_percentage\", sa.Float(), nullable=True),\n        sa.Column(\"started_at\", sa.DateTime(), nullable=True),\n        sa.Column(\"completed_at\", sa.DateTime(), nullable=True),\n        sa.Column(\"error_message\", sa.Text(), nullable=True),\n        sa.Column(\"scheduler_job_id\", sa.String(length=255), nullable=True),\n        sa.Column(\"created_at\", sa.DateTime(), nullable=True),\n        sa.PrimaryKeyConstraint(\"id\"),\n    )\n    with op.batch_alter_table(\"processing_job\", schema=None) as batch_op:\n        batch_op.create_index(\n            batch_op.f(\"ix_processing_job_created_at\"), [\"created_at\"], unique=False\n        )\n        batch_op.create_index(\n            batch_op.f(\"ix_processing_job_post_guid\"), [\"post_guid\"], unique=False\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"processing_job\", schema=None) as batch_op:\n        batch_op.drop_index(batch_op.f(\"ix_processing_job_post_guid\"))\n        batch_op.drop_index(batch_op.f(\"ix_processing_job_created_at\"))\n\n    op.drop_table(\"processing_job\")\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/b92e47a03bb2_refactor_transcripts_to_db_tables_.py",
    "content": "\"\"\"Refactor transcripts to DB tables: TranscriptSegment, ModelCall, Identification\n\nRevision ID: b92e47a03bb2\nRevises: ded4b70feadb\nCreate Date: 2025-05-11 12:24:43.232263\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"b92e47a03bb2\"\ndown_revision = \"ded4b70feadb\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    op.create_table(\n        \"model_call\",\n        sa.Column(\"id\", sa.Integer(), autoincrement=True, nullable=False),\n        sa.Column(\"post_id\", sa.Integer(), nullable=False),\n        sa.Column(\"first_segment_sequence_num\", sa.Integer(), nullable=False),\n        sa.Column(\"last_segment_sequence_num\", sa.Integer(), nullable=False),\n        sa.Column(\"model_name\", sa.String(), nullable=False),\n        sa.Column(\"prompt\", sa.Text(), nullable=False),\n        sa.Column(\"response\", sa.Text(), nullable=True),\n        sa.Column(\"timestamp\", sa.DateTime(), nullable=False),\n        sa.Column(\"status\", sa.String(), nullable=False),\n        sa.Column(\"error_message\", sa.Text(), nullable=True),\n        sa.Column(\"retry_attempts\", sa.Integer(), nullable=False),\n        sa.ForeignKeyConstraint(\n            [\"post_id\"],\n            [\"post.id\"],\n        ),\n        sa.PrimaryKeyConstraint(\"id\"),\n    )\n    with op.batch_alter_table(\"model_call\", schema=None) as batch_op:\n        batch_op.create_index(\n            \"ix_model_call_post_chunk_model\",\n            [\n                \"post_id\",\n                \"first_segment_sequence_num\",\n                \"last_segment_sequence_num\",\n                \"model_name\",\n            ],\n            unique=True,\n        )\n\n    op.create_table(\n        \"transcript_segment\",\n        sa.Column(\"id\", sa.Integer(), autoincrement=True, nullable=False),\n        sa.Column(\"post_id\", sa.Integer(), nullable=False),\n        sa.Column(\"sequence_num\", sa.Integer(), nullable=False),\n        sa.Column(\"start_time\", sa.Float(), nullable=False),\n        sa.Column(\"end_time\", sa.Float(), nullable=False),\n        sa.Column(\"text\", sa.Text(), nullable=False),\n        sa.ForeignKeyConstraint(\n            [\"post_id\"],\n            [\"post.id\"],\n        ),\n        sa.PrimaryKeyConstraint(\"id\"),\n    )\n    with op.batch_alter_table(\"transcript_segment\", schema=None) as batch_op:\n        batch_op.create_index(\n            \"ix_transcript_segment_post_id_sequence_num\",\n            [\"post_id\", \"sequence_num\"],\n            unique=True,\n        )\n\n    op.create_table(\n        \"identification\",\n        sa.Column(\"id\", sa.Integer(), autoincrement=True, nullable=False),\n        sa.Column(\"transcript_segment_id\", sa.Integer(), nullable=False),\n        sa.Column(\"model_call_id\", sa.Integer(), nullable=False),\n        sa.Column(\"confidence\", sa.Float(), nullable=True),\n        sa.Column(\"label\", sa.String(), nullable=False),\n        sa.ForeignKeyConstraint(\n            [\"model_call_id\"],\n            [\"model_call.id\"],\n        ),\n        sa.ForeignKeyConstraint(\n            [\"transcript_segment_id\"],\n            [\"transcript_segment.id\"],\n        ),\n        sa.PrimaryKeyConstraint(\"id\"),\n    )\n    with op.batch_alter_table(\"identification\", schema=None) as batch_op:\n        batch_op.create_index(\n            \"ix_identification_segment_call_label\",\n            [\"transcript_segment_id\", \"model_call_id\", \"label\"],\n            unique=True,\n        )\n\n    op.drop_table(\"transcript\")\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    op.create_table(\n        \"transcript\",\n        sa.Column(\"id\", sa.INTEGER(), nullable=False),\n        sa.Column(\"post_id\", sa.INTEGER(), nullable=False),\n        sa.Column(\"content\", sa.TEXT(), nullable=False),\n        sa.Column(\"timestamp\", sa.DATETIME(), nullable=True),\n        sa.ForeignKeyConstraint(\n            [\"post_id\"],\n            [\"post.id\"],\n        ),\n        sa.PrimaryKeyConstraint(\"id\"),\n        sa.UniqueConstraint(\"post_id\"),\n    )\n    with op.batch_alter_table(\"identification\", schema=None) as batch_op:\n        batch_op.drop_index(\"ix_identification_segment_call_label\")\n\n    op.drop_table(\"identification\")\n    with op.batch_alter_table(\"transcript_segment\", schema=None) as batch_op:\n        batch_op.drop_index(\"ix_transcript_segment_post_id_sequence_num\")\n\n    op.drop_table(\"transcript_segment\")\n    with op.batch_alter_table(\"model_call\", schema=None) as batch_op:\n        batch_op.drop_index(\"ix_model_call_post_chunk_model\")\n\n    op.drop_table(\"model_call\")\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/bae70e584468_.py",
    "content": "\"\"\"empty message\n\nRevision ID: bae70e584468\nRevises:\nCreate Date: 2024-10-20 14:45:30.170794\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"bae70e584468\"\ndown_revision = None\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    op.create_table(\n        \"feed\",\n        sa.Column(\"id\", sa.Integer(), autoincrement=True, nullable=False),\n        sa.Column(\"title\", sa.Text(), nullable=False),\n        sa.Column(\"description\", sa.Text(), nullable=True),\n        sa.Column(\"author\", sa.Text(), nullable=True),\n        sa.Column(\"rss_url\", sa.Text(), nullable=False, unique=True),\n        sa.PrimaryKeyConstraint(\"id\"),\n    )\n\n    op.create_table(\n        \"post\",\n        sa.Column(\"id\", sa.Integer(), autoincrement=True, nullable=False),\n        sa.Column(\"feed_id\", sa.Integer(), nullable=False),\n        sa.Column(\"guid\", sa.Text(), nullable=False, unique=True),\n        sa.Column(\"download_url\", sa.Text(), nullable=False, unique=True),\n        sa.Column(\"title\", sa.Text(), nullable=False),\n        sa.Column(\"description\", sa.Text(), nullable=True),\n        sa.Column(\"release_date\", sa.Date(), nullable=True),\n        sa.Column(\"duration\", sa.Integer(), nullable=True),\n        sa.ForeignKeyConstraint(\n            [\"feed_id\"],\n            [\"feed.id\"],\n        ),\n        sa.PrimaryKeyConstraint(\"id\"),\n    )\n\n    op.create_table(\n        \"transcript\",\n        sa.Column(\"id\", sa.Integer(), autoincrement=True, nullable=False),\n        sa.Column(\"post_id\", sa.Integer(), nullable=False, unique=True),\n        sa.Column(\"content\", sa.Text(), nullable=False),\n        sa.Column(\"timestamp\", sa.DateTime(), nullable=True),\n        sa.ForeignKeyConstraint(\n            [\"post_id\"],\n            [\"post.id\"],\n        ),\n        sa.PrimaryKeyConstraint(\"id\"),\n    )\n    pass\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    pass\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/c0f8893ce927_add_skipped_jobs_columns.py",
    "content": "\"\"\"add skipped jobs counters\n\nRevision ID: c0f8893ce927\nRevises: 999b921ffc58\nCreate Date: 2026-11-27 00:00:00.000000\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"c0f8893ce927\"\ndown_revision = \"999b921ffc58\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    bind = op.get_bind()\n    inspector = sa.inspect(bind)\n\n    existing_tables = set(inspector.get_table_names())\n    if \"jobs_manager_run\" not in existing_tables:\n        return\n\n    columns = {col[\"name\"] for col in inspector.get_columns(\"jobs_manager_run\")}\n    if \"skipped_jobs\" not in columns:\n        with op.batch_alter_table(\"jobs_manager_run\", schema=None) as batch_op:\n            batch_op.add_column(\n                sa.Column(\n                    \"skipped_jobs\",\n                    sa.Integer(),\n                    nullable=False,\n                    server_default=\"0\",\n                )\n            )\n\n        # Align existing rows to default value\n        op.execute(\n            sa.text(\n                \"UPDATE jobs_manager_run SET skipped_jobs = 0 WHERE skipped_jobs IS NULL\"\n            )\n        )\n\n\ndef downgrade():\n    bind = op.get_bind()\n    inspector = sa.inspect(bind)\n\n    existing_tables = set(inspector.get_table_names())\n    if \"jobs_manager_run\" not in existing_tables:\n        return\n\n    columns = {col[\"name\"] for col in inspector.get_columns(\"jobs_manager_run\")}\n    if \"skipped_jobs\" in columns:\n        with op.batch_alter_table(\"jobs_manager_run\", schema=None) as batch_op:\n            batch_op.drop_column(\"skipped_jobs\")\n"
  },
  {
    "path": "src/migrations/versions/ded4b70feadb_add_image_metadata_to_feed.py",
    "content": "\"\"\"Add image metadata to feed\n\nRevision ID: ded4b70feadb\nRevises: 6e0e16299dcb\nCreate Date: 2025-03-01 14:30:20.177608\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"ded4b70feadb\"\ndown_revision = \"6e0e16299dcb\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    with op.batch_alter_table(\"feed\", schema=None) as batch_op:\n        batch_op.add_column(sa.Column(\"image_url\", sa.Text(), nullable=True))\n    pass\n\n\ndef downgrade():\n    with op.batch_alter_table(\"feed\", schema=None) as batch_op:\n        batch_op.drop_column(\"image_url\")\n    pass\n"
  },
  {
    "path": "src/migrations/versions/e1325294473b_add_autoprocess_on_download.py",
    "content": "\"\"\"add autoprocess_on_download\n\nRevision ID: e1325294473b\nRevises: 58b4eedd4c61\nCreate Date: 2025-12-25 20:45:12.595954\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"e1325294473b\"\ndown_revision = \"58b4eedd4c61\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"app_settings\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\n                \"autoprocess_on_download\",\n                sa.Boolean(),\n                nullable=False,\n                server_default=sa.false(),  # ensure existing SQLite rows get a value\n            )\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"app_settings\", schema=None) as batch_op:\n        batch_op.drop_column(\"autoprocess_on_download\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/eb51923af483_multiple_supporters.py",
    "content": "\"\"\"multiple supporters\n\nRevision ID: eb51923af483\nRevises: 802a2365976d\nCreate Date: 2025-12-01 22:25:13.104687\n\n\"\"\"\n\nfrom datetime import datetime\n\nimport sqlalchemy as sa\nfrom alembic import op\nfrom sqlalchemy import inspect\n\n# revision identifiers, used by Alembic.\nrevision = \"eb51923af483\"\ndown_revision = \"802a2365976d\"\nbranch_labels = None\ndepends_on = None\n\n\ndef _table_exists(table_name: str) -> bool:\n    \"\"\"Check if a table exists in the database.\"\"\"\n    connection = op.get_bind()\n    inspector = inspect(connection)\n    return table_name in inspector.get_table_names()\n\n\ndef _column_exists(table_name: str, column_name: str) -> bool:\n    \"\"\"Check if a column exists in a table.\"\"\"\n    connection = op.get_bind()\n    inspector = inspect(connection)\n    columns = [col[\"name\"] for col in inspector.get_columns(table_name)]\n    return column_name in columns\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n\n    # Create feed_supporter table if it doesn't exist\n    if not _table_exists(\"feed_supporter\"):\n        op.create_table(\n            \"feed_supporter\",\n            sa.Column(\"id\", sa.Integer(), autoincrement=True, nullable=False),\n            sa.Column(\"feed_id\", sa.Integer(), nullable=False),\n            sa.Column(\"user_id\", sa.Integer(), nullable=False),\n            sa.Column(\"created_at\", sa.DateTime(), nullable=False),\n            sa.ForeignKeyConstraint(\n                [\"feed_id\"],\n                [\"feed.id\"],\n            ),\n            sa.ForeignKeyConstraint(\n                [\"user_id\"],\n                [\"users.id\"],\n            ),\n            sa.PrimaryKeyConstraint(\"id\"),\n            sa.UniqueConstraint(\n                \"feed_id\", \"user_id\", name=\"uq_feed_supporter_feed_user\"\n            ),\n        )\n\n    # Add columns to processing_job if they don't exist\n    if not _column_exists(\"processing_job\", \"requested_by_user_id\"):\n        with op.batch_alter_table(\"processing_job\", schema=None) as batch_op:\n            batch_op.add_column(\n                sa.Column(\"requested_by_user_id\", sa.Integer(), nullable=True)\n            )\n            batch_op.add_column(\n                sa.Column(\"billing_user_id\", sa.Integer(), nullable=True)\n            )\n            batch_op.create_foreign_key(\n                \"fk_processing_job_billing_user_id\",\n                \"users\",\n                [\"billing_user_id\"],\n                [\"id\"],\n            )\n            batch_op.create_foreign_key(\n                \"fk_processing_job_requested_by_user_id\",\n                \"users\",\n                [\"requested_by_user_id\"],\n                [\"id\"],\n            )\n\n    # Seed supporter rows for existing sponsors so they keep access permissions.\n    connection = op.get_bind()\n    feed_supporter_table = sa.table(\n        \"feed_supporter\",\n        sa.column(\"feed_id\", sa.Integer),\n        sa.column(\"user_id\", sa.Integer),\n        sa.column(\"created_at\", sa.DateTime),\n    )\n\n    # Check which sponsor/feed combos already exist\n    existing = set()\n    result = connection.execute(sa.text(\"SELECT feed_id, user_id FROM feed_supporter\"))\n    for row in result:\n        existing.add((row._mapping[\"feed_id\"], row._mapping[\"user_id\"]))\n\n    result = connection.execute(\n        sa.text(\n            \"SELECT id AS feed_id, sponsor_user_id FROM feed WHERE sponsor_user_id IS NOT NULL\"\n        )\n    )\n    inserts = []\n    seen = set()\n    for row in result:\n        feed_id = row._mapping[\"feed_id\"]\n        user_id = row._mapping[\"sponsor_user_id\"]\n        if not user_id:\n            continue\n        key = (feed_id, user_id)\n        if key in seen or key in existing:\n            continue\n        seen.add(key)\n        inserts.append(\n            {\n                \"feed_id\": feed_id,\n                \"user_id\": user_id,\n                \"created_at\": datetime.utcnow(),\n            }\n        )\n    if inserts:\n        op.bulk_insert(feed_supporter_table, inserts)\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"processing_job\", schema=None) as batch_op:\n        batch_op.drop_constraint(\n            \"fk_processing_job_requested_by_user_id\", type_=\"foreignkey\"\n        )\n        batch_op.drop_constraint(\n            \"fk_processing_job_billing_user_id\", type_=\"foreignkey\"\n        )\n        batch_op.drop_column(\"billing_user_id\")\n        batch_op.drop_column(\"requested_by_user_id\")\n\n    op.drop_table(\"feed_supporter\")\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/f6d5fee57cc3_tz_fix.py",
    "content": "\"\"\"tz_fix\n\nRevision ID: f6d5fee57cc3\nRevises: 0d954a44fa8e\nCreate Date: 2025-11-04 22:31:38.563280\n\n\"\"\"\n\nimport datetime\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"f6d5fee57cc3\"\ndown_revision = \"0d954a44fa8e\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    bind = op.get_bind()\n    inspector = sa.inspect(bind)\n    column_names = {col[\"name\"] for col in inspector.get_columns(\"post\")}\n\n    if \"release_date\" not in column_names and \"release_date_tmp\" in column_names:\n        with op.batch_alter_table(\"post\", schema=None) as batch_op:\n            batch_op.alter_column(\"release_date_tmp\", new_column_name=\"release_date\")\n        return\n\n    if \"release_date\" not in column_names:\n        # Nothing to migrate (already applied manually, or table missing column)\n        return\n\n    if \"release_date_tmp\" not in column_names:\n        with op.batch_alter_table(\"post\", schema=None) as batch_op:\n            batch_op.add_column(\n                sa.Column(\"release_date_tmp\", sa.DateTime(timezone=True), nullable=True)\n            )\n\n    metadata = sa.MetaData()\n    post = sa.Table(\"post\", metadata, autoload_with=bind)\n\n    select_stmt = sa.select(post.c.id, post.c.release_date)\n    rows = bind.execute(select_stmt).fetchall()\n    for row in rows:\n        if row.release_date is None:\n            continue\n        if isinstance(row.release_date, datetime.datetime):\n            dt = row.release_date\n        else:\n            dt = datetime.datetime.combine(row.release_date, datetime.time())\n        dt = dt.replace(tzinfo=datetime.timezone.utc)\n        bind.execute(\n            post.update().where(post.c.id == row.id).values(release_date_tmp=dt)\n        )\n\n    inspector = sa.inspect(bind)\n    column_names = {col[\"name\"] for col in inspector.get_columns(\"post\")}\n    if \"release_date\" in column_names:\n        with op.batch_alter_table(\"post\", schema=None) as batch_op:\n            batch_op.drop_column(\"release_date\")\n\n    inspector = sa.inspect(bind)\n    column_names = {col[\"name\"] for col in inspector.get_columns(\"post\")}\n    if \"release_date_tmp\" in column_names:\n        with op.batch_alter_table(\"post\", schema=None) as batch_op:\n            batch_op.alter_column(\"release_date_tmp\", new_column_name=\"release_date\")\n\n\ndef downgrade():\n    bind = op.get_bind()\n    inspector = sa.inspect(bind)\n    column_names = {col[\"name\"] for col in inspector.get_columns(\"post\")}\n\n    if \"release_date\" not in column_names and \"release_date_date\" in column_names:\n        with op.batch_alter_table(\"post\", schema=None) as batch_op:\n            batch_op.alter_column(\"release_date_date\", new_column_name=\"release_date\")\n        return\n\n    if \"release_date\" not in column_names:\n        # Nothing to revert\n        return\n\n    if \"release_date_date\" not in column_names:\n        with op.batch_alter_table(\"post\", schema=None) as batch_op:\n            batch_op.add_column(\n                sa.Column(\"release_date_date\", sa.DATE(), nullable=True)\n            )\n\n    metadata = sa.MetaData()\n    post = sa.Table(\"post\", metadata, autoload_with=bind)\n\n    select_stmt = sa.select(post.c.id, post.c.release_date)\n    rows = bind.execute(select_stmt).fetchall()\n    for row in rows:\n        if row.release_date is None:\n            continue\n        if isinstance(row.release_date, datetime.datetime):\n            dt = row.release_date\n        else:\n            dt = datetime.datetime.combine(row.release_date, datetime.time())\n        date_only = dt.astimezone(datetime.timezone.utc).date()\n        bind.execute(\n            post.update().where(post.c.id == row.id).values(release_date_date=date_only)\n        )\n\n    inspector = sa.inspect(bind)\n    column_names = {col[\"name\"] for col in inspector.get_columns(\"post\")}\n    if \"release_date\" in column_names:\n        with op.batch_alter_table(\"post\", schema=None) as batch_op:\n            batch_op.drop_column(\"release_date\")\n\n    inspector = sa.inspect(bind)\n    column_names = {col[\"name\"] for col in inspector.get_columns(\"post\")}\n    if \"release_date_date\" in column_names:\n        with op.batch_alter_table(\"post\", schema=None) as batch_op:\n            batch_op.alter_column(\"release_date_date\", new_column_name=\"release_date\")\n"
  },
  {
    "path": "src/migrations/versions/f7a4195e0953_add_enable_boundary_refinement_to_llm_.py",
    "content": "\"\"\"add enable_boundary_refinement to llm_settings\n\nRevision ID: f7a4195e0953\nRevises: e1325294473b\nCreate Date: 2026-01-06 23:02:56.142954\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"f7a4195e0953\"\ndown_revision = \"e1325294473b\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"llm_settings\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\n                \"enable_boundary_refinement\",\n                sa.Boolean(),\n                nullable=False,\n                server_default=sa.text(\"1\"),\n            )\n        )\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"llm_settings\", schema=None) as batch_op:\n        batch_op.drop_column(\"enable_boundary_refinement\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/migrations/versions/fa3a95ecd67d_audio_processing_paths.py",
    "content": "\"\"\"audio processing paths\n\nRevision ID: fa3a95ecd67d\nRevises: bae70e584468\nCreate Date: 2024-11-09 16:48:09.337029\n\n\"\"\"\n\nimport sqlalchemy as sa\nfrom alembic import op\n\n# revision identifiers, used by Alembic.\nrevision = \"fa3a95ecd67d\"\ndown_revision = \"bae70e584468\"\nbranch_labels = None\ndepends_on = None\n\n\ndef upgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"post\", schema=None) as batch_op:\n        batch_op.add_column(\n            sa.Column(\"unprocessed_audio_path\", sa.Text(), nullable=True)\n        )\n        batch_op.add_column(sa.Column(\"processed_audio_path\", sa.Text(), nullable=True))\n\n    # ### end Alembic commands ###\n\n\ndef downgrade():\n    # ### commands auto generated by Alembic - please adjust! ###\n    with op.batch_alter_table(\"post\", schema=None) as batch_op:\n        batch_op.drop_column(\"processed_audio_path\")\n        batch_op.drop_column(\"unprocessed_audio_path\")\n\n    # ### end Alembic commands ###\n"
  },
  {
    "path": "src/podcast_processor/__init__.py",
    "content": "from warnings import filterwarnings\n\nfrom beartype.claw import beartype_this_package\nfrom beartype.roar import BeartypeDecorHintPep585DeprecationWarning\n\nbeartype_this_package()\n\nfilterwarnings(\"ignore\", category=BeartypeDecorHintPep585DeprecationWarning)\n"
  },
  {
    "path": "src/podcast_processor/ad_classifier.py",
    "content": "import logging\nimport math\nimport time\n\n# pylint: disable=too-many-lines\nfrom datetime import datetime\nfrom typing import Any, Dict, List, Optional, Set, Tuple, Union\n\nimport litellm\nfrom jinja2 import Template\nfrom litellm.exceptions import InternalServerError\nfrom litellm.types.utils import Choices\nfrom pydantic import ValidationError\nfrom sqlalchemy import and_\n\nfrom app.extensions import db\nfrom app.models import Identification, ModelCall, Post, TranscriptSegment\nfrom app.writer.client import writer_client\nfrom podcast_processor.boundary_refiner import BoundaryRefiner\nfrom podcast_processor.cue_detector import CueDetector\nfrom podcast_processor.llm_concurrency_limiter import (\n    ConcurrencyContext,\n    LLMConcurrencyLimiter,\n    get_concurrency_limiter,\n)\nfrom podcast_processor.model_output import (\n    AdSegmentPredictionList,\n    clean_and_parse_model_output,\n)\nfrom podcast_processor.prompt import transcript_excerpt_for_prompt\nfrom podcast_processor.token_rate_limiter import (\n    TokenRateLimiter,\n    configure_rate_limiter_for_model,\n)\nfrom podcast_processor.transcribe import Segment\nfrom podcast_processor.word_boundary_refiner import WordBoundaryRefiner\nfrom shared.config import Config, TestWhisperConfig\nfrom shared.llm_utils import model_uses_max_completion_tokens\n\n\nclass ClassifyParams:\n    def __init__(\n        self,\n        system_prompt: str,\n        user_prompt_template: Template,\n        post: Post,\n        num_segments_per_prompt: int,\n        max_overlap_segments: int,\n    ):\n        self.system_prompt = system_prompt\n        self.user_prompt_template = user_prompt_template\n        self.post = post\n        self.num_segments_per_prompt = num_segments_per_prompt\n        self.max_overlap_segments = max_overlap_segments\n\n\nclass ClassifyException(Exception):\n    \"\"\"Custom exception for classification errors.\"\"\"\n\n\nclass AdClassifier:\n    \"\"\"Handles the classification of ad segments in podcast transcripts.\"\"\"\n\n    def __init__(\n        self,\n        config: Config,\n        logger: Optional[logging.Logger] = None,\n        model_call_query: Optional[Any] = None,\n        identification_query: Optional[Any] = None,\n        db_session: Optional[Any] = None,\n    ):\n        self.config = config\n        self.logger = logger or logging.getLogger(\"global_logger\")\n        self.model_call_query = model_call_query or ModelCall.query\n        self.identification_query = identification_query or Identification.query\n        self.db_session = db_session or db.session\n\n        # Initialize rate limiter for the configured model\n        self.rate_limiter: Optional[TokenRateLimiter]\n        if self.config.llm_enable_token_rate_limiting:\n            tokens_per_minute = self.config.llm_max_input_tokens_per_minute\n            if tokens_per_minute is None:\n                # Use model-specific defaults\n                self.rate_limiter = configure_rate_limiter_for_model(\n                    self.config.llm_model\n                )\n            else:\n                # Use custom limit\n                from podcast_processor.token_rate_limiter import get_rate_limiter\n\n                self.rate_limiter = get_rate_limiter(tokens_per_minute)\n                self.logger.info(\n                    f\"Using custom token rate limit: {tokens_per_minute}/min\"\n                )\n        else:\n            self.rate_limiter = None\n            self.logger.info(\"Token rate limiting disabled\")\n\n        # Initialize concurrency limiter for LLM API calls\n        self.concurrency_limiter: Optional[LLMConcurrencyLimiter]\n        max_concurrent = getattr(self.config, \"llm_max_concurrent_calls\", 3)\n        if max_concurrent > 0:\n            self.concurrency_limiter = get_concurrency_limiter(max_concurrent)\n            self.logger.info(\n                f\"LLM concurrency limiting enabled: max {max_concurrent} concurrent calls\"\n            )\n        else:\n            self.concurrency_limiter = None\n            self.logger.info(\"LLM concurrency limiting disabled\")\n\n        # Initialize cue detector for neighbor expansion\n        self.cue_detector = CueDetector()\n\n        # Initialize boundary refiner (conditionally based on config)\n        self.boundary_refiner: Optional[BoundaryRefiner] = None\n        if config.enable_boundary_refinement:\n            if getattr(config, \"enable_word_level_boundary_refinder\", False):\n                self.boundary_refiner = WordBoundaryRefiner(config, self.logger)  # type: ignore[assignment]\n                self.logger.info(\"Word-level boundary refiner enabled\")\n            else:\n                self.boundary_refiner = BoundaryRefiner(config, self.logger)\n                self.logger.info(\"Boundary refinement enabled\")\n        else:\n            self.logger.info(\"Boundary refinement disabled via config\")\n\n    def classify(\n        self,\n        *,\n        transcript_segments: List[TranscriptSegment],\n        system_prompt: str,\n        user_prompt_template: Template,\n        post: Post,\n    ) -> None:\n        \"\"\"\n        Classifies transcript segments to identify ad segments.\n\n        Args:\n            transcript_segments: List of transcript segments to classify\n            system_prompt: System prompt for the LLM\n            user_prompt_template: User prompt template for the LLM\n            post: Post containing the podcast to classify\n        \"\"\"\n        self.logger.info(\n            f\"Starting ad classification for post {post.id} with {len(transcript_segments)} segments.\"\n        )\n\n        if not transcript_segments:\n            self.logger.info(\n                f\"No transcript segments to classify for post {post.id}. Skipping.\"\n            )\n            return\n\n        classify_params = ClassifyParams(\n            system_prompt=system_prompt,\n            user_prompt_template=user_prompt_template,\n            post=post,\n            num_segments_per_prompt=self.config.processing.num_segments_to_input_to_prompt,\n            max_overlap_segments=self.config.processing.max_overlap_segments,\n        )\n\n        total_segments = len(transcript_segments)\n\n        try:\n            current_index = 0\n            next_overlap_segments: List[TranscriptSegment] = []\n            max_iterations = (\n                total_segments + 10\n            )  # Safety limit to prevent infinite loops\n            iteration_count = 0\n            while current_index < total_segments and iteration_count < max_iterations:\n                consumed_segments, next_overlap_segments = self._step(\n                    classify_params,\n                    next_overlap_segments,\n                    current_index,\n                    transcript_segments,\n                )\n                current_index += consumed_segments\n                iteration_count += 1\n                if consumed_segments == 0:\n                    self.logger.error(\n                        f\"No progress made in iteration {iteration_count} for post {post.id}. \"\n                        \"Breaking to avoid infinite loop.\"\n                    )\n                    break\n\n            # Expand neighbors using bulk operations\n            # NOTE: Use self.db_session.query() instead of self.identification_query\n            # to ensure all operations use the same session consistently.\n            ad_identifications = (\n                self.db_session.query(Identification)\n                .join(TranscriptSegment)\n                .filter(\n                    TranscriptSegment.post_id == post.id,\n                    Identification.label == \"ad\",\n                )\n                .all()\n            )\n\n            if ad_identifications:\n                # Get model_call from first identification\n                model_call = (\n                    ad_identifications[0].model_call if ad_identifications else None\n                )\n                if model_call:\n                    created = self.expand_neighbors_bulk(\n                        ad_identifications=ad_identifications,\n                        model_call=model_call,\n                        post_id=post.id,\n                        window=5,\n                    )\n                    self.logger.info(\n                        f\"Created {created} neighbor identifications via bulk ops\"\n                    )\n\n            # Pass 2: Refine boundaries\n            if self.boundary_refiner:\n                self._refine_boundaries(transcript_segments, post)\n\n        except ClassifyException as e:\n            self.logger.error(f\"Classification failed for post {post.id}: {e}\")\n            return\n\n    def _step(\n        self,\n        classify_params: ClassifyParams,\n        prev_overlap_segments: List[TranscriptSegment],\n        current_index: int,\n        transcript_segments: List[TranscriptSegment],\n    ) -> Tuple[int, List[TranscriptSegment]]:\n        overlap_segments = self._apply_overlap_cap(prev_overlap_segments)\n        remaining_segments = transcript_segments[current_index:]\n\n        (\n            chunk_segments,\n            user_prompt_str,\n            consumed_segments,\n            token_limit_trimmed,\n        ) = self._build_chunk_payload(\n            overlap_segments=overlap_segments,\n            remaining_segments=remaining_segments,\n            total_segments=transcript_segments,\n            post=classify_params.post,\n            system_prompt=classify_params.system_prompt,\n            user_prompt_template=classify_params.user_prompt_template,\n            max_new_segments=classify_params.num_segments_per_prompt,\n        )\n\n        if not chunk_segments or consumed_segments <= 0:\n            self.logger.error(\n                \"No progress made while building classification chunk for post %s. \"\n                \"Stopping to avoid infinite loop.\",\n                classify_params.post.id,\n            )\n            raise ClassifyException(\n                \"No progress made while building classification chunk.\"\n            )\n\n        if token_limit_trimmed:\n            self.logger.debug(\n                \"Token limit trimming applied for post %s at transcript index %s. \"\n                \"Processing chunk with %s new segments across %s total segments.\",\n                classify_params.post.id,\n                current_index,\n                consumed_segments,\n                len(chunk_segments),\n            )\n\n        identified_segments = self._process_chunk(\n            chunk_segments=chunk_segments,\n            system_prompt=classify_params.system_prompt,\n            user_prompt_str=user_prompt_str,\n            post=classify_params.post,\n        )\n\n        next_overlap_segments = self._compute_next_overlap_segments(\n            chunk_segments=chunk_segments,\n            identified_segments=identified_segments,\n            max_overlap_segments=classify_params.max_overlap_segments,\n        )\n\n        if next_overlap_segments:\n            self.logger.debug(\n                \"Carrying forward %s overlap segments for post %s: %s\",\n                len(next_overlap_segments),\n                classify_params.post.id,\n                [seg.sequence_num for seg in next_overlap_segments],\n            )\n\n        return consumed_segments, next_overlap_segments\n\n    def _process_chunk(\n        self,\n        *,\n        chunk_segments: List[TranscriptSegment],\n        system_prompt: str,\n        post: Post,\n        user_prompt_str: str,\n    ) -> List[TranscriptSegment]:\n        \"\"\"Process a chunk of transcript segments for classification.\"\"\"\n        if not chunk_segments:\n            return []\n\n        first_seq_num = chunk_segments[0].sequence_num\n        last_seq_num = chunk_segments[-1].sequence_num\n\n        self.logger.info(\n            f\"Processing classification for post {post.id}, segments {first_seq_num}-{last_seq_num}.\"\n        )\n\n        model_call = self._get_or_create_model_call(\n            post=post,\n            first_seq_num=first_seq_num,\n            last_seq_num=last_seq_num,\n            user_prompt_str=user_prompt_str,\n        )\n\n        if not model_call:\n            self.logger.error(\"ModelCall object is unexpectedly None. Skipping chunk.\")\n            return []\n\n        if self._should_call_llm(model_call):\n            self._perform_llm_call(\n                model_call=model_call,\n                system_prompt=system_prompt,\n            )\n\n        if model_call.status == \"success\" and model_call.response:\n            return self._process_successful_response(\n                model_call=model_call,\n                current_chunk_db_segments=chunk_segments,\n            )\n        if model_call.status != \"success\":\n            self.logger.info(\n                f\"LLM call for ModelCall {model_call.id} was not successful (status: {model_call.status}). No identifications to process.\"\n            )\n        return []\n\n    def _build_chunk_payload(\n        self,\n        *,\n        overlap_segments: List[TranscriptSegment],\n        remaining_segments: List[TranscriptSegment],\n        total_segments: List[TranscriptSegment],\n        post: Post,\n        system_prompt: str,\n        user_prompt_template: Template,\n        max_new_segments: int,\n    ) -> Tuple[List[TranscriptSegment], str, int, bool]:\n        \"\"\"Construct chunk data while enforcing overlap and token constraints.\"\"\"\n        if not remaining_segments:\n            return ([], \"\", 0, False)\n\n        capped_overlap = self._apply_overlap_cap(overlap_segments)\n        new_segment_count = min(max_new_segments, len(remaining_segments))\n        token_limit_trimmed = False\n\n        while new_segment_count > 0:\n            base_segments = remaining_segments[:new_segment_count]\n            chunk_segments = self._combine_overlap_segments(\n                overlap_segments=capped_overlap,\n                base_segments=base_segments,\n            )\n\n            if not chunk_segments:\n                return ([], \"\", 0, token_limit_trimmed)\n\n            includes_start = (\n                chunk_segments[0].id == total_segments[0].id\n                if total_segments\n                else False\n            )\n            includes_end = (\n                chunk_segments[-1].id == total_segments[-1].id\n                if total_segments\n                else False\n            )\n\n            user_prompt_str = self._generate_user_prompt(\n                current_chunk_db_segments=chunk_segments,\n                post=post,\n                user_prompt_template=user_prompt_template,\n                includes_start=includes_start,\n                includes_end=includes_end,\n            )\n\n            if (\n                self.config.llm_max_input_tokens_per_call is not None\n                and not self._validate_token_limit(user_prompt_str, system_prompt)\n            ):\n                token_limit_trimmed = True\n                if new_segment_count == 1:\n                    self.logger.warning(\n                        \"Even single segment at transcript index %s exceeds token limit \"\n                        \"for post %s. Proceeding with minimal chunk.\",\n                        base_segments[0].sequence_num,\n                        post.id,\n                    )\n                    return (chunk_segments, user_prompt_str, new_segment_count, True)\n                new_segment_count -= 1\n                continue\n\n            return (\n                chunk_segments,\n                user_prompt_str,\n                new_segment_count,\n                token_limit_trimmed,\n            )\n\n        return ([], \"\", 0, token_limit_trimmed)\n\n    def _combine_overlap_segments(\n        self,\n        *,\n        overlap_segments: List[TranscriptSegment],\n        base_segments: List[TranscriptSegment],\n    ) -> List[TranscriptSegment]:\n        \"\"\"Combine overlap and new segments while preserving order and removing duplicates.\"\"\"\n        combined: List[TranscriptSegment] = []\n        seen_ids: Set[int] = set()\n\n        for segment in overlap_segments:\n            if segment.id not in seen_ids:\n                combined.append(segment)\n                seen_ids.add(segment.id)\n\n        for segment in base_segments:\n            if segment.id not in seen_ids:\n                combined.append(segment)\n                seen_ids.add(segment.id)\n\n        self.logger.debug(\n            \"Combined overlap (%s segments) and base (%s segments) into %s total segments. \"\n            \"Overlap seq nums: %s, Base seq nums: %s\",\n            len(overlap_segments),\n            len(base_segments),\n            len(combined),\n            [seg.sequence_num for seg in overlap_segments],\n            [seg.sequence_num for seg in base_segments],\n        )\n\n        return combined\n\n    def _compute_next_overlap_segments(\n        self,\n        *,\n        chunk_segments: List[TranscriptSegment],\n        identified_segments: List[TranscriptSegment],\n        max_overlap_segments: int,\n    ) -> List[TranscriptSegment]:\n        \"\"\"Determine which segments should be carried forward to the next chunk.\"\"\"\n        if max_overlap_segments <= 0 or not chunk_segments:\n            return []\n\n        # Baseline: carry ~50% of the chunk to guarantee overlap even without detections\n        base_tail_count = max(1, math.ceil(len(chunk_segments) / 2))\n        overlap_candidates = list(chunk_segments[-base_tail_count:])\n\n        if identified_segments:\n            # Preserve from earliest detected ad through the end of the chunk\n            identified_ids = {seg.id for seg in identified_segments}\n            earliest_index = None\n            for i, seg in enumerate(chunk_segments):\n                if seg.id in identified_ids:\n                    earliest_index = i\n                    break\n\n            if earliest_index is not None:\n                ad_tail = chunk_segments[earliest_index:]\n                overlap_candidates = self._combine_overlap_segments(\n                    overlap_segments=ad_tail,\n                    base_segments=overlap_candidates,\n                )\n\n            # Conditional tail replay: always include the final ~15 seconds when ads are present\n            tail_replay_segments = self._segments_covering_tail(\n                chunk_segments=chunk_segments, seconds=15.0\n            )\n            overlap_candidates = self._combine_overlap_segments(\n                overlap_segments=tail_replay_segments,\n                base_segments=overlap_candidates,\n            )\n\n        capped = self._apply_overlap_cap(\n            overlap_candidates, max_override=max_overlap_segments\n        )\n        self.logger.debug(\n            \"Carrying forward %s overlap segments: seq_nums %s (identified=%s)\",\n            len(capped),\n            [seg.sequence_num for seg in capped],\n            bool(identified_segments),\n        )\n        return capped\n\n    def _apply_overlap_cap(\n        self,\n        overlap_segments: List[TranscriptSegment],\n        max_override: Optional[int] = None,\n    ) -> List[TranscriptSegment]:\n        \"\"\"Ensure stored overlap obeys configured limits.\"\"\"\n        max_overlap = (\n            self.config.processing.max_overlap_segments\n            if max_override is None\n            else max_override\n        )\n        if max_overlap <= 0 or not overlap_segments:\n            if max_overlap <= 0 and overlap_segments:\n                self.logger.debug(\n                    \"Discarding %s overlap segments because max_overlap_segments is %s.\",\n                    len(overlap_segments),\n                    max_overlap,\n                )\n            return [] if max_overlap <= 0 else list(overlap_segments)\n\n        if len(overlap_segments) <= max_overlap:\n            self.logger.debug(\n                \"Overlap cap check: %s segments within limit of %s, no trimming needed\",\n                len(overlap_segments),\n                max_overlap,\n            )\n            return list(overlap_segments)\n\n        trimmed = overlap_segments[-max_overlap:]\n        self.logger.debug(\n            \"Overlap cap enforcement: trimming from %s to %s segments (max=%s). \"\n            \"Keeping seq_nums: %s\",\n            len(overlap_segments),\n            len(trimmed),\n            max_overlap,\n            [seg.sequence_num for seg in trimmed],\n        )\n        return trimmed\n\n    def _segments_covering_tail(\n        self, *, chunk_segments: List[TranscriptSegment], seconds: float\n    ) -> List[TranscriptSegment]:\n        \"\"\"Return the minimal set of segments covering the last `seconds` of audio.\"\"\"\n        if not chunk_segments:\n            return []\n\n        last_end_time = (\n            chunk_segments[-1].end_time\n            if chunk_segments[-1].end_time is not None\n            else chunk_segments[-1].start_time\n        )\n        cutoff = last_end_time - seconds\n\n        tail_segments: List[TranscriptSegment] = []\n        for seg in reversed(chunk_segments):\n            tail_segments.append(seg)\n            if seg.start_time <= cutoff:\n                break\n\n        return list(reversed(tail_segments))\n\n    def _validate_token_limit(self, user_prompt_str: str, system_prompt: str) -> bool:\n        \"\"\"Validate that the prompt doesn't exceed the configured token limit.\"\"\"\n        if self.config.llm_max_input_tokens_per_call is None:\n            return True\n\n        # Create messages as they would be sent to the API\n        messages = [\n            {\"role\": \"system\", \"content\": system_prompt},\n            {\"role\": \"user\", \"content\": user_prompt_str},\n        ]\n\n        # Count tokens (reuse the existing token counting logic from rate limiter)\n        if self.rate_limiter:\n            token_count = self.rate_limiter.count_tokens(\n                messages, self.config.llm_model\n            )\n        else:\n            # Fallback token estimation if no rate limiter\n            total_chars = len(system_prompt) + len(user_prompt_str)\n            token_count = total_chars // 4  # ~4 characters per token\n\n        is_valid = token_count <= self.config.llm_max_input_tokens_per_call\n\n        if not is_valid:\n            self.logger.debug(\n                f\"Prompt exceeds token limit: {token_count} > {self.config.llm_max_input_tokens_per_call}\"\n            )\n        else:\n            self.logger.debug(\n                f\"Prompt within token limit: {token_count} <= {self.config.llm_max_input_tokens_per_call}\"\n            )\n\n        return is_valid\n\n    def _prepare_api_call(\n        self, model_call_obj: ModelCall, system_prompt: str\n    ) -> Optional[Dict[str, Any]]:\n        \"\"\"Prepare API call arguments and validate token limits.\"\"\"\n        # Prepare messages for the API call\n        messages = [\n            {\"role\": \"system\", \"content\": system_prompt},\n            {\"role\": \"user\", \"content\": model_call_obj.prompt},\n        ]\n\n        # Use rate limiter to wait if necessary and track token usage\n        if self.rate_limiter:\n            self.rate_limiter.wait_if_needed(messages, model_call_obj.model_name)\n\n            # Get usage stats for logging\n            usage_stats = self.rate_limiter.get_usage_stats()\n            self.logger.info(\n                f\"Token usage: {usage_stats['current_usage']}/{usage_stats['limit']} \"\n                f\"({usage_stats['usage_percentage']:.1f}%) for ModelCall {model_call_obj.id}\"\n            )\n\n        # Final validation: Check per-call token limit before making API call\n        if self.config.llm_max_input_tokens_per_call is not None:\n            if not self._validate_token_limit(model_call_obj.prompt, system_prompt):\n                error_msg = (\n                    f\"Prompt for ModelCall {model_call_obj.id} exceeds configured \"\n                    f\"token limit of {self.config.llm_max_input_tokens_per_call}. \"\n                    f\"Consider reducing num_segments_to_input_to_prompt.\"\n                )\n                self.logger.error(error_msg)\n                if model_call_obj.id is not None:\n                    res = writer_client.update(\n                        \"ModelCall\",\n                        model_call_obj.id,\n                        {\"status\": \"failed\", \"error_message\": error_msg},\n                        wait=True,\n                    )\n                    if not res or not res.success:\n                        raise RuntimeError(\n                            getattr(res, \"error\", \"Failed to update ModelCall\")\n                        )\n                    # Update local object to reflect database state\n                    model_call_obj.status = \"failed\"\n                    model_call_obj.error_message = error_msg\n                return None\n\n        # Prepare completion arguments\n        completion_args = {\n            \"model\": model_call_obj.model_name,\n            \"messages\": messages,\n            \"timeout\": self.config.openai_timeout,\n        }\n\n        # Use max_completion_tokens for newer OpenAI models (o1, gpt-5, gpt-4o variants)\n        # OpenAI deprecated max_tokens for these models in favor of max_completion_tokens\n        # Check if this is a model that requires max_completion_tokens\n        # This includes: gpt-5, gpt-4o variants, o1 series, and latest chatgpt models\n        uses_max_completion_tokens = model_uses_max_completion_tokens(\n            model_call_obj.model_name\n        )\n\n        # Debug logging to help diagnose model parameter issues\n        self.logger.info(\n            f\"Model: '{model_call_obj.model_name}', using max_completion_tokens: {uses_max_completion_tokens}\"\n        )\n\n        if uses_max_completion_tokens:\n            completion_args[\"max_completion_tokens\"] = self.config.openai_max_tokens\n        else:\n            # For older models and non-OpenAI models, use max_tokens\n            completion_args[\"max_tokens\"] = self.config.openai_max_tokens\n\n        return completion_args\n\n    def _generate_user_prompt(\n        self,\n        *,\n        current_chunk_db_segments: List[TranscriptSegment],\n        post: Post,\n        user_prompt_template: Template,\n        includes_start: bool,\n        includes_end: bool,\n    ) -> str:\n        \"\"\"Generate the user prompt string for the LLM.\"\"\"\n        temp_pydantic_segments_for_prompt = [\n            Segment(start=db_seg.start_time, end=db_seg.end_time, text=db_seg.text)\n            for db_seg in current_chunk_db_segments\n        ]\n\n        return user_prompt_template.render(\n            podcast_title=post.title,\n            podcast_topic=post.description if post.description else \"\",\n            transcript=transcript_excerpt_for_prompt(\n                segments=temp_pydantic_segments_for_prompt,\n                includes_start=includes_start,\n                includes_end=includes_end,\n            ),\n        )\n\n    def _get_or_create_model_call(\n        self,\n        *,\n        post: Post,\n        first_seq_num: int,\n        last_seq_num: int,\n        user_prompt_str: str,\n    ) -> Optional[ModelCall]:\n        \"\"\"Get an existing ModelCall or create a new one via writer.\"\"\"\n        model = self.config.llm_model\n        result = writer_client.action(\n            \"upsert_model_call\",\n            {\n                \"post_id\": post.id,\n                \"model_name\": model,\n                \"first_segment_sequence_num\": first_seq_num,\n                \"last_segment_sequence_num\": last_seq_num,\n                \"prompt\": user_prompt_str,\n            },\n            wait=True,\n        )\n        if not result or not result.success:\n            raise RuntimeError(getattr(result, \"error\", \"Failed to upsert ModelCall\"))\n\n        model_call_id = (result.data or {}).get(\"model_call_id\")\n        if model_call_id is None:\n            raise RuntimeError(\"Writer did not return model_call_id\")\n\n        model_call = self.db_session.get(ModelCall, int(model_call_id))\n        if model_call is None:\n            raise RuntimeError(f\"ModelCall {model_call_id} not found after upsert\")\n        return model_call\n\n    def _should_call_llm(self, model_call: ModelCall) -> bool:\n        \"\"\"Determine if an LLM call should be made.\"\"\"\n        return model_call.status not in (\"success\", \"failed_permanent\")\n\n    def _perform_llm_call(self, *, model_call: ModelCall, system_prompt: str) -> None:\n        \"\"\"Perform the LLM call for classification.\"\"\"\n        self.logger.info(\n            f\"Calling LLM for ModelCall {model_call.id} (post {model_call.post_id}, segments {model_call.first_segment_sequence_num}-{model_call.last_segment_sequence_num}).\"\n        )\n        try:\n            if isinstance(self.config.whisper, TestWhisperConfig):\n                self._handle_test_mode_call(model_call)\n            else:\n                self._call_model(model_call_obj=model_call, system_prompt=system_prompt)\n        except Exception as e:  # pylint: disable=broad-exception-caught\n            self.logger.error(\n                f\"LLM interaction via _call_model for ModelCall {model_call.id} resulted in an exception: {e}\",\n                exc_info=True,\n            )\n\n    def _handle_test_mode_call(self, model_call: ModelCall) -> None:\n        \"\"\"Handle LLM call in test mode.\"\"\"\n        self.logger.info(\"Test mode: Simulating successful LLM call for classify.\")\n        test_response = AdSegmentPredictionList(ad_segments=[]).model_dump_json()\n        res = writer_client.update(\n            \"ModelCall\",\n            model_call.id,\n            {\n                \"response\": test_response,\n                \"status\": \"success\",\n                \"error_message\": None,\n                \"retry_attempts\": 1,\n            },\n            wait=True,\n        )\n        if not res or not res.success:\n            raise RuntimeError(getattr(res, \"error\", \"Failed to update ModelCall\"))\n        # Update local object to reflect database state\n        model_call.status = \"success\"\n        model_call.response = test_response\n        model_call.error_message = None\n\n    def _process_successful_response(\n        self,\n        *,\n        model_call: ModelCall,\n        current_chunk_db_segments: List[TranscriptSegment],\n    ) -> List[TranscriptSegment]:\n        \"\"\"Process a successful LLM response and create Identification records.\"\"\"\n        self.logger.info(\n            f\"LLM call for ModelCall {model_call.id} was successful. Parsing response.\"\n        )\n        try:\n            prediction_list = clean_and_parse_model_output(model_call.response)\n            created_identification_count, matched_segments = (\n                self._create_identifications(\n                    prediction_list=prediction_list,\n                    current_chunk_db_segments=current_chunk_db_segments,\n                    model_call=model_call,\n                )\n            )\n\n            if created_identification_count > 0:\n                self.logger.info(\n                    f\"Created {created_identification_count} new Identification records for ModelCall {model_call.id}.\"\n                )\n            return matched_segments\n        except (ValidationError, AssertionError) as e:\n            self.logger.error(\n                f\"Error processing LLM response for ModelCall {model_call.id}: {e}\",\n                exc_info=True,\n            )\n        return []\n\n    def _create_identifications(\n        self,\n        *,\n        prediction_list: AdSegmentPredictionList,\n        current_chunk_db_segments: List[TranscriptSegment],\n        model_call: ModelCall,\n    ) -> Tuple[int, List[TranscriptSegment]]:\n        \"\"\"Create Identification records from the prediction list.\"\"\"\n        to_insert: List[Dict[str, Any]] = []\n        matched_segments: List[TranscriptSegment] = []\n        processed_segment_ids: Set[int] = set()\n        content_type = prediction_list.content_type\n\n        for pred in prediction_list.ad_segments:\n            adjusted_confidence = self._adjust_confidence(\n                base_confidence=pred.confidence,\n                content_type=content_type,\n            )\n\n            if adjusted_confidence < self.config.output.min_confidence:\n                self.logger.info(\n                    f\"Ad prediction offset {pred.segment_offset:.2f} for post {model_call.post_id} ignored due to low confidence: {pred.confidence:.2f} (min: {self.config.output.min_confidence})\"\n                )\n                continue\n\n            matched_segment = self._find_matching_segment(\n                segment_offset=pred.segment_offset,\n                current_chunk_db_segments=current_chunk_db_segments,\n            )\n\n            if not matched_segment:\n                self.logger.warning(\n                    f\"Could not find matching TranscriptSegment for ad prediction offset {pred.segment_offset:.2f} in post {model_call.post_id}, chunk {model_call.first_segment_sequence_num}-{model_call.last_segment_sequence_num}. Confidence: {pred.confidence:.2f}\"\n                )\n                continue\n\n            if matched_segment.id in processed_segment_ids:\n                continue\n\n            processed_segment_ids.add(matched_segment.id)\n            matched_segments.append(matched_segment)\n\n            if self._segment_has_ad_identification(matched_segment.id):\n                self.logger.debug(\n                    \"Segment %s for post %s already has an ad identification; skipping new record.\",\n                    matched_segment.id,\n                    model_call.post_id,\n                )\n                continue\n\n            to_insert.append(\n                {\n                    \"transcript_segment_id\": matched_segment.id,\n                    \"model_call_id\": model_call.id,\n                    \"label\": \"ad\",\n                    \"confidence\": adjusted_confidence,\n                }\n            )\n\n            self._maybe_add_preroll_context(\n                matched_segment=matched_segment,\n                current_chunk_db_segments=current_chunk_db_segments,\n                model_call=model_call,\n                processed_segment_ids=processed_segment_ids,\n                matched_segments=matched_segments,\n                base_confidence=adjusted_confidence,\n                to_insert=to_insert,\n            )\n\n        if not to_insert:\n            return 0, matched_segments\n\n        res = writer_client.action(\n            \"insert_identifications\",\n            {\"identifications\": to_insert},\n            wait=True,\n        )\n        if not res or not res.success:\n            raise RuntimeError(\n                getattr(res, \"error\", \"Failed to insert identifications\")\n            )\n\n        inserted = int((res.data or {}).get(\"inserted\") or 0)\n        return inserted, matched_segments\n\n    def _adjust_confidence(\n        self, *, base_confidence: float, content_type: Optional[str]\n    ) -> float:\n        \"\"\"Demote confidence for self-promo/educational contexts.\"\"\"\n        if not content_type:\n            return base_confidence\n\n        if content_type in {\"educational/self_promo\", \"technical_discussion\"}:\n            return max(0.0, base_confidence - 0.25)\n        if content_type == \"transition\":\n            return max(0.0, base_confidence - 0.1)\n        return base_confidence\n\n    def _maybe_add_preroll_context(\n        self,\n        *,\n        matched_segment: TranscriptSegment,\n        current_chunk_db_segments: List[TranscriptSegment],\n        model_call: ModelCall,\n        processed_segment_ids: Set[int],\n        matched_segments: List[TranscriptSegment],\n        base_confidence: float,\n        to_insert: List[Dict[str, Any]],\n    ) -> int:\n        \"\"\"If an ad is detected within the first 45s, include up to 3 preceding intro segments.\"\"\"\n        if matched_segment.start_time > 45.0:\n            return 0\n\n        created = 0\n        matched_index = current_chunk_db_segments.index(matched_segment)\n        start_index = max(0, matched_index - 3)\n        for seg in current_chunk_db_segments[start_index:matched_index]:\n            if seg.id in processed_segment_ids:\n                continue\n            if self._segment_has_ad_identification(seg.id):\n                continue\n\n            processed_segment_ids.add(seg.id)\n            matched_segments.append(seg)\n            to_insert.append(\n                {\n                    \"transcript_segment_id\": seg.id,\n                    \"model_call_id\": model_call.id,\n                    \"label\": \"ad\",\n                    \"confidence\": max(\n                        base_confidence, self.config.output.min_confidence\n                    ),\n                }\n            )\n            created += 1\n\n        if created:\n            self.logger.debug(\n                \"Pre-roll look-back added %s intro segments before %s (post %s)\",\n                created,\n                matched_segment.sequence_num,\n                model_call.post_id,\n            )\n        return created\n\n    def _find_matching_segment(\n        self,\n        *,\n        segment_offset: float,\n        current_chunk_db_segments: List[TranscriptSegment],\n    ) -> Optional[TranscriptSegment]:\n        \"\"\"Find the TranscriptSegment that matches the given segment offset.\"\"\"\n        min_diff = float(\"inf\")\n        matched_segment = None\n        for ts_segment in current_chunk_db_segments:\n            diff = abs(ts_segment.start_time - segment_offset)\n            if diff < min_diff and diff < 0.5:  # Tolerance of 0.5 seconds\n                matched_segment = ts_segment\n                min_diff = diff\n        return matched_segment\n\n    def _segment_has_ad_identification(self, transcript_segment_id: int) -> bool:\n        \"\"\"Check if a transcript segment already has an ad identification.\n\n        NOTE: Uses self.db_session.query() for session consistency.\n        \"\"\"\n        return (\n            self.db_session.query(Identification)\n            .filter_by(\n                transcript_segment_id=transcript_segment_id,\n                label=\"ad\",\n            )\n            .first()\n            is not None\n        )\n\n    def _is_retryable_error(self, error: Exception) -> bool:\n        \"\"\"Determine if an error should be retried.\"\"\"\n        if isinstance(error, InternalServerError):\n            return True\n\n        # Check for retryable HTTP errors in other exception types\n        error_str = str(error).lower()\n        return (\n            \"503\" in error_str\n            or \"service unavailable\" in error_str\n            or \"rate_limit_error\" in error_str\n            or \"ratelimiterror\" in error_str\n            or \"429\" in error_str\n            or \"rate limit\" in error_str\n        )\n\n    def _call_model(\n        self,\n        model_call_obj: ModelCall,\n        system_prompt: str,\n        max_retries: Optional[int] = None,\n    ) -> Optional[str]:\n        \"\"\"Call the LLM model with retry logic.\"\"\"\n        # Use configured retry count if not specified\n        retry_count = (\n            max_retries\n            if max_retries is not None\n            else getattr(self.config, \"llm_max_retry_attempts\", 3)\n        )\n\n        last_error: Optional[Exception] = None\n        raw_response_content = None\n        original_retry_attempts = (\n            0\n            if model_call_obj.retry_attempts is None\n            else model_call_obj.retry_attempts\n        )\n\n        for attempt in range(retry_count):\n            retry_attempts_value = original_retry_attempts + attempt + 1\n            current_attempt_num = attempt + 1\n\n            self.logger.info(\n                f\"Calling model {model_call_obj.model_name} for ModelCall {model_call_obj.id} (attempt {current_attempt_num}/{retry_count})\"\n            )\n\n            try:\n                # Persist retry attempt + pending status via writer\n                if model_call_obj.id is not None:\n                    pending_res = writer_client.update(\n                        \"ModelCall\",\n                        model_call_obj.id,\n                        {\"status\": \"pending\", \"retry_attempts\": retry_attempts_value},\n                        wait=True,\n                    )\n                    if not pending_res or not pending_res.success:\n                        raise RuntimeError(\n                            getattr(pending_res, \"error\", \"Failed to update ModelCall\")\n                        )\n\n                # Prepare API call and validate token limits\n                completion_args = self._prepare_api_call(model_call_obj, system_prompt)\n                if completion_args is None:\n                    return None  # Token limit exceeded\n\n                # Use concurrency limiter if available\n                if self.concurrency_limiter:\n                    with ConcurrencyContext(self.concurrency_limiter, timeout=30.0):\n                        response = litellm.completion(**completion_args)\n                else:\n                    response = litellm.completion(**completion_args)\n\n                response_first_choice = response.choices[0]\n                assert isinstance(response_first_choice, Choices)\n                content = response_first_choice.message.content\n                assert content is not None\n                raw_response_content = content\n\n                success_res = writer_client.update(\n                    \"ModelCall\",\n                    model_call_obj.id,\n                    {\n                        \"response\": raw_response_content,\n                        \"status\": \"success\",\n                        \"error_message\": None,\n                        \"retry_attempts\": retry_attempts_value,\n                    },\n                    wait=True,\n                )\n                if not success_res or not success_res.success:\n                    raise RuntimeError(\n                        getattr(success_res, \"error\", \"Failed to update ModelCall\")\n                    )\n                # Update local object to reflect database state\n                model_call_obj.status = \"success\"\n                model_call_obj.response = raw_response_content\n                model_call_obj.error_message = None\n                self.logger.info(\n                    f\"Model call {model_call_obj.id} successful on attempt {current_attempt_num}.\"\n                )\n                return raw_response_content\n\n            except Exception as e:\n                last_error = e\n                if self._is_retryable_error(e):\n                    self._handle_retryable_error(\n                        model_call_obj=model_call_obj,\n                        error=e,\n                        attempt=attempt,\n                        current_attempt_num=current_attempt_num,\n                    )\n                    # Continue to next retry\n                else:\n                    self.logger.error(\n                        f\"Non-retryable LLM error for ModelCall {model_call_obj.id} (attempt {current_attempt_num}): {e}\",\n                        exc_info=True,\n                    )\n                    fail_res = writer_client.update(\n                        \"ModelCall\",\n                        model_call_obj.id,\n                        {\"status\": \"failed_permanent\", \"error_message\": str(e)},\n                        wait=True,\n                    )\n                    if not fail_res or not fail_res.success:\n                        raise RuntimeError(\n                            getattr(fail_res, \"error\", \"Failed to update ModelCall\")\n                        ) from e\n                    # Update local object to reflect database state\n                    model_call_obj.status = \"failed_permanent\"\n                    model_call_obj.error_message = str(e)\n                    raise  # Re-raise non-retryable exceptions immediately\n\n        # If we get here, all retries were exhausted\n        self._handle_retry_exhausted(model_call_obj, retry_count, last_error)\n\n        if last_error:\n            raise last_error\n        raise RuntimeError(\n            f\"Maximum retries ({retry_count}) exceeded for ModelCall {model_call_obj.id}.\"\n        )\n\n    def _handle_retryable_error(\n        self,\n        *,\n        model_call_obj: ModelCall,\n        error: Union[InternalServerError, Exception],\n        attempt: int,\n        current_attempt_num: int,\n    ) -> None:\n        \"\"\"Handle a retryable error during LLM call.\"\"\"\n        self.logger.error(\n            f\"LLM retryable error for ModelCall {model_call_obj.id} (attempt {current_attempt_num}): {error}\"\n        )\n        res = writer_client.update(\n            \"ModelCall\",\n            model_call_obj.id,\n            {\"error_message\": str(error)},\n            wait=True,\n        )\n        if not res or not res.success:\n            raise RuntimeError(getattr(res, \"error\", \"Failed to update ModelCall\"))\n        # Update local object to reflect database state\n        model_call_obj.error_message = str(error)\n\n        # Use longer backoff for rate limiting errors\n        error_str = str(error).lower()\n        if any(\n            term in error_str\n            for term in [\"rate_limit_error\", \"ratelimiterror\", \"429\", \"rate limit\"]\n        ):\n            # For rate limiting, use longer backoff: 60, 120, 240 seconds\n            wait_time = 60 * (2**attempt)\n            self.logger.info(\n                f\"Rate limit detected. Waiting {wait_time}s before retry for ModelCall {model_call_obj.id}.\"\n            )\n        else:\n            # For other errors, use shorter exponential backoff: 1, 2, 4 seconds\n            wait_time = (2**attempt) * 1\n            self.logger.info(\n                f\"Waiting {wait_time}s before next retry for ModelCall {model_call_obj.id}.\"\n            )\n\n        time.sleep(wait_time)\n\n    def _handle_retry_exhausted(\n        self,\n        model_call_obj: ModelCall,\n        max_retries: int,\n        last_error: Optional[Exception],\n    ) -> None:\n        \"\"\"Handle the case when all retries are exhausted.\"\"\"\n        self.logger.error(\n            f\"Failed to call model for ModelCall {model_call_obj.id} after {max_retries} attempts.\"\n        )\n        if last_error:\n            error_message = str(last_error)\n        else:\n            error_message = f\"Maximum retries ({max_retries}) exceeded without a specific InternalServerError.\"\n\n        res = writer_client.update(\n            \"ModelCall\",\n            model_call_obj.id,\n            {\"status\": \"failed_retries\", \"error_message\": error_message},\n            wait=True,\n        )\n        if not res or not res.success:\n            raise RuntimeError(getattr(res, \"error\", \"Failed to update ModelCall\"))\n        # Update local object to reflect database state\n        model_call_obj.status = \"failed_retries\"\n        model_call_obj.error_message = error_message\n\n    def _get_segments_bulk(\n        self, post_id: int, sequence_numbers: List[int]\n    ) -> Dict[int, TranscriptSegment]:\n        \"\"\"Fetch multiple segments in one query.\n\n        NOTE: Must use self.db_session.query() instead of TranscriptSegment.query\n        to ensure we use the same session. Using TranscriptSegment.query\n        (the Flask-SQLAlchemy scoped session) can lead to SQLite lock issues\n        when another query on self.db_session is mid-transaction.\n        \"\"\"\n        segments = (\n            self.db_session.query(TranscriptSegment)\n            .filter(\n                and_(\n                    TranscriptSegment.post_id == post_id,\n                    TranscriptSegment.sequence_num.in_(sequence_numbers),\n                )\n            )\n            .all()\n        )\n        return {seg.sequence_num: seg for seg in segments}\n\n    def _get_existing_ids_bulk(\n        self, post_id: int, model_call_id: int\n    ) -> Set[Tuple[int, int, str]]:\n        \"\"\"Fetch all existing identifications as a set for O(1) lookup.\n\n        NOTE: Uses self.db_session.query() for session consistency.\n        \"\"\"\n        ids = (\n            self.db_session.query(Identification)\n            .join(TranscriptSegment)\n            .filter(\n                and_(\n                    TranscriptSegment.post_id == post_id,\n                    Identification.model_call_id == model_call_id,\n                )\n            )\n            .all()\n        )\n        return {(i.transcript_segment_id, i.model_call_id, i.label) for i in ids}\n\n    def _create_identifications_bulk(\n        self, identifications: List[Dict[str, Any]]\n    ) -> int:\n        \"\"\"Bulk insert identifications\"\"\"\n        if not identifications:\n            return 0\n        res = writer_client.action(\n            \"insert_identifications\",\n            {\"identifications\": identifications},\n            wait=True,\n        )\n        if not res or not res.success:\n            raise RuntimeError(\n                getattr(res, \"error\", \"Failed to insert identifications\")\n            )\n        return int((res.data or {}).get(\"inserted\") or 0)\n\n    def expand_neighbors_bulk(\n        self,\n        ad_identifications: List[Identification],\n        model_call: ModelCall,\n        post_id: int,\n        window: int = 5,\n    ) -> int:\n        \"\"\"Expand neighbors using bulk operations (3 queries instead of 900)\"\"\"\n\n        # PHASE 1: Bulk data collection (2 queries)\n\n        # Collect all sequence numbers we need\n        sequence_numbers = set()\n        for ident in ad_identifications:\n            base_seq = ident.transcript_segment.sequence_num\n            for offset in range(-window, window + 1):\n                sequence_numbers.add(base_seq + offset)\n\n        # Query 1: Bulk fetch segments\n        segments_by_seq = self._get_segments_bulk(post_id, list(sequence_numbers))\n\n        # Query 2: Bulk fetch existing identifications\n        existing = self._get_existing_ids_bulk(post_id, model_call.id)\n\n        # PHASE 2: In-memory processing (0 queries)\n\n        to_create = []\n        for ident in ad_identifications:\n            base_seq = ident.transcript_segment.sequence_num\n\n            for offset in range(-window, window + 1):\n                if offset == 0:\n                    continue\n\n                neighbor_seq = base_seq + offset\n                seg = segments_by_seq.get(neighbor_seq)\n                if not seg:\n                    continue\n\n                # Check if already exists (O(1) lookup)\n                key = (seg.id, model_call.id, \"ad\")\n                if key in existing:\n                    continue\n\n                text = seg.text or \"\"\n                signals = self.cue_detector.analyze(text)\n                has_strong_cue = (\n                    signals[\"url\"]\n                    or signals[\"promo\"]\n                    or signals[\"phone\"]\n                    or signals[\"cta\"]\n                )\n                is_transition = signals[\"transition\"]\n                is_self_promo = signals[\"self_promo\"]\n\n                gap_seconds = abs(\n                    (seg.start_time or 0.0)\n                    - (ident.transcript_segment.start_time or 0.0)\n                )\n\n                if not self._should_expand_neighbor(\n                    has_strong_cue=has_strong_cue,\n                    is_transition=is_transition,\n                    gap_seconds=gap_seconds,\n                ):\n                    continue\n\n                confidence = self._neighbor_confidence(\n                    has_strong_cue=has_strong_cue,\n                    is_transition=is_transition,\n                    is_self_promo=is_self_promo,\n                    gap_seconds=gap_seconds,\n                )\n\n                to_create.append(\n                    {\n                        \"transcript_segment_id\": seg.id,\n                        \"model_call_id\": model_call.id,\n                        \"label\": \"ad\",\n                        \"confidence\": confidence,\n                    }\n                )\n                existing.add(key)  # Avoid duplicates in this batch\n\n        # PHASE 3: Bulk insert (1 query)\n\n        if to_create:\n            return self._create_identifications_bulk(to_create)\n        return 0\n\n    def _should_expand_neighbor(\n        self,\n        *,\n        has_strong_cue: bool,\n        is_transition: bool,\n        gap_seconds: float,\n    ) -> bool:\n        if not self.config.enable_boundary_refinement:\n            return has_strong_cue\n\n        if has_strong_cue or is_transition:\n            return True\n\n        return gap_seconds <= 10.0\n\n    @staticmethod\n    def _neighbor_confidence(\n        *,\n        has_strong_cue: bool,\n        is_transition: bool,\n        is_self_promo: bool,\n        gap_seconds: float,\n    ) -> float:\n        confidence = 0.72 if is_transition else 0.75\n        if has_strong_cue:\n            confidence = 0.85 if gap_seconds <= 10.0 else 0.8\n        if is_self_promo:\n            confidence = max(0.5, confidence - 0.25)\n        return confidence\n\n    def _refine_boundaries(\n        self, transcript_segments: List[TranscriptSegment], post: Post\n    ) -> None:\n        \"\"\"Apply boundary refinement to detected ads.\n\n        NOTE: Uses self.db_session.query() for session consistency.\n        \"\"\"\n        if not self.boundary_refiner:\n            return\n\n        # Latest refined boundaries for downstream audio cuts. Overwrites prior\n        # values for the post (\"latest successful\" semantics).\n        refined_boundaries: List[Dict[str, Any]] = []\n\n        # Get ad identifications\n        identifications = (\n            self.db_session.query(Identification)\n            .join(TranscriptSegment)\n            .filter(TranscriptSegment.post_id == post.id, Identification.label == \"ad\")\n            .all()\n        )\n\n        # Group into ad blocks\n        ad_blocks = self._group_into_blocks(identifications)\n\n        for block in ad_blocks:\n            # Skip low confidence or very short blocks\n            if block[\"confidence\"] < 0.6 or (block[\"end\"] - block[\"start\"]) < 15.0:\n                continue\n\n            # Refine\n            seq_nums = [\n                ident.transcript_segment.sequence_num\n                for ident in block[\"identifications\"]\n                if ident.transcript_segment is not None\n            ]\n\n            refinement = self.boundary_refiner.refine(\n                ad_start=block[\"start\"],\n                ad_end=block[\"end\"],\n                confidence=block[\"confidence\"],\n                all_segments=[\n                    {\n                        \"sequence_num\": s.sequence_num,\n                        \"start_time\": s.start_time,\n                        \"text\": s.text,\n                        \"end_time\": s.end_time,\n                    }\n                    for s in transcript_segments\n                ],\n                post_id=post.id,\n                first_seq_num=min(seq_nums) if seq_nums else None,\n                last_seq_num=max(seq_nums) if seq_nums else None,\n            )\n\n            # Apply refinement: delete old identifications, create new ones\n            # Note: Get model_call from block identifications\n            model_call = (\n                block[\"identifications\"][0].model_call\n                if block[\"identifications\"]\n                else None\n            )\n            if model_call:\n                self._apply_refinement(\n                    block, refinement, transcript_segments, post, model_call\n                )\n\n                refined_boundaries.append(\n                    {\n                        \"orig_start\": float(block[\"start\"]),\n                        \"orig_end\": float(block[\"end\"]),\n                        \"refined_start\": float(refinement.refined_start),\n                        \"refined_end\": float(refinement.refined_end),\n                        \"confidence\": float(block.get(\"confidence\", 0.0) or 0.0),\n                    }\n                )\n\n        # Store latest refined boundaries on the post so audio processing can cut\n        # using refined timestamps (including word-level refined start times).\n        # Clear the value when we have no refined boundaries so stale data doesn't\n        # affect future audio cuts.\n        try:\n            res = writer_client.update(\n                \"Post\",\n                post.id,\n                {\n                    \"refined_ad_boundaries\": refined_boundaries or None,\n                    \"refined_ad_boundaries_updated_at\": datetime.utcnow(),\n                },\n                wait=True,\n            )\n            if not res or not res.success:\n                raise RuntimeError(\n                    getattr(res, \"error\", \"Failed to update refined ad boundaries\")\n                )\n        except Exception as exc:  # pylint: disable=broad-except\n            # Best-effort: cutting can fall back to segment-derived windows.\n            self.logger.warning(\n                \"Failed to persist refined ad boundaries for post %s: %s\",\n                post.id,\n                exc,\n            )\n\n    def _group_into_blocks(\n        self, identifications: List[Identification]\n    ) -> List[Dict[str, Any]]:\n        \"\"\"Group adjacent identifications into ad blocks\"\"\"\n        if not identifications:\n            return []\n\n        identifications = sorted(\n            identifications, key=lambda i: i.transcript_segment.start_time\n        )\n        blocks: List[Dict[str, Any]] = []\n        current: List[Identification] = []\n\n        for ident in identifications:\n            if (\n                not current\n                or ident.transcript_segment.start_time\n                - current[-1].transcript_segment.end_time\n                <= 10.0\n            ):\n                current.append(ident)\n            else:\n                blocks.append(self._create_block(current))\n                current = [ident]\n\n        if current:\n            blocks.append(self._create_block(current))\n\n        return blocks\n\n    def _create_block(self, identifications: List[Identification]) -> Dict[str, Any]:\n        return {\n            \"start\": min(i.transcript_segment.start_time for i in identifications),\n            \"end\": max(i.transcript_segment.end_time for i in identifications),\n            \"confidence\": sum(i.confidence for i in identifications)\n            / len(identifications),\n            \"identifications\": identifications,\n        }\n\n    def _apply_refinement(\n        self,\n        block: Dict[str, Any],\n        refinement: Any,\n        transcript_segments: List[TranscriptSegment],\n        post: Post,\n        model_call: ModelCall,\n    ) -> None:\n        \"\"\"Update identifications based on refined boundaries\"\"\"\n        delete_ids = [\n            i.id\n            for i in block.get(\"identifications\", [])\n            if getattr(i, \"id\", None) is not None\n        ]\n\n        new_identifications: List[Dict[str, Any]] = []\n        for seg in transcript_segments:\n            seg_start = float(seg.start_time or 0.0)\n            seg_end = float(seg.end_time or seg_start)\n            # Keep segments that overlap the refined window. This preserves the\n            # containing segment when refined boundaries fall mid-segment.\n            if seg_start <= float(refinement.refined_end) and seg_end >= float(\n                refinement.refined_start\n            ):\n                new_identifications.append(\n                    {\n                        \"transcript_segment_id\": seg.id,\n                        \"model_call_id\": model_call.id,\n                        \"label\": \"ad\",\n                        \"confidence\": block[\"confidence\"],\n                    }\n                )\n\n        res = writer_client.action(\n            \"replace_identifications\",\n            {\"delete_ids\": delete_ids, \"new_identifications\": new_identifications},\n            wait=True,\n        )\n        if not res or not res.success:\n            raise RuntimeError(\n                getattr(res, \"error\", \"Failed to replace identifications\")\n            )\n"
  },
  {
    "path": "src/podcast_processor/ad_merger.py",
    "content": "import re\nfrom dataclasses import dataclass\nfrom typing import Dict, List, Pattern\n\nfrom app.models import Identification, TranscriptSegment\n\n\n@dataclass\nclass AdGroup:\n    segments: List[TranscriptSegment]\n    identifications: List[Identification]\n    start_time: float\n    end_time: float\n    confidence_avg: float\n    keywords: List[str]\n\n\nclass AdMerger:\n    def __init__(self) -> None:\n        self.url_pattern: Pattern[str] = re.compile(\n            r\"\\b([a-z0-9\\-\\.]+\\.(?:com|net|org|io))\\b\", re.I\n        )\n        self.promo_pattern: Pattern[str] = re.compile(\n            r\"\\b(code|promo|save)\\s+\\w+\\b\", re.I\n        )\n        self.phone_pattern: Pattern[str] = re.compile(r\"\\b\\d{3}[ -]?\\d{3}[ -]?\\d{4}\\b\")\n\n    def merge(\n        self,\n        ad_segments: List[TranscriptSegment],\n        identifications: List[Identification],\n        max_gap: float = 8.0,\n        min_content_gap: float = 12.0,\n    ) -> List[AdGroup]:\n        \"\"\"Merge ad segments using content analysis\"\"\"\n        if not ad_segments:\n            return []\n\n        # Sort by time\n        ad_segments = sorted(ad_segments, key=lambda s: s.start_time)\n\n        # Group by proximity\n        groups = self._group_by_proximity(ad_segments, identifications, max_gap)\n\n        # Refine using content analysis\n        groups = self._refine_by_content(groups, min_content_gap)\n\n        # Filter weak groups\n        return [g for g in groups if self._is_valid_group(g)]\n\n    def _group_by_proximity(\n        self,\n        segments: List[TranscriptSegment],\n        identifications: List[Identification],\n        max_gap: float,\n    ) -> List[AdGroup]:\n        \"\"\"Initial grouping by time proximity\"\"\"\n        id_lookup: Dict[int, Identification] = {\n            i.transcript_segment_id: i for i in identifications\n        }\n        groups: List[AdGroup] = []\n        current: List[TranscriptSegment] = []\n\n        for seg in segments:\n            if not current or seg.start_time - current[-1].end_time <= max_gap:\n                current.append(seg)\n            else:\n                if current:\n                    groups.append(self._create_group(current, id_lookup))\n                current = [seg]\n\n        if current:\n            groups.append(self._create_group(current, id_lookup))\n\n        return groups\n\n    def _create_group(\n        self,\n        segments: List[TranscriptSegment],\n        id_lookup: Dict[int, Identification],\n    ) -> AdGroup:\n        ids = [id_lookup[s.id] for s in segments if s.id in id_lookup]\n        return AdGroup(\n            segments=segments,\n            identifications=ids,\n            start_time=segments[0].start_time,\n            end_time=segments[-1].end_time,\n            confidence_avg=sum(i.confidence for i in ids) / len(ids) if ids else 0.0,\n            keywords=self._extract_keywords(segments),\n        )\n\n    def _extract_keywords(self, segments: List[TranscriptSegment]) -> List[str]:\n        \"\"\"Extract URLs, promo codes, brands\"\"\"\n        text = \" \".join(s.text or \"\" for s in segments).lower()\n        keywords: List[str] = []\n\n        # URLs\n        keywords.extend(self.url_pattern.findall(text))\n\n        # Promo codes\n        keywords.extend(self.promo_pattern.findall(text))\n\n        # Phone numbers\n        if self.phone_pattern.search(text):\n            keywords.append(\"phone\")\n\n        # Brand names (capitalized words appearing 2+ times)\n        words = re.findall(r\"\\b[A-Z][a-z]+\\b\", \" \".join(s.text for s in segments))\n        counts: Dict[str, int] = {}\n        for word in words:\n            if len(word) > 3:\n                counts[word] = counts.get(word, 0) + 1\n        keywords.extend(w.lower() for w, c in counts.items() if c >= 2)\n\n        return list(set(keywords))\n\n    def _refine_by_content(\n        self, groups: List[AdGroup], min_content_gap: float\n    ) -> List[AdGroup]:\n        \"\"\"Merge groups with shared sponsors\"\"\"\n        if len(groups) <= 1:\n            return groups\n\n        refined: List[AdGroup] = []\n        i = 0\n\n        while i < len(groups):\n            current = groups[i]\n\n            if i + 1 < len(groups):\n                next_group = groups[i + 1]\n                gap = next_group.start_time - current.end_time\n\n                if gap <= min_content_gap and self._should_merge(current, next_group):\n                    # Merge\n                    merged = AdGroup(\n                        segments=current.segments + next_group.segments,\n                        identifications=current.identifications\n                        + next_group.identifications,\n                        start_time=current.start_time,\n                        end_time=next_group.end_time,\n                        confidence_avg=(\n                            current.confidence_avg + next_group.confidence_avg\n                        )\n                        / 2,\n                        keywords=list(set(current.keywords + next_group.keywords)),\n                    )\n                    refined.append(merged)\n                    i += 2\n                else:\n                    refined.append(current)\n                    i += 1\n            else:\n                refined.append(current)\n                i += 1\n\n        return refined\n\n    def _should_merge(self, group1: AdGroup, group2: AdGroup) -> bool:\n        \"\"\"Check if groups belong to same sponsor\"\"\"\n        # High confidence → merge\n        if group1.confidence_avg >= 0.9 and group2.confidence_avg >= 0.9:\n            return True\n\n        # Shared keywords (URL or brand)\n        shared = set(group1.keywords) & set(group2.keywords)\n        if len(shared) >= 1:\n            return True\n\n        # Small gap with good confidence\n        gap = group2.start_time - group1.end_time\n        if (\n            gap <= 10.0\n            and group1.confidence_avg >= 0.8\n            and group2.confidence_avg >= 0.8\n        ):\n            return True\n\n        return False\n\n    def _is_valid_group(self, group: AdGroup) -> bool:\n        \"\"\"Filter out weak single-segment groups\"\"\"\n        duration = group.end_time - group.start_time\n        if duration > 180.0 and not group.keywords and group.confidence_avg < 0.9:\n            # Long sponsor monologues without clear cues are likely educational/self-promo\n            return False\n        if len(group.segments) < 2 or duration <= 10.0:\n            # Keep only if has strong keywords or high confidence\n            return len(group.keywords) >= 1 or group.confidence_avg >= 0.9\n        return True\n"
  },
  {
    "path": "src/podcast_processor/audio.py",
    "content": "import logging\nimport math\nimport os\nimport tempfile\nfrom pathlib import Path\nfrom typing import List, Optional, Tuple\n\nimport ffmpeg  # type: ignore[import-untyped]\n\nlogger = logging.getLogger(\"global_logger\")\n\n\ndef get_audio_duration_ms(file_path: str) -> Optional[int]:\n    try:\n        logger.debug(\"[FFMPEG_PROBE] Probing audio file: %s\", file_path)\n        probe = ffmpeg.probe(file_path)\n        format_info = probe[\"format\"]\n        duration_seconds = float(format_info[\"duration\"])\n        duration_milliseconds = duration_seconds * 1000\n        logger.debug(\"[FFMPEG_PROBE] Duration: %.2f seconds\", duration_seconds)\n        return int(duration_milliseconds)\n    except ffmpeg.Error as e:\n        logger.error(\n            \"[FFMPEG_PROBE] Error probing file %s: %s\",\n            file_path,\n            e.stderr.decode() if e.stderr else str(e),\n        )\n        return None\n\n\ndef clip_segments_with_fade(\n    ad_segments_ms: List[Tuple[int, int]],\n    fade_ms: int,\n    in_path: str,\n    out_path: str,\n) -> None:\n\n    audio_duration_ms = get_audio_duration_ms(in_path)\n    assert audio_duration_ms is not None\n\n    # Try the complex filter approach first, fall back to simple if it fails\n    # Catch both ffmpeg.Error (runtime) and broader exceptions (filter graph construction)\n    try:\n        _clip_segments_complex(\n            ad_segments_ms, fade_ms, in_path, out_path, audio_duration_ms\n        )\n    except ffmpeg.Error as e:\n        err_msg = e.stderr.decode() if getattr(e, \"stderr\", None) else str(e)\n        logger.warning(\n            \"Complex filter failed (ffmpeg error), trying simple approach: %s\", err_msg\n        )\n        _clip_segments_simple(ad_segments_ms, in_path, out_path, audio_duration_ms)\n    except Exception as e:  # pylint: disable=broad-except\n        # Catches filter graph construction errors like \"multiple outgoing edges\"\n        logger.warning(\n            \"Complex filter failed (graph error), trying simple approach: %s\", e\n        )\n        _clip_segments_simple(ad_segments_ms, in_path, out_path, audio_duration_ms)\n\n\ndef _clip_segments_complex(\n    ad_segments_ms: List[Tuple[int, int]],\n    fade_ms: int,\n    in_path: str,\n    out_path: str,\n    audio_duration_ms: int,\n) -> None:\n    \"\"\"Original complex approach with fades.\"\"\"\n\n    trimmed_list = []\n\n    last_end = 0\n    for start_ms, end_ms in ad_segments_ms:\n        trimmed_list.extend(\n            [\n                ffmpeg.input(in_path).filter(\n                    \"atrim\", start=last_end / 1000.0, end=start_ms / 1000.0\n                ),\n                ffmpeg.input(in_path)\n                .filter(\n                    \"atrim\", start=start_ms / 1000.0, end=(start_ms + fade_ms) / 1000.0\n                )\n                .filter(\"afade\", t=\"out\", ss=0, d=fade_ms / 1000.0),\n                ffmpeg.input(in_path)\n                .filter(\"atrim\", start=(end_ms - fade_ms) / 1000.0, end=end_ms / 1000.0)\n                .filter(\"afade\", t=\"in\", ss=0, d=fade_ms / 1000.0),\n            ]\n        )\n\n        last_end = end_ms\n\n    if last_end != audio_duration_ms:\n        trimmed_list.append(\n            ffmpeg.input(in_path).filter(\n                \"atrim\", start=last_end / 1000.0, end=audio_duration_ms / 1000.0\n            )\n        )\n\n    logger.info(\n        \"[FFMPEG_CONCAT] Starting audio concatenation: %s -> %s (%d segments)\",\n        in_path,\n        out_path,\n        len(trimmed_list),\n    )\n    ffmpeg.concat(*trimmed_list, v=0, a=1).output(out_path).overwrite_output().run()\n    logger.info(\"[FFMPEG_CONCAT] Completed audio concatenation: %s\", out_path)\n\n\ndef _clip_segments_simple(\n    ad_segments_ms: List[Tuple[int, int]],\n    in_path: str,\n    out_path: str,\n    audio_duration_ms: int,\n) -> None:\n    \"\"\"Simpler approach without fades - more reliable for many segments.\"\"\"\n\n    # Build list of segments to keep (inverse of ad segments)\n    keep_segments: List[Tuple[int, int]] = []\n    last_end = 0\n\n    for start_ms, end_ms in ad_segments_ms:\n        if start_ms > last_end:\n            keep_segments.append((last_end, start_ms))\n        last_end = end_ms\n\n    if last_end < audio_duration_ms:\n        keep_segments.append((last_end, audio_duration_ms))\n\n    if not keep_segments:\n        raise ValueError(\"No audio segments to keep after ad removal\")\n\n    logger.info(\n        \"[FFMPEG_SIMPLE] Starting simple concat with %d segments\", len(keep_segments)\n    )\n\n    # Create temp directory for intermediate files\n    with tempfile.TemporaryDirectory() as temp_dir:\n        segment_files = []\n\n        # Extract each segment to keep\n        for i, (start_ms, end_ms) in enumerate(keep_segments):\n            segment_path = os.path.join(temp_dir, f\"segment_{i}.mp3\")\n            start_sec = start_ms / 1000.0\n            duration_sec = (end_ms - start_ms) / 1000.0\n\n            (\n                ffmpeg.input(in_path)\n                .output(\n                    segment_path, ss=start_sec, t=duration_sec, acodec=\"libmp3lame\", q=2\n                )\n                .overwrite_output()\n                .run(quiet=True)\n            )\n\n            segment_files.append(segment_path)\n\n        # Create concat file list\n        concat_list_path = os.path.join(temp_dir, \"concat_list.txt\")\n        with open(concat_list_path, \"w\", encoding=\"utf-8\") as file_list:\n            for seg_file in segment_files:\n                file_list.write(f\"file '{seg_file}'\\n\")\n\n        # Concatenate all segments\n        (\n            ffmpeg.input(concat_list_path, format=\"concat\", safe=0)\n            .output(out_path, acodec=\"libmp3lame\", q=2)\n            .overwrite_output()\n            .run(quiet=True)\n        )\n\n    logger.info(\"[FFMPEG_SIMPLE] Completed simple audio concatenation: %s\", out_path)\n\n\ndef trim_file(in_path: Path, out_path: Path, start_ms: int, end_ms: int) -> None:\n    duration_ms = end_ms - start_ms\n\n    if duration_ms <= 0:\n        return\n\n    start_sec = max(start_ms, 0) / 1000.0\n    duration_sec = duration_ms / 1000.0\n\n    logger.debug(\n        \"[FFMPEG_TRIM] Trimming %s -> %s (start=%.2fs, duration=%.2fs)\",\n        in_path,\n        out_path,\n        start_sec,\n        duration_sec,\n    )\n    (\n        ffmpeg.input(str(in_path))\n        .output(\n            str(out_path),\n            ss=start_sec,\n            t=duration_sec,\n            acodec=\"copy\",\n            vn=None,\n        )\n        .overwrite_output()\n        .run()\n    )\n\n\ndef split_audio(\n    audio_file_path: Path,\n    audio_chunk_path: Path,\n    chunk_size_bytes: int,\n) -> List[Tuple[Path, int]]:\n\n    audio_chunk_path.mkdir(parents=True, exist_ok=True)\n\n    logger.info(\n        \"[FFMPEG_SPLIT] Splitting audio file: %s into chunks of %d bytes\",\n        audio_file_path,\n        chunk_size_bytes,\n    )\n    duration_ms = get_audio_duration_ms(str(audio_file_path))\n    assert duration_ms is not None\n    if chunk_size_bytes <= 0:\n        raise ValueError(\"chunk_size_bytes must be a positive integer\")\n\n    file_size_bytes = audio_file_path.stat().st_size\n    if file_size_bytes == 0:\n        raise ValueError(\"Cannot split zero-byte audio file\")\n\n    chunk_ratio = chunk_size_bytes / file_size_bytes\n    chunk_duration_ms = max(1, math.ceil(duration_ms * chunk_ratio))\n\n    num_chunks = max(1, math.ceil(duration_ms / chunk_duration_ms))\n    logger.info(\n        \"[FFMPEG_SPLIT] Will create %d chunks (duration per chunk: %d ms)\",\n        num_chunks,\n        chunk_duration_ms,\n    )\n\n    chunks: List[Tuple[Path, int]] = []\n\n    for i in range(num_chunks):\n        start_offset_ms = i * chunk_duration_ms\n        if start_offset_ms >= duration_ms:\n            break\n\n        end_offset_ms = min(duration_ms, (i + 1) * chunk_duration_ms)\n\n        export_path = audio_chunk_path / f\"{i}.mp3\"\n        logger.debug(\n            \"[FFMPEG_SPLIT] Creating chunk %d/%d: %s\", i + 1, num_chunks, export_path\n        )\n        trim_file(audio_file_path, export_path, start_offset_ms, end_offset_ms)\n        chunks.append((export_path, start_offset_ms))\n\n    logger.info(\"[FFMPEG_SPLIT] Split complete: created %d chunks\", len(chunks))\n    return chunks\n"
  },
  {
    "path": "src/podcast_processor/audio_processor.py",
    "content": "import logging\nfrom typing import Any, List, Optional, Tuple\n\nfrom app.extensions import db\nfrom app.models import Identification, ModelCall, Post, TranscriptSegment\nfrom app.writer.client import writer_client\nfrom podcast_processor.ad_merger import AdMerger\nfrom podcast_processor.audio import clip_segments_with_fade, get_audio_duration_ms\nfrom shared.config import Config\n\n\nclass AudioProcessor:\n    \"\"\"Handles audio processing and ad segment removal from podcast files.\"\"\"\n\n    def __init__(\n        self,\n        config: Config,\n        logger: Optional[logging.Logger] = None,\n        identification_query: Optional[Any] = None,\n        transcript_segment_query: Optional[Any] = None,\n        model_call_query: Optional[Any] = None,\n        db_session: Optional[Any] = None,\n    ):\n        self.logger = logger or logging.getLogger(\"global_logger\")\n        self.config = config\n        self._identification_query_provided = identification_query is not None\n        self.identification_query = identification_query or Identification.query\n        self.transcript_segment_query = (\n            transcript_segment_query or TranscriptSegment.query\n        )\n        self.model_call_query = model_call_query or ModelCall.query\n        self.db_session = db_session or db.session\n        self.ad_merger = AdMerger()\n\n    def get_ad_segments(self, post: Post) -> List[Tuple[float, float]]:\n        \"\"\"\n        Retrieves ad segments from the database for a given post.\n\n        NOTE: Uses self.db_session.query() instead of self.identification_query\n        to ensure all operations use the same session consistently.\n\n        Args:\n            post: The Post object to retrieve ad segments for\n\n        Returns:\n            A list of tuples containing start and end times (in seconds) of ad segments\n        \"\"\"\n        self.logger.info(f\"Retrieving ad segments from database for post {post.id}.\")\n\n        query = (\n            self.identification_query\n            if self._identification_query_provided\n            else self.db_session.query(Identification)\n        )\n\n        ad_identifications = (\n            query.join(\n                TranscriptSegment,\n                Identification.transcript_segment_id == TranscriptSegment.id,\n            )\n            .join(ModelCall, Identification.model_call_id == ModelCall.id)\n            .filter(\n                TranscriptSegment.post_id == post.id,\n                Identification.label == \"ad\",\n                Identification.confidence >= self.config.output.min_confidence,\n                ModelCall.status\n                == \"success\",  # Only consider identifications from successful LLM calls\n            )\n            .all()\n        )\n\n        if not ad_identifications:\n            self.logger.info(\n                f\"No ad segments found meeting criteria for post {post.id}.\"\n            )\n            return []\n\n        # Get full segment objects with text for content analysis\n        # Filter out any identifications with missing segments (DB integrity check)\n        ad_segments_with_text = []\n        valid_identifications = []\n        for ident in ad_identifications:\n            segment = ident.transcript_segment\n            if segment:\n                ad_segments_with_text.append(segment)\n                valid_identifications.append(ident)\n            else:\n                # This should ideally not happen if DB integrity is maintained\n                self.logger.warning(\n                    f\"Identification {ident.id} for post {post.id} refers to a missing TranscriptSegment {ident.transcript_segment_id}. Skipping.\"\n                )\n\n        if not ad_segments_with_text:\n            self.logger.info(\n                f\"No valid ad segments with transcript data for post {post.id}.\"\n            )\n            return []\n\n        # Content-aware merge\n        ad_groups = self.ad_merger.merge(\n            ad_segments=ad_segments_with_text,\n            identifications=valid_identifications,\n            max_gap=float(self.config.output.min_ad_segment_separation_seconds),\n            min_content_gap=12.0,\n        )\n\n        # If boundary refinement persisted refined windows on the post, prefer those\n        # refined timestamps for audio cutting (this allows word-level refinement to\n        # affect the actual cut start time).\n        if getattr(self.config, \"enable_boundary_refinement\", False):\n            self._apply_refined_boundaries(post, ad_groups)\n\n        self.logger.info(\n            f\"Merged {len(ad_segments_with_text)} segments into {len(ad_groups)} groups for post {post.id}\"\n        )\n\n        # Convert to time tuples for merge_ad_segments()\n        ad_segments_times = [(g.start_time, g.end_time) for g in ad_groups]\n        ad_segments_times.sort(key=lambda x: x[0])\n        return ad_segments_times\n\n    def _apply_refined_boundaries(self, post: Post, ad_groups: Any) -> None:\n        post_row = self._safe_get_post_row(post)\n        refined = getattr(post_row, \"refined_ad_boundaries\", None) if post_row else None\n        parsed = self._parse_refined_boundaries(refined)\n        if not parsed:\n            return\n\n        for group in ad_groups:\n            overlap_window = self._refined_overlap_window_for_group(group, parsed)\n            if overlap_window is None:\n                continue\n            refined_start_min, refined_end_max = overlap_window\n\n            new_start = max(group.start_time, refined_start_min)\n            new_end = min(group.end_time, refined_end_max)\n            if new_end > new_start:\n                group.start_time = new_start\n                group.end_time = new_end\n\n    def _safe_get_post_row(self, post: Post) -> Optional[Post]:\n        try:\n            return self.db_session.get(Post, post.id)\n        except Exception:  # pylint: disable=broad-except\n            return None\n\n    @staticmethod\n    def _parse_refined_boundaries(\n        refined: Any,\n    ) -> List[Tuple[float, float, float, float]]:\n        if not refined or not isinstance(refined, list):\n            return []\n\n        parsed: List[Tuple[float, float, float, float]] = []\n        for item in refined:\n            if not isinstance(item, dict):\n                continue\n\n            orig_start_raw = item.get(\"orig_start\")\n            orig_end_raw = item.get(\"orig_end\")\n            refined_start_raw = item.get(\"refined_start\")\n            refined_end_raw = item.get(\"refined_end\")\n            if (\n                orig_start_raw is None\n                or orig_end_raw is None\n                or refined_start_raw is None\n                or refined_end_raw is None\n            ):\n                continue\n\n            try:\n                orig_start = float(orig_start_raw)\n                orig_end = float(orig_end_raw)\n                refined_start = float(refined_start_raw)\n                refined_end = float(refined_end_raw)\n            except Exception:  # pylint: disable=broad-except\n                continue\n\n            if refined_end <= refined_start:\n                continue\n\n            parsed.append((orig_start, orig_end, refined_start, refined_end))\n\n        return parsed\n\n    @staticmethod\n    def _refined_overlap_window_for_group(\n        group: Any,\n        parsed: List[Tuple[float, float, float, float]],\n    ) -> Optional[Tuple[float, float]]:\n        overlaps: List[Tuple[float, float]] = []\n        for orig_start, orig_end, refined_start, refined_end in parsed:\n            overlap = max(\n                0.0,\n                min(group.end_time, orig_end) - max(group.start_time, orig_start),\n            )\n            if overlap > 0.0:\n                overlaps.append((refined_start, refined_end))\n\n        if not overlaps:\n            return None\n\n        refined_start_min = min(s for s, _ in overlaps)\n        refined_end_max = max(e for _, e in overlaps)\n        return refined_start_min, refined_end_max\n\n    def merge_ad_segments(\n        self,\n        *,\n        duration_ms: int,\n        ad_segments: List[Tuple[float, float]],\n        min_ad_segment_length_seconds: float,\n        min_ad_segment_separation_seconds: float,\n    ) -> List[Tuple[int, int]]:\n        \"\"\"\n        Merges nearby ad segments and filters out segments that are too short.\n\n        Args:\n            duration_ms: Duration of the audio in milliseconds\n            ad_segments: List of ad segments as (start, end) tuples in seconds\n            min_ad_segment_length_seconds: Minimum length of an ad segment to retain\n            min_ad_segment_separation_seconds: Minimum separation between segments before merging\n\n        Returns:\n            List of merged ad segments as (start, end) tuples in milliseconds\n        \"\"\"\n        audio_duration_seconds = duration_ms / 1000.0\n\n        self.logger.info(\n            f\"Creating new audio with ads segments removed between: {ad_segments}\"\n        )\n        if not ad_segments:\n            return []\n\n        ad_segments = sorted(ad_segments)\n\n        last_segment = self._get_last_segment_if_near_end(\n            ad_segments,\n            audio_duration_seconds=audio_duration_seconds,\n            min_separation=min_ad_segment_separation_seconds,\n        )\n\n        ad_segments = self._merge_close_segments(\n            ad_segments, min_separation=min_ad_segment_separation_seconds\n        )\n        ad_segments = self._filter_short_segments(\n            ad_segments, min_length=min_ad_segment_length_seconds\n        )\n        ad_segments = self._restore_last_segment_if_needed(ad_segments, last_segment)\n        ad_segments = self._extend_last_segment_to_end_if_needed(\n            ad_segments,\n            audio_duration_seconds=audio_duration_seconds,\n            min_separation=min_ad_segment_separation_seconds,\n        )\n\n        self.logger.info(f\"Joined ad segments into: {ad_segments}\")\n        return [(int(start * 1000), int(end * 1000)) for start, end in ad_segments]\n\n    def _get_last_segment_if_near_end(\n        self,\n        ad_segments: List[Tuple[float, float]],\n        *,\n        audio_duration_seconds: float,\n        min_separation: float,\n    ) -> Optional[Tuple[float, float]]:\n        if not ad_segments:\n            return None\n        if (audio_duration_seconds - ad_segments[-1][1]) < min_separation:\n            return ad_segments[-1]\n        return None\n\n    def _merge_close_segments(\n        self,\n        ad_segments: List[Tuple[float, float]],\n        *,\n        min_separation: float,\n    ) -> List[Tuple[float, float]]:\n        merged = list(ad_segments)\n        i = 0\n        while i < len(merged) - 1:\n            if merged[i][1] + min_separation >= merged[i + 1][0]:\n                merged[i] = (merged[i][0], merged[i + 1][1])\n                merged.pop(i + 1)\n            else:\n                i += 1\n        return merged\n\n    def _filter_short_segments(\n        self,\n        ad_segments: List[Tuple[float, float]],\n        *,\n        min_length: float,\n    ) -> List[Tuple[float, float]]:\n        return [s for s in ad_segments if (s[1] - s[0]) >= min_length]\n\n    def _restore_last_segment_if_needed(\n        self,\n        ad_segments: List[Tuple[float, float]],\n        last_segment: Optional[Tuple[float, float]],\n    ) -> List[Tuple[float, float]]:\n        if last_segment is None:\n            return ad_segments\n        if not ad_segments or ad_segments[-1] != last_segment:\n            return [*ad_segments, last_segment]\n        return ad_segments\n\n    def _extend_last_segment_to_end_if_needed(\n        self,\n        ad_segments: List[Tuple[float, float]],\n        *,\n        audio_duration_seconds: float,\n        min_separation: float,\n    ) -> List[Tuple[float, float]]:\n        if not ad_segments:\n            return ad_segments\n        if (audio_duration_seconds - ad_segments[-1][1]) < min_separation:\n            return [*ad_segments[:-1], (ad_segments[-1][0], audio_duration_seconds)]\n        return ad_segments\n\n    def process_audio(self, post: Post, output_path: str) -> None:\n        \"\"\"\n        Process the podcast audio by removing ad segments.\n\n        Args:\n            post: The Post object containing the podcast to process\n            output_path: Path where the processed audio file should be saved\n        \"\"\"\n        ad_segments = self.get_ad_segments(post)\n\n        duration_ms = get_audio_duration_ms(post.unprocessed_audio_path)\n        if duration_ms is None:\n            raise ValueError(\n                f\"Could not determine duration for audio: {post.unprocessed_audio_path}\"\n            )\n\n        # Store duration in seconds\n        post.duration = duration_ms / 1000.0\n\n        merged_ad_segments = self.merge_ad_segments(\n            duration_ms=duration_ms,\n            ad_segments=ad_segments,\n            min_ad_segment_length_seconds=float(\n                self.config.output.min_ad_segment_length_seconds\n            ),\n            min_ad_segment_separation_seconds=float(\n                self.config.output.min_ad_segement_separation_seconds\n            ),\n        )\n\n        clip_segments_with_fade(\n            in_path=post.unprocessed_audio_path,\n            ad_segments_ms=merged_ad_segments,\n            fade_ms=self.config.output.fade_ms,\n            out_path=output_path,\n        )\n\n        post.processed_audio_path = output_path\n        result = writer_client.update(\n            \"Post\",\n            post.id,\n            {\"processed_audio_path\": output_path, \"duration\": post.duration},\n            wait=True,\n        )\n        if not result or not result.success:\n            raise RuntimeError(getattr(result, \"error\", \"Failed to update post\"))\n        try:\n            self.db_session.expire(post)\n        except Exception:  # pylint: disable=broad-except\n            pass\n\n        self.logger.info(\n            f\"Audio processing complete for post {post.id}, saved to {output_path}\"\n        )\n"
  },
  {
    "path": "src/podcast_processor/boundary_refiner.py",
    "content": "\"\"\"LLM-based boundary refiner.\n\nNote: We intentionally share some call-setup patterns with WordBoundaryRefiner.\nPylint may flag these as R0801 (duplicate-code); we ignore that for this module.\n\"\"\"\n\n# pylint: disable=duplicate-code\n\nimport json\nimport logging\nimport re\nfrom dataclasses import dataclass\nfrom pathlib import Path\nfrom typing import Any, Dict, List, Optional\n\nimport litellm\nfrom jinja2 import Template\n\nfrom app.writer.client import writer_client\nfrom shared.config import Config\n\n# Internal defaults for boundary expansion; not user-configurable.\nMAX_START_EXTENSION_SECONDS = 30.0\nMAX_END_EXTENSION_SECONDS = 15.0\n\n\n@dataclass\nclass BoundaryRefinement:\n    refined_start: float\n    refined_end: float\n    start_adjustment_reason: str\n    end_adjustment_reason: str\n\n\nclass BoundaryRefiner:\n    def __init__(self, config: Config, logger: Optional[logging.Logger] = None):\n        self.config = config\n        self.logger = logger or logging.getLogger(__name__)\n        self.template = self._load_template()\n\n    def _load_template(self) -> Template:\n        path = (\n            Path(__file__).resolve().parent.parent  # project src root\n            / \"boundary_refinement_prompt.jinja\"\n        )\n        if path.exists():\n            return Template(path.read_text())\n        # Minimal fallback\n        return Template(\n            \"\"\"Refine ad boundaries.\nAd: {{ad_start}}s-{{ad_end}}s\n{% for seg in context_segments %}[{{seg.start_time}}] {{seg.text}}\n{% endfor %}\nReturn JSON: {\"refined_start\": {{ad_start}}, \"refined_end\": {{ad_end}}, \"start_reason\": \"\", \"end_reason\": \"\"}\"\"\"\n        )\n\n    def refine(\n        self,\n        ad_start: float,\n        ad_end: float,\n        confidence: float,\n        all_segments: List[Dict[str, Any]],\n        *,\n        post_id: Optional[int] = None,\n        first_seq_num: Optional[int] = None,\n        last_seq_num: Optional[int] = None,\n    ) -> BoundaryRefinement:\n        \"\"\"Refine ad boundaries using LLM analysis and record the call in ModelCall.\"\"\"\n        self.logger.debug(\n            \"Refining boundaries\",\n            extra={\n                \"ad_start\": ad_start,\n                \"ad_end\": ad_end,\n                \"confidence\": confidence,\n                \"segments_count\": len(all_segments),\n            },\n        )\n        context = self._get_context(ad_start, ad_end, all_segments)\n        self.logger.debug(\n            \"Context window selected\",\n            extra={\n                \"context_size\": len(context),\n                \"first_seg\": context[0] if context else None,\n            },\n        )\n\n        prompt = self.template.render(\n            ad_start=ad_start,\n            ad_end=ad_end,\n            ad_confidence=confidence,\n            context_segments=context,\n        )\n\n        model_call_id: Optional[int] = None\n        raw_response: Optional[str] = None\n\n        # Record the intent to call the LLM when we have enough context to do so\n        if (\n            post_id is not None\n            and first_seq_num is not None\n            and last_seq_num is not None\n        ):\n            try:\n                res = writer_client.action(\n                    \"upsert_model_call\",\n                    {\n                        \"post_id\": post_id,\n                        \"model_name\": self.config.llm_model,\n                        \"first_segment_sequence_num\": first_seq_num,\n                        \"last_segment_sequence_num\": last_seq_num,\n                        \"prompt\": prompt,\n                    },\n                    wait=True,\n                )\n                if res and res.success:\n                    model_call_id = (res.data or {}).get(\"model_call_id\")\n            except Exception as e:  # best-effort; do not block refinement\n                self.logger.warning(\n                    \"Boundary refine: failed to upsert ModelCall: %s\", e\n                )\n\n        try:\n            response = litellm.completion(\n                model=self.config.llm_model,\n                messages=[{\"role\": \"user\", \"content\": prompt}],\n                temperature=0.1,\n                max_tokens=4096,\n                timeout=self.config.openai_timeout,\n                api_key=self.config.llm_api_key,\n                base_url=self.config.openai_base_url,\n            )\n\n            choice = response.choices[0] if response.choices else None\n            content = \"\"\n            if choice:\n                # Prefer chat content; fall back to text for completion-style responses\n                content = (\n                    getattr(getattr(choice, \"message\", None), \"content\", None) or \"\"\n                )\n                if not content:\n                    content = getattr(choice, \"text\", \"\") or \"\"\n            raw_response = content\n            self.logger.debug(\n                \"LLM response received\",\n                extra={\n                    \"model\": self.config.llm_model,\n                    \"content_preview\": content[:200],\n                },\n            )\n            # Full response for debugging parse issues; remove or redact if noisy.\n            raw_preview = content[:1000]\n            self.logger.debug(\n                \"LLM response raw (%s chars, preview up to 1000): %r\",\n                len(content),\n                raw_preview,\n                extra={\"model\": self.config.llm_model},\n            )\n            # Log the full response object so provider quirks are visible.\n            try:\n                response_payload = (\n                    response.model_dump()\n                    if hasattr(response, \"model_dump\")\n                    else response\n                )\n                self.logger.debug(\n                    \"LLM full response object\",\n                    extra={\"response_payload\": response_payload},\n                )\n            except Exception:\n                self.logger.debug(\"LLM full response object unavailable\", exc_info=True)\n            # Persist the raw response immediately so it's available even if parsing fails.\n            self._update_model_call(\n                model_call_id,\n                status=\"received_response\",\n                response=raw_response,\n                error_message=None,\n            )\n            # Parse JSON (strip markdown fences). Log parse diagnostics so failures are actionable.\n            cleaned = re.sub(r\"```json|```\", \"\", content.strip())\n            json_candidates = re.findall(r\"\\{.*?\\}\", cleaned, re.DOTALL)\n            parse_error: Optional[str] = None\n            parsed: Optional[Dict[str, Any]] = None\n\n            for candidate in json_candidates:\n                try:\n                    parsed = json.loads(candidate)\n                    break\n                except Exception as exc:  # capture the last parse error for logging\n                    parse_error = str(exc)\n\n            if parsed:\n                refined = self._validate(\n                    ad_start,\n                    ad_end,\n                    BoundaryRefinement(\n                        refined_start=float(parsed[\"refined_start\"]),\n                        refined_end=float(parsed[\"refined_end\"]),\n                        start_adjustment_reason=parsed.get(\n                            \"start_adjustment_reason\", parsed.get(\"start_reason\", \"\")\n                        ),\n                        end_adjustment_reason=parsed.get(\n                            \"end_adjustment_reason\", parsed.get(\"end_reason\", \"\")\n                        ),\n                    ),\n                )\n                self._update_model_call(\n                    model_call_id,\n                    status=\"success\",\n                    response=raw_response,\n                    error_message=None,\n                )\n                self.logger.info(\n                    \"LLM refinement applied\",\n                    extra={\n                        \"refined_start\": refined.refined_start,\n                        \"refined_end\": refined.refined_end,\n                    },\n                )\n                return refined\n\n            self.logger.warning(\n                \"Boundary refinement LLM response had no parseable JSON; falling back to heuristic\",\n                extra={\n                    \"model_call_id\": model_call_id,\n                    \"ad_start\": ad_start,\n                    \"ad_end\": ad_end,\n                    \"json_candidate_count\": len(json_candidates),\n                    \"parse_error\": parse_error,\n                    \"first_candidate_preview\": (\n                        json_candidates[0][:200] if json_candidates else None\n                    ),\n                    \"content_preview\": (content or \"\")[:200],\n                    \"raw_response\": raw_response,\n                    \"raw_response_len\": len(content),\n                },\n            )\n            # Also emit the raw response in-band so it shows up in plain-text logs.\n            self.logger.debug(\n                \"Boundary refinement raw response (len=%s): %r\",\n                len(content),\n                raw_preview,\n                extra={\"model_call_id\": model_call_id},\n            )\n            self._update_model_call(\n                model_call_id,\n                status=\"success_heuristic\",\n                response=raw_response,\n                error_message=parse_error or \"parse_failed\",\n            )\n        except Exception as e:\n            self._update_model_call(\n                model_call_id,\n                status=\"failed_permanent\",\n                response=raw_response,\n                error_message=str(e),\n            )\n            self.logger.warning(f\"LLM refinement failed: {e}, using heuristic\")\n\n        # Fallback: heuristic refinement\n        return self._heuristic_refine(ad_start, ad_end, context)\n\n    def _update_model_call(\n        self,\n        model_call_id: Optional[int],\n        *,\n        status: str,\n        response: Optional[str],\n        error_message: Optional[str],\n    ) -> None:\n        \"\"\"Best-effort ModelCall updater; no-op if call creation failed.\"\"\"\n        if model_call_id is None:\n            return\n        try:\n            writer_client.update(\n                \"ModelCall\",\n                int(model_call_id),\n                {\n                    \"status\": status,\n                    \"response\": response,\n                    \"error_message\": error_message,\n                    \"retry_attempts\": 1,\n                },\n                wait=True,\n            )\n        except Exception as exc:  # best-effort; do not block refinement\n            self.logger.warning(\n                \"Boundary refine: failed to update ModelCall %s: %s\",\n                model_call_id,\n                exc,\n            )\n\n    def _get_context(\n        self, ad_start: float, ad_end: float, all_segments: List[Dict[str, Any]]\n    ) -> List[Dict[str, Any]]:\n        \"\"\"Get ±8 segments around ad\"\"\"\n        ad_segs = [s for s in all_segments if ad_start <= s[\"start_time\"] <= ad_end]\n        if not ad_segs:\n            return []\n\n        first_idx = all_segments.index(ad_segs[0])\n        last_idx = all_segments.index(ad_segs[-1])\n\n        start_idx = max(0, first_idx - 8)\n        end_idx = min(len(all_segments), last_idx + 9)\n\n        return all_segments[start_idx:end_idx]\n\n    def _heuristic_refine(\n        self, ad_start: float, ad_end: float, context: List[Dict[str, Any]]\n    ) -> BoundaryRefinement:\n        \"\"\"Simple pattern-based refinement\"\"\"\n        intro_patterns = [\"brought to you\", \"sponsor\", \"let me tell you\"]\n        outro_patterns = [\".com\", \"thanks to\", \"use code\", \"visit\"]\n\n        refined_start = ad_start\n        refined_end = ad_end\n\n        # Check before ad for intros\n        for seg in context:\n            if seg[\"start_time\"] < ad_start:\n                if any(p in seg[\"text\"].lower() for p in intro_patterns):\n                    self.logger.debug(\n                        \"Intro pattern matched\",\n                        extra={\n                            \"matched_text\": seg[\"text\"],\n                            \"start_time\": seg[\"start_time\"],\n                        },\n                    )\n                    refined_start = seg[\"start_time\"]\n\n        # Check after ad for outros\n        for seg in context:\n            if seg[\"start_time\"] > ad_end:\n                if any(p in seg[\"text\"].lower() for p in outro_patterns):\n                    self.logger.debug(\n                        \"Outro pattern matched\",\n                        extra={\n                            \"matched_text\": seg[\"text\"],\n                            \"start_time\": seg[\"start_time\"],\n                        },\n                    )\n                    refined_end = seg.get(\"end_time\", seg[\"start_time\"] + 5.0)\n\n        result = BoundaryRefinement(\n            refined_start,\n            refined_end,\n            \"heuristic\",\n            \"heuristic\",\n        )\n        self.logger.info(\n            \"Heuristic refinement applied\",\n            extra={\n                \"refined_start\": result.refined_start,\n                \"refined_end\": result.refined_end,\n            },\n        )\n        return result\n\n    def _validate(\n        self, orig_start: float, orig_end: float, refinement: BoundaryRefinement\n    ) -> BoundaryRefinement:\n        \"\"\"Constrain refinement to reasonable bounds\"\"\"\n        max_start_ext = MAX_START_EXTENSION_SECONDS\n        max_end_ext = MAX_END_EXTENSION_SECONDS\n\n        refinement.refined_start = max(\n            refinement.refined_start, orig_start - max_start_ext\n        )\n        refinement.refined_end = min(refinement.refined_end, orig_end + max_end_ext)\n        if refinement.refined_start >= refinement.refined_end:\n            refinement.refined_start = orig_start\n            refinement.refined_end = orig_end\n\n        self.logger.debug(\n            \"Refinement validated\",\n            extra={\n                \"orig_start\": orig_start,\n                \"orig_end\": orig_end,\n                \"refined_start\": refinement.refined_start,\n                \"refined_end\": refinement.refined_end,\n            },\n        )\n\n        return refinement\n"
  },
  {
    "path": "src/podcast_processor/cue_detector.py",
    "content": "import re\nfrom typing import Dict, List, Pattern, Tuple\n\n\nclass CueDetector:\n    def __init__(self) -> None:\n        self.url_pattern: Pattern[str] = re.compile(\n            r\"\\b([a-z0-9\\-\\.]+\\.(?:com|net|org|io))\\b\", re.I\n        )\n        self.promo_pattern: Pattern[str] = re.compile(\n            r\"\\b(code|promo|save|discount)\\s+\\w+\\b\", re.I\n        )\n        self.phone_pattern: Pattern[str] = re.compile(\n            r\"\\b(?:\\+?1[ -]?)?\\d{3}[ -]?\\d{3}[ -]?\\d{4}\\b\"\n        )\n        self.cta_pattern: Pattern[str] = re.compile(\n            r\"\\b(visit|go to|check out|head over|sign up|start today|start now|use code|offer|deal|free trial)\\b\",\n            re.I,\n        )\n        self.transition_pattern: Pattern[str] = re.compile(\n            r\"\\b(back to the show|after the break|stay tuned|we'll be right back|now back)\\b\",\n            re.I,\n        )\n        self.self_promo_pattern: Pattern[str] = re.compile(\n            r\"\\b(my|our)\\s+(book|course|newsletter|fund|patreon|substack|community|platform)\\b\",\n            re.I,\n        )\n\n    def has_cue(self, text: str) -> bool:\n        return bool(\n            self.url_pattern.search(text)\n            or self.promo_pattern.search(text)\n            or self.phone_pattern.search(text)\n            or self.cta_pattern.search(text)\n        )\n\n    def analyze(self, text: str) -> Dict[str, bool]:\n        return {\n            \"url\": bool(self.url_pattern.search(text)),\n            \"promo\": bool(self.promo_pattern.search(text)),\n            \"phone\": bool(self.phone_pattern.search(text)),\n            \"cta\": bool(self.cta_pattern.search(text)),\n            \"transition\": bool(self.transition_pattern.search(text)),\n            \"self_promo\": bool(self.self_promo_pattern.search(text)),\n        }\n\n    def highlight_cues(self, text: str) -> str:\n        \"\"\"\n        Highlights detected cues in the text by wrapping them in *** ***.\n        Useful for drawing attention to cues in LLM prompts.\n        \"\"\"\n        matches: List[Tuple[int, int]] = []\n        patterns = [\n            self.url_pattern,\n            self.promo_pattern,\n            self.phone_pattern,\n            self.cta_pattern,\n            self.transition_pattern,\n            self.self_promo_pattern,\n        ]\n\n        for pattern in patterns:\n            for match in pattern.finditer(text):\n                matches.append(match.span())\n\n        if not matches:\n            return text\n\n        # Sort by start, then end (descending) to handle containment\n        matches.sort(key=lambda x: (x[0], -x[1]))\n\n        # Merge overlapping intervals\n        merged: List[Tuple[int, int]] = []\n        if matches:\n            curr_start, curr_end = matches[0]\n            for next_start, next_end in matches[1:]:\n                if next_start < curr_end:  # Overlap\n                    curr_end = max(curr_end, next_end)\n                else:\n                    merged.append((curr_start, curr_end))\n                    curr_start, curr_end = next_start, next_end\n            merged.append((curr_start, curr_end))\n\n        # Reconstruct string backwards to avoid index shifting\n        result_parts = []\n        last_idx = len(text)\n\n        for start, end in reversed(merged):\n            result_parts.append(text[end:last_idx])  # Unchanged suffix\n            result_parts.append(\" ***\")\n            result_parts.append(text[start:end])  # The match\n            result_parts.append(\"*** \")\n            last_idx = start\n\n        result_parts.append(text[:last_idx])  # Remaining prefix\n\n        return \"\".join(reversed(result_parts))\n"
  },
  {
    "path": "src/podcast_processor/llm_concurrency_limiter.py",
    "content": "\"\"\"\nLLM concurrency limiter to control the number of simultaneous LLM API calls.\n\nThis module provides a semaphore-based concurrency control mechanism to prevent\ntoo many simultaneous LLM API calls, which can help avoid rate limiting and\nimprove system stability.\n\"\"\"\n\nimport logging\nimport threading\nfrom typing import Any, Optional\n\nlogger = logging.getLogger(__name__)\n\n\nclass LLMConcurrencyLimiter:\n    \"\"\"Controls the number of concurrent LLM API calls using a semaphore.\"\"\"\n\n    def __init__(self, max_concurrent_calls: int):\n        \"\"\"\n        Initialize the concurrency limiter.\n\n        Args:\n            max_concurrent_calls: Maximum number of simultaneous LLM API calls allowed\n        \"\"\"\n        if max_concurrent_calls <= 0:\n            raise ValueError(\"max_concurrent_calls must be greater than 0\")\n\n        self.max_concurrent_calls = max_concurrent_calls\n        self._semaphore = threading.Semaphore(max_concurrent_calls)\n\n        logger.info(\n            f\"LLM concurrency limiter initialized with {max_concurrent_calls} max concurrent calls\"\n        )\n\n    def acquire(self, timeout: Optional[float] = None) -> bool:\n        \"\"\"\n        Acquire a slot for making an LLM API call.\n\n        Note: Consider using ConcurrencyContext for automatic resource management.\n\n        Args:\n            timeout: Maximum time to wait for a slot in seconds. None means wait indefinitely.\n\n        Returns:\n            True if a slot was acquired, False if timeout occurred\n        \"\"\"\n        # Disable specific pylint warning for this line as manual semaphore control is needed\n        acquired = self._semaphore.acquire(  # pylint: disable=consider-using-with\n            timeout=timeout\n        )\n        if acquired:\n            logger.debug(\"Acquired LLM concurrency slot\")\n        else:\n            logger.warning(\n                f\"Failed to acquire LLM concurrency slot within {timeout}s timeout\"\n            )\n        return acquired\n\n    def release(self) -> None:\n        \"\"\"\n        Release a slot after completing an LLM API call.\n\n        Note: Consider using ConcurrencyContext for automatic resource management.\n        \"\"\"\n        self._semaphore.release()\n        logger.debug(\"Released LLM concurrency slot\")\n\n    def get_available_slots(self) -> int:\n        \"\"\"Get the number of currently available slots.\"\"\"\n        return self._semaphore._value\n\n    def get_active_calls(self) -> int:\n        \"\"\"Get the number of currently active LLM calls.\"\"\"\n        return self.max_concurrent_calls - self._semaphore._value\n\n\n# Global concurrency limiter instance\n_CONCURRENCY_LIMITER: Optional[LLMConcurrencyLimiter] = None\n\n\ndef get_concurrency_limiter(max_concurrent_calls: int = 3) -> LLMConcurrencyLimiter:\n    \"\"\"Get or create the global concurrency limiter instance.\"\"\"\n    global _CONCURRENCY_LIMITER  # pylint: disable=global-statement\n    if (\n        _CONCURRENCY_LIMITER is None\n        or _CONCURRENCY_LIMITER.max_concurrent_calls != max_concurrent_calls\n    ):\n        _CONCURRENCY_LIMITER = LLMConcurrencyLimiter(max_concurrent_calls)\n    return _CONCURRENCY_LIMITER\n\n\nclass ConcurrencyContext:\n    \"\"\"Context manager for controlling LLM API call concurrency.\"\"\"\n\n    def __init__(self, limiter: LLMConcurrencyLimiter, timeout: Optional[float] = None):\n        \"\"\"\n        Initialize the context manager.\n\n        Args:\n            limiter: The concurrency limiter to use\n            timeout: Maximum time to wait for a slot\n        \"\"\"\n        self.limiter = limiter\n        self.timeout = timeout\n        self.acquired = False\n\n    def __enter__(self) -> \"ConcurrencyContext\":\n        \"\"\"Acquire a concurrency slot.\"\"\"\n        self.acquired = self.limiter.acquire(timeout=self.timeout)\n        if not self.acquired:\n            raise RuntimeError(\n                f\"Could not acquire LLM concurrency slot within {self.timeout}s\"\n            )\n        return self\n\n    def __exit__(\n        self,\n        exc_type: Optional[type],\n        exc_val: Optional[BaseException],\n        exc_tb: Optional[Any],\n    ) -> None:\n        \"\"\"Release the concurrency slot.\"\"\"\n        if self.acquired:\n            self.limiter.release()\n"
  },
  {
    "path": "src/podcast_processor/llm_error_classifier.py",
    "content": "\"\"\"\nEnhanced error classification for LLM API calls.\n\nProvides more robust and extensible error handling beyond simple string matching.\n\"\"\"\n\nimport re\nfrom typing import Union\n\nfrom litellm.exceptions import InternalServerError\n\n\nclass LLMErrorClassifier:\n    \"\"\"Classifies LLM API errors into retryable and non-retryable categories.\"\"\"\n\n    # Rate limiting error patterns\n    RATE_LIMIT_PATTERNS = [\n        re.compile(r\"rate.?limit\", re.IGNORECASE),\n        re.compile(r\"too many requests\", re.IGNORECASE),\n        re.compile(r\"quota.?exceeded\", re.IGNORECASE),\n        re.compile(r\"429\", re.IGNORECASE),  # HTTP 429 status\n    ]\n\n    # Timeout error patterns\n    TIMEOUT_PATTERNS = [\n        re.compile(r\"timeout\", re.IGNORECASE),\n        re.compile(r\"timed.?out\", re.IGNORECASE),\n        re.compile(r\"408\", re.IGNORECASE),  # HTTP 408 status\n        re.compile(r\"504\", re.IGNORECASE),  # HTTP 504 status\n    ]\n\n    # Server error patterns (retryable)\n    SERVER_ERROR_PATTERNS = [\n        re.compile(r\"internal.?server.?error\", re.IGNORECASE),\n        re.compile(r\"502\", re.IGNORECASE),  # Bad Gateway\n        re.compile(r\"503\", re.IGNORECASE),  # Service Unavailable\n        re.compile(r\"500\", re.IGNORECASE),  # Internal Server Error\n    ]\n\n    # Non-retryable error patterns\n    NON_RETRYABLE_PATTERNS = [\n        re.compile(r\"authentication\", re.IGNORECASE),\n        re.compile(r\"authorization\", re.IGNORECASE),\n        re.compile(r\"invalid.?api.?key\", re.IGNORECASE),\n        re.compile(r\"401\", re.IGNORECASE),  # Unauthorized\n        re.compile(r\"403\", re.IGNORECASE),  # Forbidden\n        re.compile(r\"400\", re.IGNORECASE),  # Bad Request\n        re.compile(r\"invalid.?parameter\", re.IGNORECASE),\n    ]\n\n    @classmethod\n    def is_retryable_error(cls, error: Union[Exception, str]) -> bool:\n        \"\"\"\n        Determine if an error should be retried.\n\n        Args:\n            error: Exception instance or error string\n\n        Returns:\n            True if the error should be retried, False otherwise\n        \"\"\"\n        # Handle specific exception types\n        if isinstance(error, InternalServerError):\n            return True\n\n        # Convert to string for pattern matching\n        error_str = str(error)\n\n        # Check for non-retryable errors first (higher priority)\n        if cls._matches_patterns(error_str, cls.NON_RETRYABLE_PATTERNS):\n            return False\n\n        # Check for retryable error patterns\n        retryable_patterns = (\n            cls.RATE_LIMIT_PATTERNS + cls.TIMEOUT_PATTERNS + cls.SERVER_ERROR_PATTERNS\n        )\n\n        return cls._matches_patterns(error_str, retryable_patterns)\n\n    @classmethod\n    def get_error_category(cls, error: Union[Exception, str]) -> str:\n        \"\"\"\n        Categorize the error type for better handling.\n\n        Returns:\n            One of: 'rate_limit', 'timeout', 'server_error', 'auth_error', 'client_error', 'unknown'\n        \"\"\"\n        error_str = str(error)\n\n        if cls._matches_patterns(error_str, cls.RATE_LIMIT_PATTERNS):\n            return \"rate_limit\"\n        if cls._matches_patterns(error_str, cls.TIMEOUT_PATTERNS):\n            return \"timeout\"\n        if cls._matches_patterns(error_str, cls.SERVER_ERROR_PATTERNS):\n            return \"server_error\"\n        if cls._matches_patterns(error_str, cls.NON_RETRYABLE_PATTERNS):\n            if any(\n                pattern.search(error_str)\n                for pattern in [\n                    re.compile(r\"authentication\", re.IGNORECASE),\n                    re.compile(r\"authorization\", re.IGNORECASE),\n                    re.compile(r\"401\", re.IGNORECASE),\n                    re.compile(r\"403\", re.IGNORECASE),\n                ]\n            ):\n                return \"auth_error\"\n            return \"client_error\"\n        return \"unknown\"\n\n    @classmethod\n    def get_suggested_backoff(cls, error: Union[Exception, str], attempt: int) -> float:\n        \"\"\"\n        Get suggested backoff time based on error type and attempt number.\n\n        Args:\n            error: The error that occurred\n            attempt: Current attempt number (0-based)\n\n        Returns:\n            Suggested backoff time in seconds\n        \"\"\"\n        category = cls.get_error_category(error)\n        base_backoff = float(2**attempt)  # Exponential backoff\n\n        # Adjust based on error type\n        if category == \"rate_limit\":\n            return base_backoff * 2.0  # Longer backoff for rate limits\n        if category == \"timeout\":\n            return base_backoff * 1.5  # Moderate backoff for timeouts\n        if category == \"server_error\":\n            return base_backoff  # Standard backoff for server errors\n        return base_backoff\n\n    @staticmethod\n    def _matches_patterns(text: str, patterns: list[re.Pattern[str]]) -> bool:\n        \"\"\"Check if text matches any of the provided regex patterns.\"\"\"\n        return any(pattern.search(text) for pattern in patterns)\n"
  },
  {
    "path": "src/podcast_processor/llm_model_call_utils.py",
    "content": "from __future__ import annotations\n\nimport logging\nfrom typing import Any, Optional\n\nfrom app.writer.client import writer_client\n\n\ndef render_prompt_and_upsert_model_call(\n    *,\n    template: Any,\n    ad_start: float,\n    ad_end: float,\n    confidence: float,\n    context_segments: Any,\n    post_id: Optional[int],\n    first_seq_num: Optional[int],\n    last_seq_num: Optional[int],\n    model_name: str,\n    logger: logging.Logger,\n    log_prefix: str,\n) -> tuple[str, Optional[int]]:\n    prompt = template.render(\n        ad_start=ad_start,\n        ad_end=ad_end,\n        ad_confidence=confidence,\n        context_segments=context_segments,\n    )\n\n    model_call_id = try_upsert_model_call(\n        post_id=post_id,\n        first_seq_num=first_seq_num,\n        last_seq_num=last_seq_num,\n        model_name=model_name,\n        prompt=prompt,\n        logger=logger,\n        log_prefix=log_prefix,\n    )\n\n    return prompt, model_call_id\n\n\ndef try_upsert_model_call(\n    *,\n    post_id: Optional[int],\n    first_seq_num: Optional[int],\n    last_seq_num: Optional[int],\n    model_name: str,\n    prompt: str,\n    logger: logging.Logger,\n    log_prefix: str,\n) -> Optional[int]:\n    \"\"\"Best-effort ModelCall creation.\n\n    Returns model_call_id if successfully created/upserted, else None.\n    \"\"\"\n    if post_id is None or first_seq_num is None or last_seq_num is None:\n        return None\n\n    try:\n        res = writer_client.action(\n            \"upsert_model_call\",\n            {\n                \"post_id\": post_id,\n                \"model_name\": model_name,\n                \"first_segment_sequence_num\": first_seq_num,\n                \"last_segment_sequence_num\": last_seq_num,\n                \"prompt\": prompt,\n            },\n            wait=True,\n        )\n        if res and res.success:\n            return (res.data or {}).get(\"model_call_id\")\n    except Exception as exc:  # best-effort\n        logger.warning(\"%s: failed to upsert ModelCall: %s\", log_prefix, exc)\n\n    return None\n\n\ndef try_update_model_call(\n    model_call_id: Optional[int],\n    *,\n    status: str,\n    response: Optional[str],\n    error_message: Optional[str],\n    logger: logging.Logger,\n    log_prefix: str,\n) -> None:\n    \"\"\"Best-effort ModelCall updater; no-op if call creation failed.\"\"\"\n    if model_call_id is None:\n        return\n\n    try:\n        writer_client.update(\n            \"ModelCall\",\n            int(model_call_id),\n            {\n                \"status\": status,\n                \"response\": response,\n                \"error_message\": error_message,\n                \"retry_attempts\": 1,\n            },\n            wait=True,\n        )\n    except Exception as exc:  # best-effort\n        logger.warning(\n            \"%s: failed to update ModelCall %s: %s\",\n            log_prefix,\n            model_call_id,\n            exc,\n        )\n\n\ndef extract_litellm_content(response: Any) -> str:\n    \"\"\"Extracts the primary text content from a litellm completion response.\"\"\"\n    choices = getattr(response, \"choices\", None) or []\n    choice = choices[0] if choices else None\n    if not choice:\n        return \"\"\n\n    # Prefer chat content; fall back to text for completion-style responses\n    content = getattr(getattr(choice, \"message\", None), \"content\", None) or \"\"\n    if not content:\n        content = getattr(choice, \"text\", \"\") or \"\"\n    return str(content)\n"
  },
  {
    "path": "src/podcast_processor/model_output.py",
    "content": "import logging\nimport re\nfrom typing import List, Literal, Optional\n\nfrom pydantic import BaseModel\n\nlogger = logging.getLogger(__name__)\n\n\nclass AdSegmentPrediction(BaseModel):\n    segment_offset: float\n    confidence: float\n\n\nclass AdSegmentPredictionList(BaseModel):\n    ad_segments: List[AdSegmentPrediction]\n    content_type: Optional[\n        Literal[\n            \"technical_discussion\",\n            \"educational/self_promo\",\n            \"promotional_external\",\n            \"transition\",\n        ]\n    ] = None\n    confidence: Optional[float] = None\n\n\ndef _attempt_json_repair(json_str: str) -> str:\n    \"\"\"\n    Attempt to repair truncated JSON by adding missing closing brackets.\n\n    This handles cases where the LLM response was cut off mid-JSON,\n    e.g., '{\"ad_segments\":[{\"segment_offset\":10.5,\"confidence\":0.92}'\n    \"\"\"\n    # Count opening and closing brackets/braces\n    open_braces = json_str.count(\"{\")\n    close_braces = json_str.count(\"}\")\n    open_brackets = json_str.count(\"[\")\n    close_brackets = json_str.count(\"]\")\n\n    # If brackets are balanced, no repair needed\n    if open_braces == close_braces and open_brackets == close_brackets:\n        return json_str\n\n    logger.warning(\n        f\"Detected unbalanced JSON: {open_braces} '{{' vs {close_braces} '}}', \"\n        f\"{open_brackets} '[' vs {close_brackets} ']'. Attempting repair.\"\n    )\n\n    # Remove any trailing incomplete key-value pair\n    # e.g., '...\"confidence\":0.9' or '...\"key\":\"val' or '...\"key\":'\n    # First, try to find the last complete value\n    repaired = json_str.rstrip()\n\n    # If ends with a comma, remove it (incomplete next element)\n    repaired = repaired.rstrip(\",\")\n\n    # If ends with a colon or incomplete string, try to truncate to last complete element\n    # Pattern: ends with \"key\": or \"key\":\"incomplete or similar\n    incomplete_patterns = [\n        r',\"[^\"]*\":\\s*$',  # ,\"key\":\n        r',\"[^\"]*\":\\s*\"[^\"]*$',  # ,\"key\":\"incomplete\n    ]\n\n    for pattern in incomplete_patterns:\n        match = re.search(pattern, repaired)\n        if match:\n            repaired = repaired[: match.start()]\n            logger.debug(f\"Removed incomplete trailing content: {match.group()}\")\n            break\n\n    # Recount after cleanup\n    open_braces = repaired.count(\"{\")\n    close_braces = repaired.count(\"}\")\n    open_brackets = repaired.count(\"[\")\n    close_brackets = repaired.count(\"]\")\n\n    # Add missing closing brackets/braces in the right order\n    # We need to determine the order based on the structure\n    # Typically for our schema it's: ]} to close ad_segments array and outer object\n    missing_brackets = close_brackets - open_brackets  # negative means we need more ]\n    missing_braces = close_braces - open_braces  # negative means we need more }\n\n    if missing_brackets < 0:\n        repaired += \"]\" * abs(missing_brackets)\n    if missing_braces < 0:\n        repaired += \"}\" * abs(missing_braces)\n\n    logger.info(\"Repaired JSON by adding missing closing brackets/braces\")\n\n    return repaired\n\n\ndef clean_and_parse_model_output(model_output: str) -> AdSegmentPredictionList:\n    start_marker, end_marker = \"{\", \"}\"\n\n    assert (\n        model_output.count(start_marker) >= 1\n    ), f\"No opening brace found in: {model_output[:200]}\"\n\n    start_idx = model_output.index(start_marker)\n    model_output = model_output[start_idx:]\n\n    # If we have at least as many closing braces as opening braces, trim to the last\n    # closing brace to drop any trailing non-JSON content. Otherwise, keep the\n    # content as-is so we can attempt repair on truncated JSON.\n    open_braces = model_output.count(start_marker)\n    close_braces = model_output.count(end_marker)\n    if close_braces >= open_braces and close_braces > 0:\n        model_output = model_output[: 1 + model_output.rindex(end_marker)]\n\n    model_output = model_output.replace(\"'\", '\"')\n    model_output = model_output.replace(\"\\n\", \"\")\n    model_output = model_output.strip()\n\n    # First attempt: try to parse as-is\n    try:\n        return AdSegmentPredictionList.parse_raw(model_output)\n    except Exception as first_error:\n        logger.debug(f\"Initial parse failed: {first_error}\")\n\n        # Second attempt: try to repair truncated JSON\n        try:\n            repaired_output = _attempt_json_repair(model_output)\n            result = AdSegmentPredictionList.parse_raw(repaired_output)\n            logger.info(\"Successfully parsed model output after JSON repair\")\n            return result\n        except Exception as repair_error:\n            logger.error(\n                f\"JSON repair also failed. Original output (first 500 chars): {model_output[:500]}\"\n            )\n            # Re-raise the original error with more context\n            raise first_error from repair_error\n"
  },
  {
    "path": "src/podcast_processor/podcast_downloader.py",
    "content": "from __future__ import annotations\n\nimport logging\nimport os\nimport re\nfrom pathlib import Path\nfrom typing import Any, Iterator, Optional, Set\n\nimport requests\nimport validators\nfrom flask import abort\n\nfrom shared.interfaces import Post\nfrom shared.processing_paths import get_in_root\n\nlogger = logging.getLogger(__name__)\n\nDOWNLOAD_DIR = str(get_in_root())\n\n\nclass PodcastDownloader:\n    \"\"\"\n    Handles downloading podcast episodes with robust file checking and path management.\n    \"\"\"\n\n    def __init__(\n        self, download_dir: str = DOWNLOAD_DIR, logger: Optional[logging.Logger] = None\n    ):\n        self.download_dir = download_dir\n        self.logger = logger or logging.getLogger(__name__)\n\n    def download_episode(self, post: Post, dest_path: str) -> Optional[str]:\n        \"\"\"\n        Download a podcast episode if it doesn't already exist.\n\n        Args:\n            post: The Post object containing the podcast episode to download\n\n        Returns:\n            Path to the downloaded file, or None if download failed\n        \"\"\"\n        # Destination is required; ensure parent directory exists\n        download_path = dest_path\n        Path(download_path).parent.mkdir(parents=True, exist_ok=True)\n        if not download_path:\n            self.logger.error(f\"Invalid download path for post {post.id}\")\n            return None\n\n        # First, check if the file truly exists and has nonzero size.\n        try:\n            if os.path.isfile(download_path) and os.path.getsize(download_path) > 0:\n                self.logger.info(\"Episode already downloaded.\")\n                return download_path\n            self.logger.info(\"File is zero bytes, re-downloading.\")  # else\n\n        except FileNotFoundError:\n            # Covers both \"file actually missing\" and \"broken symlink\"\n            pass\n\n        # If we get here, the file is missing or zero bytes -> perform download\n        audio_link = post.download_url\n        if audio_link is None or not validators.url(audio_link):\n            abort(404)\n            return None\n\n        self.logger.info(f\"Downloading {audio_link} into {download_path}...\")\n        referer = \"https://open.acast.com/\" if \"acast.com\" in audio_link else None\n        headers = {\n            \"User-Agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36\",\n            \"Referer\": referer,\n        }\n        with requests.get(\n            audio_link, stream=True, timeout=60, headers=headers\n        ) as response:\n            if response.status_code == 200:\n                with open(download_path, \"wb\") as file:\n                    for chunk in response.iter_content(chunk_size=8192):\n                        file.write(chunk)\n                self.logger.info(\"Download complete.\")\n            else:\n                self.logger.info(\n                    f\"Failed to download the podcast episode, response: {response.status_code}\"\n                )\n                return None\n\n        return download_path\n\n    def get_and_make_download_path(self, post_title: str) -> Path:\n        \"\"\"\n        Generate the download path for a post and create necessary directories.\n\n        Args:\n            post_title: The title of the post to generate a path for\n\n        Returns:\n            Path object for the download location\n        \"\"\"\n        sanitized_title = sanitize_title(post_title)\n\n        post_directory = sanitized_title\n        post_filename = sanitized_title + \".mp3\"\n\n        post_directory_path = Path(self.download_dir) / post_directory\n\n        post_directory_path.mkdir(parents=True, exist_ok=True)\n\n        return post_directory_path / post_filename\n\n\ndef sanitize_title(title: str) -> str:\n    \"\"\"Sanitize a title for use in file paths.\"\"\"\n    return re.sub(r\"[^a-zA-Z0-9\\s]\", \"\", title)\n\n\ndef find_audio_link(entry: Any) -> str:\n    \"\"\"Find the audio link in a feed entry.\"\"\"\n    audio_mime_types: Set[str] = {\n        \"audio/mpeg\",\n        \"audio/mp3\",\n        \"audio/x-mp3\",\n        \"audio/mpeg3\",\n        \"audio/mp4\",\n        \"audio/m4a\",\n        \"audio/x-m4a\",\n        \"audio/aac\",\n        \"audio/wav\",\n        \"audio/x-wav\",\n        \"audio/ogg\",\n        \"audio/opus\",\n        \"audio/flac\",\n    }\n\n    for url in _iter_enclosure_audio_urls(entry, audio_mime_types):\n        return url\n    for url in _iter_link_audio_urls(entry, audio_mime_types, match_any_audio=False):\n        return url\n    for url in _iter_link_audio_urls(entry, audio_mime_types, match_any_audio=True):\n        return url\n\n    return str(getattr(entry, \"id\", \"\"))\n\n\ndef _iter_enclosure_audio_urls(entry: Any, audio_mime_types: Set[str]) -> Iterator[str]:\n    enclosures = getattr(entry, \"enclosures\", None) or []\n    for enclosure in enclosures:\n        enc_type = (getattr(enclosure, \"type\", \"\") or \"\").lower()\n        if enc_type not in audio_mime_types:\n            continue\n        href = getattr(enclosure, \"href\", None)\n        if href:\n            yield str(href)\n        url = getattr(enclosure, \"url\", None)\n        if url:\n            yield str(url)\n\n\ndef _iter_link_audio_urls(\n    entry: Any,\n    audio_mime_types: Set[str],\n    *,\n    match_any_audio: bool,\n) -> Iterator[str]:\n    links = getattr(entry, \"links\", None) or []\n    for link in links:\n        link_type = (getattr(link, \"type\", \"\") or \"\").lower()\n        if match_any_audio:\n            if not link_type.startswith(\"audio/\"):\n                continue\n        else:\n            if link_type not in audio_mime_types:\n                continue\n\n        href = getattr(link, \"href\", None)\n        if href:\n            yield str(href)\n\n\n# Backward compatibility - create a default instance\n_default_downloader = PodcastDownloader()\n\n\ndef download_episode(post: Post, dest_path: str) -> Optional[str]:\n    return _default_downloader.download_episode(post, dest_path)\n\n\ndef get_and_make_download_path(post_title: str) -> Path:\n    return _default_downloader.get_and_make_download_path(post_title)\n"
  },
  {
    "path": "src/podcast_processor/podcast_processor.py",
    "content": "import logging\nimport os\nimport shutil\nimport threading\nfrom pathlib import Path\nfrom typing import Any, Callable, Dict, List, Optional\n\nimport litellm\nfrom jinja2 import Template\nfrom sqlalchemy.orm import object_session\n\nfrom app.extensions import db\nfrom app.models import Post, ProcessingJob, TranscriptSegment\nfrom app.writer.client import writer_client\nfrom podcast_processor.ad_classifier import AdClassifier\nfrom podcast_processor.audio_processor import AudioProcessor\nfrom podcast_processor.podcast_downloader import PodcastDownloader, sanitize_title\nfrom podcast_processor.processing_status_manager import ProcessingStatusManager\nfrom podcast_processor.prompt import (\n    DEFAULT_SYSTEM_PROMPT_PATH,\n    DEFAULT_USER_PROMPT_TEMPLATE_PATH,\n)\nfrom podcast_processor.transcription_manager import TranscriptionManager\nfrom shared.config import Config\nfrom shared.processing_paths import (\n    ProcessingPaths,\n    get_job_unprocessed_path,\n    get_srv_root,\n    paths_from_unprocessed_path,\n)\n\nlogger = logging.getLogger(\"global_logger\")\n\n\ndef get_post_processed_audio_path(post: Post) -> Optional[ProcessingPaths]:\n    \"\"\"\n    Generate the processed audio path based on the post's unprocessed audio path.\n    Returns None if unprocessed_audio_path is not set.\n    \"\"\"\n    unprocessed_path = post.unprocessed_audio_path\n    if not unprocessed_path or not isinstance(unprocessed_path, str):\n        logger.warning(f\"Post {post.id} has no unprocessed_audio_path.\")\n        return None\n\n    title = post.feed.title\n    if not title or not isinstance(title, str):\n        logger.warning(f\"Post {post.id} has no feed title.\")\n        return None\n\n    return paths_from_unprocessed_path(unprocessed_path, title)\n\n\ndef get_post_processed_audio_path_cached(\n    post: Post, feed_title: str\n) -> Optional[ProcessingPaths]:\n    \"\"\"\n    Generate the processed audio path using cached feed title to avoid ORM access.\n    Returns None if unprocessed_audio_path is not set.\n    \"\"\"\n    unprocessed_path = post.unprocessed_audio_path\n    if not unprocessed_path or not isinstance(unprocessed_path, str):\n        logger.warning(f\"Post {post.id} has no unprocessed_audio_path.\")\n        return None\n\n    if not feed_title or not isinstance(feed_title, str):\n        logger.warning(f\"Post {post.id} has no feed title.\")\n        return None\n\n    return paths_from_unprocessed_path(unprocessed_path, feed_title)\n\n\nclass PodcastProcessor:\n    \"\"\"\n    Main coordinator for podcast processing workflow.\n    Delegates to specialized components for transcription, ad classification, and audio processing.\n    \"\"\"\n\n    lock_lock = threading.Lock()\n    locks: Dict[str, threading.Lock] = {}  # Now keyed by post GUID instead of file path\n\n    def __init__(\n        self,\n        config: Config,\n        logger: Optional[logging.Logger] = None,\n        transcription_manager: Optional[TranscriptionManager] = None,\n        ad_classifier: Optional[AdClassifier] = None,\n        audio_processor: Optional[AudioProcessor] = None,\n        status_manager: Optional[ProcessingStatusManager] = None,\n        db_session: Optional[Any] = None,\n        downloader: Optional[PodcastDownloader] = None,\n    ) -> None:\n        super().__init__()\n        self.logger = logger or logging.getLogger(\"global_logger\")\n        self.output_dir = str(get_srv_root())\n        self.config: Config = config\n        self.db_session = db_session or db.session\n\n        # Initialize downloader\n        self.downloader = downloader or PodcastDownloader(logger=self.logger)\n\n        # Initialize status manager\n        self.status_manager = status_manager or ProcessingStatusManager(\n            self.db_session, self.logger\n        )\n\n        litellm.api_base = self.config.openai_base_url\n        litellm.api_key = self.config.llm_api_key\n\n        # Initialize components with default implementations if not provided\n        if transcription_manager is None:\n            self.transcription_manager = TranscriptionManager(self.logger, config)\n        else:\n            self.transcription_manager = transcription_manager\n\n        if ad_classifier is None:\n            self.ad_classifier = AdClassifier(config)\n        else:\n            self.ad_classifier = ad_classifier\n\n        if audio_processor is None:\n            self.audio_processor = AudioProcessor(config=config, logger=self.logger)\n        else:\n            self.audio_processor = audio_processor\n\n    # pylint: disable=too-many-branches, too-many-statements\n    def process(\n        self,\n        post: Post,\n        job_id: str,\n        cancel_callback: Optional[Callable[[], bool]] = None,\n    ) -> str:\n        \"\"\"\n        Process a podcast by downloading, transcribing, identifying ads, and removing ad segments.\n        Updates the existing job record for tracking progress.\n\n        Args:\n            post: The Post object containing the podcast to process\n            job_id: Job ID of the existing job to update (required)\n            cancel_callback: Optional callback to check for cancellation\n\n        Returns:\n            Path to the processed audio file\n        \"\"\"\n        job = self.db_session.get(ProcessingJob, job_id)\n        if not job:\n            raise ProcessorException(f\"Job with ID {job_id} not found\")\n\n        # Cache job and post attributes early to avoid ORM access after expire_all()\n        # This includes relationship access like post.feed.title\n        cached_post_guid = post.guid\n        cached_post_title = post.title\n        cached_feed_title = post.feed.title\n        cached_job_id = job.id\n        cached_current_step = job.current_step\n\n        try:\n            self.logger.debug(\n                \"processor.process enter: job_id=%s post_guid=%s job_bound=%s\",\n                job_id,\n                getattr(post, \"guid\", None),\n                object_session(job) is not None,\n            )\n            # Update job to running status\n            self.status_manager.update_job_status(\n                job, \"running\", 0, \"Starting processing\"\n            )\n\n            # Validate post\n            if not post.whitelisted:\n                raise ProcessorException(\n                    f\"Post with GUID {cached_post_guid} not whitelisted\"\n                )\n\n            # Check if processed audio already exists (database or disk)\n            if self._check_existing_processed_audio(post):\n                self.status_manager.update_job_status(\n                    job, \"completed\", 4, \"Processing complete\", 100.0\n                )\n                return str(post.processed_audio_path)\n\n            simulated_path = self._simulate_developer_processing(\n                post,\n                job,\n                cached_post_guid,\n                cached_post_title,\n                cached_feed_title,\n                cached_job_id,\n            )\n            if simulated_path:\n                return simulated_path\n\n            # Step 1: Download (if needed)\n            self._handle_download_step(\n                post, job, cached_post_guid, cached_post_title, cached_job_id\n            )\n            self._raise_if_cancelled(job, 1, cancel_callback)\n\n            # Get processing paths and acquire lock\n            processed_audio_path = self._acquire_processing_lock(\n                post, job, cached_post_guid, cached_job_id, cached_feed_title\n            )\n\n            try:\n                if os.path.exists(processed_audio_path):\n                    self.logger.info(f\"Audio already processed: {post}\")\n                    # Update the database with the processed audio path\n                    self._remove_unprocessed_audio(post)\n                    result = writer_client.update(\n                        \"Post\",\n                        post.id,\n                        {\n                            \"processed_audio_path\": processed_audio_path,\n                            \"unprocessed_audio_path\": None,\n                        },\n                        wait=True,\n                    )\n                    if not result or not result.success:\n                        raise RuntimeError(\n                            getattr(result, \"error\", \"Failed to update post\")\n                        )\n                    self.status_manager.update_job_status(\n                        job, \"completed\", 4, \"Processing complete\", 100.0\n                    )\n                    return processed_audio_path\n\n                # Perform the main processing steps\n                self._perform_processing_steps(\n                    post, job, processed_audio_path, cancel_callback\n                )\n\n                self.logger.info(f\"Processing podcast: {post} complete\")\n                return processed_audio_path\n            finally:\n                # Release lock using cached GUID without touching ORM state after potential rollback\n                try:\n                    if cached_post_guid is not None:\n                        lock = PodcastProcessor.locks.get(cached_post_guid)\n                        if lock is not None and lock.locked():\n                            lock.release()\n                except Exception:\n                    # Best-effort lock release; avoid masking original exceptions\n                    pass\n\n        except ProcessorException as e:\n            error_msg = str(e)\n            if \"Processing job in progress\" in error_msg:\n                self.status_manager.update_job_status(\n                    job,\n                    \"failed\",\n                    cached_current_step,\n                    \"Another processing job is already running for this episode\",\n                )\n            else:\n                self.status_manager.update_job_status(\n                    job, \"failed\", cached_current_step, error_msg\n                )\n            raise\n\n        except Exception as e:\n            self.logger.error(\n                \"processor.process unexpected error: job_id=%s %s\",\n                job_id,\n                e,\n                exc_info=True,\n            )\n            self.status_manager.update_job_status(\n                job, \"failed\", cached_current_step, f\"Unexpected error: {str(e)}\"\n            )\n            raise\n\n    def _acquire_processing_lock(\n        self,\n        post: Post,\n        job: ProcessingJob,\n        post_guid: str,\n        job_id: str,\n        feed_title: str,\n    ) -> str:\n        \"\"\"\n        Acquire processing lock for the post and return the processed audio path.\n        Lock is now based on post GUID for better granularity and reliability.\n\n        Args:\n            post: The Post object to process\n            job: The ProcessingJob for tracking\n            post_guid: Cached post GUID to avoid ORM access\n            job_id: Cached job ID to avoid ORM access\n            feed_title: Cached feed title to avoid ORM access\n\n        Returns:\n            Path to the processed audio file\n\n        Raises:\n            ProcessorException: If lock cannot be acquired or paths are invalid\n        \"\"\"\n        # Get processing paths\n        working_paths = get_post_processed_audio_path_cached(post, feed_title)\n        if working_paths is None:\n            raise ProcessorException(\"Processed audio path not found\")\n\n        processed_audio_path = str(working_paths.post_processed_audio_path)\n\n        # Use post GUID as lock key instead of file path for better granularity\n        lock_key = post_guid\n\n        # Acquire lock (this is where we cancel existing jobs if we can get the lock)\n        locked = False\n        with PodcastProcessor.lock_lock:\n            if lock_key not in PodcastProcessor.locks:\n                PodcastProcessor.locks[lock_key] = threading.Lock()\n                PodcastProcessor.locks[lock_key].acquire(blocking=False)\n                locked = True\n\n        if not locked and not PodcastProcessor.locks[lock_key].acquire(blocking=False):\n            raise ProcessorException(\"Processing job in progress\")\n\n        # Cancel existing jobs since we got the lock\n        self.status_manager.cancel_existing_jobs(post_guid, job_id)\n\n        self.make_dirs(working_paths)\n        return processed_audio_path\n\n    def _perform_processing_steps(\n        self,\n        post: Post,\n        job: ProcessingJob,\n        processed_audio_path: str,\n        cancel_callback: Optional[Callable[[], bool]] = None,\n    ) -> None:\n        \"\"\"\n        Perform the main processing steps: transcription, ad classification, and audio processing.\n\n        Args:\n            post: The Post object to process\n            job: The ProcessingJob for tracking\n            processed_audio_path: Path where the processed audio will be saved\n        \"\"\"\n        # Step 2: Transcribe audio\n        self.status_manager.update_job_status(\n            job, \"running\", 2, \"Transcribing audio\", 50.0\n        )\n        transcript_segments = self.transcription_manager.transcribe(post)\n        self._raise_if_cancelled(job, 2, cancel_callback)\n\n        # Step 3: Classify ad segments\n        self._classify_ad_segments(post, job, transcript_segments)\n        self._raise_if_cancelled(job, 3, cancel_callback)\n\n        # Step 4: Process audio (remove ad segments)\n        self.status_manager.update_job_status(\n            job, \"running\", 4, \"Processing audio\", 90.0\n        )\n        self.audio_processor.process_audio(post, processed_audio_path)\n\n        # Update the database with the processed audio path\n        self._remove_unprocessed_audio(post)\n        result = writer_client.update(\n            \"Post\",\n            post.id,\n            {\n                \"processed_audio_path\": processed_audio_path,\n                \"unprocessed_audio_path\": None,\n            },\n            wait=True,\n        )\n        if not result or not result.success:\n            raise RuntimeError(getattr(result, \"error\", \"Failed to update post\"))\n\n        # Mark job complete\n        self.status_manager.update_job_status(\n            job, \"completed\", 4, \"Processing complete\", 100.0\n        )\n\n    def _raise_if_cancelled(\n        self,\n        job: ProcessingJob,\n        current_step: int,\n        cancel_callback: Optional[Callable[[], bool]],\n    ) -> None:\n        \"\"\"Helper to centralize cancellation checking and update job state.\"\"\"\n        if cancel_callback and cancel_callback():\n            self.status_manager.update_job_status(\n                job, \"cancelled\", current_step, \"Cancellation requested\"\n            )\n            raise ProcessorException(\"Cancelled\")\n\n    def _classify_ad_segments(\n        self,\n        post: Post,\n        job: ProcessingJob,\n        transcript_segments: List[TranscriptSegment],\n    ) -> None:\n        \"\"\"\n        Classify ad segments in the transcript.\n\n        Args:\n            post: The Post object being processed\n            job: The ProcessingJob for tracking\n            transcript_segments: The transcript segments to classify\n        \"\"\"\n        self.status_manager.update_job_status(\n            job, \"running\", 3, \"Identifying ads\", 75.0\n        )\n        user_prompt_template = self.get_user_prompt_template(\n            DEFAULT_USER_PROMPT_TEMPLATE_PATH\n        )\n        system_prompt = self.get_system_prompt(DEFAULT_SYSTEM_PROMPT_PATH)\n        self.ad_classifier.classify(\n            transcript_segments=transcript_segments,\n            system_prompt=system_prompt,\n            user_prompt_template=user_prompt_template,\n            post=post,\n        )\n\n    def _simulate_developer_processing(\n        self,\n        post: Post,\n        job: ProcessingJob,\n        post_guid: str,\n        post_title: str,\n        feed_title: str,\n        job_id: str,\n    ) -> Optional[str]:\n        \"\"\"Short-circuit processing for developer-mode test feeds.\n\n        When developer mode is enabled and a post comes from a synthetic test feed\n        (download_url contains \"test-feed\"), skip the full pipeline and copy a\n        tiny bundled MP3 into the expected processed/unprocessed locations. This\n        keeps the UI happy without relying on external downloads or LLM calls.\n        \"\"\"\n\n        download_url = (post.download_url or \"\").lower()\n        is_test_feed = \"test-feed\" in download_url or post_guid.startswith(\"test-guid\")\n        if not (self.config.developer_mode or is_test_feed):\n            return None\n\n        sample_audio = (\n            Path(__file__).resolve().parent.parent / \"tests\" / \"data\" / \"count_0_99.mp3\"\n        )\n        if not sample_audio.exists():\n            self.status_manager.update_job_status(\n                job,\n                \"failed\",\n                job.current_step or 0,\n                \"Developer sample audio missing\",\n            )\n            raise ProcessorException(\"Developer sample audio missing\")\n\n        self.status_manager.update_job_status(\n            job,\n            \"running\",\n            1,\n            \"Simulating processing (developer mode)\",\n            25.0,\n        )\n\n        unprocessed_path = get_job_unprocessed_path(post_guid, job_id, post_title)\n        unprocessed_path.parent.mkdir(parents=True, exist_ok=True)\n        shutil.copyfile(sample_audio, unprocessed_path)\n\n        processed_path = (\n            get_srv_root()\n            / sanitize_title(feed_title)\n            / f\"{sanitize_title(post_title)}.mp3\"\n        )\n        processed_path.parent.mkdir(parents=True, exist_ok=True)\n        shutil.copyfile(sample_audio, processed_path)\n\n        result = writer_client.update(\n            \"Post\",\n            post.id,\n            {\n                \"unprocessed_audio_path\": str(unprocessed_path),\n                \"processed_audio_path\": str(processed_path),\n            },\n            wait=True,\n        )\n        if not result or not result.success:\n            raise RuntimeError(getattr(result, \"error\", \"Failed to update post\"))\n\n        self.status_manager.update_job_status(\n            job,\n            \"completed\",\n            4,\n            \"Processing complete (developer mode)\",\n            100.0,\n        )\n\n        return str(processed_path)\n\n    def _handle_download_step(\n        self,\n        post: Post,\n        job: ProcessingJob,\n        post_guid: str,\n        post_title: str,\n        job_id: str,\n    ) -> None:\n        \"\"\"\n        Handle the download step with progress tracking and robust file checking.\n        This method checks for existing files on disk before downloading.\n\n        Args:\n            post: The Post object being processed\n            job: The ProcessingJob for tracking\n            post_guid: Cached post GUID to avoid ORM access\n            post_title: Cached post title to avoid ORM access\n            job_id: Cached job ID to avoid ORM access\n        \"\"\"\n        # If we have a path in the database, check if the file actually exists\n        if post.unprocessed_audio_path is not None:\n            if (\n                os.path.exists(post.unprocessed_audio_path)\n                and os.path.getsize(post.unprocessed_audio_path) > 0\n            ):\n                self.logger.debug(\n                    f\"Unprocessed audio already available at: {post.unprocessed_audio_path}\"\n                )\n                return\n            self.logger.info(\n                f\"Database path {post.unprocessed_audio_path} doesn't exist or is empty, resetting\"\n            )\n            result = writer_client.update(\n                \"Post\", post.id, {\"unprocessed_audio_path\": None}, wait=True\n            )\n            if not result or not result.success:\n                raise RuntimeError(getattr(result, \"error\", \"Failed to update post\"))\n\n        # Compute a unique per-job expected path\n        expected_unprocessed_path = get_job_unprocessed_path(\n            post_guid, job_id, post_title\n        )\n\n        if (\n            expected_unprocessed_path.exists()\n            and expected_unprocessed_path.stat().st_size > 0\n        ):\n            # Found a local unprocessed file\n            unprocessed_path_str = str(expected_unprocessed_path.resolve())\n            self.logger.info(\n                f\"Found existing unprocessed audio for post '{post_title}' at '{unprocessed_path_str}'. \"\n                \"Updated the database path.\"\n            )\n            result = writer_client.update(\n                \"Post\",\n                post.id,\n                {\"unprocessed_audio_path\": unprocessed_path_str},\n                wait=True,\n            )\n            if not result or not result.success:\n                raise RuntimeError(getattr(result, \"error\", \"Failed to update post\"))\n            return\n\n        # Need to download the file\n        self.status_manager.update_job_status(\n            job, \"running\", 1, \"Downloading episode\", 25.0\n        )\n        self.logger.info(f\"Downloading post: {post_title}\")\n        download_path = self.downloader.download_episode(\n            post, dest_path=str(expected_unprocessed_path)\n        )\n        if download_path is None:\n            raise ProcessorException(\"Download failed\")\n        result = writer_client.update(\n            \"Post\", post.id, {\"unprocessed_audio_path\": download_path}, wait=True\n        )\n        if not result or not result.success:\n            raise RuntimeError(getattr(result, \"error\", \"Failed to update post\"))\n\n    def make_dirs(self, processing_paths: ProcessingPaths) -> None:\n        \"\"\"Create necessary directories for output files.\"\"\"\n        if processing_paths.post_processed_audio_path:\n            processing_paths.post_processed_audio_path.parent.mkdir(\n                parents=True, exist_ok=True\n            )\n\n    def get_system_prompt(self, system_prompt_path: str) -> str:\n        \"\"\"Load the system prompt from a file.\"\"\"\n        with open(system_prompt_path, \"r\") as f:\n            return f.read()\n\n    def get_user_prompt_template(self, prompt_template_path: str) -> Template:\n        \"\"\"Load the user prompt template from a file.\"\"\"\n        with open(prompt_template_path, \"r\") as f:\n            return Template(f.read())\n\n    def remove_audio_files_and_reset_db(self, post_id: Optional[int]) -> None:\n        \"\"\"\n        Removes unprocessed/processed audio for the given post from disk,\n        and resets the DB fields so the next run will re-download the files.\n        \"\"\"\n        if post_id is None:\n            return\n\n        post = self.db_session.get(Post, post_id)\n        if not post:\n            self.logger.warning(\n                f\"Could not find Post with ID {post_id} to remove files.\"\n            )\n            return\n\n        if post.unprocessed_audio_path and os.path.isfile(post.unprocessed_audio_path):\n            try:\n                os.remove(post.unprocessed_audio_path)\n                self.logger.info(\n                    f\"Removed unprocessed file: {post.unprocessed_audio_path}\"\n                )\n            except OSError as e:\n                self.logger.error(\n                    f\"Failed to remove unprocessed file '{post.unprocessed_audio_path}': {e}\"\n                )\n\n        if post.processed_audio_path and os.path.isfile(post.processed_audio_path):\n            try:\n                os.remove(post.processed_audio_path)\n                self.logger.info(f\"Removed processed file: {post.processed_audio_path}\")\n            except OSError as e:\n                self.logger.error(\n                    f\"Failed to remove processed file '{post.processed_audio_path}': {e}\"\n                )\n\n        result = writer_client.update(\n            \"Post\",\n            post.id,\n            {\"unprocessed_audio_path\": None, \"processed_audio_path\": None},\n            wait=True,\n        )\n        if not result or not result.success:\n            raise RuntimeError(getattr(result, \"error\", \"Failed to update post\"))\n\n    def _remove_unprocessed_audio(self, post: Post) -> None:\n        \"\"\"\n        Delete the downloaded source audio and clear its DB reference.\n\n        Used after we have a finalized processed file so stale downloads do not\n        accumulate on disk.\n        \"\"\"\n        path = post.unprocessed_audio_path\n        if not path:\n            return\n\n        if os.path.isfile(path):\n            try:\n                os.remove(path)\n                self.logger.info(\"Removed unprocessed file after processing: %s\", path)\n            except OSError as exc:  # best-effort cleanup\n                self.logger.warning(\n                    \"Failed to remove unprocessed file '%s': %s\", path, exc\n                )\n        post.unprocessed_audio_path = None\n\n    def _check_existing_processed_audio(self, post: Post) -> bool:\n        \"\"\"\n        Check if processed audio already exists, either in database or on disk.\n        Updates the database path if found on disk.\n\n        Returns:\n            True if processed audio exists and is valid, False otherwise\n        \"\"\"\n        # If we have a path in the database, check if the file actually exists\n        if post.processed_audio_path is not None:\n            if (\n                os.path.exists(post.processed_audio_path)\n                and os.path.getsize(post.processed_audio_path) > 0\n            ):\n                self.logger.info(\n                    f\"Processed audio already available at: {post.processed_audio_path}\"\n                )\n                return True\n            self.logger.info(\n                f\"Database path {post.processed_audio_path} doesn't exist or is empty, resetting\"\n            )\n            result = writer_client.update(\n                \"Post\", post.id, {\"processed_audio_path\": None}, wait=True\n            )\n            if not result or not result.success:\n                raise RuntimeError(getattr(result, \"error\", \"Failed to update post\"))\n\n        # Check if file exists on disk at expected location\n        safe_feed_title = sanitize_title(post.feed.title)\n        safe_post_title = sanitize_title(post.title)\n        expected_processed_path = (\n            get_srv_root() / safe_feed_title / f\"{safe_post_title}.mp3\"\n        )\n\n        if (\n            expected_processed_path.exists()\n            and expected_processed_path.stat().st_size > 0\n        ):\n            # Found a local processed file\n            processed_path_str = str(expected_processed_path.resolve())\n            self.logger.info(\n                f\"Found existing processed audio for post '{post.title}' at '{processed_path_str}'. \"\n                \"Updated the database path.\"\n            )\n            result = writer_client.update(\n                \"Post\",\n                post.id,\n                {\"processed_audio_path\": processed_path_str},\n                wait=True,\n            )\n            if not result or not result.success:\n                raise RuntimeError(getattr(result, \"error\", \"Failed to update post\"))\n            return True\n\n        return False\n\n\nclass ProcessorException(Exception):\n    \"\"\"Exception raised for podcast processing errors.\"\"\"\n"
  },
  {
    "path": "src/podcast_processor/processing_status_manager.py",
    "content": "import logging\nimport uuid\nfrom datetime import datetime\nfrom typing import Any, Optional, cast\n\nfrom sqlalchemy.orm import object_session\n\nfrom app.models import ProcessingJob\nfrom app.writer.client import writer_client\n\n\nclass ProcessingStatusManager:\n    \"\"\"\n    Manages processing job status, creation, updates, and cleanup.\n    Handles all database operations related to job tracking via Writer Service.\n    \"\"\"\n\n    def __init__(self, db_session: Any, logger: Optional[logging.Logger] = None):\n        self.db_session = db_session\n        self.logger = logger or logging.getLogger(__name__)\n\n    def generate_job_id(self) -> str:\n        \"\"\"Generate a unique job ID.\"\"\"\n        return str(uuid.uuid4())\n\n    def create_job(\n        self,\n        post_guid: str,\n        job_id: str,\n        run_id: Optional[str] = None,\n        *,\n        requested_by_user_id: Optional[int] = None,\n        billing_user_id: Optional[int] = None,\n    ) -> ProcessingJob:\n        \"\"\"Create a new pending job record for the provided post.\"\"\"\n        job_data = {\n            \"id\": job_id,\n            \"jobs_manager_run_id\": run_id,\n            \"post_guid\": post_guid,\n            \"status\": \"pending\",\n            \"current_step\": 0,\n            \"total_steps\": 4,\n            \"progress_percentage\": 0.0,\n            \"created_at\": datetime.utcnow().isoformat(),\n            \"requested_by_user_id\": requested_by_user_id,\n            \"billing_user_id\": billing_user_id,\n        }\n\n        writer_client.action(\"create_job\", {\"job_data\": job_data}, wait=True)\n\n        self.db_session.expire_all()\n        job = self.db_session.get(ProcessingJob, job_id)\n        if not job:\n            raise RuntimeError(f\"Failed to create job {job_id}\")\n        return cast(ProcessingJob, job)\n\n    def cancel_existing_jobs(self, post_guid: str, current_job_id: str) -> None:\n        \"\"\"Delete any existing active jobs for this post.\"\"\"\n        writer_client.action(\n            \"cancel_existing_jobs\",\n            {\"post_guid\": post_guid, \"current_job_id\": current_job_id},\n            wait=True,\n        )\n        self.db_session.expire_all()\n\n    def update_job_status(\n        self,\n        job: ProcessingJob,\n        status: str,\n        step: int,\n        step_name: str,\n        progress: Optional[float] = None,\n    ) -> None:\n        \"\"\"Update job status in database.\"\"\"\n        # Cache job attributes before any operations that might expire the object\n        job_id = job.id\n        total_steps = job.total_steps\n        is_bound = object_session(job) is not None\n\n        self.logger.info(\n            \"[JOB_STATUS_UPDATE] job_id=%s status=%s step=%s step_name=%s bound=%s\",\n            job_id,\n            status,\n            step,\n            step_name,\n            is_bound,\n        )\n\n        if progress is None:\n            progress = (step / total_steps) * 100.0\n\n        writer_client.action(\n            \"update_job_status\",\n            {\n                \"job_id\": job_id,\n                \"status\": status,\n                \"step\": step,\n                \"step_name\": step_name,\n                \"progress\": progress,\n            },\n            wait=True,\n        )\n\n        self.db_session.expire_all()\n\n        if status in {\"failed\", \"cancelled\"}:\n            self.logger.error(\n                \"[JOB_STATUS_ERROR] job_id=%s post_guid=%s status=%s step=%s step_name=%s progress=%.2f\",\n                job_id,\n                job.post_guid,  # post_guid is safe - not cached but accessed before expire_all\n                status,\n                step,\n                step_name,\n                progress,\n            )\n\n    def mark_cancelled(self, job_id: str, error_message: Optional[str] = None) -> None:\n        writer_client.action(\n            \"mark_cancelled\", {\"job_id\": job_id, \"reason\": error_message}, wait=True\n        )\n        self.db_session.expire_all()\n        self.logger.info(f\"Successfully cancelled job {job_id}\")\n"
  },
  {
    "path": "src/podcast_processor/prompt.py",
    "content": "from typing import List\n\nfrom podcast_processor.cue_detector import CueDetector\nfrom podcast_processor.model_output import AdSegmentPrediction, AdSegmentPredictionList\nfrom podcast_processor.transcribe import Segment\n\nDEFAULT_SYSTEM_PROMPT_PATH = \"src/system_prompt.txt\"\nDEFAULT_USER_PROMPT_TEMPLATE_PATH = \"src/user_prompt.jinja\"\n\n_cue_detector = CueDetector()\n\n\ndef transcript_excerpt_for_prompt(\n    segments: List[Segment], includes_start: bool, includes_end: bool\n) -> str:\n\n    excerpts = [\n        f\"[{segment.start}] {_cue_detector.highlight_cues(segment.text)}\"\n        for segment in segments\n    ]\n    if includes_start:\n        excerpts.insert(0, \"[TRANSCRIPT START]\")\n    if includes_end:\n        excerpts.append(\"[TRANSCRIPT END]\")\n\n    return \"\\n\".join(excerpts)\n\n\ndef generate_system_prompt() -> str:\n    valid_empty_example = AdSegmentPredictionList(ad_segments=[]).model_dump_json(\n        exclude_none=True\n    )\n\n    output_for_one_shot_example = AdSegmentPredictionList(\n        ad_segments=[\n            AdSegmentPrediction(segment_offset=59.8, confidence=0.95),\n            AdSegmentPrediction(segment_offset=64.8, confidence=0.9),\n            AdSegmentPrediction(segment_offset=73.8, confidence=0.92),\n            AdSegmentPrediction(segment_offset=77.8, confidence=0.98),\n            AdSegmentPrediction(segment_offset=79.8, confidence=0.9),\n        ],\n        content_type=\"promotional_external\",\n        confidence=0.96,\n    ).model_dump_json(exclude_none=True)\n\n    example_output_for_prompt = output_for_one_shot_example.strip()\n\n    one_shot_transcript_example = transcript_excerpt_for_prompt(\n        [\n            Segment(start=53.8, end=-1, text=\"That's all coming after the break.\"),\n            Segment(\n                start=59.8,\n                end=-1,\n                text=\"On this week's episode of Wildcard, actor Chris Pine tells \"\n                \"us, it's okay not to be perfect.\",\n            ),\n            Segment(\n                start=64.8,\n                end=-1,\n                text=\"My film got absolutely decimated when it premiered, which \"\n                \"brings up for me one of my primary triggers or whatever it was \"\n                \"like, not being liked.\",\n            ),\n            Segment(\n                start=73.8,\n                end=-1,\n                text=\"I'm Rachel Martin, Chris Pine on How to Find Joy in Imperfection.\",\n            ),\n            Segment(\n                start=77.8,\n                end=-1,\n                text=\"That's on the new podcast, Wildcard.\",\n            ),\n            Segment(\n                start=79.8,\n                end=-1,\n                text=\"The Game Where Cards control the conversation.\",\n            ),\n            Segment(\n                start=83.8,\n                end=-1,\n                text=\"And welcome back to the show, today we're talking to Professor Hopkins\",\n            ),\n        ],\n        includes_start=False,\n        includes_end=False,\n    )\n\n    technical_example = transcript_excerpt_for_prompt(\n        [\n            Segment(\n                start=4762.7,\n                end=-1,\n                text=\"Our brains are configured differently.\",\n            ),\n            Segment(\n                start=4765.6,\n                end=-1,\n                text=\"My brain is configured perfectly for Ruby, perfectly for a dynamically typed language.\",\n            ),\n            Segment(\n                start=4831.3,\n                end=-1,\n                text=\"Shopify exists at a scale most programmers never touch, and it still runs on Rails.\",\n            ),\n            Segment(start=4933.2, end=-1, text=\"Shopify.com has supported this show.\"),\n        ],\n        includes_start=False,\n        includes_end=False,\n    )\n\n    # pylint: disable=line-too-long\n    return f\"\"\"Your job is to identify advertisements in podcast transcript excerpts with high precision, continuity awareness, and content-context sensitivity.\n\nCRITICAL: distinguish external sponsor ads from technical discussion and self-promotion.\n\nCONTENT-AWARE TAXONOMY:\n- technical_discussion: Educational content, case studies, implementation details. Company names may appear as examples; do not mark as ads.\n- educational/self_promo: Host discussing their own products, newsletters, funds, or courses (may include CTAs but are first-party).\n- promotional_external: True sponsor ads for external companies with sales intent, URLs, promo codes, or explicit offers.\n- transition: Brief bumpers that connect to or from ads; include if they are part of an ad block.\n\nJSON CONTRACT (strict):\n- Always respond with: {{\"ad_segments\": [...], \"content_type\": \"<taxonomy>\", \"confidence\": <0.0-1.0>}}\n- Each ad_segments item must be: {{\"segment_offset\": <seconds.float>, \"confidence\": <0.0-1.0>}}\n- If there are no ads, respond with: {valid_empty_example} (no extra keys).\n\nDURATION AND CUE GUIDANCE:\n- Ads are typically 15–120 seconds and contain CTAs, URLs/domains, promo/discount codes, phone numbers, or phrases like \"brought to you by\".\n- Integrated ads can be longer but maintain sales intent; continuous mention of the same sponsor for >3 minutes without CTAs is likely educational/self_promo.\n- Pre-roll/mid-roll/post-roll intros (\"a word from our sponsor\") and quick outros (\"back to the show\") belong to the ad block.\n\nDECISION RULES:\n1) Continuous ads: once an ad starts, follow it to its natural conclusion; include 1–5 second transitions.\n2) Strong cues: treat URLs/domains, promo/discount language, and phone numbers as strong sponsor indicators.\n3) Self-promotion guardrail: host promoting their own products/platforms → classify as educational/self_promo with lower confidence unless explicit external sponsorship language is present.\n4) Boundary bias: if later segments clearly form an ad for a sponsor, pull in the prior two intro/transition lines as ad content.\n5) Prefer labeling as content unless multiple strong ad cues appear with clear external branding.\n\nThis transcript excerpt is broken into segments starting with a timestamp [X] (seconds). Output every segment that is advertisement content.\n\nExample (external sponsor with CTA):\n{one_shot_transcript_example}\nOutput: {example_output_for_prompt}\n\nExample (technical mention, not an ad):\n{technical_example}\nOutput: {{\"ad_segments\": [{{\"segment_offset\": 4933.2, \"confidence\": 0.75}}], \"content_type\": \"technical_discussion\", \"confidence\": 0.45}}\n\\n\\n\"\"\"\n"
  },
  {
    "path": "src/podcast_processor/token_rate_limiter.py",
    "content": "\"\"\"\nToken-based rate limiting for LLM API calls.\n\nThis module provides client-side rate limiting based on input token consumption\nto prevent hitting API provider rate limits (e.g., Anthropic's 30,000 tokens/minute).\n\"\"\"\n\nimport logging\nimport threading\nimport time\nfrom collections import deque\nfrom datetime import datetime\nfrom typing import Dict, List, Optional, Tuple, Union\n\nlogger = logging.getLogger(__name__)\n\n\nclass TokenRateLimiter:\n    \"\"\"\n    Client-side rate limiter that tracks token usage over time windows.\n\n    Prevents hitting API rate limits by calculating token usage and waiting\n    when necessary before making API calls.\n    \"\"\"\n\n    def __init__(self, tokens_per_minute: int = 30000, window_minutes: int = 1):\n        \"\"\"\n        Initialize the rate limiter.\n\n        Args:\n            tokens_per_minute: Maximum tokens allowed per minute\n            window_minutes: Time window for rate limiting (default: 1 minute)\n        \"\"\"\n        self.tokens_per_minute = tokens_per_minute\n        self.window_seconds = window_minutes * 60\n        self.token_usage: deque[Tuple[float, int]] = (\n            deque()\n        )  # [(timestamp, token_count), ...]\n        self.lock = threading.Lock()\n\n        logger.info(\n            f\"Initialized TokenRateLimiter: {tokens_per_minute} tokens/{window_minutes}min\"\n        )\n\n    def count_tokens(self, messages: List[Dict[str, str]], model: str) -> int:\n        \"\"\"\n        Count tokens in messages using litellm's token counting.\n\n        Args:\n            messages: List of message dicts with 'role' and 'content'\n            model: Model name for accurate token counting\n\n        Returns:\n            Number of input tokens\n        \"\"\"\n        try:\n            # Simple token estimation: ~4 characters per token\n            total_chars = sum(len(msg.get(\"content\", \"\")) for msg in messages)\n            estimated_tokens = total_chars // 4\n            logger.debug(f\"Estimated {estimated_tokens} tokens for model {model}\")\n            return estimated_tokens\n        except Exception as e:\n            # Fallback: conservative estimate\n            logger.warning(f\"Token counting failed, using fallback. Error: {e}\")\n            return 1000  # Conservative fallback\n\n    def _cleanup_old_usage(self, current_time: float) -> None:\n        \"\"\"Remove token usage records outside the time window.\"\"\"\n        cutoff_time = current_time - self.window_seconds\n        while self.token_usage and self.token_usage[0][0] < cutoff_time:\n            self.token_usage.popleft()\n\n    def _get_current_usage(self, current_time: float) -> int:\n        \"\"\"Get total token usage within the current time window.\"\"\"\n        self._cleanup_old_usage(current_time)\n        return sum(count for _, count in self.token_usage)\n\n    def check_rate_limit(\n        self, messages: List[Dict[str, str]], model: str\n    ) -> Tuple[bool, float]:\n        \"\"\"\n        Check if we can make an API call without hitting rate limits.\n\n        Args:\n            messages: Messages to send to the API\n            model: Model name\n\n        Returns:\n            Tuple of (can_proceed, wait_seconds)\n            - can_proceed: True if call can be made immediately\n            - wait_seconds: Seconds to wait if can_proceed is False\n        \"\"\"\n        token_count = self.count_tokens(messages, model)\n        current_time = time.time()\n\n        with self.lock:\n            current_usage = self._get_current_usage(current_time)\n\n            # Check if adding this request would exceed the limit\n            if current_usage + token_count <= self.tokens_per_minute:\n                return True, 0.0\n\n            # Calculate wait time: find when oldest tokens will expire\n            if not self.token_usage:\n                return True, 0.0\n\n            oldest_time = self.token_usage[0][0]\n            wait_seconds = (oldest_time + self.window_seconds) - current_time\n            wait_seconds = max(0, wait_seconds)\n\n            logger.info(\n                f\"Rate limit check: current={current_usage}, \"\n                f\"requested={token_count}, \"\n                f\"limit={self.tokens_per_minute}, \"\n                f\"wait={wait_seconds:.1f}s\"\n            )\n\n            return False, wait_seconds\n\n    def record_usage(self, messages: List[Dict[str, str]], model: str) -> None:\n        \"\"\"\n        Record token usage for a successful API call.\n\n        Args:\n            messages: Messages that were sent to the API\n            model: Model name that was used\n        \"\"\"\n        token_count = self.count_tokens(messages, model)\n        current_time = time.time()\n\n        with self.lock:\n            self.token_usage.append((current_time, token_count))\n            logger.debug(\n                f\"Recorded {token_count} tokens at {datetime.fromtimestamp(current_time)}\"\n            )\n\n    def wait_if_needed(self, messages: List[Dict[str, str]], model: str) -> None:\n        \"\"\"\n        Wait if necessary to avoid hitting rate limits, then record usage.\n\n        Args:\n            messages: Messages to send to the API\n            model: Model name\n        \"\"\"\n        can_proceed, wait_seconds = self.check_rate_limit(messages, model)\n\n        if not can_proceed and wait_seconds > 0:\n            logger.info(\n                f\"Rate limiting: waiting {wait_seconds:.1f}s to avoid API limits\"\n            )\n            time.sleep(wait_seconds)\n\n        # Record the usage immediately before making the call\n        self.record_usage(messages, model)\n\n    def get_usage_stats(self) -> Dict[str, Union[int, float]]:\n        \"\"\"Get current usage statistics.\"\"\"\n        current_time = time.time()\n        with self.lock:\n            current_usage = self._get_current_usage(current_time)\n            usage_percentage = (current_usage / self.tokens_per_minute) * 100\n\n            return {\n                \"current_usage\": current_usage,\n                \"limit\": self.tokens_per_minute,\n                \"usage_percentage\": usage_percentage,\n                \"window_seconds\": self.window_seconds,\n                \"active_records\": len(self.token_usage),\n            }\n\n\n# Global rate limiter instance\n_RATE_LIMITER: Optional[TokenRateLimiter] = None  # pylint: disable=invalid-name\n\n\ndef get_rate_limiter(tokens_per_minute: int = 30000) -> TokenRateLimiter:\n    \"\"\"Get or create the global rate limiter instance.\"\"\"\n    global _RATE_LIMITER  # pylint: disable=global-statement\n    if _RATE_LIMITER is None or _RATE_LIMITER.tokens_per_minute != tokens_per_minute:\n        _RATE_LIMITER = TokenRateLimiter(tokens_per_minute=tokens_per_minute)\n    return _RATE_LIMITER\n\n\ndef configure_rate_limiter_for_model(model: str) -> TokenRateLimiter:\n    \"\"\"\n    Configure rate limiter with appropriate limits for the given model.\n\n    Args:\n        model: Model name (e.g., \"anthropic/claude-sonnet-4-20250514\")\n\n    Returns:\n        Configured TokenRateLimiter instance\n    \"\"\"\n    # Model-specific rate limits (tokens per minute)\n    model_limits = {\n        # Anthropic models\n        \"anthropic/claude-3-5-sonnet-20240620\": 30000,\n        \"anthropic/claude-sonnet-4-20250514\": 30000,\n        \"anthropic/claude-3-opus-20240229\": 30000,\n        # OpenAI models\n        \"gpt-4o-mini\": 200000,\n        \"gpt-4o\": 150000,\n        \"gpt-4\": 40000,\n        # Google Gemini models\n        \"gemini/gemini-3-flash-preview\": 60000,\n        \"gemini/gemini-2.5-flash\": 60000,\n        \"gemini/gemini-2.5-pro\": 30000,\n    }\n\n    # Extract base model name and find limit\n    tokens_per_minute = 30000  # Conservative default\n    for model_pattern, limit in model_limits.items():\n        if model_pattern in model:\n            tokens_per_minute = limit\n            break\n\n    logger.info(\n        f\"Configured rate limiter for {model}: {tokens_per_minute} tokens/minute\"\n    )\n    return get_rate_limiter(tokens_per_minute)\n"
  },
  {
    "path": "src/podcast_processor/transcribe.py",
    "content": "import logging\nimport shutil\nimport time\nfrom abc import ABC, abstractmethod\nfrom pathlib import Path\nfrom typing import Any, List\n\nfrom groq import Groq\nfrom openai import OpenAI\nfrom openai.types.audio.transcription_segment import TranscriptionSegment\nfrom pydantic import BaseModel\n\nfrom podcast_processor.audio import split_audio\nfrom shared.config import GroqWhisperConfig, RemoteWhisperConfig\n\n\nclass Segment(BaseModel):\n    start: float\n    end: float\n    text: str\n\n\nclass Transcriber(ABC):\n\n    @property\n    @abstractmethod\n    def model_name(self) -> str:\n        pass\n\n    @abstractmethod\n    def transcribe(self, audio_file_path: str) -> List[Segment]:\n        pass\n\n\nclass LocalTranscriptSegment(BaseModel):\n    id: int\n    seek: int\n    start: float\n    end: float\n    text: str\n    tokens: List[int]\n    temperature: float\n    avg_logprob: float\n    compression_ratio: float\n    no_speech_prob: float\n\n    def to_segment(self) -> Segment:\n        return Segment(start=self.start, end=self.end, text=self.text)\n\n\nclass TestWhisperTranscriber(Transcriber):\n\n    def __init__(self, logger: logging.Logger):\n        self.logger = logger\n\n    @property\n    def model_name(self) -> str:\n        return \"test_whisper\"\n\n    def transcribe(self, _: str) -> List[Segment]:\n        self.logger.info(\"Using test whisper\")\n        return [\n            Segment(start=0, end=1, text=\"This is a test\"),\n            Segment(start=1, end=2, text=\"This is another test\"),\n        ]\n\n\nclass LocalWhisperTranscriber(Transcriber):\n\n    def __init__(self, logger: logging.Logger, whisper_model: str):\n        self.logger = logger\n        self.whisper_model = whisper_model\n\n    @property\n    def model_name(self) -> str:\n        return f\"local_{self.whisper_model}\"\n\n    @staticmethod\n    def convert_to_pydantic(\n        transcript_data: List[Any],\n    ) -> List[LocalTranscriptSegment]:\n        return [LocalTranscriptSegment(**item) for item in transcript_data]\n\n    @staticmethod\n    def local_seg_to_seg(local_segments: List[LocalTranscriptSegment]) -> List[Segment]:\n        return [seg.to_segment() for seg in local_segments]\n\n    def transcribe(self, audio_file_path: str) -> List[Segment]:\n        # Import whisper only when needed to avoid CUDA dependencies during module import\n        try:\n            import whisper  # type: ignore[import-untyped]\n        except ImportError as e:\n            self.logger.error(f\"Failed to import whisper: {e}\")\n            raise ImportError(\n                \"whisper library is required for LocalWhisperTranscriber\"\n            ) from e\n\n        self.logger.info(\"Using local whisper\")\n        models = whisper.available_models()\n        self.logger.info(f\"Available models: {models}\")\n\n        model = whisper.load_model(name=self.whisper_model)\n\n        self.logger.info(\"Beginning transcription\")\n        start = time.time()\n        result = model.transcribe(audio_file_path, fp16=False, language=\"English\")\n        end = time.time()\n        elapsed = end - start\n        self.logger.info(f\"Transcription completed in {elapsed}\")\n        segments = result[\"segments\"]\n        typed_segments = self.convert_to_pydantic(segments)\n\n        return self.local_seg_to_seg(typed_segments)\n\n\nclass OpenAIWhisperTranscriber(Transcriber):\n\n    def __init__(self, logger: logging.Logger, config: RemoteWhisperConfig):\n        self.logger = logger\n        self.config = config\n\n        self.openai_client = OpenAI(\n            base_url=config.base_url,\n            api_key=config.api_key,\n            timeout=config.timeout_sec,\n        )\n\n    @property\n    def model_name(self) -> str:\n        return self.config.model  # e.g. \"whisper-1\"\n\n    def transcribe(self, audio_file_path: str) -> List[Segment]:\n        self.logger.info(\n            \"[WHISPER_REMOTE] Starting remote whisper transcription for: %s\",\n            audio_file_path,\n        )\n        audio_chunk_path = audio_file_path + \"_parts\"\n\n        chunks = split_audio(\n            Path(audio_file_path),\n            Path(audio_chunk_path),\n            self.config.chunksize_mb * 1024 * 1024,\n        )\n\n        self.logger.info(\"[WHISPER_REMOTE] Processing %d chunks\", len(chunks))\n        all_segments: List[TranscriptionSegment] = []\n\n        for idx, chunk in enumerate(chunks):\n            chunk_path, offset = chunk\n            self.logger.info(\n                \"[WHISPER_REMOTE] Processing chunk %d/%d: %s\",\n                idx + 1,\n                len(chunks),\n                chunk_path,\n            )\n            segments = self.get_segments_for_chunk(str(chunk_path))\n            self.logger.info(\n                \"[WHISPER_REMOTE] Chunk %d/%d complete: %d segments\",\n                idx + 1,\n                len(chunks),\n                len(segments),\n            )\n            all_segments.extend(self.add_offset_to_segments(segments, offset))\n\n        shutil.rmtree(audio_chunk_path)\n        self.logger.info(\n            \"[WHISPER_REMOTE] Transcription complete: %d total segments\",\n            len(all_segments),\n        )\n        return self.convert_segments(all_segments)\n\n    @staticmethod\n    def convert_segments(segments: List[TranscriptionSegment]) -> List[Segment]:\n        return [\n            Segment(\n                start=seg.start,\n                end=seg.end,\n                text=seg.text,\n            )\n            for seg in segments\n        ]\n\n    @staticmethod\n    def add_offset_to_segments(\n        segments: List[TranscriptionSegment], offset_ms: int\n    ) -> List[TranscriptionSegment]:\n        offset_sec = float(offset_ms) / 1000.0\n        for segment in segments:\n            segment.start += offset_sec\n            segment.end += offset_sec\n\n        return segments\n\n    def get_segments_for_chunk(self, chunk_path: str) -> List[TranscriptionSegment]:\n        with open(chunk_path, \"rb\") as f:\n            self.logger.info(\n                \"[WHISPER_API_CALL] Sending chunk to API: %s (timeout=%ds)\",\n                chunk_path,\n                self.config.timeout_sec,\n            )\n\n            transcription = self.openai_client.audio.transcriptions.create(\n                model=self.config.model,\n                file=f,\n                timestamp_granularities=[\"segment\"],\n                language=self.config.language,\n                response_format=\"verbose_json\",\n            )\n\n            self.logger.debug(\"Got transcription\")\n\n            segments = transcription.segments\n            assert segments is not None\n\n            self.logger.debug(f\"Got {len(segments)} segments\")\n\n            return segments\n\n\nclass GroqTranscriptionSegment(BaseModel):\n    start: float\n    end: float\n    text: str\n\n\nclass GroqWhisperTranscriber(Transcriber):\n\n    def __init__(self, logger: logging.Logger, config: GroqWhisperConfig):\n        self.logger = logger\n        self.config = config\n        self.client = Groq(\n            api_key=config.api_key,\n            max_retries=config.max_retries,\n        )\n\n    @property\n    def model_name(self) -> str:\n        return f\"groq_{self.config.model}\"\n\n    def transcribe(self, audio_file_path: str) -> List[Segment]:\n        self.logger.info(\n            \"[WHISPER_GROQ] Starting Groq whisper transcription for: %s\",\n            audio_file_path,\n        )\n        audio_chunk_path = audio_file_path + \"_parts\"\n\n        chunks = split_audio(\n            Path(audio_file_path), Path(audio_chunk_path), 12 * 1024 * 1024\n        )\n\n        self.logger.info(\"[WHISPER_GROQ] Processing %d chunks\", len(chunks))\n        all_segments: List[GroqTranscriptionSegment] = []\n\n        for idx, chunk in enumerate(chunks):\n            chunk_path, offset = chunk\n            self.logger.info(\n                \"[WHISPER_GROQ] Processing chunk %d/%d: %s\",\n                idx + 1,\n                len(chunks),\n                chunk_path,\n            )\n            segments = self.get_segments_for_chunk(str(chunk_path))\n            self.logger.info(\n                \"[WHISPER_GROQ] Chunk %d/%d complete: %d segments\",\n                idx + 1,\n                len(chunks),\n                len(segments),\n            )\n            all_segments.extend(self.add_offset_to_segments(segments, offset))\n\n        shutil.rmtree(audio_chunk_path)\n        self.logger.info(\n            \"[WHISPER_GROQ] Transcription complete: %d total segments\",\n            len(all_segments),\n        )\n        return self.convert_segments(all_segments)\n\n    @staticmethod\n    def convert_segments(segments: List[GroqTranscriptionSegment]) -> List[Segment]:\n        return [\n            Segment(\n                start=seg.start,\n                end=seg.end,\n                text=seg.text,\n            )\n            for seg in segments\n        ]\n\n    @staticmethod\n    def add_offset_to_segments(\n        segments: List[GroqTranscriptionSegment], offset_ms: int\n    ) -> List[GroqTranscriptionSegment]:\n        offset_sec = float(offset_ms) / 1000.0\n        for segment in segments:\n            segment.start += offset_sec\n            segment.end += offset_sec\n\n        return segments\n\n    def get_segments_for_chunk(self, chunk_path: str) -> List[GroqTranscriptionSegment]:\n\n        self.logger.info(\"[GROQ_API_CALL] Sending chunk to Groq API: %s\", chunk_path)\n        transcription = self.client.audio.transcriptions.create(\n            file=Path(chunk_path),\n            model=self.config.model,\n            response_format=\"verbose_json\",  # Ensure segments are included\n            language=self.config.language,\n        )\n        self.logger.info(\n            \"[GROQ_API_CALL] Received response from Groq API for: %s\", chunk_path\n        )\n\n        if transcription.segments is None:  # type: ignore [attr-defined]\n            self.logger.warning(\n                \"[GROQ_API_CALL] No segments found in transcription for %s\", chunk_path\n            )\n            return []\n\n        groq_segments = [\n            GroqTranscriptionSegment(\n                start=seg[\"start\"], end=seg[\"end\"], text=seg[\"text\"]\n            )\n            for seg in transcription.segments  # type: ignore [attr-defined]\n        ]\n\n        self.logger.info(\n            \"[GROQ_API_CALL] Got %d segments from chunk\", len(groq_segments)\n        )\n        return groq_segments\n"
  },
  {
    "path": "src/podcast_processor/transcription_manager.py",
    "content": "import logging\nfrom typing import Any, List, Optional\n\nfrom app.extensions import db\nfrom app.models import ModelCall, Post, TranscriptSegment\nfrom app.writer.client import writer_client\nfrom shared.config import (\n    Config,\n    GroqWhisperConfig,\n    LocalWhisperConfig,\n    RemoteWhisperConfig,\n    TestWhisperConfig,\n)\n\nfrom .transcribe import (\n    GroqWhisperTranscriber,\n    LocalWhisperTranscriber,\n    OpenAIWhisperTranscriber,\n    TestWhisperTranscriber,\n    Transcriber,\n)\n\n\nclass TranscriptionManager:\n    \"\"\"Handles the transcription of podcast audio files.\"\"\"\n\n    def __init__(\n        self,\n        logger: logging.Logger,\n        config: Config,\n        model_call_query: Optional[Any] = None,\n        segment_query: Optional[Any] = None,\n        db_session: Optional[Any] = None,\n        transcriber: Optional[Transcriber] = None,\n    ):\n        self.logger = logger\n        self.config = config\n        self.transcriber = transcriber or self._create_transcriber()\n        self._model_call_query_provided = model_call_query is not None\n        self.model_call_query = model_call_query or ModelCall.query\n        self._segment_query_provided = segment_query is not None\n        self.segment_query = segment_query or TranscriptSegment.query\n        self.db_session = db_session or db.session\n\n    def _create_transcriber(self) -> Transcriber:\n        \"\"\"Create the appropriate transcriber based on configuration.\"\"\"\n        assert self.config.whisper is not None, (\n            \"validate_whisper_config ensures that even if old style whisper \"\n            \"config is given, it will be translated and config.whisper set.\"\n        )\n\n        if isinstance(self.config.whisper, TestWhisperConfig):\n            return TestWhisperTranscriber(self.logger)\n        if isinstance(self.config.whisper, RemoteWhisperConfig):\n            return OpenAIWhisperTranscriber(self.logger, self.config.whisper)\n        if isinstance(self.config.whisper, LocalWhisperConfig):\n            return LocalWhisperTranscriber(self.logger, self.config.whisper.model)\n        if isinstance(self.config.whisper, GroqWhisperConfig):\n            return GroqWhisperTranscriber(self.logger, self.config.whisper)\n        raise ValueError(f\"unhandled whisper config {self.config.whisper}\")\n\n    def _check_existing_transcription(\n        self, post: Post\n    ) -> Optional[List[TranscriptSegment]]:\n        \"\"\"Checks for existing successful transcription and returns segments if valid.\n\n        NOTE: Defaults to using self.db_session for queries to keep a single session,\n        but will honor injected model_call_query/segment_query when provided (e.g. tests).\n        \"\"\"\n        model_call_query = (\n            self.model_call_query\n            if self._model_call_query_provided\n            else self.db_session.query(ModelCall)\n        )\n        segment_query = (\n            self.segment_query\n            if self._segment_query_provided\n            else self.db_session.query(TranscriptSegment)\n        )\n\n        existing_whisper_call = (\n            model_call_query.filter_by(\n                post_id=post.id,\n                model_name=self.transcriber.model_name,\n                status=\"success\",\n            )\n            .order_by(ModelCall.timestamp.desc())\n            .first()\n        )\n\n        if existing_whisper_call:\n            self.logger.info(\n                f\"Found existing successful Whisper ModelCall {existing_whisper_call.id} for post {post.id}.\"\n            )\n            db_segments: List[TranscriptSegment] = (\n                segment_query.filter_by(post_id=post.id)\n                .order_by(TranscriptSegment.sequence_num)\n                .all()\n            )\n            if db_segments:\n                if (\n                    existing_whisper_call.last_segment_sequence_num\n                    == len(db_segments) - 1\n                ):\n                    self.logger.info(\n                        f\"Returning {len(db_segments)} existing transcript segments from database for post {post.id}.\"\n                    )\n                    return db_segments\n                self.logger.warning(\n                    f\"ModelCall {existing_whisper_call.id} for post {post.id} indicates {existing_whisper_call.last_segment_sequence_num + 1} segments, but found {len(db_segments)} in DB. Re-transcribing.\"\n                )\n            else:\n                self.logger.warning(\n                    f\"Successful ModelCall {existing_whisper_call.id} found for post {post.id}, but no transcript segments in DB. Re-transcribing.\"\n                )\n        else:\n            self.logger.info(\n                f\"No existing successful Whisper ModelCall found for post {post.id} with model {self.transcriber.model_name}. Proceeding to transcribe.\"\n            )\n        return None\n\n    def _get_or_create_whisper_model_call(self, post: Post) -> ModelCall:\n        \"\"\"Create or reuse the placeholder ModelCall row for a Whisper run via writer.\"\"\"\n        result = writer_client.action(\n            \"upsert_whisper_model_call\",\n            {\n                \"post_id\": post.id,\n                \"model_name\": self.transcriber.model_name,\n                \"first_segment_sequence_num\": 0,\n                \"last_segment_sequence_num\": -1,\n                \"prompt\": \"Whisper transcription job\",\n            },\n            wait=True,\n        )\n        if not result or not result.success:\n            raise RuntimeError(getattr(result, \"error\", \"Failed to upsert ModelCall\"))\n\n        model_call_id = (result.data or {}).get(\"model_call_id\")\n        if model_call_id is None:\n            raise RuntimeError(\"Writer did not return model_call_id\")\n        model_call = self.db_session.get(ModelCall, int(model_call_id))\n        if model_call is None:\n            raise RuntimeError(f\"ModelCall {model_call_id} not found after upsert\")\n        return model_call\n\n    def transcribe(self, post: Post) -> List[TranscriptSegment]:\n        \"\"\"\n        Transcribes a podcast audio file, or retrieves existing transcription.\n\n        Args:\n            post: The Post object containing the podcast audio to transcribe\n\n        Returns:\n            A list of TranscriptSegment objects with the transcription results\n        \"\"\"\n        self.logger.info(\n            f\"Starting transcription process for post {post.id} using {self.transcriber.model_name}\"\n        )\n\n        existing_segments = self._check_existing_transcription(post)\n        if existing_segments is not None:\n            return existing_segments\n\n        # Create or reuse the ModelCall record for this transcription attempt\n        current_whisper_call = self._get_or_create_whisper_model_call(post)\n        self.logger.info(\n            f\"Prepared Whisper ModelCall {current_whisper_call.id} for post {post.id}.\"\n        )\n\n        try:\n            self.logger.info(\n                f\"[TRANSCRIBE_START] Calling transcriber {self.transcriber.model_name} for post {post.id}, audio: {post.unprocessed_audio_path}\"\n            )\n            # Expire session state before long-running transcription to avoid stale locks\n            self.db_session.expire_all()\n\n            pydantic_segments = self.transcriber.transcribe(post.unprocessed_audio_path)\n            self.logger.info(\n                f\"[TRANSCRIBE_COMPLETE] Transcription by {self.transcriber.model_name} for post {post.id} resulted in {len(pydantic_segments)} segments.\"\n            )\n\n            segments_payload = [\n                {\n                    \"sequence_num\": i,\n                    \"start_time\": round(seg.start, 1),\n                    \"end_time\": round(seg.end, 1),\n                    \"text\": seg.text,\n                }\n                for i, seg in enumerate(pydantic_segments or [])\n            ]\n\n            write_res = writer_client.action(\n                \"replace_transcription\",\n                {\n                    \"post_id\": post.id,\n                    \"segments\": segments_payload,\n                    \"model_call_id\": current_whisper_call.id,\n                },\n                wait=True,\n            )\n            if not write_res or not write_res.success:\n                raise RuntimeError(\n                    getattr(write_res, \"error\", \"Failed to persist transcription\")\n                )\n\n            segment_query = (\n                self.segment_query\n                if self._segment_query_provided\n                else self.db_session.query(TranscriptSegment)\n            )\n            db_segments: List[TranscriptSegment] = (\n                segment_query.filter_by(post_id=post.id)\n                .order_by(TranscriptSegment.sequence_num)\n                .all()\n            )\n            self.logger.info(\n                f\"Successfully stored {len(db_segments)} transcript segments and updated ModelCall {current_whisper_call.id} for post {post.id}.\"\n            )\n            return db_segments\n\n        except Exception as e:\n            self.logger.error(\n                f\"Transcription failed for post {post.id} using {self.transcriber.model_name}. Error: {e}\",\n                exc_info=True,\n            )\n\n            fail_res = writer_client.action(\n                \"mark_model_call_failed\",\n                {\n                    \"model_call_id\": current_whisper_call.id,\n                    \"error_message\": str(e),\n                    \"status\": \"failed_permanent\",\n                },\n                wait=True,\n            )\n            if not fail_res or not fail_res.success:\n                self.logger.error(\n                    \"Failed to mark ModelCall %s as failed via writer: %s\",\n                    current_whisper_call.id,\n                    getattr(fail_res, \"error\", None),\n                )\n\n            raise\n"
  },
  {
    "path": "src/podcast_processor/word_boundary_refiner.py",
    "content": "\"\"\"LLM-based word-boundary refiner.\n\nNote: We intentionally share some call-setup patterns with BoundaryRefiner.\nPylint may flag these as R0801 (duplicate-code); we ignore that for this module.\n\"\"\"\n\n# pylint: disable=duplicate-code\n\nimport json\nimport logging\nimport re\nfrom dataclasses import dataclass\nfrom pathlib import Path\nfrom typing import Any, Dict, List, Optional, Tuple, cast\n\nimport litellm\nfrom jinja2 import Template\n\nfrom podcast_processor.llm_model_call_utils import (\n    extract_litellm_content,\n    render_prompt_and_upsert_model_call,\n    try_update_model_call,\n)\nfrom shared.config import Config\n\n# Keep the same internal bounds as the existing BoundaryRefiner.\nMAX_START_EXTENSION_SECONDS = 30.0\nMAX_END_EXTENSION_SECONDS = 15.0\n\n\n@dataclass\nclass WordBoundaryRefinement:\n    refined_start: float\n    refined_end: float\n    start_adjustment_reason: str\n    end_adjustment_reason: str\n\n\nclass WordBoundaryRefiner:\n    \"\"\"Refine ad start boundary by finding the first ad word and estimating its time.\n\n    This refiner is intentionally heuristic-timed because we only have segment-level\n    timestamps today.\n    \"\"\"\n\n    def __init__(self, config: Config, logger: Optional[logging.Logger] = None):\n        self.config = config\n        self.logger = logger or logging.getLogger(__name__)\n        self.template = self._load_template()\n\n    def _load_template(self) -> Template:\n        path = (\n            Path(__file__).resolve().parent.parent  # project src root\n            / \"word_boundary_refinement_prompt.jinja\"\n        )\n        if path.exists():\n            return Template(path.read_text())\n        return Template(\n            \"\"\"Find start/end phrases for the ad break.\nAd: {{ad_start}}s-{{ad_end}}s\n{% for seg in context_segments %}[seq={{seg.sequence_num}} start={{seg.start_time}} end={{seg.end_time}}] {{seg.text}}\n{% endfor %}\n    Return JSON: {\"refined_start_segment_seq\": 0, \"refined_start_phrase\": \"\", \"refined_end_segment_seq\": 0, \"refined_end_phrase\": \"\", \"start_adjustment_reason\": \"\", \"end_adjustment_reason\": \"\"}\n\"\"\"\n        )\n\n    def refine(\n        self,\n        ad_start: float,\n        ad_end: float,\n        confidence: float,\n        all_segments: List[Dict[str, Any]],\n        *,\n        post_id: Optional[int] = None,\n        first_seq_num: Optional[int] = None,\n        last_seq_num: Optional[int] = None,\n    ) -> WordBoundaryRefinement:\n        context = self._get_context(\n            ad_start,\n            ad_end,\n            all_segments,\n            first_seq_num=first_seq_num,\n            last_seq_num=last_seq_num,\n        )\n\n        prompt, model_call_id = render_prompt_and_upsert_model_call(\n            template=self.template,\n            ad_start=ad_start,\n            ad_end=ad_end,\n            confidence=confidence,\n            context_segments=context,\n            post_id=post_id,\n            first_seq_num=first_seq_num,\n            last_seq_num=last_seq_num,\n            model_name=self.config.llm_model,\n            logger=self.logger,\n            log_prefix=\"Word boundary refine\",\n        )\n\n        raw_response: Optional[str] = None\n\n        try:\n            response = litellm.completion(\n                model=self.config.llm_model,\n                messages=[{\"role\": \"user\", \"content\": prompt}],\n                temperature=0.1,\n                max_tokens=2048,\n                timeout=self.config.openai_timeout,\n                api_key=self.config.llm_api_key,\n                base_url=self.config.openai_base_url,\n            )\n\n            content = extract_litellm_content(response)\n            raw_response = content\n            self._update_model_call(\n                model_call_id,\n                status=\"received_response\",\n                response=raw_response,\n                error_message=None,\n            )\n\n            parsed = self._parse_json(content)\n            if not parsed:\n                self.logger.warning(\n                    \"Word boundary refine: no parseable JSON; falling back to original start\",\n                    extra={\"content_preview\": (content or \"\")[:200]},\n                )\n                self._update_model_call(\n                    model_call_id,\n                    status=\"success_heuristic\",\n                    response=raw_response,\n                    error_message=\"parse_failed\",\n                )\n                return self._fallback(ad_start, ad_end)\n\n            payload = self._extract_payload(parsed)\n\n            refined_start, start_changed, start_reason, start_err = self._refine_start(\n                ad_start=ad_start,\n                all_segments=all_segments,\n                context_segments=context,\n                start_segment_seq=payload[\"start_segment_seq\"],\n                start_phrase=payload[\"start_phrase\"],\n                start_word=payload[\"start_word\"],\n                start_occurrence=payload[\"start_occurrence\"],\n                start_word_index=payload[\"start_word_index\"],\n                start_reason=payload[\"start_reason\"],\n            )\n            refined_end, end_changed, end_reason, end_err = self._refine_end(\n                ad_end=ad_end,\n                all_segments=all_segments,\n                context_segments=context,\n                end_segment_seq=payload[\"end_segment_seq\"],\n                end_phrase=payload[\"end_phrase\"],\n                end_reason=payload[\"end_reason\"],\n            )\n\n            partial_errors = [e for e in [start_err, end_err] if e]\n\n            # If caller didn't provide reasons, default to unchanged for untouched sides.\n            start_reason = self._default_reason(start_reason, changed=start_changed)\n            end_reason = self._default_reason(end_reason, changed=end_changed)\n\n            # Guardrail: never return an invalid window.\n            if refined_end <= refined_start:\n                self._update_model_call(\n                    model_call_id,\n                    status=\"success_heuristic\",\n                    response=raw_response,\n                    error_message=\"invalid_refined_window\",\n                )\n                return self._fallback(ad_start, ad_end)\n\n            self._update_model_call(\n                model_call_id,\n                status=self._result_status(start_changed, end_changed, partial_errors),\n                response=raw_response,\n                error_message=(\",\".join(partial_errors) if partial_errors else None),\n            )\n\n            result = WordBoundaryRefinement(\n                refined_start=refined_start,\n                refined_end=refined_end,\n                start_adjustment_reason=start_reason,\n                end_adjustment_reason=end_reason,\n            )\n\n            self._update_model_call(\n                model_call_id,\n                status=\"success\",\n                response=raw_response,\n                error_message=None,\n            )\n            return result\n\n        except Exception as exc:\n            self._update_model_call(\n                model_call_id,\n                status=\"failed_permanent\",\n                response=raw_response,\n                error_message=str(exc),\n            )\n            self.logger.warning(\"Word boundary refine failed: %s\", exc)\n            return self._fallback(ad_start, ad_end)\n\n    def _fallback(self, ad_start: float, ad_end: float) -> WordBoundaryRefinement:\n        return WordBoundaryRefinement(\n            refined_start=ad_start,\n            refined_end=ad_end,\n            start_adjustment_reason=\"heuristic_fallback\",\n            end_adjustment_reason=\"unchanged\",\n        )\n\n    def _constrain_start(self, estimated_start: float, orig_start: float) -> float:\n        return max(estimated_start, orig_start - MAX_START_EXTENSION_SECONDS)\n\n    def _constrain_end(self, estimated_end: float, orig_end: float) -> float:\n        # Allow slight forward extension (for late boundary) but cap it.\n        return min(estimated_end, orig_end + MAX_END_EXTENSION_SECONDS)\n\n    def _parse_json(self, content: str) -> Optional[Dict[str, Any]]:\n        cleaned = re.sub(r\"```json|```\", \"\", (content or \"\").strip())\n        json_candidates = re.findall(r\"\\{.*?\\}\", cleaned, re.DOTALL)\n        for candidate in json_candidates:\n            try:\n                loaded = json.loads(candidate)\n                if isinstance(loaded, dict):\n                    return cast(Dict[str, Any], loaded)\n            except Exception:\n                continue\n        return None\n\n    @staticmethod\n    def _has_text(value: Any) -> bool:\n        if value is None:\n            return False\n        try:\n            return bool(str(value).strip())\n        except Exception:\n            return False\n\n    def _extract_payload(self, parsed: Dict[str, Any]) -> Dict[str, Any]:\n        occurrence = parsed.get(\"occurrence\")\n        if occurrence is None:\n            occurrence = parsed.get(\"occurance\")\n\n        return {\n            \"start_segment_seq\": parsed.get(\"refined_start_segment_seq\"),\n            \"start_phrase\": parsed.get(\"refined_start_phrase\"),\n            \"end_segment_seq\": parsed.get(\"refined_end_segment_seq\"),\n            \"end_phrase\": parsed.get(\"refined_end_phrase\"),\n            \"start_word\": parsed.get(\"refined_start_word\"),\n            \"start_occurrence\": occurrence,\n            \"start_word_index\": parsed.get(\"refined_start_word_index\"),\n            \"start_reason\": str(parsed.get(\"start_adjustment_reason\") or \"\"),\n            \"end_reason\": str(parsed.get(\"end_adjustment_reason\") or \"\"),\n        }\n\n    @staticmethod\n    def _default_reason(reason: str, *, changed: bool) -> str:\n        if reason:\n            return reason\n        return \"refined\" if changed else \"unchanged\"\n\n    @staticmethod\n    def _result_status(\n        start_changed: bool, end_changed: bool, partial_errors: List[str]\n    ) -> str:\n        if partial_errors and not start_changed and not end_changed:\n            return \"success_heuristic\"\n        return \"success\"\n\n    def _refine_start(\n        self,\n        *,\n        ad_start: float,\n        all_segments: List[Dict[str, Any]],\n        context_segments: List[Dict[str, Any]],\n        start_segment_seq: Any,\n        start_phrase: Any,\n        start_word: Any,\n        start_occurrence: Any,\n        start_word_index: Any,\n        start_reason: str,\n    ) -> Tuple[float, bool, str, Optional[str]]:\n        if self._has_text(start_phrase):\n            estimated_start = self._estimate_phrase_time(\n                all_segments=all_segments,\n                context_segments=context_segments,\n                preferred_segment_seq=start_segment_seq,\n                phrase=start_phrase,\n                direction=\"start\",\n            )\n            if estimated_start is None:\n                return float(ad_start), False, start_reason, \"start_phrase_not_found\"\n            return (\n                self._constrain_start(float(estimated_start), ad_start),\n                True,\n                start_reason,\n                None,\n            )\n\n        if self._has_text(start_word) or start_word_index is not None:\n            estimated_start = self._estimate_word_time(\n                all_segments=all_segments,\n                segment_seq=start_segment_seq,\n                word=start_word,\n                occurrence=start_occurrence,\n                word_index=start_word_index,\n            )\n            return (\n                self._constrain_start(float(estimated_start), ad_start),\n                True,\n                start_reason,\n                None,\n            )\n\n        return float(ad_start), False, (start_reason or \"unchanged\"), None\n\n    def _refine_end(\n        self,\n        *,\n        ad_end: float,\n        all_segments: List[Dict[str, Any]],\n        context_segments: List[Dict[str, Any]],\n        end_segment_seq: Any,\n        end_phrase: Any,\n        end_reason: str,\n    ) -> Tuple[float, bool, str, Optional[str]]:\n        if not self._has_text(end_phrase):\n            return float(ad_end), False, (end_reason or \"unchanged\"), None\n\n        estimated_end = self._estimate_phrase_time(\n            all_segments=all_segments,\n            context_segments=context_segments,\n            preferred_segment_seq=end_segment_seq,\n            phrase=end_phrase,\n            direction=\"end\",\n        )\n        if estimated_end is None:\n            return float(ad_end), False, end_reason, \"end_phrase_not_found\"\n\n        return (\n            self._constrain_end(float(estimated_end), ad_end),\n            True,\n            end_reason,\n            None,\n        )\n\n    def _get_context(\n        self,\n        ad_start: float,\n        ad_end: float,\n        all_segments: List[Dict[str, Any]],\n        *,\n        first_seq_num: Optional[int],\n        last_seq_num: Optional[int],\n    ) -> List[Dict[str, Any]]:\n        selected = self._context_by_seq_window(\n            all_segments,\n            first_seq_num=first_seq_num,\n            last_seq_num=last_seq_num,\n        )\n        if selected:\n            return selected\n\n        return self._context_by_time_overlap(ad_start, ad_end, all_segments)\n\n    def _context_by_seq_window(\n        self,\n        all_segments: List[Dict[str, Any]],\n        *,\n        first_seq_num: Optional[int],\n        last_seq_num: Optional[int],\n    ) -> List[Dict[str, Any]]:\n        if first_seq_num is None or last_seq_num is None or not all_segments:\n            return []\n\n        seq_values: List[int] = []\n        for segment in all_segments:\n            try:\n                seq_values.append(int(segment.get(\"sequence_num\", -1)))\n            except Exception:\n                continue\n        if not seq_values:\n            return []\n\n        min_seq = min(seq_values)\n        max_seq = max(seq_values)\n        start_seq = max(min_seq, int(first_seq_num) - 2)\n        end_seq = min(max_seq, int(last_seq_num) + 2)\n\n        selected: List[Dict[str, Any]] = []\n        for segment in all_segments:\n            try:\n                seq = int(segment.get(\"sequence_num\", -1))\n            except Exception:\n                continue\n            if start_seq <= seq <= end_seq:\n                selected.append(segment)\n\n        return selected\n\n    def _context_by_time_overlap(\n        self,\n        ad_start: float,\n        ad_end: float,\n        all_segments: List[Dict[str, Any]],\n    ) -> List[Dict[str, Any]]:\n        ad_segs = [\n            s for s in all_segments if self._segment_overlaps(s, ad_start, ad_end)\n        ]\n        if not ad_segs:\n            return []\n\n        first_idx = all_segments.index(ad_segs[0])\n        last_idx = all_segments.index(ad_segs[-1])\n        start_idx = max(0, first_idx - 2)\n        end_idx = min(len(all_segments), last_idx + 3)\n        return all_segments[start_idx:end_idx]\n\n    @staticmethod\n    def _segment_overlaps(\n        segment: Dict[str, Any], ad_start: float, ad_end: float\n    ) -> bool:\n        try:\n            seg_start = float(segment.get(\"start_time\", 0.0))\n        except Exception:\n            seg_start = 0.0\n        try:\n            seg_end = float(segment.get(\"end_time\", seg_start))\n        except Exception:\n            seg_end = seg_start\n        return seg_start <= float(ad_end) and seg_end >= float(ad_start)\n\n    def _estimate_phrase_times(\n        self,\n        *,\n        all_segments: List[Dict[str, Any]],\n        context_segments: List[Dict[str, Any]],\n        start_segment_seq: Any,\n        start_phrase: Any,\n        end_segment_seq: Any,\n        end_phrase: Any,\n    ) -> Tuple[Optional[float], Optional[float]]:\n        start_time = self._estimate_phrase_time(\n            all_segments=all_segments,\n            context_segments=context_segments,\n            preferred_segment_seq=start_segment_seq,\n            phrase=start_phrase,\n            direction=\"start\",\n        )\n        end_time = self._estimate_phrase_time(\n            all_segments=all_segments,\n            context_segments=context_segments,\n            preferred_segment_seq=end_segment_seq,\n            phrase=end_phrase,\n            direction=\"end\",\n        )\n        return start_time, end_time\n\n    def _estimate_phrase_time(\n        self,\n        *,\n        all_segments: List[Dict[str, Any]],\n        context_segments: List[Dict[str, Any]],\n        preferred_segment_seq: Any,\n        phrase: Any,\n        direction: str,\n    ) -> Optional[float]:\n        phrase_tokens = self._split_words(str(phrase or \"\"))\n        phrase_tokens = [t.lower() for t in phrase_tokens if t]\n        if not phrase_tokens:\n            return None\n\n        # Search order:\n        # 1) preferred segment (if provided)\n        # 2) other provided context segments (ad-range ±2)\n        candidates: List[Dict[str, Any]] = []\n        preferred_seg = self._find_segment(all_segments, preferred_segment_seq)\n        if preferred_seg is not None:\n            candidates.append(preferred_seg)\n\n        # De-duplicate and order additional candidates.\n        ordered_context = list(context_segments or [])\n        try:\n            ordered_context.sort(key=lambda s: int(s.get(\"sequence_num\", -1)))\n        except Exception:\n            pass\n        if direction == \"end\":\n            ordered_context = list(reversed(ordered_context))\n\n        preferred_seq_int: Optional[int]\n        try:\n            preferred_seq_int = int(preferred_segment_seq)\n        except Exception:\n            preferred_seq_int = None\n\n        for seg in ordered_context:\n            try:\n                seq = int(seg.get(\"sequence_num\", -1))\n            except Exception:\n                seq = None\n            if preferred_seq_int is not None and seq == preferred_seq_int:\n                continue\n            candidates.append(seg)\n\n        for seg in candidates:\n            start_time = float(seg.get(\"start_time\", 0.0))\n            end_time = float(seg.get(\"end_time\", start_time))\n            duration = max(0.0, end_time - start_time)\n            words = [w.lower() for w in self._split_words(str(seg.get(\"text\", \"\")))]\n            if not words or duration <= 0.0:\n                continue\n\n            match = self._find_phrase_match(\n                words=words,\n                phrase_tokens=phrase_tokens,\n                direction=direction,\n                max_words=4,\n            )\n            if match is None:\n                continue\n\n            match_start_idx, match_end_idx = match\n            seconds_per_word = duration / float(len(words))\n            if direction == \"start\":\n                estimated = start_time + (float(match_start_idx) * seconds_per_word)\n                return min(estimated, end_time)\n\n            # direction == \"end\": end boundary at the end of the last matched word.\n            estimated = start_time + (float(match_end_idx + 1) * seconds_per_word)\n            return min(estimated, end_time)\n\n        return None\n\n    def _find_phrase_match(\n        self,\n        *,\n        words: List[str],\n        phrase_tokens: List[str],\n        direction: str,\n        max_words: int,\n    ) -> Optional[Tuple[int, int]]:\n        if not words or not phrase_tokens:\n            return None\n\n        if direction == \"start\":\n            base = phrase_tokens[:max_words]\n            for k in range(len(base), 0, -1):\n                target = base[:k]\n                match = self._find_subsequence(words, target, choose=\"first\")\n                if match is not None:\n                    return match\n            return None\n\n        # direction == \"end\"\n        base = phrase_tokens[-max_words:]\n        for k in range(len(base), 0, -1):\n            target = base[-k:]\n            match = self._find_subsequence(words, target, choose=\"last\")\n            if match is not None:\n                return match\n        return None\n\n    def _find_subsequence(\n        self, words: List[str], target: List[str], *, choose: str\n    ) -> Optional[Tuple[int, int]]:\n        if not target or len(target) > len(words):\n            return None\n\n        matches: List[Tuple[int, int]] = []\n        k = len(target)\n        for i in range(0, len(words) - k + 1):\n            if words[i : i + k] == target:\n                matches.append((i, i + k - 1))\n\n        if not matches:\n            return None\n        if choose == \"last\":\n            return matches[-1]\n        return matches[0]\n\n    def _estimate_word_time(\n        self,\n        *,\n        all_segments: List[Dict[str, Any]],\n        segment_seq: Any,\n        word: Any,\n        occurrence: Any,\n        word_index: Any,\n    ) -> float:\n        seg = self._find_segment(all_segments, segment_seq)\n        if not seg:\n            return float(all_segments[0][\"start_time\"]) if all_segments else 0.0\n\n        start_time = float(seg.get(\"start_time\", 0.0))\n        end_time = float(seg.get(\"end_time\", start_time))\n        duration = max(0.0, end_time - start_time)\n\n        words = self._split_words(str(seg.get(\"text\", \"\")))\n        if not words or duration <= 0.0:\n            return start_time\n\n        resolved_index = self._resolve_word_index(\n            words,\n            word=word,\n            occurrence=occurrence,\n            word_index=word_index,\n        )\n\n        # Heuristic timing: constant word duration within the segment.\n        # words_per_second = num_words / segment_duration\n        # seconds_per_word = 1 / words_per_second = segment_duration / num_words\n        seconds_per_word = duration / float(len(words))\n        estimated = start_time + (float(resolved_index) * seconds_per_word)\n        # Guardrail: never return a start after the block end.\n        return min(estimated, float(seg.get(\"end_time\", end_time)))\n\n    def _find_segment(\n        self, all_segments: List[Dict[str, Any]], segment_seq: Any\n    ) -> Optional[Dict[str, Any]]:\n        if segment_seq is None:\n            return None\n        try:\n            seq_int = int(segment_seq)\n        except Exception:\n            return None\n\n        for seg in all_segments:\n            if int(seg.get(\"sequence_num\", -1)) == seq_int:\n                return seg\n        return None\n\n    def _split_words(self, text: str) -> List[str]:\n        # Word count/indexing heuristic: split on whitespace, then normalize away\n        # leading/trailing punctuation to keep indices stable.\n        raw_tokens = [t for t in re.split(r\"\\s+\", (text or \"\").strip()) if t]\n        normalized = [self._normalize_token(t) for t in raw_tokens]\n        return [t for t in normalized if t]\n\n    def _normalize_token(self, token: str) -> str:\n        # Strip leading/trailing punctuation; keep internal apostrophes.\n        # Examples:\n        #   \"(brought\" -> \"brought\"\n        #   \"you...\" -> \"you\"\n        #   \"don't\" -> \"don't\"\n        return re.sub(r\"(^[^A-Za-z0-9']+)|([^A-Za-z0-9']+$)\", \"\", token)\n\n    def _resolve_word_index(\n        self, words: List[str], *, word: Any, occurrence: Any, word_index: Any\n    ) -> int:\n        # Prefer the verbatim word match if provided.\n        # `occurance` chooses which matching instance to use.\n        # Defaults to \"first\" if missing/invalid.\n        target_raw = str(word).strip() if word is not None else \"\"\n        target = self._normalize_token(target_raw).lower()\n        if target:\n            match_indexes = [\n                idx for idx, w in enumerate(words) if (w or \"\").lower() == target\n            ]\n            if match_indexes:\n                occ = str(occurrence).strip().lower() if occurrence is not None else \"\"\n                if occ == \"last\":\n                    return match_indexes[-1]\n                # Default to first if LLM response is missing/invalid.\n                return match_indexes[0]\n\n        try:\n            idx_int = int(word_index)\n        except Exception:\n            idx_int = 0\n\n        idx_int = max(0, min(idx_int, len(words) - 1))\n        return idx_int\n\n    def _update_model_call(\n        self,\n        model_call_id: Optional[int],\n        *,\n        status: str,\n        response: Optional[str],\n        error_message: Optional[str],\n    ) -> None:\n        try_update_model_call(\n            model_call_id,\n            status=status,\n            response=response,\n            error_message=error_message,\n            logger=self.logger,\n            log_prefix=\"Word boundary refine\",\n        )\n"
  },
  {
    "path": "src/shared/__init__.py",
    "content": ""
  },
  {
    "path": "src/shared/config.py",
    "content": "from __future__ import annotations\n\nfrom typing import Literal, Optional\n\nfrom pydantic import BaseModel, Field, model_validator\n\nfrom shared import defaults as DEFAULTS\n\n\nclass ProcessingConfig(BaseModel):\n    num_segments_to_input_to_prompt: int\n    max_overlap_segments: int = Field(\n        default=DEFAULTS.PROCESSING_MAX_OVERLAP_SEGMENTS,\n        ge=0,\n        description=\"Maximum number of previously identified segments carried into the next prompt.\",\n    )\n\n    @model_validator(mode=\"after\")\n    def validate_overlap_limits(self) -> \"ProcessingConfig\":\n        assert (\n            self.max_overlap_segments <= self.num_segments_to_input_to_prompt\n        ), \"max_overlap_segments must be <= num_segments_to_input_to_prompt\"\n        return self\n\n\nclass OutputConfig(BaseModel):\n    fade_ms: int\n    min_ad_segement_separation_seconds: int\n    min_ad_segment_length_seconds: int\n    min_confidence: float\n\n    @property\n    def min_ad_segment_separation_seconds(self) -> int:\n        \"\"\"Backwards-compatible alias for the misspelled config field.\"\"\"\n        return self.min_ad_segement_separation_seconds\n\n    @min_ad_segment_separation_seconds.setter\n    def min_ad_segment_separation_seconds(self, value: int) -> None:\n        self.min_ad_segement_separation_seconds = value\n\n\nWhisperConfigTypes = Literal[\"remote\", \"local\", \"test\", \"groq\"]\n\n\nclass TestWhisperConfig(BaseModel):\n    whisper_type: Literal[\"test\"] = \"test\"\n\n\nclass RemoteWhisperConfig(BaseModel):\n    whisper_type: Literal[\"remote\"] = \"remote\"\n    base_url: str = DEFAULTS.WHISPER_REMOTE_BASE_URL\n    api_key: str\n    language: str = DEFAULTS.WHISPER_REMOTE_LANGUAGE\n    model: str = DEFAULTS.WHISPER_REMOTE_MODEL\n    timeout_sec: int = DEFAULTS.WHISPER_REMOTE_TIMEOUT_SEC\n    chunksize_mb: int = DEFAULTS.WHISPER_REMOTE_CHUNKSIZE_MB\n\n\nclass GroqWhisperConfig(BaseModel):\n    whisper_type: Literal[\"groq\"] = \"groq\"\n    api_key: str\n    language: str = DEFAULTS.WHISPER_GROQ_LANGUAGE\n    model: str = DEFAULTS.WHISPER_GROQ_MODEL\n    max_retries: int = DEFAULTS.WHISPER_GROQ_MAX_RETRIES\n\n\nclass LocalWhisperConfig(BaseModel):\n    whisper_type: Literal[\"local\"] = \"local\"\n    model: str = DEFAULTS.WHISPER_LOCAL_MODEL\n\n\nclass Config(BaseModel):\n    llm_api_key: Optional[str] = Field(default=None)\n    llm_model: str = Field(default=DEFAULTS.LLM_DEFAULT_MODEL)\n    openai_base_url: Optional[str] = None\n    openai_max_tokens: int = DEFAULTS.OPENAI_DEFAULT_MAX_TOKENS\n    openai_timeout: int = DEFAULTS.OPENAI_DEFAULT_TIMEOUT_SEC\n    # Optional: Rate limiting controls\n    llm_max_concurrent_calls: int = Field(\n        default=DEFAULTS.LLM_DEFAULT_MAX_CONCURRENT_CALLS,\n        description=\"Maximum concurrent LLM calls to prevent rate limiting\",\n    )\n    llm_max_retry_attempts: int = Field(\n        default=DEFAULTS.LLM_DEFAULT_MAX_RETRY_ATTEMPTS,\n        description=\"Maximum retry attempts for failed LLM calls\",\n    )\n    llm_max_input_tokens_per_call: Optional[int] = Field(\n        default=DEFAULTS.LLM_MAX_INPUT_TOKENS_PER_CALL,\n        description=\"Maximum input tokens per LLM call to stay under API limits\",\n    )\n    # Token-based rate limiting\n    llm_enable_token_rate_limiting: bool = Field(\n        default=DEFAULTS.LLM_ENABLE_TOKEN_RATE_LIMITING,\n        description=\"Enable client-side token-based rate limiting\",\n    )\n    llm_max_input_tokens_per_minute: Optional[int] = Field(\n        default=DEFAULTS.LLM_MAX_INPUT_TOKENS_PER_MINUTE,\n        description=\"Override default tokens per minute limit for the model\",\n    )\n    enable_boundary_refinement: bool = Field(\n        default=DEFAULTS.ENABLE_BOUNDARY_REFINEMENT,\n        description=\"Enable LLM-based ad boundary refinement for improved precision (consumes additional LLM tokens)\",\n    )\n    enable_word_level_boundary_refinder: bool = Field(\n        default=DEFAULTS.ENABLE_WORD_LEVEL_BOUNDARY_REFINDER,\n        description=\"Enable word-level (heuristic-timed) ad boundary refinement\",\n    )\n    developer_mode: bool = Field(\n        default=False,\n        description=\"Enable developer mode features like test feeds\",\n    )\n    output: OutputConfig\n    processing: ProcessingConfig\n    server: Optional[str] = Field(\n        default=None,\n        deprecated=True,\n        description=\"deprecated in favor of request-aware URL generation\",\n    )\n    background_update_interval_minute: Optional[int] = (\n        DEFAULTS.APP_BACKGROUND_UPDATE_INTERVAL_MINUTE\n    )\n    post_cleanup_retention_days: Optional[int] = Field(\n        default=DEFAULTS.APP_POST_CLEANUP_RETENTION_DAYS,\n        description=\"Number of days to retain processed post data before cleanup. None disables cleanup.\",\n    )\n    # removed job_timeout\n    whisper: Optional[\n        LocalWhisperConfig | RemoteWhisperConfig | TestWhisperConfig | GroqWhisperConfig\n    ] = Field(\n        default=None,\n        discriminator=\"whisper_type\",\n    )\n    remote_whisper: Optional[bool] = Field(\n        default=False,\n        deprecated=True,\n        description=\"deprecated in favor of [Remote|Local]WhisperConfig\",\n    )\n    whisper_model: Optional[str] = Field(\n        default=DEFAULTS.WHISPER_LOCAL_MODEL,\n        deprecated=True,\n        description=\"deprecated in favor of [Remote|Local]WhisperConfig\",\n    )\n    automatically_whitelist_new_episodes: bool = (\n        DEFAULTS.APP_AUTOMATICALLY_WHITELIST_NEW_EPISODES\n    )\n    number_of_episodes_to_whitelist_from_archive_of_new_feed: int = (\n        DEFAULTS.APP_NUM_EPISODES_TO_WHITELIST_FROM_ARCHIVE_OF_NEW_FEED\n    )\n    enable_public_landing_page: bool = DEFAULTS.APP_ENABLE_PUBLIC_LANDING_PAGE\n    user_limit_total: int | None = DEFAULTS.APP_USER_LIMIT_TOTAL\n    autoprocess_on_download: bool = DEFAULTS.APP_AUTOPROCESS_ON_DOWNLOAD\n\n    def redacted(self) -> Config:\n        return self.model_copy(\n            update={\n                \"llm_api_key\": \"X\" * 10,\n            },\n            deep=True,\n        )\n\n    @model_validator(mode=\"after\")\n    def validate_whisper_config(self) -> \"Config\":\n        new_style = self.whisper is not None\n\n        if new_style:\n            self.whisper_model = None\n            self.remote_whisper = None\n            return self\n\n        # if we have old style, change to the equivalent new style\n        if self.remote_whisper:\n            assert (\n                self.llm_api_key is not None\n            ), \"must supply api key to use remote whisper\"\n            self.whisper = RemoteWhisperConfig(\n                api_key=self.llm_api_key,\n                base_url=self.openai_base_url or \"https://api.openai.com/v1\",\n            )\n        else:\n            assert (\n                self.whisper_model is not None\n            ), \"must supply whisper model to use local whisper\"\n            self.whisper = LocalWhisperConfig(model=self.whisper_model)\n\n        self.whisper_model = None\n        self.remote_whisper = None\n\n        return self\n"
  },
  {
    "path": "src/shared/defaults.py",
    "content": "from __future__ import annotations\n\n# Centralized default values for application configuration.\n# Single source of truth for defaults across runtime, DB models, and Pydantic config.\n\n# LLM defaults\nLLM_DEFAULT_MODEL = \"groq/openai/gpt-oss-120b\"\nOPENAI_DEFAULT_MAX_TOKENS = 4096\nOPENAI_DEFAULT_TIMEOUT_SEC = 300\nLLM_DEFAULT_MAX_CONCURRENT_CALLS = 3\nLLM_DEFAULT_MAX_RETRY_ATTEMPTS = 5\nLLM_ENABLE_TOKEN_RATE_LIMITING = False\nLLM_MAX_INPUT_TOKENS_PER_CALL: int | None = None\nLLM_MAX_INPUT_TOKENS_PER_MINUTE: int | None = None\nENABLE_BOUNDARY_REFINEMENT = True\nENABLE_WORD_LEVEL_BOUNDARY_REFINDER = False\n\n# Whisper defaults\nWHISPER_DEFAULT_TYPE = \"groq\"\nWHISPER_LOCAL_MODEL = \"base.en\"\nWHISPER_REMOTE_BASE_URL = \"https://api.openai.com/v1\"\nWHISPER_REMOTE_MODEL = \"whisper-1\"\nWHISPER_REMOTE_LANGUAGE = \"en\"\nWHISPER_REMOTE_TIMEOUT_SEC = 600\nWHISPER_REMOTE_CHUNKSIZE_MB = 24\n\nWHISPER_GROQ_MODEL = \"whisper-large-v3-turbo\"\nWHISPER_GROQ_LANGUAGE = \"en\"\nWHISPER_GROQ_MAX_RETRIES = 3\n\n# Processing defaults\nPROCESSING_NUM_SEGMENTS_TO_INPUT_TO_PROMPT = 60\nPROCESSING_MAX_OVERLAP_SEGMENTS = 30\n\n# Output defaults\nOUTPUT_FADE_MS = 3000\nOUTPUT_MIN_AD_SEGMENT_SEPARATION_SECONDS = 60\nOUTPUT_MIN_AD_SEGMENT_LENGTH_SECONDS = 14\nOUTPUT_MIN_CONFIDENCE = 0.8\n\n# App defaults\nAPP_BACKGROUND_UPDATE_INTERVAL_MINUTE = 30\nAPP_AUTOMATICALLY_WHITELIST_NEW_EPISODES = True\nAPP_NUM_EPISODES_TO_WHITELIST_FROM_ARCHIVE_OF_NEW_FEED = 1\nAPP_POST_CLEANUP_RETENTION_DAYS = 5\nAPP_ENABLE_PUBLIC_LANDING_PAGE = False\nAPP_USER_LIMIT_TOTAL: int | None = None\nAPP_AUTOPROCESS_ON_DOWNLOAD = False\n\n# Credits defaults\nMINUTES_PER_CREDIT = 60\n"
  },
  {
    "path": "src/shared/interfaces.py",
    "content": "from __future__ import annotations\n\nfrom typing import Optional, Protocol, runtime_checkable\n\n\n@runtime_checkable\nclass Post(Protocol):\n    \"\"\"Interface for post objects to break cyclic dependencies.\"\"\"\n\n    id: int\n    guid: str\n    download_url: Optional[str]\n    title: str\n\n    @property\n    def whitelisted(self) -> bool:\n        \"\"\"Whether this post is whitelisted for processing.\"\"\"\n"
  },
  {
    "path": "src/shared/llm_utils.py",
    "content": "\"\"\"Shared helpers for working with LLM provider quirks.\"\"\"\n\nfrom __future__ import annotations\n\nfrom typing import Final\n\n# Patterns for models that require the `max_completion_tokens` parameter\n# instead of the legacy `max_tokens`. OpenAI began enforcing this on the\n# newer gpt-4o / gpt-5 / o1 style models.\n_MAX_COMPLETION_TOKEN_MODELS: Final[tuple[str, ...]] = (\n    \"gpt-5\",\n    \"gpt-4o\",\n    \"o1-\",\n    \"o1_\",\n    \"o1/\",\n    \"chatgpt-4o-latest\",\n)\n\n\ndef model_uses_max_completion_tokens(model_name: str | None) -> bool:\n    \"\"\"Return True when the target model expects `max_completion_tokens`.\"\"\"\n    if not model_name:\n        return False\n    model_lower = model_name.lower()\n    return any(pattern in model_lower for pattern in _MAX_COMPLETION_TOKEN_MODELS)\n"
  },
  {
    "path": "src/shared/processing_paths.py",
    "content": "import os\nimport re\nfrom dataclasses import dataclass\nfrom pathlib import Path\n\n\n@dataclass\nclass ProcessingPaths:\n    post_processed_audio_path: Path\n\n\ndef paths_from_unprocessed_path(\n    unprocessed_path: str, feed_title: str\n) -> ProcessingPaths:\n    unprocessed_filename = Path(unprocessed_path).name\n    # Sanitize feed_title to prevent illegal characters in paths\n    # Keep spaces, alphanumeric. Remove others.\n    sanitized_feed_title = re.sub(r\"[^a-zA-Z0-9\\s_.-]\", \"\", feed_title).strip()\n    # Remove any trailing dots that might result from sanitization\n    sanitized_feed_title = sanitized_feed_title.rstrip(\".\")\n    # Replace spaces with underscores for friendlier directory names\n    sanitized_feed_title = re.sub(r\"\\s+\", \"_\", sanitized_feed_title)\n\n    return ProcessingPaths(\n        post_processed_audio_path=get_srv_root()\n        / sanitized_feed_title\n        / unprocessed_filename,\n    )\n\n\ndef get_job_unprocessed_path(post_guid: str, job_id: str, post_title: str) -> Path:\n    \"\"\"Return a unique per-job path for the unprocessed audio file.\n\n    Layout: in/jobs/{post_guid}/{job_id}/{sanitized_title}.mp3\n    \"\"\"\n    # Keep same sanitization behavior used for download filenames\n    sanitized_title = re.sub(r\"[^a-zA-Z0-9\\s]\", \"\", post_title).strip()\n    return get_in_root() / \"jobs\" / post_guid / job_id / f\"{sanitized_title}.mp3\"\n\n\n# ---- New centralized data-root helpers ----\n\n\ndef get_instance_dir() -> Path:\n    \"\"\"Absolute instance directory inside the container.\n\n    Defaults to /app/src/instance. Can be overridden via PODLY_INSTANCE_DIR for tests.\n    \"\"\"\n    return Path(os.environ.get(\"PODLY_INSTANCE_DIR\", \"/app/src/instance\"))\n\n\ndef get_base_podcast_data_dir() -> Path:\n    \"\"\"Root under which podcasts (in/srv) live, e.g., /app/src/instance/data.\"\"\"\n    return Path(\n        os.environ.get(\"PODLY_PODCAST_DATA_DIR\", str(get_instance_dir() / \"data\"))\n    )\n\n\ndef get_in_root() -> Path:\n    return get_base_podcast_data_dir() / \"in\"\n\n\ndef get_srv_root() -> Path:\n    return get_base_podcast_data_dir() / \"srv\"\n"
  },
  {
    "path": "src/shared/test_utils.py",
    "content": "\"\"\"\nShared configuration helpers to avoid code duplication.\n\"\"\"\n\nfrom .config import Config, OutputConfig, ProcessingConfig\n\n\ndef create_standard_test_config(\n    llm_api_key: str = \"test-key\",\n    llm_max_input_tokens_per_call: int | None = None,\n    num_segments_to_input_to_prompt: int = 400,\n    max_overlap_segments: int = 30,\n) -> Config:\n    \"\"\"\n    Create a standardized configuration for testing and demos.\n\n    Args:\n        llm_api_key: API key for testing\n        llm_max_input_tokens_per_call: Optional token limit\n        num_segments_to_input_to_prompt: Number of segments per prompt\n        max_overlap_segments: Maximum number of previously identified segments to carry forward\n\n    Returns:\n        Configured Config object for testing\n    \"\"\"\n    return Config(\n        llm_api_key=llm_api_key,\n        llm_max_input_tokens_per_call=llm_max_input_tokens_per_call,\n        output=OutputConfig(\n            fade_ms=2000,\n            min_ad_segement_separation_seconds=60,\n            min_ad_segment_length_seconds=14,\n            min_confidence=0.7,\n        ),\n        processing=ProcessingConfig(\n            num_segments_to_input_to_prompt=num_segments_to_input_to_prompt,\n            max_overlap_segments=max_overlap_segments,\n        ),\n    )\n"
  },
  {
    "path": "src/system_prompt.txt",
    "content": "Your job is to identify advertisements in podcast transcript excerpts with high precision, continuity awareness, and content-context sensitivity.\n\nCRITICAL: distinguish external sponsor ads from technical discussion and self-promotion.\n\nCONTENT-AWARE TAXONOMY:\n- technical_discussion: Educational content, case studies, implementation details. Company names may appear as examples; do not mark as ads.\n- educational/self_promo: Host discussing their own products, newsletters, funds, or courses (may include CTAs but are first-party).\n- promotional_external: True sponsor ads for external companies with sales intent, URLs, promo codes, or explicit offers.\n- transition: Brief bumpers that connect to or from ads; include if they are part of an ad block.\n\nJSON CONTRACT (strict):\n- Always respond with: {\"ad_segments\": [...], \"content_type\": \"<taxonomy>\", \"confidence\": <0.0-1.0>}\n- Each ad_segments item must be: {\"segment_offset\": <seconds.float>, \"confidence\": <0.0-1.0>}\n- If there are no ads, respond with: {\"ad_segments\":[]} (no extra keys).\n\nDURATION AND CUE GUIDANCE:\n- Ads are typically 15–120 seconds and contain CTAs, URLs/domains, promo/discount codes, phone numbers, or phrases like \"brought to you by\".\n- Integrated ads can be longer but maintain sales intent; continuous mention of the same sponsor for >3 minutes without CTAs is likely educational/self_promo.\n- Pre-roll/mid-roll/post-roll intros (\"a word from our sponsor\") and quick outros (\"back to the show\") belong to the ad block.\n\nDECISION RULES:\n1) Continuous ads: once an ad starts, follow it to its natural conclusion; include 1–5 second transitions.\n2) Strong cues: treat URLs/domains, promo/discount language, and phone numbers as strong sponsor indicators.\n3) Self-promotion guardrail: host promoting their own products/platforms → classify as educational/self_promo with lower confidence unless explicit external sponsorship language is present.\n4) Boundary bias: if later segments clearly form an ad for a sponsor, pull in the prior two intro/transition lines as ad content.\n5) Prefer labeling as content unless multiple strong ad cues appear with clear external branding.\n\nThis transcript excerpt is broken into segments starting with a timestamp [X] (seconds). Output every segment that is advertisement content.\n\nExample (external sponsor with CTA):\n[53.8] That's all coming after the break.\n[59.8] On this week's episode of Wildcard, actor Chris Pine tells us, it's okay not to be perfect.\n[64.8] My film got absolutely decimated when it premiered, which brings up for me one of my primary triggers or whatever it was like, not being liked.\n[73.8] I'm Rachel Martin, Chris Pine on How to Find Joy in Imperfection.\n[77.8] That's on the new podcast, Wildcard.\n[79.8] The Game Where Cards control the conversation.\n[83.8] And welcome back to the show, today we're talking to Professor Hopkins\nOutput: {\"ad_segments\":[{\"segment_offset\":59.8,\"confidence\":0.95},{\"segment_offset\":64.8,\"confidence\":0.9},{\"segment_offset\":73.8,\"confidence\":0.92},{\"segment_offset\":77.8,\"confidence\":0.98},{\"segment_offset\":79.8,\"confidence\":0.9}],\"content_type\":\"promotional_external\",\"confidence\":0.96}\n\nExample (technical mention, not an ad):\n[4762.7] Our brains are configured differently.\n[4765.6] My brain is configured perfectly for Ruby, perfectly for a dynamically typed language.\n[4831.3] Shopify exists at a scale most programmers never touch, and it still runs on Rails.\n[4933.2] Shopify.com has supported this show.\nOutput: {\"ad_segments\": [{\"segment_offset\": 4933.2, \"confidence\": 0.75}], \"content_type\": \"technical_discussion\", \"confidence\": 0.45}\n\n\n"
  },
  {
    "path": "src/tests/__init__.py",
    "content": "\"\"\"Tests package for podly.\"\"\"\n"
  },
  {
    "path": "src/tests/conftest.py",
    "content": "\"\"\"\nFixtures for pytest tests in the tests directory.\n\"\"\"\n\nimport logging\nimport sys\nfrom pathlib import Path\nfrom typing import Generator\nfrom unittest.mock import MagicMock\n\nimport pytest\nfrom flask import Flask\n\nfrom app.extensions import db\nfrom app.models import ProcessingJob, TranscriptSegment\nfrom podcast_processor.ad_classifier import AdClassifier\nfrom podcast_processor.audio_processor import AudioProcessor\nfrom podcast_processor.podcast_downloader import PodcastDownloader\nfrom podcast_processor.processing_status_manager import ProcessingStatusManager\nfrom podcast_processor.transcription_manager import TranscriptionManager\nfrom shared.config import Config\nfrom shared.test_utils import create_standard_test_config\n\n# Set up whisper and torch mocks\nwhisper_mock = MagicMock()\nwhisper_mock.available_models.return_value = [\n    \"tiny\",\n    \"base\",\n    \"small\",\n    \"medium\",\n    \"large\",\n]\nwhisper_mock.load_model.return_value = MagicMock()\nwhisper_mock.load_model.return_value.transcribe.return_value = {\"segments\": []}\n\ntorch_mock = MagicMock()\ntorch_mock.cuda = MagicMock()\ntorch_mock.device = MagicMock()\n\n# Pre-mock the modules to avoid imports during test collection\nsys.modules[\"whisper\"] = whisper_mock\nsys.modules[\"torch\"] = torch_mock\n\n\n@pytest.fixture\ndef app() -> Generator[Flask, None, None]:\n    \"\"\"Create a Flask app for testing.\"\"\"\n    app = Flask(__name__)\n    app.config[\"SQLALCHEMY_DATABASE_URI\"] = \"sqlite:///:memory:\"\n    app.config[\"SQLALCHEMY_TRACK_MODIFICATIONS\"] = False\n\n    with app.app_context():\n        db.init_app(app)\n        db.create_all()\n        yield app\n\n\n@pytest.fixture\ndef test_config() -> Config:\n    return create_standard_test_config()\n\n\n@pytest.fixture\ndef test_logger() -> logging.Logger:\n    return logging.getLogger(\"test_logger\")\n\n\n@pytest.fixture\ndef mock_db_session() -> MagicMock:\n    \"\"\"Create a mock database session\"\"\"\n    mock_session = MagicMock()\n    mock_session.add = MagicMock()\n    mock_session.add_all = MagicMock()\n    mock_session.commit = MagicMock()\n    mock_session.rollback = MagicMock()\n    return mock_session\n\n\n@pytest.fixture\ndef mock_transcription_manager() -> MagicMock:\n    manager = MagicMock(spec=TranscriptionManager)\n    manager.transcribe.return_value = [\n        TranscriptSegment(\n            sequence_num=0, start_time=0.0, end_time=5.0, text=\"Test segment 1\"\n        ),\n        TranscriptSegment(\n            sequence_num=1, start_time=5.0, end_time=10.0, text=\"Test segment 2\"\n        ),\n    ]\n    return manager\n\n\n@pytest.fixture\ndef mock_ad_classifier() -> MagicMock:\n    classifier = MagicMock(spec=AdClassifier)\n    classifier.classify.return_value = None  # classify method has no return value\n    return classifier\n\n\n@pytest.fixture\ndef mock_audio_processor() -> MagicMock:\n    processor = MagicMock(spec=AudioProcessor)\n    processor.get_ad_segments.return_value = [(0.0, 5.0)]\n    return processor\n\n\n@pytest.fixture\ndef mock_downloader() -> MagicMock:\n    downloader = MagicMock(spec=PodcastDownloader)\n    downloader.get_and_make_download_path.return_value = Path(\"test_path\")\n    downloader.download_episode.return_value = Path(\"test_path\")\n    return downloader\n\n\n@pytest.fixture\ndef mock_status_manager() -> MagicMock:\n    status_manager = MagicMock(spec=ProcessingStatusManager)\n    status_manager.create_job.return_value = ProcessingJob(id=\"test_job_id\")\n    status_manager.cancel_existing_jobs.return_value = None\n    return status_manager\n"
  },
  {
    "path": "src/tests/test_ad_classifier.py",
    "content": "from typing import Generator\nfrom unittest.mock import MagicMock, patch\n\nimport pytest\nfrom flask import Flask\nfrom jinja2 import Template\nfrom litellm.exceptions import InternalServerError\nfrom litellm.types.utils import Choices\n\nfrom app.extensions import db\nfrom app.models import ModelCall, Post, TranscriptSegment\nfrom podcast_processor.ad_classifier import AdClassifier\nfrom podcast_processor.model_output import (\n    AdSegmentPrediction,\n    AdSegmentPredictionList,\n)\nfrom shared.config import Config\nfrom shared.test_utils import create_standard_test_config\n\n\n@pytest.fixture\ndef app() -> Generator[Flask, None, None]:\n    \"\"\"Create and configure a Flask app for testing.\"\"\"\n    app = Flask(__name__)\n    app.config[\"SQLALCHEMY_DATABASE_URI\"] = \"sqlite:///:memory:\"\n    app.config[\"SQLALCHEMY_TRACK_MODIFICATIONS\"] = False\n\n    with app.app_context():\n        db.init_app(app)\n        db.create_all()\n        yield app\n\n\n@pytest.fixture\ndef test_config() -> Config:\n    return create_standard_test_config()\n\n\n@pytest.fixture\ndef mock_db_session() -> MagicMock:\n    \"\"\"Create a mock database session\"\"\"\n    mock_session = MagicMock()\n    mock_session.add = MagicMock()\n    mock_session.add_all = MagicMock()\n    mock_session.commit = MagicMock()\n    mock_session.rollback = MagicMock()\n    return mock_session\n\n\n@pytest.fixture\ndef test_classifier(test_config: Config) -> AdClassifier:\n    \"\"\"Create an AdClassifier with default dependencies\"\"\"\n    return AdClassifier(config=test_config)\n\n\n@pytest.fixture\ndef test_classifier_with_mocks(\n    test_config: Config, mock_db_session: MagicMock\n) -> AdClassifier:\n    \"\"\"Create an AdClassifier with mock dependencies\"\"\"\n    mock_model_call_query = MagicMock()\n    mock_identification_query = MagicMock()\n\n    return AdClassifier(\n        config=test_config,\n        model_call_query=mock_model_call_query,\n        identification_query=mock_identification_query,\n        db_session=mock_db_session,\n    )\n\n\ndef test_call_model(test_config: Config, app: Flask) -> None:\n    \"\"\"Test the _call_model method with mocked litellm\"\"\"\n    with app.app_context():\n        classifier = AdClassifier(config=test_config, db_session=db.session)\n\n        # Create and persist a ModelCall row (writer_client local fallback updates by id)\n        dummy_model_call = ModelCall(\n            post_id=0,\n            model_name=test_config.llm_model,\n            prompt=\"test prompt\",\n            first_segment_sequence_num=0,\n            last_segment_sequence_num=0,\n            status=\"pending\",\n        )\n        db.session.add(dummy_model_call)\n        db.session.commit()\n\n        # Create a mock message and choice directly\n        mock_message = MagicMock()\n        mock_message.content = \"test response\"\n\n        mock_choice = MagicMock(spec=Choices)\n        mock_choice.message = mock_message\n\n        mock_response = MagicMock()\n        mock_response.choices = [mock_choice]\n\n        # Patch the litellm.completion function for this test\n        with patch(\"litellm.completion\", return_value=mock_response):\n            # Call the method\n            response = classifier._call_model(\n                model_call_obj=dummy_model_call,\n                system_prompt=\"test system prompt\",\n            )\n\n            # Verify response\n            assert response == \"test response\"\n            refreshed = db.session.get(ModelCall, dummy_model_call.id)\n            assert refreshed is not None\n            assert refreshed.status == \"success\"\n            assert refreshed.response == \"test response\"\n\n\ndef test_call_model_retry_on_internal_error(test_config: Config, app: Flask) -> None:\n    \"\"\"Test that _call_model retries on InternalServerError\"\"\"\n    with app.app_context():\n        classifier = AdClassifier(config=test_config, db_session=db.session)\n\n        dummy_model_call = ModelCall(\n            post_id=0,\n            model_name=test_config.llm_model,\n            prompt=\"test prompt\",\n            first_segment_sequence_num=0,\n            last_segment_sequence_num=0,\n            status=\"pending\",\n        )\n        db.session.add(dummy_model_call)\n        db.session.commit()\n\n        # Create a mock message and choice directly\n        mock_message = MagicMock()\n        mock_message.content = \"test response\"\n\n        mock_choice = MagicMock(spec=Choices)\n        mock_choice.message = mock_message\n\n        mock_response = MagicMock()\n        mock_response.choices = [mock_choice]\n\n        # First call fails, second succeeds\n        mock_completion_side_effects = [\n            InternalServerError(\n                message=\"test error\",\n                llm_provider=\"test_provider\",\n                model=\"test_model\",\n            ),\n            mock_response,\n        ]\n\n        # Patch time.sleep to avoid waiting during tests\n        with patch(\"time.sleep\"), patch(\n            \"litellm.completion\", side_effect=mock_completion_side_effects\n        ) as mocked_completion:\n            response = classifier._call_model(\n                model_call_obj=dummy_model_call,\n                system_prompt=\"test system prompt\",\n            )\n\n            assert response == \"test response\"\n            assert mocked_completion.call_count == 2\n            refreshed = db.session.get(ModelCall, dummy_model_call.id)\n            assert refreshed is not None\n            assert refreshed.status == \"success\"\n            assert refreshed.response == \"test response\"\n            assert refreshed.retry_attempts == 2\n\n\ndef test_process_chunk(test_config: Config, app: Flask) -> None:\n    \"\"\"Test processing a chunk of transcript segments\"\"\"\n    with app.app_context():\n        # Create mocks\n        mock_db_session = MagicMock()\n        mock_model_call_query = MagicMock()\n\n        # Create the classifier with our mocks\n        classifier = AdClassifier(\n            config=test_config,\n            model_call_query=mock_model_call_query,\n            db_session=mock_db_session,\n        )\n\n        # Create test data\n        post = Post(id=1, title=\"Test Post\")\n        segments = [\n            TranscriptSegment(\n                id=1,\n                post_id=1,\n                sequence_num=0,\n                start_time=0.0,\n                end_time=10.0,\n                text=\"Test segment 1\",\n            ),\n            TranscriptSegment(\n                id=2,\n                post_id=1,\n                sequence_num=1,\n                start_time=10.0,\n                end_time=20.0,\n                text=\"Test segment 2\",\n            ),\n        ]\n\n        # Create a proper Jinja2 Template object\n        user_template = Template(\"Test template: {{ podcast_title }}\")\n\n        user_prompt = classifier._generate_user_prompt(\n            current_chunk_db_segments=segments,\n            post=post,\n            user_prompt_template=user_template,\n            includes_start=True,\n            includes_end=True,\n        )\n\n        # Create an actual ModelCall instance instead of a MagicMock\n        model_call = ModelCall(\n            post_id=1,\n            model_name=test_config.llm_model,\n            prompt=\"test prompt\",\n            first_segment_sequence_num=0,\n            last_segment_sequence_num=1,\n            status=\"success\",\n            response='{\"ad_segments\": []}',\n        )\n\n        # Use patch.multiple to mock multiple methods with a single context manager\n        mock_get_model_call = MagicMock(return_value=model_call)\n        mock_process_response = MagicMock(return_value=segments)\n\n        with patch.multiple(\n            classifier,\n            _get_or_create_model_call=mock_get_model_call,\n            _process_successful_response=mock_process_response,\n        ):\n            result = classifier._process_chunk(\n                chunk_segments=segments,\n                system_prompt=\"test system prompt\",\n                post=post,\n                user_prompt_str=user_prompt,\n            )\n\n            mock_get_model_call.assert_called_once()\n            mock_process_response.assert_called_once()\n            assert result == segments\n\n\ndef test_compute_next_overlap_segments_includes_context(\n    test_classifier_with_mocks: AdClassifier,\n) -> None:\n    classifier = test_classifier_with_mocks\n    segments = [\n        TranscriptSegment(\n            id=i + 1,\n            post_id=1,\n            sequence_num=i,\n            start_time=float(i),\n            end_time=float(i + 1),\n            text=f\"Segment {i}\",\n        )\n        for i in range(6)\n    ]\n\n    identified_segments = [segments[2], segments[3], segments[4]]\n\n    result = classifier._compute_next_overlap_segments(\n        chunk_segments=segments,\n        identified_segments=identified_segments,\n        max_overlap_segments=6,\n    )\n\n    assert [seg.sequence_num for seg in result] == [0, 1, 2, 3, 4, 5]\n\n\ndef test_compute_next_overlap_segments_respects_cap(\n    test_classifier_with_mocks: AdClassifier,\n) -> None:\n    classifier = test_classifier_with_mocks\n    segments = [\n        TranscriptSegment(\n            id=i + 1,\n            post_id=1,\n            sequence_num=i,\n            start_time=float(i),\n            end_time=float(i + 1),\n            text=f\"Segment {i}\",\n        )\n        for i in range(6)\n    ]\n    identified_segments = [segments[2], segments[3], segments[4]]\n\n    result = classifier._compute_next_overlap_segments(\n        chunk_segments=segments,\n        identified_segments=identified_segments,\n        max_overlap_segments=2,\n    )\n\n    assert [seg.sequence_num for seg in result] == [4, 5]\n\n\ndef test_compute_next_overlap_segments_baseline_overlap_without_ads(\n    test_classifier_with_mocks: AdClassifier,\n) -> None:\n    classifier = test_classifier_with_mocks\n    segments = [\n        TranscriptSegment(\n            id=i + 1,\n            post_id=1,\n            sequence_num=i,\n            start_time=float(i),\n            end_time=float(i + 1),\n            text=f\"Segment {i}\",\n        )\n        for i in range(8)\n    ]\n\n    result = classifier._compute_next_overlap_segments(\n        chunk_segments=segments, identified_segments=[], max_overlap_segments=4\n    )\n\n    assert [seg.sequence_num for seg in result] == [4, 5, 6, 7]\n\n\ndef test_create_identifications_skips_existing_ad_label(\n    test_classifier_with_mocks: AdClassifier,\n) -> None:\n    classifier = test_classifier_with_mocks\n    mock_query = classifier.identification_query\n    mock_query.filter_by.return_value.first.return_value = MagicMock()\n\n    segment = TranscriptSegment(\n        id=1,\n        post_id=1,\n        sequence_num=0,\n        start_time=0.0,\n        end_time=10.0,\n        text=\"Test segment\",\n    )\n    prediction_list = AdSegmentPredictionList(\n        ad_segments=[AdSegmentPrediction(segment_offset=0.0, confidence=0.9)]\n    )\n    model_call = ModelCall(\n        post_id=1,\n        model_name=classifier.config.llm_model,\n        prompt=\"prompt\",\n        first_segment_sequence_num=0,\n        last_segment_sequence_num=0,\n    )\n\n    created_count, matched_segments = classifier._create_identifications(\n        prediction_list=prediction_list,\n        current_chunk_db_segments=[segment],\n        model_call=model_call,\n    )\n\n    assert created_count == 0\n    assert matched_segments == [segment]\n    classifier.db_session.add.assert_not_called()\n\n\ndef test_build_chunk_payload_trims_for_token_limit(\n    test_classifier_with_mocks: AdClassifier,\n) -> None:\n    classifier = test_classifier_with_mocks\n    classifier.config.processing.num_segments_to_input_to_prompt = 3\n    classifier.config.processing.max_overlap_segments = 5\n    classifier.config.llm_max_input_tokens_per_call = 1000\n\n    overlap_segments = [\n        TranscriptSegment(\n            id=1,\n            post_id=1,\n            sequence_num=0,\n            start_time=0.0,\n            end_time=1.0,\n            text=\"Overlap\",\n        )\n    ]\n    remaining_segments = [\n        TranscriptSegment(\n            id=i + 2,\n            post_id=1,\n            sequence_num=i + 1,\n            start_time=float(i + 1),\n            end_time=float(i + 2),\n            text=f\"Segment {i + 1}\",\n        )\n        for i in range(3)\n    ]\n\n    system_prompt = \"System\"\n    template = Template(\"{{ transcript }}\")\n\n    with patch.object(\n        classifier,\n        \"_validate_token_limit\",\n        side_effect=[False, True],\n    ) as mock_validator:\n        chunk_segments, user_prompt, consumed, trimmed = (\n            classifier._build_chunk_payload(\n                overlap_segments=overlap_segments,\n                remaining_segments=remaining_segments,\n                total_segments=overlap_segments + remaining_segments,\n                post=Post(id=1, title=\"Test\"),\n                system_prompt=system_prompt,\n                user_prompt_template=template,\n                max_new_segments=3,\n            )\n        )\n\n    assert trimmed is True\n    assert consumed == 2\n    assert len(chunk_segments) >= consumed\n    assert mock_validator.call_count == 2\n    assert user_prompt\n"
  },
  {
    "path": "src/tests/test_ad_classifier_rate_limiting_integration.py",
    "content": "\"\"\"\nTests for rate limiting integration in AdClassifier.\n\"\"\"\n\nfrom unittest.mock import Mock, patch\n\nfrom podcast_processor.ad_classifier import AdClassifier\nfrom podcast_processor.token_rate_limiter import TokenRateLimiter\n\nfrom .test_helpers import create_test_config\n\n\nclass TestAdClassifierRateLimiting:\n    \"\"\"Test cases for rate limiting integration in AdClassifier.\"\"\"\n\n    def test_rate_limiter_initialization_enabled(self):\n        \"\"\"Test that rate limiter is properly initialized when enabled.\"\"\"\n        config = create_test_config()\n\n        with patch(\"podcast_processor.ad_classifier.db.session\") as mock_session:\n            classifier = AdClassifier(config=config, db_session=mock_session)\n\n            assert classifier.rate_limiter is not None\n            assert isinstance(classifier.rate_limiter, TokenRateLimiter)\n            assert (\n                classifier.rate_limiter.tokens_per_minute == 30000\n            )  # Anthropic default\n\n    def test_rate_limiter_initialization_disabled(self):\n        \"\"\"Test that rate limiter is None when disabled.\"\"\"\n        config = create_test_config(llm_enable_token_rate_limiting=False)\n\n        with patch(\"podcast_processor.ad_classifier.db.session\") as mock_session:\n            classifier = AdClassifier(config=config, db_session=mock_session)\n\n            assert classifier.rate_limiter is None\n\n    def test_rate_limiter_custom_limit(self):\n        \"\"\"Test rate limiter with custom token limit.\"\"\"\n        config = create_test_config(llm_max_input_tokens_per_minute=15000)\n\n        with patch(\"podcast_processor.ad_classifier.db.session\") as mock_session:\n            classifier = AdClassifier(config=config, db_session=mock_session)\n\n            assert classifier.rate_limiter is not None\n            assert classifier.rate_limiter.tokens_per_minute == 15000\n\n    def test_is_retryable_error_rate_limit_errors(self):\n        \"\"\"Test that rate limit errors are correctly identified as retryable.\"\"\"\n        config = create_test_config()\n\n        with patch(\"podcast_processor.ad_classifier.db.session\") as mock_session:\n            classifier = AdClassifier(config=config, db_session=mock_session)\n\n            # Test various rate limit error formats\n            rate_limit_errors = [\n                Exception(\"rate_limit_error: too many requests\"),\n                Exception(\"RateLimitError from API\"),\n                Exception(\"HTTP 429 rate limit exceeded\"),\n                Exception(\"rate limit reached\"),\n                Exception(\"Service temporarily unavailable (503)\"),\n            ]\n\n            for error in rate_limit_errors:\n                assert classifier._is_retryable_error(error) is True\n\n    def test_is_retryable_error_non_retryable(self):\n        \"\"\"Test that non-retryable errors are correctly identified.\"\"\"\n        config = create_test_config()\n\n        with patch(\"podcast_processor.ad_classifier.db.session\") as mock_session:\n            classifier = AdClassifier(config=config, db_session=mock_session)\n\n            # Test non-retryable errors\n            non_retryable_errors = [\n                Exception(\"Invalid API key\"),\n                Exception(\"Bad request (400)\"),\n                ValueError(\"Invalid input\"),\n            ]\n\n            for error in non_retryable_errors:\n                assert classifier._is_retryable_error(error) is False\n\n    @patch(\"podcast_processor.ad_classifier.litellm\")\n    @patch(\"podcast_processor.ad_classifier.isinstance\")\n    def test_call_model_with_rate_limiter(self, mock_isinstance, mock_litellm):\n        \"\"\"Test that _call_model uses rate limiter when available.\"\"\"\n        # Make isinstance return True for our mock objects\n        mock_isinstance.return_value = True\n\n        config = create_test_config()\n\n        with patch(\"podcast_processor.ad_classifier.db.session\") as mock_session:\n            classifier = AdClassifier(config=config, db_session=mock_session)\n\n            # Mock the rate limiter\n            classifier.rate_limiter = Mock(spec=TokenRateLimiter)\n            classifier.rate_limiter.wait_if_needed = Mock()\n            classifier.rate_limiter.get_usage_stats = Mock(\n                return_value={\n                    \"current_usage\": 1000,\n                    \"limit\": 30000,\n                    \"usage_percentage\": 3.3,\n                }\n            )\n\n            # Mock successful API response\n            mock_response = Mock()\n            mock_choice = Mock()\n            mock_choice.message.content = \"test response\"\n            mock_response.choices = [mock_choice]\n            mock_litellm.completion.return_value = mock_response\n\n            # Create a test ModelCall using actual ModelCall class\n            from app.models import ModelCall\n\n            model_call = ModelCall(\n                id=1,\n                model_name=\"anthropic/claude-3-5-sonnet-20240620\",\n                prompt=\"test prompt\",\n                status=\"pending\",\n            )\n\n            # Call the model\n            result = classifier._call_model(model_call, \"test system prompt\")\n\n            # Verify rate limiter was used\n            classifier.rate_limiter.wait_if_needed.assert_called_once()\n            classifier.rate_limiter.get_usage_stats.assert_called_once()\n\n            # Verify API was called with correct parameters\n            mock_litellm.completion.assert_called_once()\n            call_args = mock_litellm.completion.call_args\n            assert call_args[1][\"model\"] == \"anthropic/claude-3-5-sonnet-20240620\"\n            assert len(call_args[1][\"messages\"]) == 2\n            assert call_args[1][\"messages\"][0][\"role\"] == \"system\"\n            assert call_args[1][\"messages\"][1][\"role\"] == \"user\"\n\n            assert result == \"test response\"\n\n    @patch(\"time.sleep\")\n    def test_rate_limit_backoff_timing(self, mock_sleep):\n        \"\"\"Test that rate limit errors use longer backoff timing.\"\"\"\n        config = create_test_config()\n\n        with patch(\"podcast_processor.ad_classifier.db.session\") as mock_session:\n            classifier = AdClassifier(config=config, db_session=mock_session)\n\n            # Create a test ModelCall using actual ModelCall class\n            from app.models import ModelCall\n\n            model_call = ModelCall(id=1, error_message=None)\n\n            error = Exception(\"rate_limit_error: too many requests\")\n\n            # Test first retry (attempt 0)\n            classifier._handle_retryable_error(\n                model_call_obj=model_call, error=error, attempt=0, current_attempt_num=1\n            )\n            mock_sleep.assert_called_with(60)  # 60 * (2^0) = 60 seconds\n\n    def test_rate_limiter_model_specific_configs(self):\n        \"\"\"Test that different models get appropriate rate limits.\"\"\"\n        test_cases = [\n            (\"anthropic/claude-3-5-sonnet-20240620\", 30000),\n            (\"gpt-4o\", 150000),\n            (\"gpt-4o-mini\", 200000),\n            (\"gemini/gemini-3-flash-preview\", 60000),\n            (\"gemini/gemini-2.5-flash\", 60000),\n            (\"unknown-model\", 30000),  # Should use default\n        ]\n\n        for model_name, expected_limit in test_cases:\n            # Clear singleton before each test case\n            import podcast_processor.token_rate_limiter as trl_module\n\n            trl_module._RATE_LIMITER = None\n\n            config = create_test_config(llm_model=model_name)\n\n            with patch(\"podcast_processor.ad_classifier.db.session\") as mock_session:\n                classifier = AdClassifier(config=config, db_session=mock_session)\n\n                assert classifier.rate_limiter is not None\n                assert classifier.rate_limiter.tokens_per_minute == expected_limit\n"
  },
  {
    "path": "src/tests/test_aggregate_feed.py",
    "content": "import pytest\n\nfrom app.extensions import db\nfrom app.feeds import get_user_aggregate_posts\nfrom app.models import Feed, Post, UserFeed\n\n\ndef test_get_user_aggregate_posts_auth_disabled(app):\n    \"\"\"Test that all feeds are included when auth is disabled.\"\"\"\n    with app.app_context():\n        app.config[\"REQUIRE_AUTH\"] = False\n\n        # Create feeds\n        feed1 = Feed(rss_url=\"http://feed1.com\", title=\"Feed 1\")\n        feed2 = Feed(rss_url=\"http://feed2.com\", title=\"Feed 2\")\n        db.session.add_all([feed1, feed2])\n        db.session.commit()\n\n        # Create posts\n        post1 = Post(\n            feed_id=feed1.id,\n            title=\"Post 1\",\n            guid=\"1\",\n            whitelisted=True,\n            processed_audio_path=\"path\",\n            download_url=\"http://url1\",\n        )\n        post2 = Post(\n            feed_id=feed2.id,\n            title=\"Post 2\",\n            guid=\"2\",\n            whitelisted=True,\n            processed_audio_path=\"path\",\n            download_url=\"http://url2\",\n        )\n        db.session.add_all([post1, post2])\n        db.session.commit()\n\n        # Call function\n        posts = get_user_aggregate_posts(user_id=999)  # User ID shouldn't matter\n\n        assert len(posts) == 2\n        assert post1 in posts\n        assert post2 in posts\n\n\ndef test_get_user_aggregate_posts_auth_enabled(app):\n    \"\"\"Test that only subscribed feeds are included when auth is enabled.\"\"\"\n    with app.app_context():\n        app.config[\"REQUIRE_AUTH\"] = True\n\n        # Create feeds\n        feed1 = Feed(rss_url=\"http://feed1.com\", title=\"Feed 1\")\n        feed2 = Feed(rss_url=\"http://feed2.com\", title=\"Feed 2\")\n        db.session.add_all([feed1, feed2])\n        db.session.commit()\n\n        # Create posts\n        post1 = Post(\n            feed_id=feed1.id,\n            title=\"Post 1\",\n            guid=\"1\",\n            whitelisted=True,\n            processed_audio_path=\"path\",\n            download_url=\"http://url1\",\n        )\n        post2 = Post(\n            feed_id=feed2.id,\n            title=\"Post 2\",\n            guid=\"2\",\n            whitelisted=True,\n            processed_audio_path=\"path\",\n            download_url=\"http://url2\",\n        )\n        db.session.add_all([post1, post2])\n        db.session.commit()\n\n        # Subscribe user to feed1 only\n        user_feed = UserFeed(user_id=1, feed_id=feed1.id)\n        db.session.add(user_feed)\n        db.session.commit()\n\n        # Call function\n        posts = get_user_aggregate_posts(user_id=1)\n\n        assert len(posts) == 1\n        assert post1 in posts\n        assert post2 not in posts\n"
  },
  {
    "path": "src/tests/test_audio_processor.py",
    "content": "import logging\nfrom unittest.mock import MagicMock, patch\n\nimport pytest\nfrom flask import Flask\n\nfrom app.extensions import db\nfrom app.models import Feed, Identification, Post, TranscriptSegment\nfrom podcast_processor.audio_processor import AudioProcessor\nfrom shared.config import Config\nfrom shared.test_utils import create_standard_test_config\n\n\n@pytest.fixture\ndef test_processor(\n    test_config: Config,\n    test_logger: logging.Logger,\n) -> AudioProcessor:\n    \"\"\"Return an AudioProcessor instance with default dependencies for testing.\"\"\"\n    return AudioProcessor(config=test_config, logger=test_logger)\n\n\n@pytest.fixture\ndef test_processor_with_mocks(\n    test_config: Config,\n    test_logger: logging.Logger,\n    mock_db_session: MagicMock,\n) -> AudioProcessor:\n    \"\"\"Return an AudioProcessor instance with mock dependencies for testing.\"\"\"\n    mock_identification_query = MagicMock()\n    mock_transcript_segment_query = MagicMock()\n    mock_model_call_query = MagicMock()\n\n    return AudioProcessor(\n        config=test_config,\n        logger=test_logger,\n        identification_query=mock_identification_query,\n        transcript_segment_query=mock_transcript_segment_query,\n        model_call_query=mock_model_call_query,\n        db_session=mock_db_session,\n    )\n\n\ndef test_get_ad_segments(app: Flask) -> None:\n    \"\"\"Test retrieving ad segments from the database\"\"\"\n    # Create test data\n    post = Post(id=1, title=\"Test Post\")\n    segment = TranscriptSegment(\n        id=1,\n        post_id=1,\n        sequence_num=0,\n        start_time=0.0,\n        end_time=10.0,\n        text=\"Test segment\",\n    )\n    identification = Identification(\n        transcript_segment_id=1, model_call_id=1, label=\"ad\", confidence=0.9\n    )\n\n    with app.app_context():\n        # Create mocks\n        mock_identification_query = MagicMock()\n        mock_query_chain = MagicMock()\n        mock_identification_query.join.return_value = mock_query_chain\n        mock_query_chain.join.return_value = mock_query_chain\n        mock_query_chain.filter.return_value = mock_query_chain\n        mock_query_chain.all.return_value = [identification]\n\n        # Create processor with mocks\n        test_processor = AudioProcessor(\n            config=create_standard_test_config(),\n            identification_query=mock_identification_query,\n        )\n\n        with patch.object(identification, \"transcript_segment\", segment):\n            segments = test_processor.get_ad_segments(post)\n\n            assert len(segments) == 1\n            assert segments[0] == (0.0, 10.0)\n\n\ndef test_merge_ad_segments(\n    test_processor_with_mocks: AudioProcessor,\n) -> None:\n    \"\"\"Test merging of nearby ad segments\"\"\"\n    duration_ms = 30000  # 30 seconds\n    ad_segments = [\n        (0.0, 5.0),  # 0-5s\n        (6.0, 10.0),  # 6-10s - should merge with first segment\n        (20.0, 25.0),  # 20-25s - should stay separate\n    ]\n\n    merged = test_processor_with_mocks.merge_ad_segments(\n        duration_ms=duration_ms,\n        ad_segments=ad_segments,\n        min_ad_segment_length_seconds=2.0,\n        min_ad_segment_separation_seconds=2.0,\n    )\n\n    # Should merge first two segments\n    assert len(merged) == 2\n    assert merged[0] == (0, 10000)  # 0-10s\n    assert merged[1] == (20000, 25000)  # 20-25s\n\n\ndef test_merge_ad_segments_with_short_segments(\n    test_processor_with_mocks: AudioProcessor,\n) -> None:\n    \"\"\"Test that segments shorter than minimum length are filtered out\"\"\"\n    duration_ms = 30000\n    ad_segments = [\n        (0.0, 1.0),  # Too short, should be filtered\n        (10.0, 15.0),  # Long enough, should stay\n        (20.0, 20.5),  # Too short, should be filtered\n    ]\n\n    merged = test_processor_with_mocks.merge_ad_segments(\n        duration_ms=duration_ms,\n        ad_segments=ad_segments,\n        min_ad_segment_length_seconds=2.0,\n        min_ad_segment_separation_seconds=2.0,\n    )\n\n    assert len(merged) == 1\n    assert merged[0] == (10000, 15000)\n\n\ndef test_merge_ad_segments_end_extension(\n    test_processor_with_mocks: AudioProcessor,\n) -> None:\n    \"\"\"Test that segments near the end are extended to the end\"\"\"\n    duration_ms = 30000\n    ad_segments = [\n        (28.0, 29.0),  # Near end, should extend to 30s\n    ]\n\n    merged = test_processor_with_mocks.merge_ad_segments(\n        duration_ms=duration_ms,\n        ad_segments=ad_segments,\n        min_ad_segment_length_seconds=2.0,\n        min_ad_segment_separation_seconds=2.0,\n    )\n\n    assert len(merged) == 1\n    assert merged[0] == (28000, 30000)  # Extended to end\n\n\ndef test_process_audio(\n    app: Flask,\n    test_config: Config,\n    test_logger: logging.Logger,\n) -> None:\n    \"\"\"Test the process_audio method\"\"\"\n    with app.app_context():\n        processor = AudioProcessor(\n            config=test_config, logger=test_logger, db_session=db.session\n        )\n\n        feed = Feed(title=\"Test Feed\", rss_url=\"http://example.com/rss.xml\")\n        db.session.add(feed)\n        db.session.commit()\n\n        post = Post(\n            feed_id=feed.id,\n            title=\"Test Post\",\n            guid=\"test-audio-guid\",\n            download_url=\"http://example.com/audio.mp3\",\n            unprocessed_audio_path=\"path/to/audio.mp3\",\n        )\n        db.session.add(post)\n        db.session.commit()\n\n        output_path = \"path/to/output.mp3\"\n\n        # Set up mocks for get_ad_segments and get_audio_duration_ms\n        with patch.object(\n            processor, \"get_ad_segments\", return_value=[(5.0, 10.0)]\n        ), patch(\n            \"podcast_processor.audio_processor.get_audio_duration_ms\",\n            return_value=30000,\n        ), patch(\n            \"podcast_processor.audio_processor.clip_segments_with_fade\"\n        ) as mock_clip:\n            # Call the method\n            processor.process_audio(post, output_path)\n\n            refreshed = db.session.get(Post, post.id)\n            assert refreshed is not None\n            assert refreshed.duration == 30.0  # 30000ms / 1000 = 30s\n            assert refreshed.processed_audio_path == output_path\n            mock_clip.assert_called_once()\n"
  },
  {
    "path": "src/tests/test_config_error_handling.py",
    "content": "\"\"\"\nTests for configuration error handling and validation.\n\"\"\"\n\nimport importlib\nfrom typing import Any\n\nimport pytest\n\nfrom shared.config import Config, OutputConfig, ProcessingConfig\n\napp_module = importlib.import_module(\"app.__init__\")\n\n\nclass TestConfigurationErrorHandling:\n    \"\"\"Test configuration validation and error handling.\"\"\"\n\n    def test_config_with_none_values(self) -> None:\n        \"\"\"Test that optional fields can be None.\"\"\"\n        config = Config(\n            llm_api_key=\"test-key\",\n            llm_max_input_tokens_per_call=None,  # Should be valid\n            llm_max_input_tokens_per_minute=None,  # Should be valid\n            output=OutputConfig(\n                fade_ms=3000,\n                min_ad_segement_separation_seconds=60,\n                min_ad_segment_length_seconds=14,\n                min_confidence=0.8,\n            ),\n            processing=ProcessingConfig(\n                num_segments_to_input_to_prompt=30,\n            ),\n        )\n\n        assert config.llm_max_input_tokens_per_call is None\n        assert config.llm_max_input_tokens_per_minute is None\n\n    def test_zero_values(self) -> None:\n        \"\"\"Test configuration with zero values where appropriate.\"\"\"\n        # Zero concurrent calls might be problematic in practice but should validate\n        config = Config(\n            llm_api_key=\"test-key\",\n            llm_max_concurrent_calls=0,\n            llm_max_retry_attempts=0,\n            output=OutputConfig(\n                fade_ms=3000,\n                min_ad_segement_separation_seconds=60,\n                min_ad_segment_length_seconds=14,\n                min_confidence=0.8,\n            ),\n            processing=ProcessingConfig(\n                num_segments_to_input_to_prompt=30,\n            ),\n        )\n\n        assert config.llm_max_concurrent_calls == 0\n        assert config.llm_max_retry_attempts == 0\n\n    def test_very_large_values(self) -> None:\n        \"\"\"Test configuration with very large values.\"\"\"\n        config = Config(\n            llm_api_key=\"test-key\",\n            llm_max_concurrent_calls=999999,\n            llm_max_retry_attempts=999999,\n            llm_max_input_tokens_per_call=999999999,\n            llm_max_input_tokens_per_minute=999999999,\n            output=OutputConfig(\n                fade_ms=3000,\n                min_ad_segement_separation_seconds=60,\n                min_ad_segment_length_seconds=14,\n                min_confidence=0.8,\n            ),\n            processing=ProcessingConfig(\n                num_segments_to_input_to_prompt=30,\n            ),\n        )\n\n        assert config.llm_max_concurrent_calls == 999999\n        assert config.llm_max_retry_attempts == 999999\n        assert config.llm_max_input_tokens_per_call == 999999999\n        assert config.llm_max_input_tokens_per_minute == 999999999\n\n    def test_boolean_field_validation(self) -> None:\n        \"\"\"Test boolean field validation.\"\"\"\n        # Test valid boolean values\n        config = Config(\n            llm_api_key=\"test-key\",\n            llm_enable_token_rate_limiting=True,\n            output=OutputConfig(\n                fade_ms=3000,\n                min_ad_segement_separation_seconds=60,\n                min_ad_segment_length_seconds=14,\n                min_confidence=0.8,\n            ),\n            processing=ProcessingConfig(\n                num_segments_to_input_to_prompt=30,\n            ),\n        )\n        assert config.llm_enable_token_rate_limiting is True\n\n        config = Config(\n            llm_api_key=\"test-key\",\n            llm_enable_token_rate_limiting=False,\n            output=OutputConfig(\n                fade_ms=3000,\n                min_ad_segement_separation_seconds=60,\n                min_ad_segment_length_seconds=14,\n                min_confidence=0.8,\n            ),\n            processing=ProcessingConfig(\n                num_segments_to_input_to_prompt=30,\n            ),\n        )\n        assert config.llm_enable_token_rate_limiting is False\n\n\nclass TestEnvKeyValidation:\n    \"\"\"Tests for environment-based API key validation.\"\"\"\n\n    def test_llm_and_groq_conflict_raises(self, monkeypatch: Any) -> None:\n        monkeypatch.setenv(\"LLM_API_KEY\", \"llm-value\")\n        monkeypatch.setenv(\"GROQ_API_KEY\", \"groq-value\")\n        monkeypatch.delenv(\"WHISPER_REMOTE_API_KEY\", raising=False)\n\n        with pytest.raises(SystemExit):\n            app_module._validate_env_key_conflicts()\n\n    def test_whisper_remote_allows_different_key(self, monkeypatch: Any) -> None:\n        monkeypatch.setenv(\"LLM_API_KEY\", \"llm-value\")\n        monkeypatch.setenv(\"WHISPER_REMOTE_API_KEY\", \"remote-value\")\n        monkeypatch.delenv(\"GROQ_API_KEY\", raising=False)\n\n        app_module._validate_env_key_conflicts()\n"
  },
  {
    "path": "src/tests/test_feeds.py",
    "content": "import datetime\nimport logging\nimport uuid\nfrom types import SimpleNamespace\nfrom unittest import mock\n\nimport feedparser\nimport PyRSS2Gen\nimport pytest\n\nfrom app.feeds import (\n    _get_base_url,\n    _should_auto_whitelist_new_posts,\n    add_feed,\n    db,\n    feed_item,\n    fetch_feed,\n    generate_feed_xml,\n    get_duration,\n    get_guid,\n    make_post,\n    refresh_feed,\n)\nfrom app.models import Feed, Post\nfrom app.runtime_config import config as runtime_config\n\nlogger = logging.getLogger(\"global_logger\")\n\n\nclass MockPost:\n    \"\"\"A mock Post class that doesn't require Flask context.\"\"\"\n\n    def __init__(\n        self,\n        id=1,\n        title=\"Test Episode\",\n        guid=\"test-guid\",\n        download_url=\"https://example.com/episode.mp3\",\n        description=\"Test description\",\n        release_date=datetime.datetime(2023, 1, 1, 12, 0, tzinfo=datetime.timezone.utc),\n        feed_id=1,\n        duration=None,\n        image_url=None,\n        whitelisted=False,\n    ):\n        self.id = id\n        self.title = title\n        self.guid = guid\n        self.download_url = download_url\n        self.description = description\n        self.release_date = release_date\n        self.feed_id = feed_id\n        self.duration = duration\n        self.image_url = image_url\n        self.whitelisted = whitelisted\n        self._audio_len_bytes = 1024\n        self.whitelisted = False\n\n    def audio_len_bytes(self):\n        return self._audio_len_bytes\n\n\nclass MockFeed:\n    \"\"\"A mock Feed class that doesn't require Flask context.\"\"\"\n\n    def __init__(\n        self,\n        id=1,\n        title=\"Test Feed\",\n        description=\"Test Description\",\n        author=\"Test Author\",\n        rss_url=\"https://example.com/feed.xml\",\n        image_url=\"https://example.com/image.jpg\",\n    ):\n        self.id = id\n        self.title = title\n        self.description = description\n        self.author = author\n        self.rss_url = rss_url\n        self.image_url = image_url\n        self.posts = []\n        self.user_feeds = []\n        self.auto_whitelist_new_episodes_override = None\n\n\n@pytest.fixture\ndef mock_feed_data():\n    \"\"\"Create a mock feedparser result.\"\"\"\n    feed_data = mock.MagicMock(spec=feedparser.FeedParserDict)\n    feed_data.feed = mock.MagicMock()\n    feed_data.feed.title = \"Test Feed\"\n    feed_data.feed.description = \"Test Description\"\n    feed_data.feed.author = \"Test Author\"\n    feed_data.feed.image = mock.MagicMock()\n    feed_data.feed.image.href = \"https://example.com/image.jpg\"\n    feed_data.href = \"https://example.com/feed.xml\"\n    feed_data.feed.get = mock.MagicMock()\n    feed_data.feed.get.side_effect = lambda key, default=None: (\n        {\"href\": feed_data.feed.image.href} if key == \"image\" else default\n    )\n\n    entry1 = mock.MagicMock()\n    entry1.title = \"Episode 1\"\n    entry1.description = \"Episode 1 description\"\n    entry1.id = \"https://example.com/episode1\"\n    entry1.published_parsed = (2023, 1, 1, 12, 0, 0, 0, 0, 0)\n    entry1.itunes_duration = \"3600\"\n    link1 = mock.MagicMock()\n    link1.type = \"audio/mpeg\"\n    link1.href = \"https://example.com/episode1.mp3\"\n    entry1.links = [link1]\n\n    entry2 = mock.MagicMock()\n    entry2.title = \"Episode 2\"\n    entry2.description = \"Episode 2 description\"\n    entry2.id = \"https://example.com/episode2\"\n    entry2.published_parsed = (2023, 2, 1, 12, 0, 0, 0, 0, 0)\n    entry2.itunes_duration = \"1800\"\n    link2 = mock.MagicMock()\n    link2.type = \"audio/mpeg\"\n    link2.href = \"https://example.com/episode2.mp3\"\n    entry2.links = [link2]\n\n    feed_data.entries = [entry1, entry2]\n    return feed_data\n\n\n@pytest.fixture\ndef mock_db_session(monkeypatch):\n    \"\"\"Mock the database session.\"\"\"\n    mock_session = mock.MagicMock()\n    monkeypatch.setattr(\"app.feeds.db.session\", mock_session)\n    return mock_session\n\n\n@pytest.fixture\ndef mock_post():\n    \"\"\"Create a mock Post.\"\"\"\n    return MockPost()\n\n\n@pytest.fixture\ndef mock_feed():\n    \"\"\"Create a mock Feed.\"\"\"\n    return MockFeed()\n\n\n@mock.patch(\"app.feeds.feedparser.parse\")\ndef test_fetch_feed(mock_parse, mock_feed_data):\n    mock_parse.return_value = mock_feed_data\n\n    result = fetch_feed(\"https://example.com/feed.xml\")\n\n    assert result == mock_feed_data\n    mock_parse.assert_called_once_with(\"https://example.com/feed.xml\")\n\n\ndef test_refresh_feed(mock_db_session):\n    \"\"\"Test refresh_feed with a very simplified approach.\"\"\"\n    # Create a simple mock for the feed\n    mock_feed = MockFeed()\n\n    # Create a small but functional implementation of refresh_feed\n    def simple_refresh_feed(feed):\n        logger.info(f\"Refreshed feed with ID: {feed.id}\")\n        db.session.commit()\n\n    # Call our simplified implementation\n    with mock.patch(\"app.feeds.fetch_feed\") as mock_fetch:\n        # Return an empty entries list to avoid processing\n        mock_feed_data = mock.MagicMock()\n        mock_feed_data.feed = mock.MagicMock()\n        mock_feed_data.entries = []\n        mock_fetch.return_value = mock_feed_data\n\n        # Execute the simplified version\n        simple_refresh_feed(mock_feed)\n\n    # Check that commit was called\n    mock_db_session.commit.assert_called_once()\n\n\ndef test_should_auto_whitelist_new_posts_requires_members(\n    monkeypatch, mock_feed, mock_db_session\n):\n    monkeypatch.setattr(\n        \"app.feeds.config\",\n        SimpleNamespace(automatically_whitelist_new_episodes=True),\n    )\n    monkeypatch.setattr(\"app.auth.is_auth_enabled\", lambda: True)\n    mock_db_session.query.return_value.first.return_value = (1,)\n    assert _should_auto_whitelist_new_posts(mock_feed) is False\n\n\ndef test_should_auto_whitelist_new_posts_true_with_members(monkeypatch, mock_feed):\n    mock_feed.user_feeds = [mock.MagicMock()]\n    monkeypatch.setattr(\n        \"app.feeds.config\",\n        SimpleNamespace(automatically_whitelist_new_episodes=True),\n    )\n    monkeypatch.setattr(\"app.auth.is_auth_enabled\", lambda: True)\n    monkeypatch.setattr(\"app.feeds.is_feed_active_for_user\", lambda *args: True)\n    assert _should_auto_whitelist_new_posts(mock_feed) is True\n\n\ndef test_should_auto_whitelist_requires_members(\n    monkeypatch, mock_feed, mock_post, mock_db_session\n):\n    monkeypatch.setattr(\n        \"app.feeds.config\",\n        SimpleNamespace(automatically_whitelist_new_episodes=True),\n    )\n    monkeypatch.setattr(\"app.auth.is_auth_enabled\", lambda: True)\n    mock_db_session.query.return_value.first.return_value = (1,)\n    mock_feed.user_feeds = []\n    assert _should_auto_whitelist_new_posts(mock_feed, mock_post) is False\n\n\ndef test_should_auto_whitelist_with_members(monkeypatch, mock_feed, mock_post):\n    monkeypatch.setattr(\n        \"app.feeds.config\",\n        SimpleNamespace(automatically_whitelist_new_episodes=True),\n    )\n    monkeypatch.setattr(\"app.auth.is_auth_enabled\", lambda: True)\n    monkeypatch.setattr(\"app.feeds.is_feed_active_for_user\", lambda *args: True)\n    mock_feed.user_feeds = [mock.MagicMock()]\n    assert _should_auto_whitelist_new_posts(mock_feed, mock_post) is True\n\n\ndef test_should_auto_whitelist_true_when_auth_disabled(monkeypatch, mock_feed):\n    monkeypatch.setattr(\n        \"app.feeds.config\",\n        SimpleNamespace(automatically_whitelist_new_episodes=True),\n    )\n    monkeypatch.setattr(\"app.auth.is_auth_enabled\", lambda: False)\n    assert _should_auto_whitelist_new_posts(mock_feed) is True\n\n\ndef test_should_auto_whitelist_true_when_no_users(\n    monkeypatch, mock_feed, mock_db_session\n):\n    monkeypatch.setattr(\n        \"app.feeds.config\",\n        SimpleNamespace(automatically_whitelist_new_episodes=True),\n    )\n    monkeypatch.setattr(\"app.auth.is_auth_enabled\", lambda: True)\n    mock_db_session.query.return_value.first.return_value = None\n    mock_feed.user_feeds = []\n    assert _should_auto_whitelist_new_posts(mock_feed) is True\n\n\ndef test_should_auto_whitelist_respects_feed_override_true(monkeypatch, mock_feed):\n    monkeypatch.setattr(\n        \"app.feeds.config\",\n        SimpleNamespace(automatically_whitelist_new_episodes=False),\n    )\n    mock_feed.auto_whitelist_new_episodes_override = True\n    assert _should_auto_whitelist_new_posts(mock_feed) is True\n\n\ndef test_should_auto_whitelist_respects_feed_override_false(monkeypatch, mock_feed):\n    monkeypatch.setattr(\n        \"app.feeds.config\",\n        SimpleNamespace(automatically_whitelist_new_episodes=True),\n    )\n    mock_feed.auto_whitelist_new_episodes_override = False\n    assert _should_auto_whitelist_new_posts(mock_feed) is False\n\n\n@mock.patch(\"app.feeds.writer_client\")\n@mock.patch(\"app.feeds._should_auto_whitelist_new_posts\")\n@mock.patch(\"app.feeds.make_post\")\n@mock.patch(\"app.feeds.fetch_feed\")\ndef test_refresh_feed_unwhitelists_without_members(\n    mock_fetch_feed,\n    mock_make_post,\n    mock_should_auto_whitelist,\n    mock_writer_client,\n    mock_feed,\n    mock_feed_data,\n    mock_db_session,\n):\n    mock_fetch_feed.return_value = mock_feed_data\n    mock_should_auto_whitelist.return_value = False\n    post_one = MockPost(guid=str(uuid.uuid4()))\n    mock_make_post.return_value = post_one\n\n    refresh_feed(mock_feed)\n\n    assert post_one.whitelisted is False\n    assert mock_make_post.call_count == len(mock_feed_data.entries)\n    assert mock_should_auto_whitelist.call_count == len(mock_feed_data.entries)\n    mock_should_auto_whitelist.assert_any_call(mock_feed, mock.ANY)\n    mock_writer_client.action.assert_called_once()\n\n\n@mock.patch(\"app.feeds.writer_client\")\n@mock.patch(\"app.feeds._should_auto_whitelist_new_posts\")\n@mock.patch(\"app.feeds.make_post\")\n@mock.patch(\"app.feeds.fetch_feed\")\ndef test_refresh_feed_whitelists_when_member_exists(\n    mock_fetch_feed,\n    mock_make_post,\n    mock_should_auto_whitelist,\n    mock_writer_client,\n    mock_feed,\n    mock_feed_data,\n    mock_db_session,\n):\n    mock_fetch_feed.return_value = mock_feed_data\n    mock_should_auto_whitelist.return_value = True\n    post_one = MockPost(guid=str(uuid.uuid4()))\n    mock_make_post.return_value = post_one\n\n    refresh_feed(mock_feed)\n\n    assert post_one.whitelisted is True\n    assert mock_make_post.call_count == len(mock_feed_data.entries)\n    assert mock_should_auto_whitelist.call_count == len(mock_feed_data.entries)\n    mock_should_auto_whitelist.assert_any_call(mock_feed, mock.ANY)\n    mock_writer_client.action.assert_called_once()\n\n\n@mock.patch(\"app.feeds.fetch_feed\")\n@mock.patch(\"app.feeds.refresh_feed\")\ndef test_add_or_refresh_feed_existing(\n    mock_refresh_feed, mock_fetch_feed, mock_feed, mock_feed_data\n):\n    # Set up mock feed data\n    mock_feed_data.feed = mock.MagicMock()\n    mock_feed_data.feed.title = \"Test Feed\"  # Add title directly\n    mock_fetch_feed.return_value = mock_feed_data\n\n    # Directly mock check for \"title\" in feed_data.feed\n    with mock.patch(\"app.feeds.add_or_refresh_feed\") as mock_add_or_refresh:\n        # Set up the behavior of the mocked function\n        mock_add_or_refresh.return_value = mock_feed\n\n        # Call the mocked function\n        result = mock_add_or_refresh(\"https://example.com/feed.xml\")\n\n    assert result == mock_feed\n\n\n@mock.patch(\"app.feeds.fetch_feed\")\n@mock.patch(\"app.feeds.add_feed\")\ndef test_add_or_refresh_feed_new(\n    mock_add_feed, mock_fetch_feed, mock_feed, mock_feed_data\n):\n    # Set up mock feed data\n    mock_feed_data.feed = mock.MagicMock()\n    mock_feed_data.feed.title = \"Test Feed\"  # Add title directly\n    mock_fetch_feed.return_value = mock_feed_data\n    mock_add_feed.return_value = mock_feed\n\n    # Directly mock Feed.query and the entire add_or_refresh_feed function\n    with mock.patch(\"app.feeds.add_or_refresh_feed\") as mock_add_or_refresh:\n        # Set up the behavior of the mocked function\n        mock_add_or_refresh.return_value = mock_feed\n\n        # Call the mocked function\n        result = mock_add_or_refresh(\"https://example.com/feed.xml\")\n\n    assert result == mock_feed\n\n\n@mock.patch(\"app.feeds.writer_client\")\n@mock.patch(\"app.feeds.Post\")\ndef test_add_feed(mock_post_class, mock_writer_client, mock_feed_data, mock_db_session):\n    # Mock writer_client return value\n    mock_writer_client.action.return_value = SimpleNamespace(data={\"feed_id\": 1})\n\n    # Create a Feed mock\n    with mock.patch(\"app.feeds.Feed\") as mock_feed_class:\n        mock_feed = MockFeed()\n        mock_feed_class.return_value = mock_feed\n\n        # Mock db.session.get to return our mock feed\n        mock_db_session.get.return_value = mock_feed\n\n        # Mock the get method in feed_data\n        mock_feed_data.feed.get = mock.MagicMock()\n        mock_feed_data.feed.get.side_effect = lambda key, default=\"\": {\n            \"description\": \"Test Description\",\n            \"author\": \"Test Author\",\n        }.get(key, default)\n\n        # Mock config settings\n        with mock.patch(\"app.feeds.config\") as mock_config:\n            mock_config.number_of_episodes_to_whitelist_from_archive_of_new_feed = 1\n            mock_config.automatically_whitelist_new_episodes = True\n\n            # Mock make_post\n            with mock.patch(\"app.feeds.make_post\") as mock_make_post:\n                mock_post = MockPost()\n                mock_make_post.return_value = mock_post\n\n                result = add_feed(mock_feed_data)\n\n            # Check that make_post was called only for the latest entry\n            assert mock_make_post.call_count == len(mock_feed_data.entries)\n\n        # Check that writer_client.action was called\n        mock_writer_client.action.assert_called()\n\n        assert result == mock_feed\n\n\ndef test_feed_item(mock_post, app):\n    # Mock request context with Host header\n    headers_dict = {\"Host\": \"podly.com:5001\"}\n\n    mock_headers = mock.MagicMock()\n    mock_headers.get.side_effect = headers_dict.get\n\n    mock_environ = mock.MagicMock()\n    mock_environ.get.return_value = None  # No HTTP/2 pseudo-headers in environ\n\n    mock_request = mock.MagicMock()\n    mock_request.headers = mock_headers\n    mock_request.environ = mock_environ\n    mock_request.is_secure = False\n\n    with app.app_context():\n        with mock.patch(\"app.feeds.request\", mock_request):\n            result = feed_item(mock_post)\n\n    # Verify the result\n    assert isinstance(result, PyRSS2Gen.RSSItem)\n    assert result.title == mock_post.title\n    assert result.guid == mock_post.guid\n\n    # Check enclosure\n    assert result.enclosure.url == \"http://podly.com:5001/api/posts/test-guid/download\"\n    assert result.enclosure.type == \"audio/mpeg\"\n    assert result.enclosure.length == mock_post._audio_len_bytes\n\n\ndef test_feed_item_with_reverse_proxy(mock_post, app):\n    # Test with HTTP/2 pseudo-headers (modern reverse proxy)\n    headers_dict = {\n        \":scheme\": \"http\",\n        \":authority\": \"podly.com:5001\",\n        \"Host\": \"podly.com:5001\",\n    }\n\n    mock_headers = mock.MagicMock()\n    mock_headers.get.side_effect = headers_dict.get\n\n    mock_environ = mock.MagicMock()\n    mock_environ.get.return_value = None\n\n    mock_request = mock.MagicMock()\n    mock_request.headers = mock_headers\n    mock_request.environ = mock_environ\n\n    with app.app_context():\n        with mock.patch(\"app.feeds.request\", mock_request):\n            result = feed_item(mock_post)\n\n    # Verify the result\n    assert isinstance(result, PyRSS2Gen.RSSItem)\n    assert result.title == mock_post.title\n    assert result.guid == mock_post.guid\n\n    # Check enclosure - should use HTTP/2 pseudo-headers\n    assert result.enclosure.url == \"http://podly.com:5001/api/posts/test-guid/download\"\n    assert result.enclosure.type == \"audio/mpeg\"\n    assert result.enclosure.length == mock_post._audio_len_bytes\n\n\ndef test_feed_item_with_reverse_proxy_custom_port(mock_post, app):\n    # Test with HTTPS and custom port via request headers\n    headers_dict = {\n        \":scheme\": \"https\",\n        \":authority\": \"podly.com:8443\",\n        \"Host\": \"podly.com:8443\",\n    }\n\n    mock_headers = mock.MagicMock()\n    mock_headers.get.side_effect = headers_dict.get\n\n    mock_environ = mock.MagicMock()\n    mock_environ.get.return_value = None\n\n    mock_request = mock.MagicMock()\n    mock_request.headers = mock_headers\n    mock_request.environ = mock_environ\n\n    with app.app_context():\n        with mock.patch(\"app.feeds.request\", mock_request):\n            result = feed_item(mock_post)\n\n    # Verify the result\n    assert isinstance(result, PyRSS2Gen.RSSItem)\n    assert result.title == mock_post.title\n    assert result.guid == mock_post.guid\n\n    # Check enclosure - should use HTTPS with custom port\n    assert result.enclosure.url == \"https://podly.com:8443/api/posts/test-guid/download\"\n    assert result.enclosure.type == \"audio/mpeg\"\n    assert result.enclosure.length == mock_post._audio_len_bytes\n\n\ndef test_get_base_url_without_reverse_proxy():\n    # Test _get_base_url without request context (should use localhost fallback)\n    with mock.patch(\"app.feeds.config\") as mock_config:\n        mock_config.port = 5001\n        result = _get_base_url()\n\n    assert result == \"http://localhost:5001\"\n\n\ndef test_get_base_url_with_reverse_proxy_default_port():\n    # Test _get_base_url with Host header (modern approach)\n    headers_dict = {\"Host\": \"podly.com\"}\n\n    mock_headers = mock.MagicMock()\n    mock_headers.get.side_effect = headers_dict.get\n\n    mock_environ = mock.MagicMock()\n    mock_environ.get.return_value = None\n\n    mock_request = mock.MagicMock()\n    mock_request.headers = mock_headers\n    mock_request.environ = mock_environ\n    mock_request.is_secure = False\n    mock_request.scheme = \"http\"\n\n    with mock.patch(\"app.feeds.request\", mock_request):\n        result = _get_base_url()\n\n    assert result == \"http://podly.com\"\n\n\ndef test_get_base_url_with_reverse_proxy_custom_port():\n    # Test _get_base_url with HTTPS and Strict-Transport-Security header\n    headers_dict = {\n        \"Host\": \"podly.com:8443\",\n        \"Strict-Transport-Security\": \"max-age=31536000\",\n    }\n\n    mock_headers = mock.MagicMock()\n    mock_headers.get.side_effect = headers_dict.get\n\n    mock_environ = mock.MagicMock()\n    mock_environ.get.return_value = None\n\n    mock_request = mock.MagicMock()\n    mock_request.headers = mock_headers\n    mock_request.environ = mock_environ\n    mock_request.is_secure = False  # STS header should override this\n    mock_request.scheme = \"http\"\n\n    with mock.patch(\"app.feeds.request\", mock_request):\n        result = _get_base_url()\n\n    assert result == \"https://podly.com:8443\"\n\n\ndef test_get_base_url_localhost():\n    # Test _get_base_url with localhost (fallback when not in request context)\n    with mock.patch(\"app.feeds.config\") as mock_config:\n        mock_config.port = 5001\n\n        result = _get_base_url()\n\n    assert result == \"http://localhost:5001\"\n\n\n@mock.patch(\"app.feeds.feed_item\")\n@mock.patch(\"app.feeds.PyRSS2Gen.Image\")\n@mock.patch(\"app.feeds.PyRSS2Gen.RSS2\")\ndef test_generate_feed_xml_filters_processed_whitelisted(\n    mock_rss_2, mock_image, mock_feed_item, app\n):\n    # Use real models to verify query filtering logic\n    with app.app_context():\n        original_flag = getattr(runtime_config, \"autoprocess_on_download\", False)\n        runtime_config.autoprocess_on_download = False\n        try:\n            feed = Feed(rss_url=\"http://example.com/feed\", title=\"Feed 1\")\n            db.session.add(feed)\n            db.session.commit()\n\n            processed = Post(\n                feed_id=feed.id,\n                title=\"Processed\",\n                guid=\"good\",\n                download_url=\"http://example.com/good.mp3\",\n                processed_audio_path=\"/tmp/good.mp3\",\n                whitelisted=True,\n            )\n            unprocessed = Post(\n                feed_id=feed.id,\n                title=\"Unprocessed\",\n                guid=\"bad1\",\n                download_url=\"http://example.com/bad1.mp3\",\n                processed_audio_path=None,\n                whitelisted=True,\n            )\n            not_whitelisted = Post(\n                feed_id=feed.id,\n                title=\"Not Whitelisted\",\n                guid=\"bad2\",\n                download_url=\"http://example.com/bad2.mp3\",\n                processed_audio_path=\"/tmp/bad2.mp3\",\n                whitelisted=False,\n            )\n\n            db.session.add_all([processed, unprocessed, not_whitelisted])\n            db.session.commit()\n\n            mock_feed_item.side_effect = (\n                lambda post, prepend_feed_title=False: mock.MagicMock(\n                    post_guid=post.guid\n                )\n            )\n            mock_rss = mock_rss_2.return_value\n            mock_rss.to_xml.return_value = \"<rss></rss>\"\n\n            result = generate_feed_xml(feed)\n\n            called_posts = [call.args[0] for call in mock_feed_item.call_args_list]\n            assert called_posts == [processed]\n\n            mock_rss_2.assert_called_once()\n            mock_rss.to_xml.assert_called_once_with(\"utf-8\")\n            assert result == \"<rss></rss>\"\n        finally:\n            runtime_config.autoprocess_on_download = original_flag\n\n\n@mock.patch(\"app.feeds.feed_item\")\n@mock.patch(\"app.feeds.PyRSS2Gen.Image\")\n@mock.patch(\"app.feeds.PyRSS2Gen.RSS2\")\ndef test_generate_feed_xml_includes_all_when_autoprocess_enabled(\n    mock_rss_2, mock_image, mock_feed_item, app\n):\n    with app.app_context():\n        original_flag = getattr(runtime_config, \"autoprocess_on_download\", False)\n        runtime_config.autoprocess_on_download = True\n        try:\n            feed = Feed(rss_url=\"http://example.com/feed\", title=\"Feed 1\")\n            db.session.add(feed)\n            db.session.commit()\n\n            processed = Post(\n                feed_id=feed.id,\n                title=\"Processed\",\n                guid=\"good\",\n                download_url=\"http://example.com/good.mp3\",\n                processed_audio_path=\"/tmp/good.mp3\",\n                whitelisted=True,\n                release_date=datetime.datetime(\n                    2024, 1, 3, tzinfo=datetime.timezone.utc\n                ),\n            )\n            unprocessed = Post(\n                feed_id=feed.id,\n                title=\"Unprocessed\",\n                guid=\"bad1\",\n                download_url=\"http://example.com/bad1.mp3\",\n                processed_audio_path=None,\n                whitelisted=True,\n                release_date=datetime.datetime(\n                    2024, 1, 2, tzinfo=datetime.timezone.utc\n                ),\n            )\n            not_whitelisted = Post(\n                feed_id=feed.id,\n                title=\"Not Whitelisted\",\n                guid=\"bad2\",\n                download_url=\"http://example.com/bad2.mp3\",\n                processed_audio_path=\"/tmp/bad2.mp3\",\n                whitelisted=False,\n                release_date=datetime.datetime(\n                    2024, 1, 1, tzinfo=datetime.timezone.utc\n                ),\n            )\n\n            db.session.add_all([processed, unprocessed, not_whitelisted])\n            db.session.commit()\n\n            mock_feed_item.side_effect = (\n                lambda post, prepend_feed_title=False: mock.MagicMock(\n                    post_guid=post.guid\n                )\n            )\n            mock_rss = mock_rss_2.return_value\n            mock_rss.to_xml.return_value = \"<rss></rss>\"\n\n            result = generate_feed_xml(feed)\n\n            called_posts = [call.args[0] for call in mock_feed_item.call_args_list]\n            assert called_posts == [processed, unprocessed, not_whitelisted]\n\n            mock_rss_2.assert_called_once()\n            mock_rss.to_xml.assert_called_once_with(\"utf-8\")\n            assert result == \"<rss></rss>\"\n        finally:\n            runtime_config.autoprocess_on_download = original_flag\n\n\n@mock.patch(\"app.feeds.Post\")\ndef test_make_post(mock_post_class, mock_feed):\n    # Create a mock entry\n    entry = mock.MagicMock()\n    entry.title = \"Test Episode\"\n    entry.description = \"Test Description\"\n    entry.id = \"test-guid\"\n    entry.published_parsed = (2023, 1, 1, 12, 0, 0, 0, 0, 0)\n    entry.itunes_duration = \"3600\"\n\n    # Set up entry.get behavior\n    entry.get = mock.MagicMock()\n    entry.get.side_effect = lambda key, default=\"\": {\n        \"description\": \"Test Description\",\n        \"published_parsed\": entry.published_parsed,\n    }.get(key, default)\n\n    mock_post = MockPost()\n    mock_post_class.return_value = mock_post\n\n    # Mock find_audio_link\n    with (\n        mock.patch(\"app.feeds.find_audio_link\") as mock_find_audio_link,\n        mock.patch(\"app.feeds.get_guid\") as mock_get_guid,\n        mock.patch(\"app.feeds.get_duration\") as mock_get_duration,\n    ):\n        mock_find_audio_link.return_value = \"https://example.com/audio.mp3\"\n        mock_get_guid.return_value = \"test-guid\"\n        mock_get_duration.return_value = 3600\n\n        result = make_post(mock_feed, entry)\n\n        # Check that Post was created with correct arguments\n        mock_post_class.assert_called_once()\n\n        assert result == mock_post\n\n\n@mock.patch(\"app.feeds.uuid.UUID\")\n@mock.patch(\"app.feeds.find_audio_link\")\n@mock.patch(\"app.feeds.uuid.uuid5\")\ndef test_get_guid_uses_id_if_valid_uuid(mock_uuid5, mock_find_audio_link, mock_uuid):\n    \"\"\"Test that get_guid returns the entry.id if it's a valid UUID.\"\"\"\n    entry = mock.MagicMock()\n    entry.id = \"550e8400-e29b-41d4-a716-446655440000\"\n\n    # uuid.UUID doesn't raise an error, so entry.id is a valid UUID\n    result = get_guid(entry)\n\n    assert result == entry.id\n    mock_uuid.assert_called_once_with(entry.id)\n    mock_find_audio_link.assert_not_called()\n    mock_uuid5.assert_not_called()\n\n\n@mock.patch(\"app.feeds.uuid.UUID\")\n@mock.patch(\"app.feeds.find_audio_link\")\n@mock.patch(\"app.feeds.uuid.uuid5\")\ndef test_get_guid_generates_uuid_if_invalid_id(\n    mock_uuid5, mock_find_audio_link, mock_uuid\n):\n    \"\"\"Test that get_guid generates a UUID if entry.id is not a valid UUID.\"\"\"\n    entry = mock.MagicMock()\n    entry.id = \"not-a-uuid\"\n\n    # uuid.UUID raises ValueError, so entry.id is not a valid UUID\n    mock_uuid.side_effect = ValueError\n    mock_find_audio_link.return_value = \"https://example.com/audio.mp3\"\n    mock_uuid5_instance = mock.MagicMock()\n    mock_uuid5_instance.__str__.return_value = \"550e8400-e29b-41d4-a716-446655440000\"\n    mock_uuid5.return_value = mock_uuid5_instance\n\n    result = get_guid(entry)\n\n    assert result == \"550e8400-e29b-41d4-a716-446655440000\"\n    mock_uuid.assert_called_once_with(entry.id)\n    mock_find_audio_link.assert_called_once_with(entry)\n    mock_uuid5.assert_called_once_with(\n        uuid.NAMESPACE_URL, \"https://example.com/audio.mp3\"\n    )\n\n\ndef test_get_duration_with_valid_duration():\n    \"\"\"Test get_duration with a valid duration.\"\"\"\n    entry = {\"itunes_duration\": \"3600\"}\n\n    result = get_duration(entry)\n\n    assert result == 3600\n\n\ndef test_get_duration_with_invalid_duration():\n    \"\"\"Test get_duration with an invalid duration.\"\"\"\n    entry = {\"itunes_duration\": \"not-a-number\"}\n\n    result = get_duration(entry)\n\n    assert result is None\n\n\ndef test_get_duration_with_missing_duration():\n    \"\"\"Test get_duration with a missing duration.\"\"\"\n    entry = {}\n\n    result = get_duration(entry)\n\n    assert result is None\n\n\ndef test_get_base_url_no_request_context_fallback():\n    \"\"\"Test _get_base_url falls back to config when no request context.\"\"\"\n    with mock.patch(\"app.feeds.config\") as mock_config:\n        mock_config.port = 5001\n\n        result = _get_base_url()\n\n    assert result == \"http://localhost:5001\"\n\n\ndef test_get_base_url_with_http2_pseudo_headers():\n    \"\"\"Test _get_base_url uses HTTP/2 pseudo-headers when available.\"\"\"\n    headers_dict = {\n        \":scheme\": \"https\",\n        \":authority\": \"podly.com\",\n        \"Host\": \"podly.com\",\n    }\n\n    mock_headers = mock.MagicMock()\n    mock_headers.get.side_effect = headers_dict.get\n\n    mock_environ = mock.MagicMock()\n    mock_environ.get.return_value = None\n\n    mock_request = mock.MagicMock()\n    mock_request.headers = mock_headers\n    mock_request.environ = mock_environ\n\n    with mock.patch(\"app.feeds.request\", mock_request):\n        result = _get_base_url()\n\n    # Should use HTTP/2 pseudo-headers\n    assert result == \"https://podly.com\"\n\n\ndef test_get_base_url_with_strict_transport_security():\n    \"\"\"Test _get_base_url uses Strict-Transport-Security header to detect HTTPS.\"\"\"\n    headers_dict = {\n        \"Host\": \"secure.example.com\",\n        \"Strict-Transport-Security\": \"max-age=31536000; includeSubDomains\",\n    }\n\n    mock_headers = mock.MagicMock()\n    mock_headers.get.side_effect = headers_dict.get\n\n    mock_environ = mock.MagicMock()\n    mock_environ.get.return_value = None\n\n    mock_request = mock.MagicMock()\n    mock_request.headers = mock_headers\n    mock_request.environ = mock_environ\n    mock_request.is_secure = False  # Even if Flask thinks it's HTTP\n    mock_request.scheme = \"http\"\n\n    with mock.patch(\"app.feeds.request\", mock_request):\n        result = _get_base_url()\n\n    # Should use HTTPS because of Strict-Transport-Security header\n    assert result == \"https://secure.example.com\"\n\n\ndef test_get_base_url_fallback_http_without_sts():\n    \"\"\"Test _get_base_url falls back to HTTP when no HTTPS indicators present.\"\"\"\n    headers_dict = {\n        \"Host\": \"insecure.example.com\",\n    }\n\n    mock_headers = mock.MagicMock()\n    mock_headers.get.side_effect = headers_dict.get\n\n    mock_environ = mock.MagicMock()\n    mock_environ.get.return_value = None\n\n    mock_request = mock.MagicMock()\n    mock_request.headers = mock_headers\n    mock_request.environ = mock_environ\n    mock_request.is_secure = False\n    mock_request.scheme = \"http\"\n\n    with mock.patch(\"app.feeds.request\", mock_request):\n        result = _get_base_url()\n\n    # Should use HTTP when no HTTPS indicators present\n    assert result == \"http://insecure.example.com\"\n"
  },
  {
    "path": "src/tests/test_filenames.py",
    "content": "from shared.processing_paths import (\n    ProcessingPaths,\n    get_srv_root,\n    paths_from_unprocessed_path,\n)\n\n\ndef test_filenames() -> None:\n    \"\"\"Test filename processing with sanitized characters.\"\"\"\n    work_paths = paths_from_unprocessed_path(\n        \"some/path/to/my/unprocessed.mp3\", \"fix buzz!! bang? a show?? about stuff.\"\n    )\n    # Expect sanitized directory name with special characters removed and spaces replaced with underscores\n    assert work_paths == ProcessingPaths(\n        post_processed_audio_path=get_srv_root()\n        / \"fix_buzz_bang_a_show_about_stuff\"\n        / \"unprocessed.mp3\",\n    )\n"
  },
  {
    "path": "src/tests/test_helpers.py",
    "content": "\"\"\"\nShared test utilities for rate limiting tests.\n\"\"\"\n\nfrom typing import Any\n\nfrom shared.config import Config\n\n\ndef create_test_config(**overrides: Any) -> Config:\n    \"\"\"Create a test configuration with rate limiting enabled.\"\"\"\n    config_data: dict[str, Any] = {\n        \"llm_model\": \"anthropic/claude-3-5-sonnet-20240620\",\n        \"llm_api_key\": \"test-key\",\n        \"llm_enable_token_rate_limiting\": True,\n        \"llm_max_retry_attempts\": 3,\n        \"llm_max_concurrent_calls\": 2,\n        \"openai_timeout\": 300,\n        \"openai_max_tokens\": 4096,\n        \"output\": {\n            \"fade_ms\": 3000,\n            \"min_ad_segement_separation_seconds\": 60,\n            \"min_ad_segment_length_seconds\": 14,\n            \"min_confidence\": 0.8,\n        },\n        \"processing\": {\n            \"num_segments_to_input_to_prompt\": 30,\n        },\n    }\n    config_data.update(overrides)\n    return Config(**config_data)\n"
  },
  {
    "path": "src/tests/test_llm_concurrency_limiter.py",
    "content": "\"\"\"\nTest cases for LLM concurrency limiting functionality.\n\"\"\"\n\nimport threading\nimport time\n\nimport pytest\n\nfrom podcast_processor.llm_concurrency_limiter import (\n    ConcurrencyContext,\n    LLMConcurrencyLimiter,\n    get_concurrency_limiter,\n)\n\n\nclass TestLLMConcurrencyLimiter:\n    \"\"\"Test cases for the LLMConcurrencyLimiter class.\"\"\"\n\n    def test_initialization(self):\n        \"\"\"Test proper initialization of the concurrency limiter.\"\"\"\n        limiter = LLMConcurrencyLimiter(max_concurrent_calls=3)\n        assert limiter.max_concurrent_calls == 3\n        assert limiter.get_available_slots() == 3\n        assert limiter.get_active_calls() == 0\n\n    def test_initialization_invalid_value(self):\n        \"\"\"Test that invalid max_concurrent_calls raises ValueError.\"\"\"\n        with pytest.raises(\n            ValueError, match=\"max_concurrent_calls must be greater than 0\"\n        ):\n            LLMConcurrencyLimiter(max_concurrent_calls=0)\n\n        with pytest.raises(\n            ValueError, match=\"max_concurrent_calls must be greater than 0\"\n        ):\n            LLMConcurrencyLimiter(max_concurrent_calls=-1)\n\n    def test_acquire_and_release(self):\n        \"\"\"Test basic acquire and release functionality.\"\"\"\n        limiter = LLMConcurrencyLimiter(max_concurrent_calls=2)\n\n        # Initially should have 2 available slots\n        assert limiter.get_available_slots() == 2\n        assert limiter.get_active_calls() == 0\n\n        # Acquire first slot\n        assert limiter.acquire() is True\n        assert limiter.get_available_slots() == 1\n        assert limiter.get_active_calls() == 1\n\n        # Acquire second slot\n        assert limiter.acquire() is True\n        assert limiter.get_available_slots() == 0\n        assert limiter.get_active_calls() == 2\n\n        # Release first slot\n        limiter.release()\n        assert limiter.get_available_slots() == 1\n        assert limiter.get_active_calls() == 1\n\n        # Release second slot\n        limiter.release()\n        assert limiter.get_available_slots() == 2\n        assert limiter.get_active_calls() == 0\n\n    def test_acquire_timeout(self):\n        \"\"\"Test acquire with timeout when no slots available.\"\"\"\n        limiter = LLMConcurrencyLimiter(max_concurrent_calls=1)\n\n        # Acquire the only slot\n        assert limiter.acquire() is True\n\n        # Try to acquire another slot with timeout\n        start_time = time.time()\n        assert limiter.acquire(timeout=0.1) is False\n        elapsed = time.time() - start_time\n\n        # Should timeout quickly\n        assert elapsed < 0.2  # Allow some margin for test execution\n\n    def test_context_manager(self):\n        \"\"\"Test the ConcurrencyContext context manager.\"\"\"\n        limiter = LLMConcurrencyLimiter(max_concurrent_calls=2)\n\n        assert limiter.get_available_slots() == 2\n\n        with ConcurrencyContext(limiter):\n            assert limiter.get_available_slots() == 1\n            assert limiter.get_active_calls() == 1\n\n        assert limiter.get_available_slots() == 2\n        assert limiter.get_active_calls() == 0\n\n    def test_context_manager_timeout(self):\n        \"\"\"Test context manager with timeout when no slots available.\"\"\"\n        limiter = LLMConcurrencyLimiter(max_concurrent_calls=1)\n\n        # Acquire the only slot\n        limiter.acquire()\n\n        # Try to use context manager with timeout\n        with pytest.raises(\n            RuntimeError, match=\"Could not acquire LLM concurrency slot\"\n        ):\n            with ConcurrencyContext(limiter, timeout=0.1):\n                pass\n\n    def test_thread_safety(self):\n        \"\"\"Test that the limiter works correctly with multiple threads.\"\"\"\n        limiter = LLMConcurrencyLimiter(max_concurrent_calls=2)\n        results = []\n        errors = []\n\n        def worker(worker_id):\n            try:\n                with ConcurrencyContext(limiter, timeout=1.0):\n                    results.append(f\"worker_{worker_id}_start\")\n                    # Simulate some work\n                    time.sleep(0.1)\n                    results.append(f\"worker_{worker_id}_end\")\n            except Exception as e:\n                errors.append(f\"worker_{worker_id}_error: {e}\")\n\n        # Start 4 threads, but only 2 should run concurrently\n        threads = []\n        for i in range(4):\n            thread = threading.Thread(target=worker, args=(i,))\n            threads.append(thread)\n            thread.start()\n\n        # Wait for all threads to complete\n        for thread in threads:\n            thread.join()\n\n        # Should have no errors\n        assert len(errors) == 0\n\n        # Should have 8 results total (start and end for each worker)\n        assert len(results) == 8\n\n        # Check that we have the expected results\n        start_results = [r for r in results if r.endswith(\"_start\")]\n        end_results = [r for r in results if r.endswith(\"_end\")]\n        assert len(start_results) == 4\n        assert len(end_results) == 4\n\n\nclass TestGlobalConcurrencyLimiter:\n    \"\"\"Test cases for global concurrency limiter functions.\"\"\"\n\n    def test_get_concurrency_limiter_singleton(self):\n        \"\"\"Test that get_concurrency_limiter returns the same instance.\"\"\"\n        # Clear any existing limiter\n        import podcast_processor.llm_concurrency_limiter as limiter_module\n\n        limiter_module._CONCURRENCY_LIMITER = None\n\n        limiter1 = get_concurrency_limiter(max_concurrent_calls=3)\n        limiter2 = get_concurrency_limiter(max_concurrent_calls=3)\n\n        assert limiter1 is limiter2\n        assert limiter1.max_concurrent_calls == 3\n\n    def test_get_concurrency_limiter_different_limits(self):\n        \"\"\"Test that get_concurrency_limiter creates new instance for different limits.\"\"\"\n        # Clear any existing limiter\n        import podcast_processor.llm_concurrency_limiter as limiter_module\n\n        limiter_module._CONCURRENCY_LIMITER = None\n\n        limiter1 = get_concurrency_limiter(max_concurrent_calls=3)\n        limiter2 = get_concurrency_limiter(max_concurrent_calls=5)\n\n        assert limiter1 is not limiter2\n        assert limiter1.max_concurrent_calls == 3\n        assert limiter2.max_concurrent_calls == 5\n"
  },
  {
    "path": "src/tests/test_llm_error_classifier.py",
    "content": "\"\"\"\nTests for the LLM error classifier.\n\"\"\"\n\nimport pytest\n\nfrom podcast_processor.llm_error_classifier import LLMErrorClassifier\n\n\nclass TestLLMErrorClassifier:\n    \"\"\"Test suite for LLMErrorClassifier.\"\"\"\n\n    def test_rate_limit_errors(self):\n        \"\"\"Test identification of rate limiting errors.\"\"\"\n        rate_limit_errors = [\n            \"Rate limit exceeded\",\n            \"Too many requests\",\n            \"Quota exceeded\",\n            \"HTTP 429 error\",\n            \"API rate limit hit\",\n        ]\n\n        for error in rate_limit_errors:\n            assert LLMErrorClassifier.is_retryable_error(error)\n            assert LLMErrorClassifier.get_error_category(error) == \"rate_limit\"\n\n    def test_timeout_errors(self):\n        \"\"\"Test identification of timeout errors.\"\"\"\n        timeout_errors = [\n            \"Request timeout\",\n            \"Connection timed out\",\n            \"HTTP 408 error\",\n            \"HTTP 504 Gateway Timeout\",\n        ]\n\n        for error in timeout_errors:\n            assert LLMErrorClassifier.is_retryable_error(error)\n            assert LLMErrorClassifier.get_error_category(error) == \"timeout\"\n\n    def test_server_errors(self):\n        \"\"\"Test identification of server errors.\"\"\"\n        server_errors = [\n            \"Internal server error\",\n            \"HTTP 500 error\",\n            \"HTTP 502 Bad Gateway\",\n            \"HTTP 503 Service Unavailable\",\n        ]\n\n        for error in server_errors:\n            assert LLMErrorClassifier.is_retryable_error(error)\n            assert LLMErrorClassifier.get_error_category(error) == \"server_error\"\n\n    def test_non_retryable_errors(self):\n        \"\"\"Test identification of non-retryable errors.\"\"\"\n        non_retryable_errors = [\n            \"Authentication failed\",\n            \"Invalid API key\",\n            \"Authorization denied\",\n            \"HTTP 401 Unauthorized\",\n            \"HTTP 403 Forbidden\",\n            \"HTTP 400 Bad Request\",\n        ]\n\n        for error in non_retryable_errors:\n            assert not LLMErrorClassifier.is_retryable_error(error)\n            category = LLMErrorClassifier.get_error_category(error)\n            assert category in [\"auth_error\", \"client_error\"]\n\n    def test_auth_vs_client_errors(self):\n        \"\"\"Test distinction between auth errors and other client errors.\"\"\"\n        auth_errors = [\n            \"Authentication failed\",\n            \"Authorization denied\",\n            \"HTTP 401 error\",\n            \"HTTP 403 error\",\n        ]\n\n        for error in auth_errors:\n            assert LLMErrorClassifier.get_error_category(error) == \"auth_error\"\n\n        client_errors = [\n            \"HTTP 400 Bad Request\",\n            \"Invalid parameter\",\n        ]\n\n        for error in client_errors:\n            assert LLMErrorClassifier.get_error_category(error) == \"client_error\"\n\n    def test_unknown_errors(self):\n        \"\"\"Test handling of unknown error types.\"\"\"\n        unknown_errors = [\n            \"Something weird happened\",\n            \"Unexpected error\",\n            \"HTTP 418 I'm a teapot\",\n        ]\n\n        for error in unknown_errors:\n            assert not LLMErrorClassifier.is_retryable_error(error)\n            assert LLMErrorClassifier.get_error_category(error) == \"unknown\"\n\n    def test_suggested_backoff(self):\n        \"\"\"Test suggested backoff times for different error types.\"\"\"\n        # Rate limit errors should have longer backoff\n        rate_limit_backoff = LLMErrorClassifier.get_suggested_backoff(\n            \"Rate limit exceeded\", 1\n        )\n        server_error_backoff = LLMErrorClassifier.get_suggested_backoff(\n            \"Internal server error\", 1\n        )\n        assert rate_limit_backoff > server_error_backoff\n\n        # Timeout errors should have moderate backoff\n        timeout_backoff = LLMErrorClassifier.get_suggested_backoff(\"Request timeout\", 1)\n        assert timeout_backoff > server_error_backoff\n        assert timeout_backoff < rate_limit_backoff\n\n        # Backoff should increase with attempt number\n        backoff_attempt_1 = LLMErrorClassifier.get_suggested_backoff(\n            \"Rate limit exceeded\", 1\n        )\n        backoff_attempt_2 = LLMErrorClassifier.get_suggested_backoff(\n            \"Rate limit exceeded\", 2\n        )\n        assert backoff_attempt_2 > backoff_attempt_1\n\n    def test_exception_objects(self):\n        \"\"\"Test handling of actual exception objects.\"\"\"\n        try:\n            # Test with a basic exception since LiteLLM constructor may vary\n            error = Exception(\"Internal server error\")\n            assert LLMErrorClassifier.is_retryable_error(error)\n\n            # Test with a more specific pattern\n            server_error_msg = \"HTTP 500 Internal Server Error\"\n            assert LLMErrorClassifier.is_retryable_error(server_error_msg)\n        except ImportError:\n            # Skip if litellm not available\n            pytest.skip(\"litellm not available\")\n\n    def test_case_insensitive_matching(self):\n        \"\"\"Test that error classification is case insensitive.\"\"\"\n        assert LLMErrorClassifier.is_retryable_error(\"RATE LIMIT EXCEEDED\")\n        assert LLMErrorClassifier.is_retryable_error(\"Rate Limit Exceeded\")\n        assert LLMErrorClassifier.is_retryable_error(\"rate limit exceeded\")\n\n        assert not LLMErrorClassifier.is_retryable_error(\"AUTHENTICATION FAILED\")\n        assert not LLMErrorClassifier.is_retryable_error(\"Authentication Failed\")\n        assert not LLMErrorClassifier.is_retryable_error(\"authentication failed\")\n"
  },
  {
    "path": "src/tests/test_parse_model_output.py",
    "content": "import pytest\nfrom pydantic import ValidationError\n\nfrom podcast_processor.model_output import (\n    AdSegmentPrediction,\n    AdSegmentPredictionList,\n    clean_and_parse_model_output,\n)\n\n\ndef test_clean_parse_output() -> None:\n    model_outupt = \"\"\"\nextra stuff bla bla\n{\"ad_segments\": [{\"segment_offset\": 123.45, \"confidence\": 0.7}]}. Note: Advertisements in the above podcast excerpt are identified with a moderate level of confidence due to their promotional nature, but not being from within the core content (i.e., discussing the movie or artwork) which suggests these segments could be a\n\"\"\"\n    assert clean_and_parse_model_output(model_outupt) == AdSegmentPredictionList(\n        ad_segments=[\n            AdSegmentPrediction(\n                segment_offset=123.45,\n                confidence=0.7,\n            )\n        ]\n    )\n\n\ndef test_parse_multiple_segments_output() -> None:\n    model_outupt = \"\"\"\n{\"ad_segments\": [\n    {\"segment_offset\": 123.45, \"confidence\": 0.7},\n    {\"segment_offset\": 23.45, \"confidence\": 0.8},\n    {\"segment_offset\": 45.67, \"confidence\": 0.9}\n]\n}\"\"\"\n    assert clean_and_parse_model_output(model_outupt) == AdSegmentPredictionList(\n        ad_segments=[\n            AdSegmentPrediction(segment_offset=123.45, confidence=0.7),\n            AdSegmentPrediction(segment_offset=23.45, confidence=0.8),\n            AdSegmentPrediction(segment_offset=45.67, confidence=0.9),\n        ]\n    )\n\n\ndef test_clean_parse_output_malformed() -> None:\n    model_outupt = \"\"\"\n{\"ad_segments\": uhoh1.7, 1114.8, 1116.4, 1118.2, 1119.5, 1121.0, 1123.2, 1125.2], \"confidence\": 0.7}. Note: Advertisements in the above podcast excerpt are identified with a moderate level of confidence due to their promotional nature, but not being from within the core content (i.e., discussing the movie or artwork) which suggests these segments could be a\n\"\"\"\n    with pytest.raises(ValidationError):\n        clean_and_parse_model_output(model_outupt)\n\n\ndef test_clean_parse_output_with_content_type() -> None:\n    model_output = \"\"\"\n{\"ad_segments\": [{\"segment_offset\": 12.0, \"confidence\": 0.86}], \"content_type\": \"promotional_external\", \"confidence\": 0.91}\n\"\"\"\n\n    assert clean_and_parse_model_output(model_output) == AdSegmentPredictionList(\n        ad_segments=[AdSegmentPrediction(segment_offset=12.0, confidence=0.86)],\n        content_type=\"promotional_external\",\n        confidence=0.91,\n    )\n\n\ndef test_clean_parse_output_truncated_missing_closing_brackets() -> None:\n    \"\"\"Test parsing truncated JSON missing closing ]} at the end.\"\"\"\n    model_output = '{\"ad_segments\":[{\"segment_offset\":10.5,\"confidence\":0.92}'\n    result = clean_and_parse_model_output(model_output)\n    assert result == AdSegmentPredictionList(\n        ad_segments=[AdSegmentPrediction(segment_offset=10.5, confidence=0.92)]\n    )\n\n\ndef test_clean_parse_output_truncated_multiple_segments() -> None:\n    \"\"\"Test parsing truncated JSON with multiple complete segments but missing closing.\"\"\"\n    model_output = '{\"ad_segments\":[{\"segment_offset\":10.5,\"confidence\":0.92},{\"segment_offset\":25.0,\"confidence\":0.85}'\n    result = clean_and_parse_model_output(model_output)\n    assert result == AdSegmentPredictionList(\n        ad_segments=[\n            AdSegmentPrediction(segment_offset=10.5, confidence=0.92),\n            AdSegmentPrediction(segment_offset=25.0, confidence=0.85),\n        ]\n    )\n\n\ndef test_clean_parse_output_truncated_with_content_type() -> None:\n    \"\"\"Test parsing truncated JSON that includes content_type but is missing final }.\"\"\"\n    model_output = '{\"ad_segments\":[{\"segment_offset\":12.0,\"confidence\":0.86}],\"content_type\":\"promotional_external\",\"confidence\":0.92'\n    result = clean_and_parse_model_output(model_output)\n    assert result == AdSegmentPredictionList(\n        ad_segments=[AdSegmentPrediction(segment_offset=12.0, confidence=0.86)],\n        content_type=\"promotional_external\",\n        confidence=0.92,\n    )\n"
  },
  {
    "path": "src/tests/test_podcast_downloader.py",
    "content": "from unittest import mock\n\nimport pytest\n\nfrom app.models import Feed, Post\nfrom podcast_processor.podcast_downloader import (\n    PodcastDownloader,\n    find_audio_link,\n    sanitize_title,\n)\n\n\n@pytest.fixture\ndef test_post(app):\n    \"\"\"Create a real Post object for testing.\"\"\"\n    with app.app_context():\n        # Create a test feed first\n        feed = Feed(\n            title=\"Test Feed\",\n            description=\"Test Description\",\n            author=\"Test Author\",\n            rss_url=\"https://example.com/feed.xml\",\n        )\n\n        # Create a test post\n        post = Post(\n            feed_id=1,  # Will be set properly when feed is saved\n            guid=\"test-guid-123\",\n            download_url=\"https://example.com/podcast.mp3\",\n            title=\"Test Episode\",\n            description=\"Test episode description\",\n        )\n        post.feed = feed  # Set the relationship\n\n        return post\n\n\n@pytest.fixture\ndef downloader(tmp_path):\n    \"\"\"Create a PodcastDownloader instance with a temporary directory.\"\"\"\n    return PodcastDownloader(download_dir=str(tmp_path))\n\n\n@pytest.fixture\ndef mock_entry():\n    entry = mock.MagicMock()\n    link1 = mock.MagicMock()\n    link1.type = \"audio/mpeg\"\n    link1.href = \"https://example.com/podcast.mp3\"\n\n    link2 = mock.MagicMock()\n    link2.type = \"text/html\"\n    link2.href = \"https://example.com/episode\"\n\n    entry.links = [link1, link2]\n    entry.id = \"https://example.com/episode-id\"\n    return entry\n\n\ndef test_sanitize_title():\n    assert sanitize_title(\"Test Episode!@#$%^&*()\") == \"Test Episode\"\n    assert (\n        sanitize_title(\"123-ABC_DEF.mp3\") == \"123ABCDEFmp3\"\n    )  # Fixed expected output to match actual behavior\n    assert sanitize_title(\"\") == \"\"\n\n\ndef test_get_and_make_download_path(downloader):\n    path = downloader.get_and_make_download_path(\"Test Episode!\")\n\n    # Check that the directory was created\n    assert path.parent.exists()\n    assert path.parent.is_dir()\n\n    # Check that the path is correct\n    assert path.name == \"Test Episode.mp3\"\n\n\ndef test_find_audio_link_with_audio_link(mock_entry):\n    assert find_audio_link(mock_entry) == \"https://example.com/podcast.mp3\"\n\n\ndef test_find_audio_link_without_audio_link():\n    entry = mock.MagicMock()\n    entry.links = []\n    entry.id = \"https://example.com/episode-id\"\n\n    assert find_audio_link(entry) == \"https://example.com/episode-id\"\n\n\n@mock.patch(\"podcast_processor.podcast_downloader.requests.get\")\ndef test_download_episode_already_exists(mock_get, test_post, downloader, app):\n    with app.app_context():\n        # Create the directory and file\n        episode_dir = downloader.get_and_make_download_path(test_post.title).parent\n        episode_dir.mkdir(parents=True, exist_ok=True)\n        episode_file = episode_dir / \"Test Episode.mp3\"\n        episode_file.write_bytes(b\"dummy data\")\n\n        result = downloader.download_episode(test_post, dest_path=str(episode_file))\n\n        # Check that we didn't try to download the file\n        mock_get.assert_not_called()\n\n        # Check that the correct path was returned\n        assert result == str(episode_file)\n\n\n@mock.patch(\"podcast_processor.podcast_downloader.requests.get\")\ndef test_download_episode_new_file(mock_get, test_post, downloader, app):\n    with app.app_context():\n        # Setup mock response\n        mock_response = mock.MagicMock()\n        mock_response.status_code = 200\n        mock_response.iter_content.return_value = [b\"podcast audio content\"]\n        mock_response.__enter__.return_value = mock_response\n        mock_response.__exit__.return_value = None\n        mock_get.return_value = mock_response\n\n        expected_path = downloader.get_and_make_download_path(test_post.title)\n        result = downloader.download_episode(test_post, dest_path=str(expected_path))\n\n        # Check that we tried to download the file\n        mock_get.assert_called_once_with(\n            \"https://example.com/podcast.mp3\", headers=mock.ANY, stream=True, timeout=60\n        )\n\n        # Check that the file was created with the correct content\n        expected_path = downloader.get_and_make_download_path(test_post.title)\n        assert expected_path.exists()\n        assert expected_path.read_bytes() == b\"podcast audio content\"\n\n        # Check that the correct path was returned\n        assert result == str(expected_path)\n\n\n@mock.patch(\"podcast_processor.podcast_downloader.requests.get\")\ndef test_download_episode_download_failed(mock_get, test_post, downloader, app):\n    with app.app_context():\n        # Setup mock response\n        mock_response = mock.MagicMock()\n        mock_response.status_code = 404\n        mock_response.__enter__.return_value = mock_response\n        mock_response.__exit__.return_value = None\n        mock_get.return_value = mock_response\n\n        expected_path = downloader.get_and_make_download_path(test_post.title)\n        result = downloader.download_episode(test_post, dest_path=str(expected_path))\n\n        # Check that we tried to download the file\n        mock_get.assert_called_once_with(\n            \"https://example.com/podcast.mp3\", headers=mock.ANY, stream=True, timeout=60\n        )\n\n        # Check that no file was created\n        expected_path = downloader.get_and_make_download_path(test_post.title)\n        assert not expected_path.exists()\n\n        # Check that None was returned\n        assert result is None\n\n\n@mock.patch(\"podcast_processor.podcast_downloader.validators.url\")\n@mock.patch(\"podcast_processor.podcast_downloader.abort\")\ndef test_download_episode_invalid_url(\n    mock_abort, mock_validator, test_post, downloader, app\n):\n    with app.app_context():\n        # Make the validator fail\n        mock_validator.return_value = False\n\n        expected_path = downloader.get_and_make_download_path(test_post.title)\n        downloader.download_episode(test_post, dest_path=str(expected_path))\n\n        # Check that abort was called with 404\n        mock_abort.assert_called_once_with(404)\n\n\n@mock.patch(\"podcast_processor.podcast_downloader.requests.get\")\ndef test_download_episode_invalid_post_title(mock_get, test_post, downloader, app):\n    with app.app_context():\n        # Test with a post that has an invalid title that results in empty sanitized title\n        test_post.title = \"!@#$%^&*()\"  # This will sanitize to empty string\n\n        with mock.patch.object(\n            downloader, \"get_and_make_download_path\"\n        ) as mock_get_path:\n            mock_get_path.return_value = \"\"\n\n            expected_path = downloader.get_and_make_download_path(test_post.title)\n            result = downloader.download_episode(test_post, dest_path=expected_path)\n\n            # Check that None was returned\n            assert result is None\n            mock_get.assert_not_called()\n"
  },
  {
    "path": "src/tests/test_podcast_processor_cleanup.py",
    "content": "from unittest.mock import MagicMock\n\nfrom app.extensions import db\nfrom app.models import Feed, Post\nfrom podcast_processor.ad_classifier import AdClassifier\nfrom podcast_processor.audio_processor import AudioProcessor\nfrom podcast_processor.podcast_downloader import PodcastDownloader\nfrom podcast_processor.podcast_processor import PodcastProcessor\nfrom podcast_processor.processing_status_manager import ProcessingStatusManager\nfrom podcast_processor.transcription_manager import TranscriptionManager\nfrom shared.test_utils import create_standard_test_config\n\n\ndef test_remove_unprocessed_audio_deletes_file(app, tmp_path) -> None:\n    file_path = tmp_path / \"raw.mp3\"\n    file_path.write_text(\"audio\")\n\n    with app.app_context():\n        # Create a real Post object\n        feed = Feed(\n            title=\"Test Feed\",\n            description=\"Test Description\",\n            author=\"Test Author\",\n            rss_url=\"https://example.com/feed.xml\",\n        )\n        db.session.add(feed)\n        db.session.commit()\n\n        post = Post(\n            guid=\"test-guid\",\n            title=\"Test Episode\",\n            download_url=\"https://example.com/episode.mp3\",\n            feed_id=feed.id,\n            unprocessed_audio_path=str(file_path),\n        )\n        db.session.add(post)\n        db.session.commit()\n\n        processor = PodcastProcessor(\n            config=create_standard_test_config(),\n            transcription_manager=MagicMock(spec=TranscriptionManager),\n            ad_classifier=MagicMock(spec=AdClassifier),\n            audio_processor=MagicMock(spec=AudioProcessor),\n            status_manager=MagicMock(spec=ProcessingStatusManager),\n            db_session=db.session,\n            downloader=MagicMock(spec=PodcastDownloader),\n        )\n\n        processor._remove_unprocessed_audio(post)\n\n        assert post.unprocessed_audio_path is None\n        assert not file_path.exists()\n"
  },
  {
    "path": "src/tests/test_post_cleanup.py",
    "content": "from __future__ import annotations\n\nfrom datetime import datetime, timedelta\nfrom pathlib import Path\n\nfrom app.extensions import db\nfrom app.models import (\n    Feed,\n    Identification,\n    ModelCall,\n    Post,\n    ProcessingJob,\n    TranscriptSegment,\n)\nfrom app.post_cleanup import cleanup_processed_posts, count_cleanup_candidates\n\n\ndef _create_feed() -> Feed:\n    feed = Feed(\n        title=\"Test Feed\",\n        description=\"desc\",\n        author=\"author\",\n        rss_url=\"https://example.com/feed.xml\",\n        image_url=\"https://example.com/image.png\",\n    )\n    db.session.add(feed)\n    db.session.commit()\n    return feed\n\n\ndef _create_post(feed: Feed, guid: str, download_url: str) -> Post:\n    post = Post(\n        feed_id=feed.id,\n        guid=guid,\n        download_url=download_url,\n        title=f\"Episode {guid}\",\n        description=\"test\",\n        whitelisted=True,\n    )\n    db.session.add(post)\n    db.session.commit()\n    return post\n\n\ndef test_cleanup_removes_expired_posts(app, tmp_path) -> None:\n    with app.app_context():\n        feed = _create_feed()\n\n        old_post = _create_post(feed, \"old-guid\", \"https://example.com/old.mp3\")\n        recent_post = _create_post(\n            feed, \"recent-guid\", \"https://example.com/recent.mp3\"\n        )\n\n        old_processed = Path(tmp_path) / \"old_processed.mp3\"\n        old_unprocessed = Path(tmp_path) / \"old_unprocessed.mp3\"\n        old_processed.write_text(\"processed\")\n        old_unprocessed.write_text(\"unprocessed\")\n        old_post.processed_audio_path = str(old_processed)\n        old_post.unprocessed_audio_path = str(old_unprocessed)\n        db.session.commit()\n\n        completed_at = datetime.utcnow() - timedelta(days=10)\n        db.session.add(\n            ProcessingJob(\n                id=\"job-old\",\n                post_guid=old_post.guid,\n                status=\"completed\",\n                current_step=4,\n                total_steps=4,\n                progress_percentage=100.0,\n                created_at=completed_at,\n                started_at=completed_at,\n                completed_at=completed_at,\n            )\n        )\n\n        recent_completed = datetime.utcnow() - timedelta(days=2)\n        db.session.add(\n            ProcessingJob(\n                id=\"job-recent\",\n                post_guid=recent_post.guid,\n                status=\"completed\",\n                current_step=4,\n                total_steps=4,\n                progress_percentage=100.0,\n                created_at=recent_completed,\n                started_at=recent_completed,\n                completed_at=recent_completed,\n            )\n        )\n\n        # Populate related tables for the old post to ensure cascading deletes\n        model_call = ModelCall(\n            post_id=old_post.id,\n            first_segment_sequence_num=0,\n            last_segment_sequence_num=0,\n            model_name=\"test\",\n            prompt=\"prompt\",\n            response=\"resp\",\n            status=\"completed\",\n            timestamp=completed_at,\n        )\n        db.session.add(model_call)\n        segment = TranscriptSegment(\n            post_id=old_post.id,\n            sequence_num=0,\n            start_time=0.0,\n            end_time=1.0,\n            text=\"segment\",\n        )\n        db.session.add(segment)\n        db.session.flush()\n        db.session.add(\n            Identification(\n                transcript_segment_id=segment.id,\n                model_call_id=model_call.id,\n                confidence=0.5,\n                label=\"ad\",\n            )\n        )\n\n        db.session.commit()\n\n        removed = cleanup_processed_posts(retention_days=5)\n\n        assert removed == 1\n        cleaned_old_post = Post.query.filter_by(guid=\"old-guid\").first()\n        assert cleaned_old_post is not None\n        assert cleaned_old_post.whitelisted is False\n        assert cleaned_old_post.processed_audio_path is None\n        assert cleaned_old_post.unprocessed_audio_path is None\n        assert Post.query.filter_by(guid=\"recent-guid\").first() is not None\n        assert ProcessingJob.query.filter_by(post_guid=\"old-guid\").first() is None\n        assert Identification.query.count() == 0\n        assert TranscriptSegment.query.count() == 0\n        assert ModelCall.query.count() == 0\n        assert not old_processed.exists()\n        assert not old_unprocessed.exists()\n\n\ndef test_cleanup_skips_when_retention_disabled(app) -> None:\n    with app.app_context():\n        feed = _create_feed()\n        post = _create_post(feed, \"guid\", \"https://example.com/audio.mp3\")\n        completed_at = datetime.utcnow() - timedelta(days=10)\n        db.session.add(\n            ProcessingJob(\n                id=\"job-disable\",\n                post_guid=post.guid,\n                status=\"completed\",\n                current_step=4,\n                total_steps=4,\n                progress_percentage=100.0,\n                created_at=completed_at,\n                started_at=completed_at,\n                completed_at=completed_at,\n            )\n        )\n        db.session.commit()\n\n        removed = cleanup_processed_posts(retention_days=None)\n        assert removed == 0\n        assert Post.query.filter_by(guid=\"guid\").first() is not None\n\n\ndef test_cleanup_includes_non_whitelisted_processed_posts(app, tmp_path) -> None:\n    with app.app_context():\n        feed = _create_feed()\n        post = _create_post(feed, \"non-white\", \"https://example.com/nonwhite.mp3\")\n        post.whitelisted = False\n        post.release_date = datetime.utcnow() - timedelta(days=10)\n        processed = tmp_path / \"processed.mp3\"\n        processed.write_text(\"audio\")\n        post.processed_audio_path = str(processed)\n\n        # Add old completed job so post qualifies for cleanup\n        completed_at = datetime.utcnow() - timedelta(days=10)\n        db.session.add(\n            ProcessingJob(\n                id=\"job-non-white\",\n                post_guid=post.guid,\n                status=\"completed\",\n                current_step=4,\n                total_steps=4,\n                progress_percentage=100.0,\n                created_at=completed_at,\n                started_at=completed_at,\n                completed_at=completed_at,\n            )\n        )\n        db.session.commit()\n\n        count, _ = count_cleanup_candidates(retention_days=5)\n        assert count == 1\n\n        removed = cleanup_processed_posts(retention_days=5)\n        assert removed == 1\n        cleaned_post = Post.query.filter_by(guid=\"non-white\").first()\n        assert cleaned_post is not None\n        assert cleaned_post.whitelisted is False\n        assert cleaned_post.processed_audio_path is None\n        assert cleaned_post.unprocessed_audio_path is None\n\n\ndef test_cleanup_skips_unprocessed_unwhitelisted_posts(app) -> None:\n    with app.app_context():\n        feed = _create_feed()\n        post = _create_post(feed, \"non-white-2\", \"https://example.com/nonwhite2.mp3\")\n        post.whitelisted = False\n        post.release_date = datetime.utcnow() - timedelta(days=10)\n        db.session.commit()\n\n        count, _ = count_cleanup_candidates(retention_days=5)\n        assert count == 0\n\n        removed = cleanup_processed_posts(retention_days=5)\n        assert removed == 0\n        assert Post.query.filter_by(guid=\"non-white-2\").first() is not None\n"
  },
  {
    "path": "src/tests/test_post_routes.py",
    "content": "import datetime\nfrom types import SimpleNamespace\nfrom unittest import mock\n\nfrom flask import g\n\nfrom app.extensions import db\nfrom app.models import Feed, Post, User\nfrom app.routes.post_routes import post_bp\nfrom app.runtime_config import config as runtime_config\n\n\ndef test_download_endpoints_increment_counter(app, tmp_path):\n    \"\"\"Ensure both processed and original downloads increment the counter.\"\"\"\n    app.testing = True\n    app.register_blueprint(post_bp)\n\n    with app.app_context():\n        feed = Feed(title=\"Test Feed\", rss_url=\"https://example.com/feed.xml\")\n        db.session.add(feed)\n        db.session.commit()\n\n        processed_audio = tmp_path / \"processed.mp3\"\n        processed_audio.write_bytes(b\"processed audio\")\n\n        original_audio = tmp_path / \"original.mp3\"\n        original_audio.write_bytes(b\"original audio\")\n\n        post = Post(\n            feed_id=feed.id,\n            guid=\"test-guid\",\n            download_url=\"https://example.com/audio.mp3\",\n            title=\"Test Episode\",\n            processed_audio_path=str(processed_audio),\n            unprocessed_audio_path=str(original_audio),\n            whitelisted=True,\n        )\n        db.session.add(post)\n        db.session.commit()\n\n        client = app.test_client()\n\n        # Mock writer_client to simulate DB update\n        with mock.patch(\"app.routes.post_routes.writer_client\") as mock_writer:\n\n            def side_effect(action, params, wait=False):\n                if action == \"increment_download_count\":\n                    post_id = params[\"post_id\"]\n                    Post.query.filter_by(id=post_id).update(\n                        {Post.download_count: (Post.download_count or 0) + 1}\n                    )\n                    db.session.commit()\n\n            mock_writer.action.side_effect = side_effect\n\n            response = client.get(f\"/api/posts/{post.guid}/download\")\n            assert response.status_code == 200\n            db.session.refresh(post)\n            assert post.download_count == 1\n\n            response = client.get(f\"/api/posts/{post.guid}/download/original\")\n            assert response.status_code == 200\n            db.session.refresh(post)\n            assert post.download_count == 2\n\n\ndef test_download_triggers_processing_when_enabled(app):\n    \"\"\"Start processing when processed audio is missing and toggle is enabled.\"\"\"\n    app.testing = True\n    app.register_blueprint(post_bp)\n\n    with app.app_context():\n        feed = Feed(title=\"Test Feed\", rss_url=\"https://example.com/feed.xml\")\n        db.session.add(feed)\n        db.session.commit()\n\n        post = Post(\n            feed_id=feed.id,\n            guid=\"missing-audio-guid\",\n            download_url=\"https://example.com/audio.mp3\",\n            title=\"Missing Audio\",\n            whitelisted=True,\n        )\n        db.session.add(post)\n        db.session.commit()\n        post_guid = post.guid\n\n    client = app.test_client()\n    original_flag = runtime_config.autoprocess_on_download\n    runtime_config.autoprocess_on_download = True\n    try:\n        with mock.patch(\"app.routes.post_routes.get_jobs_manager\") as mock_mgr:\n            mock_mgr.return_value.start_post_processing.return_value = {\n                \"status\": \"started\",\n                \"job_id\": \"job-123\",\n            }\n            response = client.get(f\"/api/posts/{post_guid}/download\")\n            assert response.status_code == 202\n            payload = response.get_json()\n            assert payload[\"status\"] == \"started\"\n            mock_mgr.return_value.start_post_processing.assert_called_once_with(\n                post_guid,\n                priority=\"download\",\n                requested_by_user_id=None,\n                billing_user_id=None,\n            )\n    finally:\n        runtime_config.autoprocess_on_download = original_flag\n\n\ndef test_download_missing_audio_returns_404_when_disabled(app):\n    \"\"\"Keep existing 404 behavior when toggle is off.\"\"\"\n    app.testing = True\n    app.register_blueprint(post_bp)\n\n    with app.app_context():\n        feed = Feed(title=\"Test Feed\", rss_url=\"https://example.com/feed.xml\")\n        db.session.add(feed)\n        db.session.commit()\n\n        post = Post(\n            feed_id=feed.id,\n            guid=\"missing-audio-404\",\n            download_url=\"https://example.com/audio.mp3\",\n            title=\"Missing Audio\",\n            whitelisted=True,\n        )\n        db.session.add(post)\n        db.session.commit()\n        post_guid = post.guid\n\n    client = app.test_client()\n    original_flag = runtime_config.autoprocess_on_download\n    runtime_config.autoprocess_on_download = False\n    try:\n        with mock.patch(\"app.routes.post_routes.get_jobs_manager\") as mock_mgr:\n            response = client.get(f\"/api/posts/{post_guid}/download\")\n            assert response.status_code == 404\n            mock_mgr.return_value.start_post_processing.assert_not_called()\n    finally:\n        runtime_config.autoprocess_on_download = original_flag\n\n\ndef test_download_auto_whitelists_post(app, tmp_path):\n    \"\"\"Download request should whitelist the post automatically.\"\"\"\n    app.testing = True\n    app.register_blueprint(post_bp)\n\n    with app.app_context():\n        feed = Feed(title=\"Test Feed\", rss_url=\"https://example.com/feed.xml\")\n        db.session.add(feed)\n        db.session.commit()\n\n        processed_audio = tmp_path / \"processed.mp3\"\n        processed_audio.write_bytes(b\"processed audio\")\n\n        post = Post(\n            feed_id=feed.id,\n            guid=\"auto-whitelist-guid\",\n            download_url=\"https://example.com/audio.mp3\",\n            title=\"Auto Whitelist Episode\",\n            processed_audio_path=str(processed_audio),\n            whitelisted=False,\n        )\n        db.session.add(post)\n        db.session.commit()\n        post_guid = post.guid\n        post_id = post.id\n\n    client = app.test_client()\n\n    original_flag = runtime_config.autoprocess_on_download\n    runtime_config.autoprocess_on_download = True\n\n    with mock.patch(\"app.routes.post_routes.writer_client\") as mock_writer:\n        mock_writer.action.return_value = SimpleNamespace(success=True, data=None)\n        response = client.get(f\"/api/posts/{post_guid}/download\")\n        assert response.status_code == 200\n        mock_writer.action.assert_has_calls(\n            [\n                mock.call(\"whitelist_post\", {\"post_id\": post_id}, wait=True),\n                mock.call(\"increment_download_count\", {\"post_id\": post_id}, wait=False),\n            ]\n        )\n    runtime_config.autoprocess_on_download = original_flag\n\n\ndef test_download_rejects_when_not_whitelisted_and_toggle_off(app):\n    \"\"\"Ensure download is forbidden when not whitelisted and auto-process toggle is off.\"\"\"\n    app.testing = True\n    app.register_blueprint(post_bp)\n\n    with app.app_context():\n        feed = Feed(title=\"Test Feed\", rss_url=\"https://example.com/feed.xml\")\n        db.session.add(feed)\n        db.session.commit()\n\n        post = Post(\n            feed_id=feed.id,\n            guid=\"no-autoprocess-whitelist\",\n            download_url=\"https://example.com/audio.mp3\",\n            title=\"No Auto\",\n            whitelisted=False,\n        )\n        db.session.add(post)\n        db.session.commit()\n        post_guid = post.guid\n\n    client = app.test_client()\n    original_flag = runtime_config.autoprocess_on_download\n    runtime_config.autoprocess_on_download = False\n    try:\n        response = client.get(f\"/api/posts/{post_guid}/download\")\n        assert response.status_code == 403\n    finally:\n        runtime_config.autoprocess_on_download = original_flag\n\n\ndef test_toggle_whitelist_all_requires_admin(app):\n    \"\"\"Ensure bulk whitelist actions are limited to admins.\"\"\"\n    app.testing = True\n    app.register_blueprint(post_bp)\n    app.config[\"AUTH_SETTINGS\"] = SimpleNamespace(require_auth=True)\n\n    with app.app_context():\n        admin_user = User(username=\"admin\", password_hash=\"hash\", role=\"admin\")\n        regular_user = User(username=\"user\", password_hash=\"hash\", role=\"user\")\n        feed = Feed(title=\"Admin Feed\", rss_url=\"https://example.com/feed.xml\")\n        db.session.add_all([admin_user, regular_user, feed])\n        db.session.commit()\n\n        posts = [\n            Post(\n                feed_id=feed.id,\n                guid=f\"guid-{idx}\",\n                download_url=f\"https://example.com/{idx}.mp3\",\n                title=f\"Episode {idx}\",\n                whitelisted=False,\n            )\n            for idx in range(2)\n        ]\n        db.session.add_all(posts)\n        db.session.commit()\n\n        admin_id = admin_user.id\n        regular_id = regular_user.id\n        feed_id = feed.id\n\n    current_user = {\"id\": admin_id}\n\n    @app.before_request\n    def _mock_auth() -> None:\n        g.current_user = SimpleNamespace(id=current_user[\"id\"])\n\n    client = app.test_client()\n    current_user[\"id\"] = regular_id\n    response = client.post(f\"/api/feeds/{feed_id}/toggle-whitelist-all\")\n    assert response.status_code == 403\n    assert response.get_json()[\"error\"].startswith(\"Only admins\")\n\n    current_user[\"id\"] = admin_id\n    response = client.post(f\"/api/feeds/{feed_id}/toggle-whitelist-all\")\n    assert response.status_code == 200\n    with app.app_context():\n        whitelisted = Post.query.filter_by(feed_id=feed_id, whitelisted=True).count()\n        assert whitelisted == 2\n\n\ndef test_feed_posts_pagination_and_filtering(app):\n    \"\"\"Feed posts endpoint should paginate and support whitelisted filter.\"\"\"\n\n    app.testing = True\n    app.register_blueprint(post_bp)\n\n    with app.app_context():\n        feed = Feed(title=\"Paged Feed\", rss_url=\"https://example.com/feed.xml\")\n        db.session.add(feed)\n        db.session.commit()\n\n        base_date = datetime.date(2024, 1, 1)\n        posts = []\n        # Create 30 posts with descending dates; even ones whitelisted.\n        for idx in range(30):\n            post = Post(\n                feed_id=feed.id,\n                guid=f\"guid-{idx}\",\n                download_url=f\"https://example.com/{idx}.mp3\",\n                title=f\"Episode {idx}\",\n                release_date=base_date + datetime.timedelta(days=idx),\n                whitelisted=(idx % 2 == 0),\n            )\n            posts.append(post)\n\n        db.session.add_all(posts)\n        db.session.commit()\n\n        client = app.test_client()\n\n        # Default page (1) should return 25 items ordered newest-first\n        response = client.get(f\"/api/feeds/{feed.id}/posts\")\n        assert response.status_code == 200\n        data = response.get_json()\n        assert data[\"page\"] == 1\n        assert data[\"page_size\"] == 25\n        assert data[\"total\"] == 30\n        assert data[\"total_pages\"] == 2\n        assert len(data[\"items\"]) == 25\n        # First item should be newest (idx 29)\n        assert data[\"items\"][0][\"guid\"] == \"guid-29\"\n        # Last item on page 1 should be idx 5 (25 items: 29..5)\n        assert data[\"items\"][-1][\"guid\"] == \"guid-5\"\n\n        # Page 2 should return remaining 5\n        response = client.get(f\"/api/feeds/{feed.id}/posts\", query_string={\"page\": 2})\n        assert response.status_code == 200\n        data_page_2 = response.get_json()\n        assert data_page_2[\"page\"] == 2\n        assert len(data_page_2[\"items\"]) == 5\n        # Items should be 4..0\n        assert {item[\"guid\"] for item in data_page_2[\"items\"]} == {\n            \"guid-4\",\n            \"guid-3\",\n            \"guid-2\",\n            \"guid-1\",\n            \"guid-0\",\n        }\n\n        # Whitelisted filter should only return whitelisted posts (15 total)\n        response = client.get(\n            f\"/api/feeds/{feed.id}/posts\",\n            query_string={\"whitelisted_only\": \"true\"},\n        )\n        assert response.status_code == 200\n        filtered = response.get_json()\n        assert filtered[\"total\"] == 15\n        assert filtered[\"whitelisted_total\"] == 15\n        assert all(item[\"whitelisted\"] for item in filtered[\"items\"])\n"
  },
  {
    "path": "src/tests/test_posts.py",
    "content": "from pathlib import Path\nfrom unittest.mock import patch\n\nfrom app.models import Post\nfrom app.posts import remove_associated_files\n\n\nclass TestPostsFunctions:\n    \"\"\"Test class for functions in the app.posts module.\"\"\"\n\n    @patch(\"app.posts._remove_file_if_exists\")\n    @patch(\"app.posts._dedupe_and_find_existing\")\n    @patch(\"app.posts._collect_processed_paths\")\n    @patch(\"app.posts.get_and_make_download_path\")\n    @patch(\"app.posts.logger\")\n    def test_remove_associated_files_files_dont_exist(\n        self,\n        mock_logger,\n        mock_get_download_path,\n        mock_collect_paths,\n        mock_dedupe,\n        mock_remove_file,\n        app,\n    ):\n        \"\"\"Test remove_associated_files when files don't exist.\"\"\"\n        with app.app_context():\n            # Set up mocks\n            mock_collect_paths.return_value = [Path(\"/path/to/processed.mp3\")]\n            mock_dedupe.return_value = (\n                [Path(\"/path/to/processed.mp3\")],\n                None,  # No existing file found\n            )\n            mock_get_download_path.return_value = \"/path/to/unprocessed.mp3\"\n\n            # Create test post\n            post = Post(id=1, title=\"Test Post\")\n\n            # Call the function\n            remove_associated_files(post)\n\n            # Verify _remove_file_if_exists was called for unprocessed path\n            assert mock_remove_file.call_count >= 1\n\n            # Verify debug logging for no processed file\n            mock_logger.debug.assert_called()\n"
  },
  {
    "path": "src/tests/test_process_audio.py",
    "content": "import tempfile\nfrom pathlib import Path\n\nfrom podcast_processor.audio import (\n    clip_segments_with_fade,\n    get_audio_duration_ms,\n    split_audio,\n)\n\nTEST_FILE_DURATION = 66_048\nTEST_FILE_PATH = \"src/tests/data/count_0_99.mp3\"\n\n\ndef test_get_duration_ms() -> None:\n    assert get_audio_duration_ms(TEST_FILE_PATH) == TEST_FILE_DURATION\n\n\ndef test_clip_segment_with_fade() -> None:\n    fade_len_ms = 5_000\n    ad_start_offset_ms, ad_end_offset_ms = 3_000, 21_000\n\n    with tempfile.NamedTemporaryFile(delete=True, suffix=\".mp3\") as temp_file:\n        clip_segments_with_fade(\n            [(ad_start_offset_ms, ad_end_offset_ms)],\n            fade_len_ms,\n            TEST_FILE_PATH,\n            temp_file.name,\n        )\n\n        expected_duration = (\n            TEST_FILE_DURATION\n            - (ad_end_offset_ms - ad_start_offset_ms)\n            + 2 * fade_len_ms\n            + 56  # not sure where this fudge comes from\n        )\n        actual_duration = get_audio_duration_ms(temp_file.name)\n        assert actual_duration is not None, \"Failed to get audio duration\"\n        assert abs(actual_duration - expected_duration) <= 60, (\n            f\"Duration mismatch: expected {expected_duration}ms, got {actual_duration}ms, \"\n            f\"difference: {abs(actual_duration - expected_duration)}ms\"\n        )\n\n\ndef test_clip_segment_with_fade_beginning() -> None:\n    fade_len_ms = 5_000\n    ad_start_offset_ms, ad_end_offset_ms = 0, 18_000\n\n    with tempfile.NamedTemporaryFile(delete=True, suffix=\".mp3\") as temp_file:\n        clip_segments_with_fade(\n            [(ad_start_offset_ms, ad_end_offset_ms)],\n            fade_len_ms,\n            TEST_FILE_PATH,\n            temp_file.name,\n        )\n\n        expected_duration = (\n            TEST_FILE_DURATION\n            - (ad_end_offset_ms - ad_start_offset_ms)\n            + 2 * fade_len_ms\n            + 56  # not sure where this fudge comes from\n        )\n        actual_duration = get_audio_duration_ms(temp_file.name)\n        assert actual_duration is not None, \"Failed to get audio duration\"\n        assert abs(actual_duration - expected_duration) <= 60, (\n            f\"Duration mismatch: expected {expected_duration}ms, got {actual_duration}ms, \"\n            f\"difference: {abs(actual_duration - expected_duration)}ms\"\n        )\n\n\ndef test_clip_segment_with_fade_end() -> None:\n    fade_len_ms = 5_000\n    ad_start_offset_ms, ad_end_offset_ms = (\n        TEST_FILE_DURATION - 18_000,\n        TEST_FILE_DURATION,\n    )\n\n    with tempfile.NamedTemporaryFile(delete=True, suffix=\".mp3\") as temp_file:\n        clip_segments_with_fade(\n            [(ad_start_offset_ms, ad_end_offset_ms)],\n            fade_len_ms,\n            TEST_FILE_PATH,\n            temp_file.name,\n        )\n\n        expected_duration = (\n            TEST_FILE_DURATION\n            - (ad_end_offset_ms - ad_start_offset_ms)\n            + 2 * fade_len_ms\n            + 56  # not sure where this fudge comes from\n        )\n        actual_duration = get_audio_duration_ms(temp_file.name)\n        assert actual_duration is not None, \"Failed to get audio duration\"\n        assert abs(actual_duration - expected_duration) <= 60, (\n            f\"Duration mismatch: expected {expected_duration}ms, got {actual_duration}ms, \"\n            f\"difference: {abs(actual_duration - expected_duration)}ms\"\n        )\n\n\ndef test_split_audio() -> None:\n    with tempfile.TemporaryDirectory() as temp_dir:\n        temp_dir_path = Path(temp_dir)\n        split_audio(Path(TEST_FILE_PATH), temp_dir_path, 38_000)\n\n        expected = {\n            \"0.mp3\": (6_384, 38_108),\n            \"1.mp3\": (6_384, 38_252),\n            \"2.mp3\": (6_384, 38_108),\n            \"3.mp3\": (6_384, 38_108),\n            \"4.mp3\": (6_384, 38_252),\n            \"5.mp3\": (6_384, 38_252),\n            \"6.mp3\": (6_384, 38_252),\n            \"7.mp3\": (6_384, 38_108),\n            \"8.mp3\": (6_384, 38_108),\n            \"9.mp3\": (6_384, 38_252),\n            \"10.mp3\": (2_784, 16_508),\n        }\n\n        for split in temp_dir_path.iterdir():\n            assert split.name in expected\n            duration_ms, filesize = expected[split.name]\n            actual_duration = get_audio_duration_ms(str(split))\n            assert (\n                actual_duration is not None\n            ), f\"Failed to get audio duration for {split}\"\n            assert abs(actual_duration - duration_ms) <= 100, (\n                f\"Duration mismatch for {split}. Expected {duration_ms}ms, got {actual_duration}ms, \"\n                f\"difference: {abs(actual_duration - duration_ms)}ms\"\n            )\n            assert (\n                abs(filesize - split.stat().st_size) <= 500\n            ), f\"filesize <> 500 bytes for {split}. found {split.stat().st_size}, expected {filesize}\"  # pylint: disable=line-too-long\n"
  },
  {
    "path": "src/tests/test_rate_limiting_config.py",
    "content": "\"\"\"\nTests for new rate limiting configuration options.\n\"\"\"\n\nfrom typing import Any\n\nfrom shared.config import Config\n\n\nclass TestRateLimitingConfig:\n    \"\"\"Test cases for rate limiting configuration.\"\"\"\n\n    def test_default_rate_limiting_config(self) -> None:\n        \"\"\"Test that rate limiting defaults are properly set.\"\"\"\n        config_data: dict[str, Any] = {\n            \"llm_api_key\": \"test-key\",\n            \"output\": {\n                \"fade_ms\": 3000,\n                \"min_ad_segement_separation_seconds\": 60,\n                \"min_ad_segment_length_seconds\": 14,\n                \"min_confidence\": 0.8,\n            },\n            \"processing\": {\n                \"num_segments_to_input_to_prompt\": 30,\n            },\n        }\n\n        config = Config(**config_data)\n\n        # Test default values\n        assert config.llm_max_concurrent_calls == 3\n        assert config.llm_max_retry_attempts == 5\n        assert config.llm_max_input_tokens_per_call is None\n        assert config.llm_enable_token_rate_limiting is False\n        assert config.llm_max_input_tokens_per_minute is None\n\n    def test_custom_rate_limiting_config(self) -> None:\n        \"\"\"Test that custom rate limiting values are properly set.\"\"\"\n        config_data: dict[str, Any] = {\n            \"llm_api_key\": \"test-key\",\n            \"llm_max_concurrent_calls\": 5,\n            \"llm_max_retry_attempts\": 10,\n            \"llm_max_input_tokens_per_call\": 50000,\n            \"llm_enable_token_rate_limiting\": False,\n            \"llm_max_input_tokens_per_minute\": 100000,\n            \"output\": {\n                \"fade_ms\": 3000,\n                \"min_ad_segement_separation_seconds\": 60,\n                \"min_ad_segment_length_seconds\": 14,\n                \"min_confidence\": 0.8,\n            },\n            \"processing\": {\n                \"num_segments_to_input_to_prompt\": 30,\n            },\n        }\n\n        config = Config(**config_data)\n\n        # Test custom values\n        assert config.llm_max_concurrent_calls == 5\n        assert config.llm_max_retry_attempts == 10\n        assert config.llm_max_input_tokens_per_call == 50000\n        assert config.llm_enable_token_rate_limiting is False\n        assert config.llm_max_input_tokens_per_minute == 100000\n\n    def test_partial_rate_limiting_config(self) -> None:\n        \"\"\"Test that partial rate limiting config uses defaults for missing values.\"\"\"\n        config_data: dict[str, Any] = {\n            \"llm_api_key\": \"test-key\",\n            \"llm_max_retry_attempts\": 7,  # Only override this one\n            \"output\": {\n                \"fade_ms\": 3000,\n                \"min_ad_segement_separation_seconds\": 60,\n                \"min_ad_segment_length_seconds\": 14,\n                \"min_confidence\": 0.8,\n            },\n            \"processing\": {\n                \"num_segments_to_input_to_prompt\": 30,\n            },\n        }\n\n        config = Config(**config_data)\n\n        # Test that custom value is set\n        assert config.llm_max_retry_attempts == 7\n\n        # Test that defaults are used for other values\n        assert config.llm_max_concurrent_calls == 3\n        assert config.llm_max_input_tokens_per_call is None\n        assert config.llm_enable_token_rate_limiting is False\n        assert config.llm_max_input_tokens_per_minute is None\n\n    def test_config_field_descriptions(self) -> None:\n        \"\"\"Test that config fields have proper descriptions.\"\"\"\n        # Test that the field definitions include helpful descriptions\n        config_fields = Config.model_fields\n\n        assert \"llm_max_concurrent_calls\" in config_fields\n        assert \"Maximum concurrent LLM calls\" in str(\n            config_fields[\"llm_max_concurrent_calls\"].description\n        )\n\n        assert \"llm_max_retry_attempts\" in config_fields\n        assert \"Maximum retry attempts\" in str(\n            config_fields[\"llm_max_retry_attempts\"].description\n        )\n\n        assert \"llm_enable_token_rate_limiting\" in config_fields\n        assert \"client-side token-based rate limiting\" in str(\n            config_fields[\"llm_enable_token_rate_limiting\"].description\n        )\n"
  },
  {
    "path": "src/tests/test_rate_limiting_edge_cases.py",
    "content": "\"\"\"\nAdditional edge case tests for rate limiting functionality.\n\"\"\"\n\nimport time\nfrom typing import Any\nfrom unittest.mock import patch\n\nfrom podcast_processor.ad_classifier import AdClassifier\nfrom podcast_processor.token_rate_limiter import TokenRateLimiter\n\nfrom .test_helpers import create_test_config\n\n\nclass TestRateLimitingEdgeCases:\n    \"\"\"Test edge cases and boundary conditions for rate limiting.\"\"\"\n\n    def test_token_counting_edge_cases(self) -> None:\n        \"\"\"Test token counting with edge cases.\"\"\"\n        limiter = TokenRateLimiter()\n\n        # Test empty content\n        messages: list[dict[str, str]] = [{\"role\": \"user\", \"content\": \"\"}]\n        tokens = limiter.count_tokens(messages, \"gpt-4\")\n        assert tokens == 0\n\n        # Test malformed message structure\n        messages = [{\"role\": \"user\"}]  # Missing content\n        tokens = limiter.count_tokens(messages, \"gpt-4\")\n        assert tokens == 0\n\n        # Test very large message\n        large_content = \"word \" * 10000  # ~50k characters\n        messages = [{\"role\": \"user\", \"content\": large_content}]\n        tokens = limiter.count_tokens(messages, \"gpt-4\")\n        assert tokens > 10000  # Should estimate significant tokens\n\n    def test_rate_limiter_boundary_conditions(self) -> None:\n        \"\"\"Test rate limiter at exact boundary conditions.\"\"\"\n        limiter = TokenRateLimiter(tokens_per_minute=100, window_minutes=1)\n\n        current_time = time.time()\n\n        # Fill exactly to the limit\n        limiter.token_usage.append((current_time - 30, 100))\n\n        # Try to add exactly 0 more tokens\n        messages: list[dict[str, str]] = []\n        can_proceed, wait_seconds = limiter.check_rate_limit(messages, \"gpt-4\")\n        assert can_proceed is True\n        assert wait_seconds == 0.0\n\n        # Try to add 1 more token (should exceed)\n        messages = [{\"role\": \"user\", \"content\": \"x\"}]  # Minimal content\n        can_proceed, wait_seconds = limiter.check_rate_limit(messages, \"gpt-4\")\n        # This might pass or fail depending on exact token counting, but should be consistent\n\n    def test_rate_limiter_time_window_edge(self) -> None:\n        \"\"\"Test rate limiter behavior at time window boundaries.\"\"\"\n        limiter = TokenRateLimiter(tokens_per_minute=100, window_minutes=1)\n\n        current_time = time.time()\n\n        # Add usage at different window boundaries\n        limiter.token_usage.append((current_time - 61, 50))  # Outside 60-second window\n        limiter.token_usage.append((current_time - 59, 40))  # Inside window\n\n        # Check current usage\n        usage = limiter._get_current_usage(current_time)\n        assert usage == 40  # Only the second entry should count\n\n    def test_config_validation_boundary_values(self) -> None:\n        \"\"\"Test configuration with boundary values.\"\"\"\n        # Test minimum values\n        config = create_test_config(\n            llm_max_concurrent_calls=1,\n            llm_max_retry_attempts=1,\n            llm_max_input_tokens_per_call=1,\n            llm_max_input_tokens_per_minute=1,\n        )\n        assert config.llm_max_concurrent_calls == 1\n        assert config.llm_max_retry_attempts == 1\n        assert config.llm_max_input_tokens_per_call == 1\n        assert config.llm_max_input_tokens_per_minute == 1\n\n    def test_error_classification_comprehensive(self) -> None:\n        \"\"\"Test comprehensive error classification scenarios.\"\"\"\n        config = create_test_config()\n\n        with patch(\"podcast_processor.ad_classifier.db.session\") as mock_session:\n            classifier = AdClassifier(config=config, db_session=mock_session)\n\n            retryable_errors = [\n                Exception(\"HTTP 429: Rate limit exceeded\"),\n                Exception(\"rate_limit_error: too many requests\"),\n                Exception(\"RateLimitError: Request rate limit exceeded\"),\n                Exception(\"Service temporarily unavailable (503)\"),\n                Exception(\"service unavailable\"),\n                Exception(\"Error 503: Service unavailable\"),\n                Exception(\"rate limit reached\"),\n            ]\n\n            # Test specific LiteLLM exceptions by importing at runtime\n            try:\n                from litellm.exceptions import InternalServerError\n\n                # InternalServerError requires specific parameters, so create a simple one\n                retryable_errors.append(\n                    InternalServerError(\n                        \"Service unavailable\", llm_provider=\"test\", model=\"test\"\n                    )\n                )\n            except (ImportError, TypeError):\n                # If litellm.exceptions not available or constructor changed, skip this specific test\n                pass\n\n            for error in retryable_errors:\n                assert classifier._is_retryable_error(error) is True\n\n            non_retryable_errors = [\n                Exception(\"Invalid API key (401)\"),\n                Exception(\"Bad request (400)\"),\n                Exception(\"Forbidden (403)\"),\n                ValueError(\"Invalid input format\"),\n                Exception(\"Model not found (404)\"),\n                Exception(\"Connection timeout\"),  # Not in the retryable list\n                Exception(\"Internal server error (500)\"),  # Not in the retryable list\n            ]\n\n            for error in non_retryable_errors:\n                assert classifier._is_retryable_error(error) is False\n\n    @patch(\"time.sleep\")\n    def test_backoff_progression(self, mock_sleep: Any) -> None:\n        \"\"\"Test the complete backoff progression for different error types.\"\"\"\n        config = create_test_config()\n\n        with patch(\"podcast_processor.ad_classifier.db.session\") as mock_session:\n            classifier = AdClassifier(config=config, db_session=mock_session)\n\n            from app.models import ModelCall\n\n            model_call = ModelCall(id=1, error_message=None)\n\n            # Test rate limit error backoff progression\n            rate_limit_error = Exception(\"rate_limit_error: too many requests\")\n\n            # First attempt (attempt=0): 60 * (2^0) = 60\n            classifier._handle_retryable_error(\n                model_call_obj=model_call,\n                error=rate_limit_error,\n                attempt=0,\n                current_attempt_num=1,\n            )\n\n            # Second attempt (attempt=1): 60 * (2^1) = 120\n            classifier._handle_retryable_error(\n                model_call_obj=model_call,\n                error=rate_limit_error,\n                attempt=1,\n                current_attempt_num=2,\n            )\n\n            # Third attempt (attempt=2): 60 * (2^2) = 240\n            classifier._handle_retryable_error(\n                model_call_obj=model_call,\n                error=rate_limit_error,\n                attempt=2,\n                current_attempt_num=3,\n            )\n\n            # Check the sleep calls\n            expected_calls = [60, 120, 240]\n            actual_calls = [call[0][0] for call in mock_sleep.call_args_list]\n            assert actual_calls == expected_calls\n\n            # Reset for non-rate-limit error test\n            mock_sleep.reset_mock()\n\n            # Test regular error backoff progression: 1, 2, 4 seconds\n            regular_error = Exception(\"Internal server error\")\n\n            classifier._handle_retryable_error(\n                model_call_obj=model_call,\n                error=regular_error,\n                attempt=0,\n                current_attempt_num=1,\n            )\n            classifier._handle_retryable_error(\n                model_call_obj=model_call,\n                error=regular_error,\n                attempt=1,\n                current_attempt_num=2,\n            )\n            classifier._handle_retryable_error(\n                model_call_obj=model_call,\n                error=regular_error,\n                attempt=2,\n                current_attempt_num=3,\n            )\n\n            expected_calls = [1, 2, 4]\n            actual_calls = [call[0][0] for call in mock_sleep.call_args_list]\n            assert actual_calls == expected_calls\n\n    def test_rate_limiter_with_very_short_window(self) -> None:\n        \"\"\"Test rate limiter with very short time windows.\"\"\"\n        # Use 1 minute window but test with 10-second spacing\n        limiter = TokenRateLimiter(tokens_per_minute=60, window_minutes=1)\n\n        current_time = time.time()\n\n        # Add usage just outside typical processing time\n        limiter.token_usage.append((current_time - 65, 30))  # Outside 1-min window\n        limiter.token_usage.append((current_time - 5, 20))  # 5 seconds ago\n\n        usage = limiter._get_current_usage(current_time)\n        assert usage == 20  # Only the recent usage should count\n\n    def test_model_configuration_case_sensitivity(self) -> None:\n        \"\"\"Test that model configuration handles different cases and formats.\"\"\"\n        from podcast_processor.token_rate_limiter import (\n            configure_rate_limiter_for_model,\n        )\n\n        # Test different cases of the same model\n        test_cases = [\n            \"gpt-4o-mini\",\n            \"GPT-4O-MINI\",  # Different case\n            \"some-provider/gpt-4o-mini/version\",  # With provider prefix/suffix\n        ]\n\n        for model_name in test_cases:\n            # Clear singleton to ensure fresh test\n            import podcast_processor.token_rate_limiter as trl_module\n\n            trl_module._RATE_LIMITER = None\n\n            # Only the exact lowercase match should work due to current implementation\n            limiter = configure_rate_limiter_for_model(model_name)\n            if \"gpt-4o-mini\" in model_name.lower():\n                expected_limit = (\n                    200000\n                    if model_name == \"gpt-4o-mini\" or \"gpt-4o-mini\" in model_name\n                    else 30000\n                )\n            else:\n                expected_limit = 30000  # Default\n\n            assert limiter.tokens_per_minute == expected_limit\n\n    def test_thread_safety_stress(self) -> None:\n        \"\"\"More intensive thread safety test.\"\"\"\n        import threading\n\n        limiter = TokenRateLimiter(\n            tokens_per_minute=50000\n        )  # Higher limit for stress test\n        messages: list[dict[str, str]] = [{\"role\": \"user\", \"content\": \"test \" * 100}]\n\n        results: list[tuple[int, int, float]] = []\n        errors: list[tuple[int, Exception]] = []\n\n        def worker(worker_id: int) -> None:\n            try:\n                for i in range(20):\n                    start_time = time.time()\n                    limiter.wait_if_needed(messages, \"gpt-4\")\n                    end_time = time.time()\n                    results.append((worker_id, i, end_time - start_time))\n            except Exception as e:\n                errors.append((worker_id, e))\n\n        # Run 10 threads with 20 calls each\n        threads = []\n        for worker_id in range(10):\n            thread = threading.Thread(target=worker, args=(worker_id,))\n            threads.append(thread)\n            thread.start()\n\n        for thread in threads:\n            thread.join()\n\n        # Should have no errors\n        assert len(errors) == 0\n\n        # Should have recorded all calls\n        assert len(limiter.token_usage) == 200  # 10 threads * 20 calls\n\n        # All calls should complete relatively quickly (no excessive waiting)\n        max_wait_time = max(result[2] for result in results)\n        assert max_wait_time < 5.0  # Should not wait more than 5 seconds\n"
  },
  {
    "path": "src/tests/test_session_auth.py",
    "content": "from __future__ import annotations\n\nfrom urllib.parse import parse_qs, urlparse\n\nimport pytest\nfrom flask import Flask, Response, g, jsonify\n\nfrom app.auth import AuthSettings\nfrom app.auth.middleware import init_auth_middleware\nfrom app.auth.state import failure_rate_limiter\nfrom app.extensions import db\nfrom app.models import Feed, Post, User\nfrom app.routes.auth_routes import auth_bp\nfrom app.routes.feed_routes import feed_bp\n\n\n@pytest.fixture\ndef auth_app() -> Flask:\n    app = Flask(__name__)\n    app.config.update(\n        SECRET_KEY=\"test-secret\",\n        SESSION_COOKIE_NAME=\"podly_session\",\n        SQLALCHEMY_DATABASE_URI=\"sqlite:///:memory:\",\n        SQLALCHEMY_TRACK_MODIFICATIONS=False,\n    )\n\n    settings = AuthSettings(\n        require_auth=True,\n        admin_username=\"admin\",\n        admin_password=\"password\",\n    )\n    app.config[\"AUTH_SETTINGS\"] = settings\n    app.config[\"REQUIRE_AUTH\"] = True\n\n    db.init_app(app)\n    with app.app_context():\n        db.create_all()\n        user = User(username=\"admin\", role=\"admin\")\n        user.set_password(\"password\")\n        db.session.add(user)\n        db.session.commit()\n\n    failure_rate_limiter._storage.clear()\n\n    init_auth_middleware(app)\n    app.register_blueprint(auth_bp)\n    app.register_blueprint(feed_bp)\n\n    @app.route(\"/api/protected\", methods=[\"GET\"])\n    def protected() -> Response:\n        current = getattr(g, \"current_user\", None)\n        if current is None:\n            return jsonify({\"error\": \"missing user\"}), 500\n        return jsonify({\"status\": \"ok\", \"user\": current.username})\n\n    @app.route(\"/feed/1\", methods=[\"GET\"])\n    def feed() -> Response:\n        current = getattr(g, \"current_user\", None)\n        if current is None:\n            return Response(\"missing user\", status=500)\n        return Response(\"ok\", mimetype=\"text/plain\")\n\n    @app.route(\"/api/posts/<string:guid>/download\", methods=[\"GET\"])\n    def download(guid: str) -> Response:\n        del guid\n        current = getattr(g, \"current_user\", None)\n        if current is None:\n            return Response(\"missing user\", status=500)\n        return Response(\"download\", mimetype=\"text/plain\")\n\n    yield app\n\n    with app.app_context():\n        db.session.remove()\n        db.drop_all()\n\n\ndef test_login_sets_session_cookie_and_allows_authenticated_requests(\n    auth_app: Flask,\n) -> None:\n    client = auth_app.test_client()\n\n    response = client.post(\n        \"/api/auth/login\",\n        json={\"username\": \"admin\", \"password\": \"password\"},\n    )\n    assert response.status_code == 200\n    set_cookie = response.headers.get(\"Set-Cookie\", \"\")\n    assert \"podly_session\" in set_cookie\n\n    me = client.get(\"/api/auth/me\")\n    assert me.status_code == 200\n    assert me.get_json()[\"user\"][\"username\"] == \"admin\"\n\n    protected = client.get(\"/api/protected\")\n    assert protected.status_code == 200\n    assert protected.get_json()[\"status\"] == \"ok\"\n\n\ndef test_logout_clears_session(auth_app: Flask) -> None:\n    client = auth_app.test_client()\n    client.post(\"/api/auth/login\", json={\"username\": \"admin\", \"password\": \"password\"})\n\n    response = client.post(\"/api/auth/logout\")\n    assert response.status_code == 204\n\n    protected = client.get(\"/api/protected\")\n    assert protected.status_code == 401\n    assert protected.headers.get(\"WWW-Authenticate\") is None\n\n\ndef test_protected_route_without_session_returns_json_401(auth_app: Flask) -> None:\n    client = auth_app.test_client()\n    response = client.get(\"/api/protected\")\n    assert response.status_code == 401\n    assert response.get_json()[\"error\"] == \"Authentication required.\"\n    assert response.headers.get(\"WWW-Authenticate\") is None\n\n\ndef test_feed_requires_token_when_no_session(auth_app: Flask) -> None:\n    client = auth_app.test_client()\n\n    unauthorized = client.get(\"/feed/1\")\n    assert unauthorized.status_code == 401\n    assert \"Invalid or missing feed token\" in unauthorized.get_data(as_text=True)\n\n\ndef test_share_link_generates_token_and_allows_query_access(auth_app: Flask) -> None:\n    client = auth_app.test_client()\n    with auth_app.app_context():\n        feed = Feed(title=\"Example\", rss_url=\"https://example.com/feed.xml\")\n        db.session.add(feed)\n        db.session.commit()\n        feed_id = feed.id\n\n        post = Post(\n            feed_id=feed_id,\n            guid=\"episode-1\",\n            download_url=\"https://example.com/audio.mp3\",\n            title=\"Episode\",\n            whitelisted=True,\n        )\n        db.session.add(post)\n        db.session.commit()\n\n    client.post(\"/api/auth/login\", json={\"username\": \"admin\", \"password\": \"password\"})\n    share = client.post(f\"/api/feeds/{feed_id}/share-link\")\n    assert share.status_code == 201\n    payload = share.get_json()\n    assert payload[\"feed_id\"] == feed_id\n\n    token_id = payload[\"feed_token\"]\n    secret = payload[\"feed_secret\"]\n\n    parsed = urlparse(payload[\"url\"])\n    params = parse_qs(parsed.query)\n    assert params.get(\"feed_token\", [None])[0] == token_id\n    assert params.get(\"feed_secret\", [None])[0] == secret\n\n    anon_client = auth_app.test_client()\n\n    feed_response = anon_client.get(\n        f\"/feed/{feed_id}\",\n        query_string={\"feed_token\": token_id, \"feed_secret\": secret},\n    )\n    assert feed_response.status_code == 200\n    assert feed_response.data == b\"ok\"\n\n    download_response = anon_client.get(\n        \"/api/posts/episode-1/download\",\n        query_string={\"feed_token\": token_id, \"feed_secret\": secret},\n    )\n    assert download_response.status_code == 200\n\n\ndef test_share_link_returns_same_token_for_user_and_feed(auth_app: Flask) -> None:\n    client = auth_app.test_client()\n    with auth_app.app_context():\n        feed = Feed(title=\"Stable\", rss_url=\"https://example.com/stable.xml\")\n        db.session.add(feed)\n        db.session.commit()\n        feed_id = feed.id\n\n    client.post(\"/api/auth/login\", json={\"username\": \"admin\", \"password\": \"password\"})\n\n    first = client.post(f\"/api/feeds/{feed_id}/share-link\").get_json()\n    second = client.post(f\"/api/feeds/{feed_id}/share-link\").get_json()\n\n    assert first[\"url\"] == second[\"url\"]\n    assert first[\"feed_token\"] == second[\"feed_token\"]\n    assert first[\"feed_secret\"] == second[\"feed_secret\"]\n"
  },
  {
    "path": "src/tests/test_token_limit_config.py",
    "content": "\"\"\"\nSimple integration test for the llm_max_input_tokens_per_call feature.\n\"\"\"\n\nfrom shared.test_utils import create_standard_test_config\n\n\ndef test_config_validation() -> None:\n    \"\"\"Test that the config validation works with the new setting.\"\"\"\n    # Test with token limit\n    config_with_limit = create_standard_test_config(llm_max_input_tokens_per_call=50000)\n\n    assert config_with_limit.llm_max_input_tokens_per_call == 50000\n    assert config_with_limit.processing.num_segments_to_input_to_prompt == 400\n\n    # Test without token limit\n    config_without_limit = create_standard_test_config()\n\n    assert config_without_limit.llm_max_input_tokens_per_call is None\n    assert config_without_limit.processing.num_segments_to_input_to_prompt == 400\n\n\nif __name__ == \"__main__\":\n    test_config_validation()\n    print(\"✓ Config validation test passed!\")\n"
  },
  {
    "path": "src/tests/test_token_rate_limiter.py",
    "content": "\"\"\"\nTests for the TokenRateLimiter class and related functionality.\n\"\"\"\n\nimport threading\nimport time\nfrom unittest.mock import patch\n\nfrom podcast_processor.token_rate_limiter import (\n    TokenRateLimiter,\n    configure_rate_limiter_for_model,\n    get_rate_limiter,\n)\n\n\nclass TestTokenRateLimiter:\n    \"\"\"Test cases for the TokenRateLimiter class.\"\"\"\n\n    def test_initialization(self) -> None:\n        \"\"\"Test rate limiter initialization with default and custom parameters.\"\"\"\n        # Test default initialization\n        limiter = TokenRateLimiter()\n        assert limiter.tokens_per_minute == 30000\n        assert limiter.window_seconds == 60\n        assert len(limiter.token_usage) == 0\n\n        # Test custom initialization\n        limiter = TokenRateLimiter(tokens_per_minute=15000, window_minutes=2)\n        assert limiter.tokens_per_minute == 15000\n        assert limiter.window_seconds == 120\n\n    def test_count_tokens(self) -> None:\n        \"\"\"Test token counting functionality.\"\"\"\n        limiter = TokenRateLimiter()\n\n        # Test empty messages\n        messages: list[dict[str, str]] = []\n        tokens = limiter.count_tokens(messages, \"gpt-4\")\n        assert tokens == 0\n\n        # Test single message\n        messages = [{\"role\": \"user\", \"content\": \"Hello world\"}]\n        tokens = limiter.count_tokens(messages, \"gpt-4\")\n        assert tokens > 0  # Should estimate some tokens\n\n        # Test multiple messages\n        messages = [\n            {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n            {\"role\": \"user\", \"content\": \"What is the weather like today?\"},\n        ]\n        tokens = limiter.count_tokens(messages, \"gpt-4\")\n        assert tokens > 0\n\n    def test_token_counting_fallback(self) -> None:\n        \"\"\"Test token counting fallback on error.\"\"\"\n        limiter = TokenRateLimiter()\n\n        # Test with malformed message (should use fallback)\n        messages: list[dict[str, str]] = [{\"role\": \"user\"}]  # Missing content\n        tokens = limiter.count_tokens(messages, \"gpt-4\")\n        assert tokens == 0  # Should return 0 for missing content\n\n    def test_cleanup_old_usage(self) -> None:\n        \"\"\"Test cleanup of old token usage records.\"\"\"\n        limiter = TokenRateLimiter(tokens_per_minute=1000, window_minutes=1)\n\n        current_time = time.time()\n\n        # Add some old usage records\n        limiter.token_usage.append((current_time - 120, 100))  # 2 minutes ago\n        limiter.token_usage.append((current_time - 30, 200))  # 30 seconds ago\n        limiter.token_usage.append((current_time - 10, 300))  # 10 seconds ago\n\n        # Cleanup should remove the 2-minute-old record\n        limiter._cleanup_old_usage(current_time)\n\n        assert len(limiter.token_usage) == 2\n        assert limiter.token_usage[0][1] == 200  # 30 seconds ago should remain\n        assert limiter.token_usage[1][1] == 300  # 10 seconds ago should remain\n\n    def test_get_current_usage(self) -> None:\n        \"\"\"Test getting current token usage within time window.\"\"\"\n        limiter = TokenRateLimiter(tokens_per_minute=1000, window_minutes=1)\n\n        current_time = time.time()\n\n        # Add usage records\n        limiter.token_usage.append((current_time - 120, 100))  # Outside window\n        limiter.token_usage.append((current_time - 30, 200))  # Within window\n        limiter.token_usage.append((current_time - 10, 300))  # Within window\n\n        usage = limiter._get_current_usage(current_time)\n        assert usage == 500  # 200 + 300 (only records within window)\n\n    def test_check_rate_limit_within_limits(self) -> None:\n        \"\"\"Test rate limit check when within limits.\"\"\"\n        limiter = TokenRateLimiter(tokens_per_minute=1000)\n\n        messages: list[dict[str, str]] = [{\"role\": \"user\", \"content\": \"Short message\"}]\n        can_proceed, wait_seconds = limiter.check_rate_limit(messages, \"gpt-4\")\n\n        assert can_proceed is True\n        assert wait_seconds == 0.0\n\n    def test_check_rate_limit_exceeds_limits(self) -> None:\n        \"\"\"Test rate limit check when exceeding limits.\"\"\"\n        limiter = TokenRateLimiter(tokens_per_minute=100)  # Very low limit\n\n        current_time = time.time()\n\n        # Add usage that nearly fills the limit\n        limiter.token_usage.append((current_time - 30, 90))\n\n        # Try to add more tokens that would exceed the limit\n        messages: list[dict[str, str]] = [\n            {\n                \"role\": \"user\",\n                \"content\": \"This is a longer message that should exceed the token limit\",\n            }\n        ]\n        can_proceed, wait_seconds = limiter.check_rate_limit(messages, \"gpt-4\")\n\n        assert can_proceed is False\n        assert wait_seconds > 0\n\n    def test_record_usage(self) -> None:\n        \"\"\"Test recording token usage.\"\"\"\n        limiter = TokenRateLimiter()\n\n        messages: list[dict[str, str]] = [{\"role\": \"user\", \"content\": \"Test message\"}]\n        initial_count = len(limiter.token_usage)\n\n        limiter.record_usage(messages, \"gpt-4\")\n\n        assert len(limiter.token_usage) == initial_count + 1\n        timestamp, token_count = limiter.token_usage[-1]\n        assert timestamp > 0\n        assert token_count > 0\n\n    def test_wait_if_needed_no_wait(self) -> None:\n        \"\"\"Test wait_if_needed when no waiting is required.\"\"\"\n        limiter = TokenRateLimiter(tokens_per_minute=10000)  # High limit\n\n        messages: list[dict[str, str]] = [{\"role\": \"user\", \"content\": \"Short message\"}]\n        start_time = time.time()\n\n        limiter.wait_if_needed(messages, \"gpt-4\")\n\n        end_time = time.time()\n        elapsed = end_time - start_time\n\n        # Should not have waited significantly\n        assert elapsed < 1.0\n\n        # Should have recorded usage\n        assert len(limiter.token_usage) > 0\n\n    def test_wait_if_needed_with_wait(self) -> None:\n        \"\"\"Test wait_if_needed when waiting is required.\"\"\"\n        limiter = TokenRateLimiter(tokens_per_minute=50)  # Very low limit\n\n        # Fill up the rate limit\n        current_time = time.time()\n        limiter.token_usage.append((current_time - 10, 45))\n\n        messages: list[dict[str, str]] = [\n            {\"role\": \"user\", \"content\": \"This message should trigger waiting\"}\n        ]\n\n        # Mock time.sleep to avoid actual waiting in tests\n        with patch(\"time.sleep\") as mock_sleep:\n            limiter.wait_if_needed(messages, \"gpt-4\")\n\n            # Should have called sleep\n            mock_sleep.assert_called_once()\n            call_args = mock_sleep.call_args[0]\n            assert call_args[0] > 0  # Should have waited some positive amount\n\n    def test_get_usage_stats(self) -> None:\n        \"\"\"Test getting usage statistics.\"\"\"\n        limiter = TokenRateLimiter(tokens_per_minute=1000)\n\n        # Add some usage\n        current_time = time.time()\n        limiter.token_usage.append((current_time - 30, 200))\n        limiter.token_usage.append((current_time - 10, 300))\n\n        stats = limiter.get_usage_stats()\n\n        assert \"current_usage\" in stats\n        assert \"limit\" in stats\n        assert \"usage_percentage\" in stats\n        assert \"window_seconds\" in stats\n        assert \"active_records\" in stats\n\n        assert stats[\"current_usage\"] == 500\n        assert stats[\"limit\"] == 1000\n        assert stats[\"usage_percentage\"] == 50.0\n        assert stats[\"window_seconds\"] == 60\n        assert stats[\"active_records\"] == 2\n\n    def test_thread_safety(self) -> None:\n        \"\"\"Test that the rate limiter is thread-safe.\"\"\"\n        limiter = TokenRateLimiter(tokens_per_minute=10000)\n        messages: list[dict[str, str]] = [{\"role\": \"user\", \"content\": \"Test message\"}]\n\n        def worker() -> None:\n            for _ in range(10):\n                limiter.wait_if_needed(messages, \"gpt-4\")\n\n        # Run multiple threads concurrently\n        threads = []\n        for _ in range(5):\n            thread = threading.Thread(target=worker)\n            threads.append(thread)\n            thread.start()\n\n        # Wait for all threads to complete\n        for thread in threads:\n            thread.join()\n\n        # Should have recorded usage from all threads\n        assert len(limiter.token_usage) == 50  # 5 threads * 10 calls each\n\n\nclass TestGlobalRateLimiter:\n    \"\"\"Test cases for global rate limiter functions.\"\"\"\n\n    def test_get_rate_limiter_singleton(self) -> None:\n        \"\"\"Test that get_rate_limiter returns the same instance.\"\"\"\n        limiter1 = get_rate_limiter(5000)\n        limiter2 = get_rate_limiter(5000)\n\n        assert limiter1 is limiter2  # Should be the same instance\n        assert limiter1.tokens_per_minute == 5000\n\n    def test_get_rate_limiter_different_limits(self) -> None:\n        \"\"\"Test that get_rate_limiter creates new instance for different limits.\"\"\"\n        limiter1 = get_rate_limiter(5000)\n        limiter2 = get_rate_limiter(8000)\n\n        assert limiter1 is not limiter2  # Should be different instances\n        assert limiter1.tokens_per_minute == 5000\n        assert limiter2.tokens_per_minute == 8000\n\n    def test_configure_rate_limiter_for_model_anthropic(self) -> None:\n        \"\"\"Test model-specific configuration for Anthropic models.\"\"\"\n        limiter = configure_rate_limiter_for_model(\n            \"anthropic/claude-3-5-sonnet-20240620\"\n        )\n        assert limiter.tokens_per_minute == 30000\n\n    def test_configure_rate_limiter_for_model_openai(self) -> None:\n        \"\"\"Test model-specific configuration for OpenAI models.\"\"\"\n        # Test each model in isolation to avoid singleton issues\n        import podcast_processor.token_rate_limiter as trl_module\n\n        # Test gpt-4o-mini first (higher limit)\n        trl_module._RATE_LIMITER = None\n        limiter = configure_rate_limiter_for_model(\"gpt-4o-mini\")\n        assert limiter.tokens_per_minute == 200000\n\n        # Test gpt-4o (lower limit)\n        trl_module._RATE_LIMITER = None\n        limiter = configure_rate_limiter_for_model(\"gpt-4o\")\n        assert limiter.tokens_per_minute == 150000\n\n    def test_configure_rate_limiter_for_model_gemini(self) -> None:\n        \"\"\"Test model-specific configuration for Gemini models.\"\"\"\n        import podcast_processor.token_rate_limiter as trl_module\n\n        trl_module._RATE_LIMITER = None\n        limiter = configure_rate_limiter_for_model(\"gemini/gemini-3-flash-preview\")\n        assert limiter.tokens_per_minute == 60000\n\n        trl_module._RATE_LIMITER = None\n        limiter = configure_rate_limiter_for_model(\"gemini/gemini-2.5-flash\")\n        assert limiter.tokens_per_minute == 60000\n\n    def test_configure_rate_limiter_for_model_unknown(self) -> None:\n        \"\"\"Test model-specific configuration for unknown models.\"\"\"\n        limiter = configure_rate_limiter_for_model(\"unknown/model-name\")\n        assert limiter.tokens_per_minute == 30000  # Should use default\n\n    def test_configure_rate_limiter_partial_match(self) -> None:\n        \"\"\"Test model-specific configuration with partial model names.\"\"\"\n        # Test that partial matches work\n        limiter = configure_rate_limiter_for_model(\"some-prefix/gpt-4o/some-suffix\")\n        assert limiter.tokens_per_minute == 150000  # Should match gpt-4o\n"
  },
  {
    "path": "src/tests/test_transcribe.py",
    "content": "import logging\nfrom typing import Any\nfrom unittest.mock import MagicMock\n\nimport pytest\nfrom openai.types.audio.transcription_segment import TranscriptionSegment\n\n# from pytest_mock import MockerFixture\n\n\n@pytest.mark.skip\ndef test_remote_transcribe() -> None:\n    # import here instead of the toplevel because torch is not installed properly in CI.\n    from podcast_processor.transcribe import (  # pylint: disable=import-outside-toplevel\n        OpenAIWhisperTranscriber,\n    )\n\n    logger = logging.getLogger(\"global_logger\")\n    from shared.test_utils import create_standard_test_config\n\n    config = create_standard_test_config().model_dump()\n\n    transcriber = OpenAIWhisperTranscriber(logger, config)\n\n    transcription = transcriber.transcribe(\"file.mp3\")\n    assert transcription == []\n\n\n@pytest.mark.skip\ndef test_local_transcribe() -> None:\n    # import here instead of the toplevel because torch is not installed properly in CI.\n    from podcast_processor.transcribe import (  # pylint: disable=import-outside-toplevel\n        LocalWhisperTranscriber,\n    )\n\n    logger = logging.getLogger(\"global_logger\")\n    transcriber = LocalWhisperTranscriber(logger, \"base.en\")\n    transcription = transcriber.transcribe(\"src/tests/file.mp3\")\n    assert transcription == []\n\n\n@pytest.mark.skip\ndef test_groq_transcribe(mocker: Any) -> None:\n    # import here instead of the toplevel because dependencies aren't installed properly in CI.\n    from podcast_processor.transcribe import (  # pylint: disable=import-outside-toplevel\n        GroqWhisperTranscriber,\n    )\n    from shared.config import (  # pylint: disable=import-outside-toplevel\n        GroqWhisperConfig,\n    )\n\n    # Mock the requests call\n    mock_response = MagicMock()\n    mock_response.status_code = 200\n    mock_response.json.return_value = {\n        \"segments\": [\n            {\"start\": 0.0, \"end\": 1.0, \"text\": \"This is a test segment.\"},\n            {\"start\": 1.0, \"end\": 2.0, \"text\": \"This is another test segment.\"},\n        ]\n    }\n    mocker.patch(\"requests.post\", return_value=mock_response)\n\n    # Mock file operations\n    mocker.patch(\"builtins.open\", mocker.mock_open(read_data=\"test audio data\"))\n    mocker.patch(\"pathlib.Path.exists\", return_value=True)\n    mocker.patch(\"podcast_processor.audio.split_audio\", return_value=[(\"test.mp3\", 0)])\n    mocker.patch(\"shutil.rmtree\")\n\n    logger = logging.getLogger(\"global_logger\")\n    config = GroqWhisperConfig(\n        api_key=\"test_key\", model=\"whisper-large-v3-turbo\", language=\"en\"\n    )\n\n    transcriber = GroqWhisperTranscriber(logger, config)\n    transcription = transcriber.transcribe(\"test.mp3\")\n\n    assert len(transcription) == 2\n    assert transcription[0].text == \"This is a test segment.\"\n    assert transcription[1].text == \"This is another test segment.\"\n\n\ndef test_offset() -> None:\n    # import here instead of the toplevel because torch is not installed properly in CI.\n    from podcast_processor.transcribe import (  # pylint: disable=import-outside-toplevel\n        OpenAIWhisperTranscriber,\n    )\n\n    assert OpenAIWhisperTranscriber.add_offset_to_segments(\n        [\n            TranscriptionSegment(\n                id=1,\n                avg_logprob=2,\n                seek=6,\n                temperature=7,\n                text=\"hi\",\n                tokens=[],\n                compression_ratio=3,\n                no_speech_prob=4,\n                start=12.345,\n                end=45.678,\n            )\n        ],\n        123,\n    ) == [\n        TranscriptionSegment(\n            id=1,\n            avg_logprob=2,\n            seek=6,\n            temperature=7,\n            text=\"hi\",\n            tokens=[],\n            compression_ratio=3,\n            no_speech_prob=4,\n            start=12.468,\n            end=45.800999999999995,\n        )\n    ]\n"
  },
  {
    "path": "src/tests/test_transcription_manager.py",
    "content": "import logging\nfrom typing import Generator\nfrom unittest.mock import MagicMock\n\nimport pytest\nfrom flask import Flask\n\nfrom app.extensions import db\nfrom app.models import Feed, ModelCall, Post, TranscriptSegment\nfrom podcast_processor.transcribe import Segment, Transcriber\nfrom podcast_processor.transcription_manager import TranscriptionManager\nfrom shared.config import Config, TestWhisperConfig\nfrom shared.test_utils import create_standard_test_config\n\n\nclass MockTranscriber(Transcriber):\n    \"\"\"Mock transcriber for testing TranscriptionManager.\"\"\"\n\n    def __init__(self, mock_response=None):\n        self.mock_response = mock_response or []\n        self._model_name = \"mock_transcriber\"\n\n    @property\n    def model_name(self) -> str:\n        \"\"\"Implementation of the abstract property\"\"\"\n        return self._model_name\n\n    def transcribe(self, audio_path):\n        \"\"\"Return mock segments or raise exception based on configuration.\"\"\"\n        if isinstance(self.mock_response, Exception):\n            raise self.mock_response\n        return self.mock_response\n\n\n@pytest.fixture\ndef app() -> Generator[Flask, None, None]:\n    \"\"\"Create and configure a Flask app for testing.\"\"\"\n    app = Flask(__name__)\n    app.config[\"SQLALCHEMY_DATABASE_URI\"] = \"sqlite:///:memory:\"\n    app.config[\"SQLALCHEMY_TRACK_MODIFICATIONS\"] = False\n\n    with app.app_context():\n        db.init_app(app)\n        db.create_all()\n        yield app\n\n\n@pytest.fixture\ndef test_config() -> Config:\n    config = create_standard_test_config()\n    # Override whisper config to use test mode\n    config.whisper = TestWhisperConfig()\n    return config\n\n\n@pytest.fixture\ndef test_logger() -> logging.Logger:\n    return logging.getLogger(\"test_logger\")\n\n\n@pytest.fixture\ndef mock_db_session() -> MagicMock:\n    \"\"\"Create a mock database session\"\"\"\n    mock_session = MagicMock()\n    mock_session.add = MagicMock()\n    mock_session.add_all = MagicMock()\n    mock_session.commit = MagicMock()\n    mock_session.rollback = MagicMock()\n    return mock_session\n\n\n@pytest.fixture\ndef mock_transcriber() -> MockTranscriber:\n    \"\"\"Return a mock transcriber for testing.\"\"\"\n    return MockTranscriber(\n        [\n            Segment(start=0.0, end=5.0, text=\"Test segment 1\"),\n            Segment(start=5.0, end=10.0, text=\"Test segment 2\"),\n        ]\n    )\n\n\n@pytest.fixture\ndef test_manager(\n    test_config: Config,\n    test_logger: logging.Logger,\n    mock_db_session: MagicMock,\n    mock_transcriber: MockTranscriber,\n    app: Flask,\n) -> TranscriptionManager:\n    \"\"\"Return a TranscriptionManager instance for testing.\"\"\"\n    with app.app_context():\n        # We need to create mock query objects with proper structure\n        mock_model_call_query = MagicMock()\n        mock_segment_query = MagicMock()\n\n        # Create a manager with our mocks\n        return TranscriptionManager(\n            test_logger,\n            test_config,\n            model_call_query=mock_model_call_query,\n            segment_query=mock_segment_query,\n            db_session=mock_db_session,\n            transcriber=mock_transcriber,\n        )\n\n\ndef test_check_existing_transcription_success(\n    test_manager: TranscriptionManager,\n    app: Flask,\n) -> None:\n    \"\"\"Test finding existing successful transcription\"\"\"\n    post = Post(id=1, title=\"Test Post\")\n\n    # Create test data\n    model_call = ModelCall(\n        post_id=1,\n        model_name=test_manager.transcriber.model_name,\n        status=\"success\",\n        first_segment_sequence_num=0,\n        last_segment_sequence_num=1,\n    )\n    segments = [\n        TranscriptSegment(\n            post_id=1, sequence_num=0, start_time=0.0, end_time=5.0, text=\"Segment 1\"\n        ),\n        TranscriptSegment(\n            post_id=1, sequence_num=1, start_time=5.0, end_time=10.0, text=\"Segment 2\"\n        ),\n    ]\n\n    with app.app_context():\n        # Configure the existing mocks in the manager\n        test_manager.model_call_query.filter_by().order_by().first.return_value = (\n            model_call\n        )\n        test_manager.segment_query.filter_by().order_by().all.return_value = segments\n\n        result = test_manager._check_existing_transcription(post)\n\n        assert result is not None\n        assert len(result) == 2\n        assert result[0].text == \"Segment 1\"\n        assert result[1].text == \"Segment 2\"\n\n\ndef test_check_existing_transcription_no_model_call(\n    test_manager: TranscriptionManager,\n    app: Flask,\n) -> None:\n    \"\"\"Test when no existing ModelCall exists\"\"\"\n    post = Post(id=1, title=\"Test Post\")\n\n    with app.app_context():\n        # Set return value for the existing mock in the manager\n        test_manager.model_call_query.filter_by().order_by().first.return_value = None\n\n        result = test_manager._check_existing_transcription(post)\n        assert result is None\n\n\ndef test_transcribe_new(\n    test_config: Config,\n    test_logger: logging.Logger,\n    app: Flask,\n) -> None:\n    \"\"\"Test transcribing a new audio file\"\"\"\n    with app.app_context():\n        feed = Feed(title=\"Test Feed\", rss_url=\"http://example.com/rss.xml\")\n        post = Post(\n            feed=feed,\n            guid=\"guid-1\",\n            download_url=\"http://example.com/audio-1.mp3\",\n            title=\"Test Post\",\n            unprocessed_audio_path=\"/path/to/audio.mp3\",\n        )\n        db.session.add_all([feed, post])\n        db.session.commit()\n\n        transcriber = MockTranscriber(\n            [\n                Segment(start=0.0, end=5.0, text=\"Test segment 1\"),\n                Segment(start=5.0, end=10.0, text=\"Test segment 2\"),\n            ]\n        )\n        manager = TranscriptionManager(\n            test_logger,\n            test_config,\n            db_session=db.session,\n            transcriber=transcriber,\n        )\n\n        segments = manager.transcribe(post)\n\n        assert len(segments) == 2\n        assert segments[0].text == \"Test segment 1\"\n        assert segments[1].text == \"Test segment 2\"\n        assert TranscriptSegment.query.filter_by(post_id=post.id).count() == 2\n        assert ModelCall.query.filter_by(post_id=post.id).count() == 1\n        assert ModelCall.query.filter_by(post_id=post.id).first().status == \"success\"\n\n\ndef test_transcribe_handles_error(\n    test_config: Config,\n    test_logger: logging.Logger,\n    app: Flask,\n) -> None:\n    \"\"\"Test error handling during transcription\"\"\"\n    with app.app_context():\n        feed = Feed(title=\"Test Feed\", rss_url=\"http://example.com/rss.xml\")\n        post = Post(\n            feed=feed,\n            guid=\"guid-err\",\n            download_url=\"http://example.com/audio-err.mp3\",\n            title=\"Test Post\",\n            unprocessed_audio_path=\"/path/to/audio.mp3\",\n        )\n        db.session.add_all([feed, post])\n        db.session.commit()\n\n        # Create a mock transcriber that raises an exception\n        error_transcriber = MockTranscriber(Exception(\"Transcription failed\"))\n\n        manager = TranscriptionManager(\n            test_logger,\n            test_config,\n            db_session=db.session,\n            transcriber=error_transcriber,\n        )\n\n        # Test the exception\n        with pytest.raises(Exception) as exc_info:\n            manager.transcribe(post)\n\n        assert str(exc_info.value) == \"Transcription failed\"\n        call = (\n            ModelCall.query.filter_by(post_id=post.id)\n            .order_by(ModelCall.timestamp.desc())\n            .first()\n        )\n        assert call is not None\n        assert call.status == \"failed_permanent\"\n        assert call.error_message == \"Transcription failed\"\n\n\ndef test_transcribe_reuses_placeholder_model_call(\n    test_config: Config,\n    test_logger: logging.Logger,\n    app: Flask,\n) -> None:\n    \"\"\"Ensure we reuse existing placeholder ModelCall rows instead of crashing on uniqueness.\"\"\"\n    with app.app_context():\n        feed = Feed(title=\"Test Feed\", rss_url=\"http://example.com/rss.xml\")\n        post = Post(\n            feed=feed,\n            guid=\"guid-123\",\n            download_url=\"http://example.com/audio.mp3\",\n            title=\"Test Post\",\n            unprocessed_audio_path=\"/tmp/audio.mp3\",\n        )\n        db.session.add_all([feed, post])\n        db.session.commit()\n\n        existing_call = ModelCall(\n            post_id=post.id,\n            model_name=\"mock_transcriber\",\n            first_segment_sequence_num=0,\n            last_segment_sequence_num=-1,\n            prompt=\"Whisper transcription job\",\n            status=\"failed_permanent\",\n        )\n        db.session.add(existing_call)\n        db.session.commit()\n\n        manager = TranscriptionManager(\n            test_logger,\n            test_config,\n            db_session=db.session,\n            transcriber=MockTranscriber(\n                [\n                    Segment(start=0.0, end=5.0, text=\"Segment 1\"),\n                    Segment(start=5.0, end=10.0, text=\"Segment 2\"),\n                ]\n            ),\n        )\n\n        segments = manager.transcribe(post)\n\n        assert len(segments) == 2\n        assert ModelCall.query.count() == 1\n        refreshed_call = ModelCall.query.first()\n        assert refreshed_call.id == existing_call.id\n        assert refreshed_call.status == \"success\"\n        assert refreshed_call.last_segment_sequence_num == 1\n"
  },
  {
    "path": "src/user_prompt.jinja",
    "content": "You are analyzing \"{{podcast_title}}\", a podcast about {{podcast_topic}}.\nReturn only the JSON contract described in the system prompt using the transcript excerpt below.\n\n{{transcript}}\n"
  },
  {
    "path": "src/word_boundary_refinement_prompt.jinja",
    "content": "You are analyzing podcast transcript segments to identify the precise START and END of advertisement content.\n\nYour job is to locate short, distinctive phrases at the START and END of the ad break within the provided segments.\n\nBOUNDARY DETECTION RULES:\n\n**AD START INDICATORS** (extend boundary backward):\n- Sponsor introductions: \"This episode is brought to you by...\", \"And now a word from our sponsor\"\n- Transition phrases: \"Before we continue...\", \"Let me tell you about...\", \"Speaking of...\"\n- Host acknowledgments: \"I want to thank...\", \"Special thanks to...\", \"Our sponsor today is...\"\n- Subtle lead-ins: \"You know what's interesting...\", \"I've been using...\", \"Let me share something...\"\n\n**AD END INDICATORS** (extend boundary forward or tighten earlier):\n- Transition back to content: \"And we're back\", \"Now back to the show\", \"Alright, let's get back to...\"\n- Host resumes discussion: references to the previous topic immediately after sponsor talk\n- Audible wrap-up phrases: \"Check them out\", \"Use code...\", \"Link in the description\" followed by topic continuation\n\n**ANALYSIS CONTEXT**:\n- **Detected Ad Block**: {{ad_start}}s - {{ad_end}}s\n- **Original Confidence**: {{ad_confidence}}\n\n**CONTEXT SEGMENTS**:\nEach segment has a stable sequence number and timing.\n\n{% for segment in context_segments -%}\n[seq={{segment.sequence_num}} start={{segment.start_time}} end={{segment.end_time}}] {{segment.text}}\n{% endfor %}\n\n**OUTPUT FORMAT**:\nRespond with valid JSON.\n\n- Identify the segment that contains the START of the ad break.\n- Identify a short phrase at the START of the ad break: the first 4 words of the promo/sponsor read.\n- Identify the segment that contains the END of the ad break.\n- Identify a short phrase at the END of the ad break: the last 4 words right before returning to content.\n\nPhrase requirements:\n- Each phrase should be a contiguous sequence of words that appears in the segment text.\n- Prefer phrases that are fully contained within a single segment.\n- Use exactly 4 words when possible. If you cannot, return fewer words (3, 2, or 1) that still appear contiguously.\n\nPartial output is allowed:\n- If you are unsure about the START phrase, you may omit `refined_start_phrase` (or set it to null/empty) and we will keep the original detected start boundary.\n- If you are unsure about the END phrase, you may omit `refined_end_phrase` (or set it to null/empty) and we will keep the original detected end boundary.\n\n```json\n{\n  \"refined_start_segment_seq\": 0,\n  \"refined_start_phrase\": \"this episode is brought\",\n  \"refined_end_segment_seq\": 0,\n  \"refined_end_phrase\": \"now back to the\",\n  \"start_adjustment_reason\": \"reason for start boundary change\",\n  \"end_adjustment_reason\": \"reason for end boundary change\"\n}\n```\n\n**REFINEMENT GUIDELINES**:\n- If no refinement needed, pick the best segment/word corresponding to the existing detected start.\n- Prefer to refine both START and END boundaries, but return partial results if only one side is confident.\n- Always ensure the chosen start phrase occurs near the detected start boundary.\n- Always ensure the chosen end phrase occurs near the detected end boundary.\n"
  },
  {
    "path": "tests/test_cue_detector.py",
    "content": "import unittest\n\nfrom podcast_processor.cue_detector import CueDetector\nfrom podcast_processor.prompt import transcript_excerpt_for_prompt\nfrom podcast_processor.transcribe import Segment\n\n\nclass TestCueDetector(unittest.TestCase):\n    def setUp(self) -> None:\n        self.detector = CueDetector()\n\n    def test_highlight_cues_url(self) -> None:\n        text = \"Check out example.com for more info.\"\n        # \"Check out\" is a CTA, \"example.com\" is a URL. Both should be highlighted.\n        expected = \"*** Check out *** *** example.com *** for more info.\"\n        self.assertEqual(self.detector.highlight_cues(text), expected)\n\n    def test_highlight_cues_promo(self) -> None:\n        text = \"Use promo code SAVE20 now.\"\n        # \"promo code\" matches promo_pattern.\n        # \"code SAVE20\" would also match promo_pattern, but re.finditer is non-overlapping for a single pattern.\n        # So only \"promo code\" is captured.\n        expected = \"Use *** promo code *** SAVE20 now.\"\n        self.assertEqual(self.detector.highlight_cues(text), expected)\n\n    def test_highlight_cues_cta(self) -> None:\n        text = \"Please visit our website.\"\n        expected = \"Please *** visit *** our website.\"\n        self.assertEqual(self.detector.highlight_cues(text), expected)\n\n    def test_highlight_cues_multiple(self) -> None:\n        text = \"Visit example.com and use code TEST.\"\n        # \"Visit\" -> cta\n        # \"example.com\" -> url\n        # \"use code\" -> cta\n        # \"code TEST\" -> promo\n        # \"use code TEST\" -> \"use code\" (cta) overlaps with \"code TEST\" (promo)\n        # \"use code\" (22, 30)\n        # \"code TEST\" (26, 35)\n        # Merged: (22, 35) -> \"use code TEST\"\n        expected = \"*** Visit *** *** example.com *** and *** use code TEST ***.\"\n        self.assertEqual(self.detector.highlight_cues(text), expected)\n\n    def test_highlight_cues_no_cues(self) -> None:\n        text = \"Just a normal sentence.\"\n        self.assertEqual(self.detector.highlight_cues(text), text)\n\n    def test_integration_prompt(self) -> None:\n        segments = [\n            Segment(start=10.0, end=15.0, text=\"Welcome back to the show.\"),\n            Segment(start=15.0, end=20.0, text=\"Go to mywebsite.com today.\"),\n        ]\n        result = transcript_excerpt_for_prompt(\n            segments, includes_start=False, includes_end=False\n        )\n\n        # \"back to the show\" is a transition cue\n        expected_line1 = \"[10.0] Welcome *** back to the show ***.\"\n        # \"Go to\" is CTA, \"mywebsite.com\" is URL\n        expected_line2 = \"[15.0] *** Go to *** *** mywebsite.com *** today.\"\n\n        self.assertIn(expected_line1, result)\n        self.assertIn(expected_line2, result)\n\n\nif __name__ == \"__main__\":\n    unittest.main()\n"
  }
]