[
  {
    "path": ".gitattributes",
    "content": "# Auto detect text files and perform LF normalization\n* text=auto\n"
  },
  {
    "path": ".github/workflows/release.yml",
    "content": "name: Upload Python Package\n\non:\n  push:\n    tags:\n      - \"V*\"\n\njobs:\n  deploy:\n    runs-on: ubuntu-latest\n\n    steps:\n      - uses: actions/checkout@v2\n      - uses: actions/setup-python@v2\n      - name: Install pypa/build\n        run: python -m  pip install build --user\n\n      - name: Build a binary wheel and a source tarball\n        run: python -m build --sdist --wheel --outdir dist/ .\n\n      - name: Publish distribution 📦 to PyPI\n        uses: pypa/gh-action-pypi-publish@master\n        with:\n          password: ${{ secrets.PYPI_API_TOKEN }}\n"
  },
  {
    "path": ".gitignore",
    "content": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packaging\n.Python\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\nwheels/\npip-wheel-metadata/\nshare/python-wheels/\n*.egg-info/\n.installed.cfg\n*.egg\nMANIFEST\n\n# PyInstaller\n#  Usually these files are written by a python script from a template\n#  before PyInstaller builds the exe, so as to inject date/other infos into it.\n*.manifest\n*.spec\n\n# Installer logs\npip-log.txt\npip-delete-this-directory.txt\n\n# Unit test / coverage reports\nhtmlcov/\n.tox/\n.nox/\n.coverage\n.coverage.*\n.cache\nnosetests.xml\ncoverage.xml\n*.cover\n*.py,cover\n.hypothesis/\n.pytest_cache/\n\n# Translations\n*.mo\n*.pot\n\n# Django stuff:\n*.log\nlocal_settings.py\ndb.sqlite3\ndb.sqlite3-journal\n\n# Flask stuff:\ninstance/\n.webassets-cache\n\n# Scrapy stuff:\n.scrapy\n\n# Sphinx documentation\ndocs/_build/\n\n# PyBuilder\ntarget/\n\n# Jupyter Notebook\n.ipynb_checkpoints\n\n# IPython\nprofile_default/\nipython_config.py\n\n# pyenv\n.python-version\n\n# pipenv\n#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.\n#   However, in case of collaboration, if having platform-specific dependencies or dependencies\n#   having no cross-platform support, pipenv may install dependencies that don't work, or not\n#   install all needed dependencies.\n#Pipfile.lock\n\n# PEP 582; used by e.g. github.com/David-OConnor/pyflow\n__pypackages__/\n\n# Celery stuff\ncelerybeat-schedule\ncelerybeat.pid\n\n# SageMath parsed files\n*.sage.py\n\n# Environments\n.env\n.venv\nenv/\nvenv/\nENV/\nenv.bak/\nvenv.bak/\n\n# Spyder project settings\n.spyderproject\n.spyproject\n\n# Rope project settings\n.ropeproject\n\n# mkdocs documentation\n/site\n\n# mypy\n.mypy_cache/\n.dmypy.json\ndmypy.json\n\n# Pyre type checker\n.pyre/\n.idea/\n.vscode/\nDatacamp/\n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2020 Mohammad Al-Fetyani\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# Datacamp Downloader\n\n[![GitHub license](https://img.shields.io/github/license/TRoboto/datacamp-downloader)](https://github.com/TRoboto/datacamp-downloader/blob/master/LICENSE)\n[![PyPI version](https://badge.fury.io/py/datacamp-downloader.svg)](https://pypi.org/project/datacamp-downloader/)\n[![Documentation Status](https://readthedocs.org/projects/ansicolortags/badge/?version=latest)](https://github.com/TRoboto/datacamp-downloader/blob/master/docs.md)\n\n[![Downloads](https://pepy.tech/badge/datacamp-downloader)](https://pepy.tech/project/datacamp-downloader)\n[![GitHub stars](https://img.shields.io/github/stars/TRoboto/datacamp-downloader)](https://github.com/TRoboto/datacamp-downloader/stargazers)\n[![GitHub forks](https://img.shields.io/github/forks/TRoboto/datacamp-downloader)](https://github.com/TRoboto/datacamp-downloader/network/members)\n[![GitHub contributors](https://img.shields.io/github/contributors/TRoboto/datacamp-downloader)](https://github.com/TRoboto/datacamp-downloader/graphs/contributors)\n\n## Table of Contents\n\n- [Datacamp Downloader](#datacamp-downloader)\n  - [Table of Contents](#table-of-contents)\n  - [Description](#description)\n  - [Installation](#installation)\n    - [PIP](#pip)\n    - [From source](#from-source)\n    - [Autocompletion](#autocompletion)\n  - [Documentation](#documentation)\n  - [Getting Started](#getting-started)\n    - [Login](#login)\n    - [Download](#download)\n  - [User Privacy](#user-privacy)\n  - [Disclaimer](#disclaimer)\n\n## Update\n\nDatacamp Downloader V3.2 is now available. The major change is that the tool now uses selenium for the backend. See changelog for version [3.0](https://github.com/TRoboto/datacamp-downloader/pull/39), [3.1](https://github.com/TRoboto/datacamp-downloader/pull/42)\nand [3.2](https://github.com/TRoboto/datacamp-downloader/pull/47).\n\n## Description\n\nDatacamp Downloader is a command-line interface tool developed in Python\nin order to help you download your completed contents on [Datacamp](https://datacamp.com)\nand keep them locally on your computer.\n\nDatacamp Downloader helps you download all videos, slides, audios, exercises, transcripts, datasets and subtitles in organized folders.\n\nThe design and development of this tool was inspired by [udacimak](https://github.com/udacimak/udacimak)\n\n**Datacampers!**\n\nIf you find this CLI helpful, please support the developers by starring this repository.\n\n## Installation\n\n### PIP\n\nIf you use pip, you can install datacamp-downloader with:\n\n```\npip install datacamp-downloader\n```\n\n### From source\n\nYou can directly clone this repo and install the tool with:\n\n```\npip install git+https://github.com/TRoboto/datacamp-downloader.git\n```\n\n### Autocompletion\n\nTo allow command autocompletion with `[TAB][TAB]`, run:\n\n```\ndatacamp --install-completion [bash|zsh|fish|powershell|pwsh]\n```\n\nThen restart the terminal.\n\n**Note:** autocompletion might not be supported by all operating systems.\n\n## Documentation\n\nThe available commands with full documentation can be found in [docs](https://github.com/TRoboto/datacamp-downloader/blob/master/docs.md)\n\n## Getting Started\n\n### Login\n\n- To login using your username or password, run:\n\n```\ndatacamp login -u [USERNAME] -p [PASSWORD]\n```\n\nor simply run:\n\n```\ndatacamp login\n```\n\n- To login using Datacamp authentication token, run:\n\n```\ndatacamp set-token [TOKEN]\n```\n\nDatacamp authentication token can be found in Datacamp website browser _cookies_.\nTo get your Datacamp authentication, follow these steps:\n\n**Firefox**\n\n1. Visit [datacamp.com](https://datacamp.com) and log in.\n2. Open the **Developer Tools** (press `Cmd + Opt + J` on MacOS or `F12` on Windows).\n3. Go to **Storage tab**, then **Cookies** > `https://www.datacamp.com`\n4. Find `_dct` key, its **Value** is the Datacamp authentication token.\n\n**Chrome**\n\n1. Visit [datacamp.com](https://datacamp.com) and log in.\n2. Open the **Developer Tools** (press `Cmd + Opt + J` on MacOS or `F12` on Windows).\n3. Go to **Application tab**, then **Storage** > **Cookies** > `https://www.datacamp.com`\n4. Find `_dct` key, its **Value** is the Datacamp authentication token.\n\n---\n\n**Security Note**\n\nDatacamp authentication token is a secret key and is unique to you. **You should not share it publicly**.\n\n---\n\nIf you provided valid credentials, you should see the following:\n\n```\nHi, YOUR_NAME\nActive subscription found\n```\n\n> Active subscription is not required anymore.\n\n### Download\n\nFirst, you should list your completed courses/track.\n\nTo list your completed **courses**, run:\n\n```\ndatacamp courses\n```\n\nTo list your completed **tracks**, run:\n\n```\ndatacamp tracks\n```\n\nSimilar output to this should appear with your completed courses/tracks:\n\n```\n+--------+------------------------------------------+------------+------------+------------+\n| ID     | Title                                    | Datasets   | Exercises  | Videos     |\n+--------+------------------------------------------+------------+------------+------------+\n| 1      | Introduction to Python                   | 2          | 46         | 11         |\n+--------+------------------------------------------+------------+------------+------------+\n| 2      | Introduction to SQL                      | 1          | 40         | 1          |\n+--------+------------------------------------------+------------+------------+------------+\n| 3      | Intermediate Python                      | 3          | 69         | 18         |\n+--------+------------------------------------------+------------+------------+------------+\n| 4      | Introduction to Data Science in Python   | 0          | 31         | 13         |\n+--------+------------------------------------------+------------+------------+------------+\n| 5      | Data Science for Everyone                | 0          | 33         | 15         |\n+--------+------------------------------------------+------------+------------+------------+\n| 6      | Joining Data in SQL                      | 3          | 40         | 13         |\n+--------+------------------------------------------+------------+------------+------------+\n| 7      | Data Manipulation with pandas            | 4          | 41         | 15         |\n+--------+------------------------------------------+------------+------------+------------+\n| 8      | Supervised Learning with scikit-learn    | 7          | 37         | 17         |\n+--------+------------------------------------------+------------+------------+------------+\n| 9      | Machine Learning for Everyone            | 0          | 25         | 12         |\n+--------+------------------------------------------+------------+------------+------------+\n| 10     | Python Data Science Toolbox (Part 1)     | 1          | 34         | 12         |\n+--------+------------------------------------------+------------+------------+------------+\n```\n\nNow, you can download any of the courses/tracks with:\n\n```\ndatacamp download id1 id2 id3\n```\n\nFor example to download the first and second course, run:\n\n```\ndatacamp download 1 2\n```\n\n- To download all your completed courses, run:\n\n```\ndatacamp download all\n```\n\n- To download all your completed tracks, run:\n\n```\ndatacamp download all-t\n```\n\nThis by default will download **videos**, **slides**, **datasets**, **exercises**, **english subtitles** and **transcripts** in organized folders in the **current directory**.\n\nTo customize this behavior see `datacamp download` command in the [docs](https://github.com/TRoboto/datacamp-downloader/blob/master/docs.md).\n\n## User Privacy\n\n`datacamp` creates a session file with your credentials saved in the temp folder. If you no longer need to use the tool, it is preferable to reset the session, which will remove the saved file, with:\n\n```\ndatacamp reset\n```\n\n## Disclaimer\n\nThis CLI is provided to help you download Datacamp courses/tracks for personal use only. Sharing the content of the courses is strictly prohibited under [Datacamp's Terms of Use](https://www.datacamp.com/terms-of-use/).\n\nBy using this CLI, the developers of this CLI are not responsible for any law infringement caused by the users of this CLI.\n"
  },
  {
    "path": "docs.md",
    "content": "# `datacamp`\n\n**Usage**:\n\n```console\n$ datacamp [OPTIONS] COMMAND [ARGS]...\n```\n\n**Options**:\n\n- `--version`: Show version.\n- `--install-completion`: Install completion for the current shell.\n- `--show-completion`: Show completion for the current shell, to copy it or customize the installation.\n- `--help`: Show this message and exit.\n\n**Commands**:\n\n- `courses`: List your completed courses.\n- `download`: Download courses/tracks given their ids.\n- `login`: Log in to Datacamp using your username and password\n- `reset`: Restart the session.\n- `set-token`: Log in to Datacamp using your token.\n- `tracks`: List your completed tracks.\n\n## `datacamp login`\n\nLog in to Datacamp using your username and password.\n\n**Usage**:\n\n```console\n$ datacamp login [OPTIONS]\n```\n\n**Options**:\n\n- `-u, --username TEXT`: [required]\n- `-p, --password TEXT`: [required]\n- `--help`: Show this message and exit.\n\n## `datacamp set-token`\n\nLog in to Datacamp using your token.\n\n**Usage**:\n\n```console\n$ datacamp set-token [OPTIONS] TOKEN\n```\n\n**Arguments**:\n\n- `TOKEN`: [required]\n\n**Options**:\n\n- `--help`: Show this message and exit.\n\n## `datacamp courses`\n\nList your completed courses.\n\n**Usage**:\n\n```console\n$ datacamp courses [OPTIONS]\n```\n\n**Options**:\n\n- `-r, --refresh`: Refresh completed courses. [default: False]\n- `--help`: Show this message and exit.\n\n## `datacamp tracks`\n\nList your completed tracks.\n\n**Usage**:\n\n```console\n$ datacamp tracks [OPTIONS]\n```\n\n**Options**:\n\n- `-r, --refresh`: Refresh completed tracks. [default: False]\n- `--help`: Show this message and exit.\n\n## `datacamp download`\n\nDownload courses/tracks given their ids.\n\nExample: `datacamp download id1 id2 id3`\n\nTo download all your completed courses run:\n`datacamp download all`\n\nTo download all your completed tracks run:\n`datacamp download all-t`\n\n**Usage**:\n\n```console\n$ datacamp download [OPTIONS] IDS...\n```\n\n**Arguments**:\n\n- `IDS...`: IDs for courses/tracks to download or `all` to download all your completed courses or `all-t` to download all your completed tracks. [required]\n\n**Options**:\n\n- `-p, --path DIRECTORY`: Path to the download directory. [default: `current_directory/Datacamp`]\n- `--slides / --no-slides`: Download slides. [default: True]\n- `--datasets / --no-datasets`: Download datasets. [default: True]\n- `--videos / --no-videos`: Download videos. [default: True]\n- `--exercises / --no-exercises`: Download exercises. [default: True]\n- `-st, --subtitles [en|zh|fr|de|it|ja|ko|pt|ru|es|none]`: Choose subtitles to download. [default: en]\n- `--audios / --no-audios`: Download audio files. [default: False]\n- `--scripts, --transcript / --no-scripts, --no-transcript`: Download scripts or transcripts. [default: True]\n- `--python-file / --no-python-file`: Download your own solution as a python file if available. [default: True]\n- `--no-warnings`: Disable warnings. [default: True]\n- `-w, --overwrite`: Overwrite files if exist. [default: False]\n- `--help`: Show this message and exit.\n\n## `datacamp reset`\n\nRestart the session.\n\n**Usage**:\n\n```console\n$ datacamp reset [OPTIONS]\n```\n\n**Options**:\n\n- `--help`: Show this message and exit.\n"
  },
  {
    "path": "pyproject.toml",
    "content": "[build-system]\nrequires = [\n    \"setuptools\",\n    \"wheel\"\n]\nbuild-backend = \"setuptools.build_meta\""
  },
  {
    "path": "requirements.txt",
    "content": "beautifulsoup4==4.13.5\nrequests==2.32.5\nselenium==4.35.0\nundetected-chromedriver==3.2.1\nwebdriver-manager==4.0.2\ntexttable==1.6.3\ntermcolor==1.1.0\ncolorama==0.4.4\ntomd==0.1.3\ntyper==0.3.2\nsetuptools==80.9.0"
  },
  {
    "path": "setup.py",
    "content": "from setuptools import find_packages, setup\n\nwith open(\"README.md\", \"r\", encoding=\"utf-8\") as fh:\n    long_description = fh.read()\n\nwith open(\"requirements.txt\", \"r\", encoding=\"utf-8\") as fh:\n    required = fh.read().splitlines()\n\nsetup(\n    name=\"datacamp-downloader\",\n        version=\"3.3\",\n    author=\"Mohammad Al-Fetyani\",\n    author_email=\"m4bh@hotmail.com\",\n    description=\"Download your completed courses on Datacamp easily!\",\n    long_description=long_description,\n    long_description_content_type=\"text/markdown\",\n    url=\"https://github.com/TRoboto/datacamp-downloader\",\n    project_urls={\n        \"Bug Tracker\": \"https://github.com/TRoboto/datacamp-downloader/issues\",\n    },\n    classifiers=[\n        \"Programming Language :: Python :: 3\",\n        \"License :: OSI Approved :: MIT License\",\n        \"Operating System :: OS Independent\",\n    ],\n    package_dir={\"\": \"src\"},\n    install_requires=required,\n    setup_requires=[\"setuptools-git\"],\n    packages=find_packages(where=\"src\"),\n    include_package_data=True,\n    python_requires=\">=3.6\",\n    entry_points={\"console_scripts\": [\"datacamp=datacamp_downloader.downloader:app\"]},\n)\n"
  },
  {
    "path": "src/datacamp_downloader/__init__.py",
    "content": "from colorama import init\n\nfrom .session import Session\n\n# use Colorama to make Termcolor work on Windows too\ninit()\n\nactive_session = Session()\ndatacamp = active_session.datacamp\n"
  },
  {
    "path": "src/datacamp_downloader/constants.py",
    "content": "import tempfile\n\nHOME_PAGE = \"https://www.datacamp.com/\"\nLOGIN_URL = \"https://www.datacamp.com/users/sign_in\"\nLOGIN_DETAILS_URL = \"https://www.datacamp.com/api/users/signed_in\"\n\nSESSION_FILE = tempfile.gettempdir() + \"/.datacamp.v3\"\n\nPROFILE_URL = \"https://www.datacamp.com/profile/{slug}\"\nPROFILE_DATA_URL = \"https://www.datacamp.com/api/public/users/{slug}\"\nCOURSE_DETAILS_API = \"https://campus-api.datacamp.com/api/courses/{id}/\"\nEXERCISE_DETAILS_API = \"https://campus-api.datacamp.com/api/exercise/{id}\"\nVIDEO_DETAILS_API = \"https://projector.datacamp.com/api/videos/{hash}\"\nPROGRESS_API = \"https://campus-api.datacamp.com/api/courses/{course_id}/chapters/{chapter_id}/progress\"\n\nLANGMAP = {\n    \"en\": \"English\",\n    \"zh\": \"Chinese simplified\",\n    \"fr\": \"French\",\n    \"de\": \"German\",\n    \"it\": \"Italian\",\n    \"ja\": \"Japanese\",\n    \"ko\": \"Korean\",\n    \"pt\": \"Portuguese\",\n    \"ru\": \"Russian\",\n    \"es\": \"Spanish\",\n}\n"
  },
  {
    "path": "src/datacamp_downloader/datacamp_utils.py",
    "content": "import re\nimport sys\nfrom pathlib import Path\nfrom selenium.webdriver.common.keys import Keys\nfrom selenium.webdriver.common.action_chains import ActionChains\nfrom selenium.webdriver.support.ui import WebDriverWait\nfrom selenium.webdriver.support import expected_conditions as EC\nfrom selenium.webdriver.common.by import By\nimport traceback\n\nfrom bs4 import BeautifulSoup\n\nimport datacamp_downloader.session as session\n\nfrom .constants import (\n    COURSE_DETAILS_API,\n    EXERCISE_DETAILS_API,\n    LANGMAP,\n    LOGIN_DETAILS_URL,\n    LOGIN_URL,\n    PROFILE_DATA_URL,\n    PROGRESS_API,\n    VIDEO_DETAILS_API,\n)\nfrom .helper import (\n    Logger,\n    animate_wait,\n    correct_path,\n    download_file,\n    fix_track_link,\n    get_table,\n    print_progress,\n    save_text,\n)\nfrom .templates.course import Chapter, Course\nfrom .templates.exercise import Exercise\nfrom .templates.track import Track\nfrom .templates.video import Video\n\n\ndef login_required(f):\n    def wrapper(*args, **kwargs):\n        self = args[0]\n        if not isinstance(self, Datacamp):\n            Logger.error(f\"{login_required.__name__} can only decorate Datacamp class.\")\n            return\n        if not self.loggedin:\n            Logger.error(\"Login first!\")\n            return\n        return f(*args, **kwargs)\n\n    return wrapper\n\n\ndef try_except_request(f):\n    def wrapper(*args, **kwargs):\n        self = args[0]\n        if not isinstance(self, Datacamp):\n            Logger.error(\n                f\"{try_except_request.__name__} can only decorate Datacamp class.\"\n            )\n            return\n\n        try:\n            return f(*args, **kwargs)\n        except Exception as e:\n            if str(e):\n                Logger.error(e)\n        return\n\n    return wrapper\n\n\nclass Datacamp:\n    def __init__(self, session: \"session.Session\") -> None:\n\n        self.session = session\n        self.init()\n\n    def init(self):\n        self.username = None\n        self.password = None\n        self.token = None\n        self.has_active_subscription = False\n        self.loggedin = False\n        self.login_data = None\n        self.profile_data = None\n\n        self.courses = []\n        self.tracks = []\n\n        self.not_found_courses = set()\n\n\n    @animate_wait\n    @try_except_request\n    def login(self, username, password):\n        # quick guard\n        if username == self.username and self.password == password and self.loggedin:\n            Logger.info(\"Already logged in!\")\n            return\n\n        self.init()\n        self.username = username\n        self.password = password\n\n        # open signin page (this calls self.session.start() internally)\n        req = self.session.get(LOGIN_URL)\n        if not req:\n            Logger.error(\"Cannot access datacamp website!\")\n            return\n\n        try:\n            # Wait for the email input to be present and clickable\n            wd = WebDriverWait(self.session.driver, 15)\n            wd.until(EC.element_to_be_clickable((By.CSS_SELECTOR, \"#user_email\")))\n\n            email = self.session.driver.find_element(By.ID, \"user_email\")\n            email.clear()\n            email.click()\n            email.send_keys(username)\n            Logger.info(\"Filled email\")\n\n        except Exception as e:\n            Logger.error(f\"Cannot find/fill email field: {e}\")\n            # save screenshot for debugging\n            try:\n                self.session.driver.save_screenshot(\"login_error_email.png\")\n            except Exception:\n                pass\n            return\n\n        # Click the next/continue button (try a couple of selectors)\n        try:\n            try:\n                next_button = self.session.driver.find_element(By.XPATH, '//button[@tabindex=\"2\"]')\n            except Exception:\n                # fallback: any submit button in a form\n                next_button = self.session.driver.find_element(By.CSS_SELECTOR, \"button[type='submit'], input[type='submit']\")\n            next_button.click()\n        except Exception as e:\n            Logger.error(f\"Cannot click next/continue button: {e}\")\n            try:\n                self.session.driver.save_screenshot(\"login_error_next.png\")\n            except Exception:\n                pass\n            return\n\n        # Wait for password input to be clickable\n        try:\n            wd = WebDriverWait(self.session.driver, 15)\n            password_field = wd.until(EC.element_to_be_clickable((By.ID, \"user_password\")))\n        except Exception as e:\n            Logger.error(f\"Password field not found or not clickable (maybe SSO-only login?): {e}\")\n            try:\n                self.session.driver.save_screenshot(\"login_error_no_password.png\")\n            except Exception:\n                pass\n            return\n\n        # Try to enter password robustly: ActionChains -> direct send_keys -> JS fallback\n        try:\n            # ActionChains to focus and type\n            ActionChains(self.session.driver).move_to_element(password_field).click().send_keys(password).perform()\n            Logger.info(\"Password typed via ActionChains\")\n        except Exception as e1:\n            try:\n                password_field.clear()\n                password_field.send_keys(password)\n                Logger.info(\"Password typed via send_keys\")\n            except Exception as e2:\n                # Last resort: set value via JS\n                try:\n                    self.session.driver.execute_script(\"arguments[0].value = arguments[1]; arguments[0].dispatchEvent(new Event('input'));\", password_field, password)\n                    Logger.info(\"Password set via JS\")\n                except Exception as e3:\n                    Logger.error(\"Cannot type password into the field. Details:\\n\" + \"\\n\".join(map(str, [e1, e2, e3])))\n                    try:\n                        self.session.driver.save_screenshot(\"login_error_password.png\")\n                    except Exception:\n                        pass\n                    return\n\n        # Submit the form (try button or ENTER)\n        try:\n            # Try to find the submit button\n            try:\n                submit_button = self.session.driver.find_element(By.XPATH, '//input[@tabindex=\"4\"]')\n                submit_button.click()\n            except Exception:\n                # fallback: hit Enter on password field\n                password_field.send_keys(Keys.RETURN)\n            Logger.info(\"Submitted login form, waiting for result...\")\n        except Exception as e:\n            Logger.error(f\"Cannot submit login form: {e}\")\n            try:\n                self.session.driver.save_screenshot(\"login_error_submit.png\")\n            except Exception:\n                pass\n            return\n\n        # wait for page to load and check result\n        try:\n            # wait for either the profile element, or error/flash messages\n            WebDriverWait(self.session.driver, 10).until(\n                lambda d: \"/users/sign_up\" not in d.page_source and \"Invalid\" not in d.page_source\n            )\n        except Exception:\n            # Not a fatal error here, proceed to check token / page content\n            pass\n\n        # obtain token cookie if login succeeded\n        try:\n            token_cookie = self.session.driver.get_cookie(\"_dct\")\n            if not token_cookie:\n                Logger.error(\"Login did not produce a _dct cookie (likely login failed or SSO-only).\")\n                try:\n                    self.session.driver.save_screenshot(\"login_no_token.png\")\n                except Exception:\n                    pass\n                return\n            self.token = token_cookie[\"value\"]\n            self._set_profile()\n            Logger.info(\"Login flow completed\")\n        except Exception as e:\n            Logger.error(\"Error after login attempt: \" + str(e))\n            try:\n                self.session.driver.save_screenshot(\"login_error_final.png\")\n            except Exception:\n                pass\n            return\n\n\n    @animate_wait\n    @try_except_request\n    def set_token(self, token):\n        if self.token == token and self.loggedin:\n            Logger.info(\"Already logged in!\")\n            return\n\n        self.init()\n        self.session.start()\n\n        self.token = token\n        self.session.add_token(token)\n        self._set_profile()\n\n    def get_profile_data(self):\n        if not self.profile_data:\n            self.profile_data = self.session.get_json(\n                PROFILE_DATA_URL.format(slug=self.login_data[\"slug\"])\n            )\n            self.session.driver.minimize_window()\n        return self.profile_data\n\n    @login_required\n    @animate_wait\n    def list_completed_tracks(self, refresh):\n        table = get_table()\n        table.set_cols_width([6, 40, 10])\n        table.add_row([\"ID\", \"Title\", \"Courses\"])\n        table_so_far = table.draw()\n        Logger.clear_and_print(table_so_far)\n        for track in self.get_completed_tracks(refresh):\n            table.add_row([track.id, track.title, len(track.courses)])\n            table_str = table.draw()\n            Logger.clear_and_print(table_str.replace(table_so_far, \"\").strip())\n            table_so_far = table_str\n\n    @login_required\n    @animate_wait\n    def list_completed_courses(self, refresh):\n        table = get_table()\n        table.set_cols_width([6, 40, 10, 10, 10])\n        table.add_row([\"ID\", \"Title\", \"Datasets\", \"Exercises\", \"Videos\"])\n        table_so_far = table.draw()\n        Logger.clear_and_print(table_so_far)\n        for i, course in enumerate(self.get_completed_courses(refresh), 1):\n            all_exercises_count = sum([c.nb_exercises for c in course.chapters])\n            videos_count = sum([c.number_of_videos for c in course.chapters])\n            course.order = i\n            table.add_row(\n                [\n                    i,\n                    course.title,\n                    len(course.datasets),\n                    all_exercises_count - videos_count,\n                    videos_count,\n                ]\n            )\n            table_str = table.draw()\n            Logger.clear_and_print(table_str.replace(table_so_far, \"\").strip())\n            table_so_far = table_str\n\n    @login_required\n    def download(self, ids, directory, **kwargs):\n        self.overwrite = kwargs.get(\"overwrite\")\n        if \"all-t\" in ids:\n            if not self.tracks:\n                Logger.error(\n                    \"No tracks to download! Maybe run `datacamp tracks` first!\"\n                )\n                return\n            to_download = self.tracks\n        elif \"all\" in ids:\n            if not self.courses:\n                Logger.error(\n                    \"No courses to download! Maybe run `datacamp courses` first!\"\n                )\n                return\n            to_download = self.courses\n        else:\n            to_download = []\n            for id in ids:\n                if \"t\" in id:\n                    track = self.get_track(id)\n                    if not track:\n                        Logger.warning(f\"Track {id} is not fetched. Ignoring it.\")\n                        continue\n                    to_download.append(track)\n                elif id.isnumeric():\n                    course = self.get_course_by_order(int(id))\n                    if not course:\n                        Logger.warning(f\"Course {id} is not fetched. Ignoring it.\")\n                        continue\n                    to_download.append(course)\n\n        if not to_download:\n            Logger.error(\"No courses/tracks to download!\")\n            return\n\n        path = Path(directory) if not isinstance(directory, Path) else directory\n\n        self.session.start()\n        self.session.driver.minimize_window()\n\n        for i, material in enumerate(to_download, 1):\n            if not material:\n                continue\n            Logger.info(\n                f\"[{i}/{len(to_download)}] Start to download ({material.id}) {material.title}\"\n            )\n            if isinstance(material, Course):\n                self.download_course(material, path, **kwargs)\n            else:\n                self.download_track(material, path, **kwargs)\n\n    def download_normal_exercise(\n        self, exercise: Exercise, path: Path, include_last_attempt: bool = False\n    ):\n        save_text(path, str(exercise), self.overwrite)\n        if include_last_attempt and exercise.is_python and exercise.last_attempt:\n            save_text(\n                path.parent / (path.name[:-3] + f\".py\"),\n                exercise.last_attempt,\n                self.overwrite,\n            )\n        subexs = exercise.data.subexercises\n        if subexs:\n            for i, subexercise in enumerate(subexs, 1):\n                exercise = self._get_exercise(subexercise)\n                self.download_normal_exercise(\n                    exercise, path.parent / (path.name[:-3] + f\"_sub{i}.md\")\n                )\n\n    def download_track(self, track: Track, path: Path, **kwargs):\n        path = path / correct_path(track.title)\n        for i, course in enumerate(track.courses, 1):\n            Logger.info(\n                f\"[{i}/{len(track.courses)}] Download ({course.id}) {course.title} from ({track.title} Track)\"\n            )\n            self.download_course(course, path, f\"{i}-\", **kwargs)\n\n    def download_course(self, course: Course, path: Path, index=\"\", **kwargs):\n        download_path = path / (\n            index + correct_path(course.slug or course.title.lower().replace(\" \", \"-\"))\n        )\n        if kwargs.get(\"datasets\") and course.datasets:\n            for i, dataset in enumerate(course.datasets, 1):\n                print_progress(i, len(course.datasets), f\"datasets\")\n                if dataset.asset_url:\n                    download_file(\n                        dataset.asset_url,\n                        download_path\n                        / \"datasets\"\n                        / correct_path(dataset.asset_url.split(\"/\")[-1]),\n                        False,\n                        overwrite=self.overwrite,\n                    )\n            sys.stdout.write(\"\\n\")\n        for chapter in course.chapters:\n            cpath = download_path / self._get_chapter_name(chapter)\n            if kwargs.get(\"slides\") and chapter.slides_link:\n                download_file(\n                    chapter.slides_link,\n                    cpath / correct_path(chapter.slides_link.split(\"/\")[-1]),\n                    overwrite=self.overwrite,\n                )\n            if (\n                kwargs.get(\"exercises\")\n                or kwargs.get(\"videos\")\n                or kwargs.get(\"audios\")\n                or kwargs.get(\"scripts\")\n            ):\n                self.download_others(course.id, chapter, cpath, **kwargs)\n\n    def download_others(self, course_id, chapter: Chapter, path: Path, **kwargs):\n        exercises = kwargs.get(\"exercises\")\n        videos = kwargs.get(\"videos\")\n        audios = kwargs.get(\"audios\")\n        scripts = kwargs.get(\"scripts\")\n        subtitles = kwargs.get(\"subtitles\")\n        last_attempt = kwargs.get(\"last_attempt\")\n        ids = self._get_exercises_ids(course_id, chapter.id)\n        last_attempts = self.get_exercises_last_attempt(course_id, chapter.id)\n        exercise_counter = 1\n        video_counter = 1\n        for i, id in enumerate(ids, 1):\n            print_progress(i, len(ids), f\"chapter {chapter.number}\")\n            exercise = self._get_exercise(id)\n            exercise.last_attempt = last_attempts[id]\n            if not exercise:\n                continue\n            if exercises and not exercise.is_video:\n                self.download_normal_exercise(\n                    exercise,\n                    path / \"exercises\" / f\"ex{exercise_counter}.md\",\n                    last_attempt,\n                )\n                exercise_counter += 1\n            if exercise.is_video:\n                video = self._get_video(exercise.data.get(\"projector_key\"))\n                if not video:\n                    continue\n                video_path = path / \"videos\" / f\"ch{chapter.number}_{video_counter}\"\n                if videos and video.video_mp4_link:\n                    download_file(\n                        video.video_mp4_link,\n                        video_path.with_suffix(\".mp4\"),\n                        overwrite=self.overwrite,\n                    )\n                if audios and video.audio_link:\n                    download_file(\n                        video.audio_link,\n                        path / \"audios\" / f\"ch{chapter.number}_{video_counter}.mp3\",\n                        False,\n                        overwrite=self.overwrite,\n                    )\n                if scripts and video.script_link:\n                    download_file(\n                        video.script_link,\n                        path / \"scripts\" / (video_path.name + \"_script.md\"),\n                        False,\n                        overwrite=self.overwrite,\n                    )\n                if subtitles and video.subtitles:\n                    for sub in subtitles:\n                        subtitle = self._get_subtitle(sub, video)\n                        if not subtitle:\n                            continue\n                        download_file(\n                            subtitle.link,\n                            video_path.parent / (video_path.name + f\"_{sub}.vtt\"),\n                            False,\n                            overwrite=self.overwrite,\n                        )\n                video_counter += 1\n            print_progress(i, len(ids), f\"chapter {chapter.number}\")\n        sys.stdout.write(\"\\n\")\n\n    def get_completed_tracks(self, refresh=False):\n        if self.tracks and not refresh:\n            yield from self.tracks\n            return\n\n        self.tracks = []\n\n        data = self.get_profile_data()\n        completed_tracks = data[\"completed_tracks\"]\n        for i, track in enumerate(completed_tracks, 1):\n            self.tracks.append(Track(f\"t{i}\", track[\"title\"].strip(), track[\"url\"]))\n        all_courses = set()\n        # add courses\n        for track in self.tracks:\n            courses = list(self._get_courses_from_link(fix_track_link(track.link)))\n            if not courses:\n                continue\n            track.courses = courses\n            all_courses.update(track.courses)\n            yield track\n        # add to courses\n        current_ids = [c.id for c in self.courses]\n        for course in all_courses:\n            if course.id not in current_ids:\n                self.courses.append(course)\n\n        self.session.save()\n\n    def get_completed_courses(self, refresh=False):\n        if self.courses and not refresh:\n            yield from self.courses\n            return\n\n        self.courses = []\n\n        data = self.get_profile_data()\n        completed_courses = data[\"completed_courses\"]\n        for course in completed_courses:\n            fetched_course = self.get_course(course[\"id\"])\n            if not fetched_course:\n                continue\n            self.session.driver.minimize_window()\n            self.courses.append(fetched_course)\n            yield fetched_course\n\n        if not self.courses:\n            return []\n\n        self.session.save()\n\n    def get_course(self, id):\n        if id in self.not_found_courses:\n            return\n        for course in self.courses:\n            if course.id == id:\n                return course\n        return self._get_course(id)\n\n    def get_course_by_order(self, order):\n        for course in self.courses:\n            if course.order == order and course.id not in self.not_found_courses:\n                return course\n\n    @try_except_request\n    def get_exercises_last_attempt(self, course_id, chapter_id):\n        data = self.session.get_json(\n            PROGRESS_API.format(course_id=course_id, chapter_id=chapter_id)\n        )\n        if \"error\" in data:\n            raise ValueError(\n                f\"Cannot get exercises for course {course_id}, chapter {chapter_id}.\"\n            )\n        last_attempt = {e[\"exercise_id\"]: e[\"last_attempt\"] for e in data}\n        return last_attempt\n\n    def get_track(self, id):\n        for track in self.tracks:\n            if track.id == id:\n                return track\n\n    @try_except_request\n    def _get_courses_from_link(self, link: str):\n        html = self.session.get(link)\n        self.session.driver.minimize_window()\n\n        soup = BeautifulSoup(html, \"html.parser\")\n        courses_ids = soup.findAll(\"article\", {\"class\": re.compile(\"^js-async\")})\n        for i, id_tag in enumerate(courses_ids, 1):\n            id = id_tag.get(\"data-id\")\n            if not id:\n                continue\n            course = self.get_course(int(id))\n            if course:\n                yield course\n\n    def _get_chapter_name(self, chapter: Chapter):\n        if chapter.title and chapter.title_meta:\n            return correct_path(chapter.slug)\n        if chapter.title:\n            return correct_path(\n                f\"chapter-{chapter.number}-{chapter.title.replace(' ', '-').lower()}\"\n            )\n        return f\"chapter-{chapter.number}\"\n\n    def _set_profile(self):\n        try:\n            data = self.session.get_json(LOGIN_DETAILS_URL)\n        except Exception as e:\n            Logger.error(\"Incorrect input token!\")\n            return\n\n        Logger.info(\"Hi, \" + (data.get(\"first_name\") or data.get(\"last_name\") or data.get(\"email\")))\n\n        # New API: 'has_active_subscription' may not exist anymore\n        has_sub = False\n        if \"has_active_subscription\" in data:\n            has_sub = data[\"has_active_subscription\"]\n        elif \"active_products\" in data:\n            has_sub = len(data[\"active_products\"]) > 0\n\n        if has_sub:\n            Logger.info(\"Active subscription found\")\n        else:\n            Logger.warning(\"No active subscription found\")\n\n        self.loggedin = True\n        self.login_data = data\n        self.has_active_subscription = has_sub\n\n        self.session.save()\n\n    def _get_subtitle(self, sub, video: Video):\n        if not LANGMAP.get(sub):\n            return\n        for subtitle in video.subtitles:\n            if subtitle.language == LANGMAP[sub]:\n                return subtitle\n\n    @try_except_request\n    def _get_video(self, id):\n        if not id:\n            raise ValueError(\"ID tag not found.\")\n        res = self.session.get_json(VIDEO_DETAILS_API.format(hash=id))\n        if \"error\" in res:\n            raise ValueError()\n        return Video(**res)\n\n    @try_except_request\n    def _get_exercises_ids(self, course_id, chapter_id):\n        if not course_id or not chapter_id:\n            raise ValueError(\"ID tags not found.\")\n        data = self.session.get_json(\n            PROGRESS_API.format(course_id=course_id, chapter_id=chapter_id)\n        )\n        if \"error\" in data:\n            raise ValueError(\n                f\"Cannot get exercises for course {course_id}, chapter {chapter_id}.\"\n            )\n        ids = [e[\"exercise_id\"] for e in data]\n        return ids\n\n    @try_except_request\n    def _get_exercise(self, id):\n        if not id:\n            raise ValueError(\"ID tag not found.\")\n        res = self.session.get_json(EXERCISE_DETAILS_API.format(id=id))\n        if \"error\" in res:\n            raise ValueError(f\"Cannot get exercise with id: {id}.\")\n        return Exercise(**res)\n\n    @try_except_request\n    def _get_course(self, id):\n        if not id:\n            self.not_found_courses.add(id)\n            raise ValueError(\"ID tag not found.\")\n        res = self.session.get_json(COURSE_DETAILS_API.format(id=id))\n        if \"error\" in res:\n            self.not_found_courses.add(id)\n            raise ValueError()\n\n        # Normalize time field\n        time_needed = res.get(\"time_needed\")\n        if not time_needed and res.get(\"time_needed_in_hours\") is not None:\n            time_needed = f\"{res['time_needed_in_hours']} hours\"\n        elif not time_needed and res.get(\"duration_minutes\") is not None:\n            hours = res[\"duration_minutes\"] / 60\n            time_needed = f\"{hours:.1f} hours\"\n\n        return Course(\n            id=res[\"id\"],\n            title=res[\"title\"],\n            description=res.get(\"description\", \"\"),\n            slug=res.get(\"slug\"),\n            datasets=res.get(\"datasets\", []),\n            chapters=res.get(\"chapters\", []),\n            time_needed=time_needed,\n        )\n\n"
  },
  {
    "path": "src/datacamp_downloader/downloader.py",
    "content": "import os\nfrom pathlib import Path\nfrom typing import List, Optional\n\nimport typer\n\nfrom . import active_session, datacamp\nfrom .helper import Logger\nfrom .templates.lang import Language\n\n__version__ = \"3.3.0\"\n\n\ndef version_callback(value: bool):\n    if value:\n        typer.echo(f\"Datacamp Downloader CLI Version: {__version__}\")\n        raise typer.Exit()\n\n\ndef main(\n    version: Optional[bool] = typer.Option(\n        None,\n        \"--version\",\n        callback=version_callback,\n        is_eager=True,\n        help=\"Show version.\",\n    ),\n):\n    pass\n\n\napp = typer.Typer(callback=main)\n\n\n@app.command()\ndef login(\n    username: str = typer.Option(..., \"-u\", \"--username\", prompt=True),\n    password: str = typer.Option(..., \"-p\", \"--password\", prompt=True, hide_input=True),\n):\n    \"\"\"Log in to Datacamp using your username and password.\"\"\"\n    datacamp.login(username, password)\n\n\n@app.command()\ndef set_token(token: str = typer.Argument(...)):\n    \"\"\"Log in to Datacamp using your token.\"\"\"\n    datacamp.set_token(token)\n\n\n@app.command()\ndef tracks(\n    refresh: Optional[bool] = typer.Option(\n        False, \"--refresh\", \"-r\", is_flag=True, help=\"Refresh completed tracks.\"\n    )\n):\n    \"\"\"List your completed tracks.\"\"\"\n    datacamp.list_completed_tracks(refresh)\n\n\n@app.command()\ndef courses(\n    refresh: Optional[bool] = typer.Option(\n        False, \"--refresh\", \"-r\", is_flag=True, help=\"Refresh completed courses.\"\n    )\n):\n    \"\"\"List your completed courses.\"\"\"\n    datacamp.list_completed_courses(refresh)\n\n\n@app.command()\ndef download(\n    ids: List[str] = typer.Argument(\n        ...,\n        help=\"IDs for courses/tracks to download or `all` to download all your completed courses or `all-t` to download all your completed tracks.\",\n    ),\n    path: Path = typer.Option(\n        Path(os.getcwd() + \"/Datacamp\"),\n        \"--path\",\n        \"-p\",\n        help=\"Path to the download directory.\",\n        dir_okay=True,\n        file_okay=False,\n    ),\n    slides: Optional[bool] = typer.Option(\n        True,\n        \"--slides/--no-slides\",\n        help=\"Download slides.\",\n    ),\n    datasets: Optional[bool] = typer.Option(\n        True,\n        \"--datasets/--no-datasets\",\n        help=\"Download datasets.\",\n    ),\n    videos: Optional[bool] = typer.Option(\n        True,\n        \"--videos/--no-videos\",\n        help=\"Download videos.\",\n    ),\n    exercises: Optional[bool] = typer.Option(\n        True,\n        \"--exercises/--no-exercises\",\n        help=\"Download exercises.\",\n    ),\n    subtitles: Optional[List[Language]] = typer.Option(\n        [Language.EN.value],\n        \"--subtitles\",\n        \"-st\",\n        help=\"Choose subtitles to download.\",\n        case_sensitive=False,\n    ),\n    audios: Optional[bool] = typer.Option(\n        False,\n        \"--audios/--no-audios\",\n        help=\"Download audio files.\",\n    ),\n    scripts: Optional[bool] = typer.Option(\n        True,\n        \"--scripts/--no-scripts\",\n        \"--transcript/--no-transcript\",\n        show_default=True,\n        help=\"Download scripts or transcripts.\",\n    ),\n    python_file: Optional[bool] = typer.Option(\n        True,\n        \"--python-file/--no-python-file\",\n        show_default=True,\n        help=\"Download your own solution as a python file if available.\",\n    ),\n    warnings: Optional[bool] = typer.Option(\n        True,\n        \"--no-warnings\",\n        flag_value=False,\n        is_flag=True,\n        help=\"Disable warnings.\",\n    ),\n    overwrite: Optional[bool] = typer.Option(\n        False,\n        \"--overwrite\",\n        \"-w\",\n        flag_value=True,\n        is_flag=True,\n        help=\"Overwrite files if exist.\",\n    ),\n):\n    \"\"\"Download courses/tracks given their ids.\n\n    Example: `datacamp download id1 id2 id3`\\n\n    To download all your completed courses run:\n    \\t`datacamp download all`\\n\n    To download all your completed tracks run:\n    \\t`datacamp download all-t`\n    \"\"\"\n    Logger.show_warnings = warnings\n    datacamp.download(\n        ids,\n        path,\n        slides=slides,\n        datasets=datasets,\n        videos=videos,\n        exercises=exercises,\n        subtitles=subtitles,\n        audios=audios,\n        scripts=scripts,\n        overwrite=overwrite,\n        last_attempt=python_file,\n    )\n\n\n@app.command()\ndef reset():\n    \"\"\"Restart the session.\"\"\"\n    active_session.reset()\n"
  },
  {
    "path": "src/datacamp_downloader/helper.py",
    "content": "import itertools\nimport re\nimport sys\nimport threading\nimport time\nfrom pathlib import Path\n\nimport requests\nfrom termcolor import colored\nfrom texttable import Texttable\n\n\nclass Logger:\n    show_warnings = True\n    is_writing = False\n\n    @classmethod\n    def error(cls, text):\n        Logger.print(text, \"ERROR:\", \"red\")\n\n    @classmethod\n    def clear(cls):\n        sys.stdout.write(\"\\r\" + \" \" * 100 + \"\\r\")\n\n    @classmethod\n    def warning(cls, text):\n        if cls.show_warnings:\n            Logger.print(text, \"WARNING:\", \"yellow\")\n\n    @classmethod\n    def info(cls, text):\n        Logger.print(text, \"INFO:\", \"green\")\n\n    @classmethod\n    def print(cls, text, head, color=None, background=None, end=\"\\n\"):\n        cls.is_writing = True\n        Logger.clear()\n        print(colored(f\"{head}\", color, background), text, end=end, flush=True)\n        cls.is_writing = False\n\n    @classmethod\n    def clear_and_print(cls, text):\n        cls.is_writing = True\n        Logger.clear()\n        print(text, flush=True)\n        cls.is_writing = False\n\n\ndef get_table():\n    table = Texttable()\n    return table\n\n\ndef animate_wait(f):\n    done = False\n\n    def animate():\n        for c in itertools.cycle(list(\"/—\\|\")):\n            if done:\n                Logger.clear()\n                break\n            if not Logger.is_writing:\n                print(\"\\rPlease wait \" + c, end=\"\", flush=True)\n            time.sleep(0.1)\n\n    def wrapper(*args):\n        nonlocal done\n        done = False\n        t = threading.Thread(target=animate)\n        t.daemon = True\n        t.start()\n        output = f(*args)\n        done = True\n        return output\n\n    return wrapper\n\n\ndef correct_path(path: str):\n    return re.sub(\"[^-a-zA-Z0-9_.() /]+\", \"\", path)\n\n\ndef download_file(link: str, path: Path, progress=True, max_retry=10, overwrite=False):\n    # start = time.clock()\n    if not overwrite and path.exists():\n        Logger.warning(f\"{path.absolute()} is already downloaded\")\n        return\n\n    for i in range(max_retry):\n        try:\n            response = requests.get(link, stream=True)\n            i = -1\n            break\n        except Exception:\n            Logger.print(f\"\", f\"Retry [{i+1}/{max_retry}]\", \"magenta\", end=\"\")\n\n    if i != -1:\n        Logger.error(f\"Failed to download {link}\")\n        return\n\n    path.parent.mkdir(exist_ok=True, parents=True)\n    total_length = response.headers.get(\"content-length\")\n\n    with path.open(\"wb\") as f:\n        if total_length is None:  # no content length header\n            f.write(response.content)\n        else:\n            dl = 0\n            total_length = int(total_length)\n            for data in response.iter_content(chunk_size=1024 * 1024):  # 1MB\n                dl += len(data)\n                f.write(data)\n                if progress:\n                    print_progress(dl, total_length, path.name)\n    if progress:\n        sys.stdout.write(\"\\n\")\n\n\ndef print_progress(progress, total, name, max=50):\n    done = int(max * progress / total)\n    Logger.print(\n        \"[%s%s] %d%%\" % (\"=\" * done, \" \" * (max - done), done * 2),\n        f\"Downloading [{name}]\",\n        \"blue\",\n        end=\"\\r\",\n    )\n    sys.stdout.flush()\n\n\ndef save_text(path: Path, content: str, overwrite=False):\n    if not path.is_file:\n        Logger.error(f\"{path.absolute()} isn't a file\")\n        return\n    if not overwrite and path.exists():\n        Logger.warning(f\"{path.absolute()} is already downloaded\")\n        return\n    path.parent.mkdir(exist_ok=True, parents=True)\n    path.write_text(content, encoding=\"utf8\")\n    # Logger.info(f\"{path.name} has been saved.\")\n\n\ndef fix_track_link(link):\n    if \"?\" in link:\n        link += \"&embedded=true\"\n    else:\n        link += \"?embedded=true\"\n    return link\n"
  },
  {
    "path": "src/datacamp_downloader/session.py",
    "content": "import json\nimport os\nimport pickle\nimport json\nfrom webdriver_manager.chrome import ChromeDriverManager\nimport re\nfrom bs4 import BeautifulSoup\nimport os\nfrom pathlib import Path\n\n# Prefer top-level undetected_chromedriver (works with Selenium 4); fallback to v2.\ntry:\n    import undetected_chromedriver as uc\nexcept Exception:\n    import undetected_chromedriver.v2 as uc\n\n# Selenium helper imports (we use these to create Service/options safely)\nfrom selenium import webdriver\nfrom selenium.webdriver.chrome.service import Service as ChromeService\nfrom selenium.webdriver.chrome.options import Options as ChromeOptions\nfrom selenium.webdriver.common.by import By\nfrom selenium.webdriver.remote.webelement import WebElement\nfrom selenium.webdriver.support import expected_conditions as EC\nfrom selenium.webdriver.support.ui import WebDriverWait\n\nfrom .constants import HOME_PAGE, SESSION_FILE\nfrom .datacamp_utils import Datacamp\n\n\nclass Session:\n    def __init__(self) -> None:\n        self.savefile = Path(SESSION_FILE)\n        self.datacamp = self.load_datacamp()\n\n    def save(self):\n        self.datacamp.session = None\n        pickled = pickle.dumps(self.datacamp)\n        self.savefile.write_bytes(pickled)\n\n    def load_datacamp(self):\n        if self.savefile.exists():\n            datacamp = pickle.load(self.savefile.open(\"rb\"))\n            datacamp.session = self\n            return datacamp\n        return Datacamp(self)\n\n    def reset(self):\n        try:\n            os.remove(SESSION_FILE)\n        except:\n            pass\n\n    def _setup_driver(self, headless=True):\n        try:\n            options = uc.ChromeOptions()\n        except Exception:\n            options = ChromeOptions()\n\n        try:\n            options.headless = headless\n        except Exception:\n            if headless:\n                options.add_argument(\"--headless=new\")\n\n        # existing flags...\n        options.add_argument(\"--no-first-run\")\n        options.add_argument(\"--no-service-autorun\")\n        options.add_argument(\"--password-store=basic\")\n        options.add_argument(\"--disable-extensions\")\n        options.add_argument(\"--disable-browser-side-navigation\")\n        options.add_argument(\"--disable-infobars\")\n        options.add_argument(\"--disable-popup-blocking\")\n        options.add_argument(\"--disable-gpu\")\n        options.add_argument(\"--disable-notifications\")\n        options.add_argument(\"--content-shell-hide-toolbar\")\n        options.add_argument(\"--top-controls-hide-threshold\")\n        options.add_argument(\"--force-app-mode\")\n        options.add_argument(\"--hide-scrollbars\")\n        options.add_argument(\"--no-sandbox\")\n        options.add_argument(\"--disable-dev-shm-usage\")\n\n        # get the absolute path of the installed package\n        package_dir = os.path.dirname(os.path.abspath(__file__))\n        \n        # create a chrome profile folder inside the package directory\n        profile_dir = os.path.join(package_dir, \"dc_chrome_profile\")\n\n        # make sure it exists\n        os.makedirs(profile_dir, exist_ok=True)\n\n        # tell Chrome to use it\n        options.add_argument(f\"--user-data-dir={profile_dir}\")\n\n\n        service = ChromeService(executable_path=ChromeDriverManager().install())\n        try:\n            self.driver = uc.Chrome(service=service, options=options)\n            return\n        except Exception:\n            self.driver = webdriver.Chrome(service=service, options=options)\n\n    def start(self, headless=False):\n        if hasattr(self, \"driver\"):\n            return\n        self._setup_driver(headless)\n        self.driver.get(HOME_PAGE)\n        self.bypass_cloudflare(HOME_PAGE)\n        if self.datacamp.token:\n            self.add_token(self.datacamp.token)\n\n    def bypass_cloudflare(self, url):\n        try:\n            self.get_element_by_id(\"cf-spinner-allow-5-secs\")\n            with self.driver:\n                self.driver.get(url)\n        except:\n            pass\n\n    def get(self, url):\n        self.start()\n        self.driver.get(url)\n        self.bypass_cloudflare(url)\n        return self.driver.page_source\n\n\n\n    def get_json(self, url):\n        page = self.get(url).strip()\n\n        # Parse with BeautifulSoup\n        soup = BeautifulSoup(page, \"html.parser\")\n        pre = soup.find(\"pre\")\n\n        if pre:\n            page = pre.text  # ✅ grab only the JSON inside <pre>\n        else:\n            page = page  # maybe raw JSON already\n\n        # Debug\n        #print(\"\\n\\n[DEBUG get_json cleaned] First 200 chars:\\n\", page[:200], \"\\n\\n\")\n\n        return json.loads(page)\n\n    def to_json(self, page: str):\n        return json.loads(page)\n\n    def get_element_by_id(self, id: str) -> WebElement:\n        return self.driver.find_element(By.ID, id)\n\n    def get_element_by_xpath(self, xpath: str) -> WebElement:\n        return self.driver.find_element(By.XPATH, xpath)\n\n    def click_element(self, id: str):\n        self.get_element_by_id(id).click()\n\n    def wait_for_element_by_css_selector(self, *css: str, timeout: int = 10):\n        WebDriverWait(self.driver, timeout).until(\n            EC.visibility_of_any_elements_located((By.CSS_SELECTOR, \",\".join(css)))\n        )\n\n    def add_token(self, token: str):\n        cookie = {\n            \"name\": \"_dct\",\n            \"value\": token,\n            \"domain\": \".datacamp.com\",\n            \"secure\": True,\n        }\n        self.driver.add_cookie(cookie)\n        return self\n"
  },
  {
    "path": "src/datacamp_downloader/templates/course.py",
    "content": "# Generated by https://quicktype.io\n\nfrom enum import Enum\nfrom typing import Any, List, Optional\n\n\nclass TypeEnum(Enum):\n    MULTIPLE_CHOICE_EXERCISE = \"MultipleChoiceExercise\"\n    NORMAL_EXERCISE = \"NormalExercise\"\n    VIDEO_EXERCISE = \"VideoExercise\"\n\n\nclass Exercise:\n    type: TypeEnum\n    title: str\n    aggregate_xp: int\n    number: int\n    url: str\n\n    def __init__(\n        self,\n        type: TypeEnum,\n        title: str,\n        aggregate_xp: int,\n        number: int,\n        url: str,\n        **kwargs\n    ) -> None:\n        self.type = type\n        self.title = title\n        self.aggregate_xp = aggregate_xp\n        self.number = number\n        self.url = url\n\n\nclass Chapter:\n    id: int\n    title_meta: str\n    title: str\n    description: str\n    number: int\n    slug: str\n    nb_exercises: int\n    badge_completed_url: str\n    badge_uncompleted_url: str\n    last_updated_on: str\n    slides_link: str\n    free_preview: Optional[bool]\n    xp: int\n    number_of_videos: int\n    exercises: List[Exercise]\n\n    def __init__(\n        self,\n        id: int,\n        title_meta: str,\n        title: str,\n        description: str,\n        number: int,\n        slug: str,\n        nb_exercises: int,\n        badge_completed_url: str,\n        badge_uncompleted_url: str,\n        last_updated_on: str,\n        slides_link: str,\n        free_preview: Optional[bool],\n        xp: int,\n        number_of_videos: int,\n        exercises: List[Exercise],\n        **kwargs\n    ) -> None:\n        self.id = id\n        self.title_meta = title_meta\n        self.title = title\n        self.description = description\n        self.number = number\n        self.slug = slug\n        self.nb_exercises = nb_exercises\n        self.badge_completed_url = badge_completed_url\n        self.badge_uncompleted_url = badge_uncompleted_url\n        self.last_updated_on = last_updated_on\n        self.slides_link = slides_link\n        self.free_preview = free_preview\n        self.xp = xp\n        self.number_of_videos = number_of_videos\n        self.exercises = [Exercise(**c) for c in exercises]\n\n\nclass Collaborator:\n    avatar_url: str\n    full_name: str\n\n    def __init__(self, avatar_url: str, full_name: str) -> None:\n        self.avatar_url = avatar_url\n        self.full_name = full_name\n\n\nclass Dataset:\n    asset_url: str\n    name: str\n\n    def __init__(self, asset_url: str, name: str) -> None:\n        self.asset_url = asset_url\n        self.name = name\n\n\nclass Instructor:\n    id: int\n    marketing_biography: str\n    biography: str\n    avatar_url: str\n    full_name: str\n    instructor_path: str\n\n    def __init__(\n        self,\n        id: int,\n        marketing_biography: str,\n        biography: str,\n        avatar_url: str,\n        full_name: str,\n        instructor_path: str,\n        **kwargs\n    ) -> None:\n        self.id = id\n        self.marketing_biography = marketing_biography\n        self.biography = biography\n        self.avatar_url = avatar_url\n        self.full_name = full_name\n        self.instructor_path = instructor_path\n\n\nclass SharingLinks:\n    twitter: str\n    facebook: str\n\n    def __init__(self, twitter: str, facebook: str) -> None:\n        self.twitter = twitter\n        self.facebook = facebook\n\n\nclass Track:\n    path: str\n    title_with_subtitle: str\n\n    def __init__(self, path: str, title_with_subtitle: str) -> None:\n        self.path = path\n        self.title_with_subtitle = title_with_subtitle\n\n\nclass Course:\n    def __init__(self,\n                 id: int,\n                 title: str,\n                 description: str = \"\",\n                 slug: str = None,\n                 chapters: List[dict] = None,\n                 datasets: List[dict] = None,\n                 time_needed_in_hours: int = None,\n                 **kwargs) -> None:\n        \"\"\"\n        Flexible Course constructor that works with the new API.\n        Extra fields are captured by **kwargs so we don't break.\n        \"\"\"\n\n        self.id = id\n        self.title = title\n        self.description = description\n        self.slug = slug or str(id)\n\n        # build nested objects safely\n        self.chapters = [Chapter(**c) for c in (chapters or [])]\n        self.datasets = [Dataset(**c) for c in (datasets or [])]\n\n        # support both old/new API keys\n        self.time_needed = kwargs.get(\"time_needed\") or time_needed_in_hours\n        self.xp = kwargs.get(\"xp\", 0)\n        self.difficulty_level = kwargs.get(\"difficulty_level\", None)\n        self.state = kwargs.get(\"state\", \"unknown\")\n\n        # optional stuff\n        self.short_description = kwargs.get(\"short_description\", \"\")\n        self.slug = kwargs.get(\"slug\", slug or str(id))\n        self.image_url = kwargs.get(\"image_url\", \"\")\n        self.image_thumbnail_url = kwargs.get(\"image_thumbnail_url\", \"\")\n        self.last_updated_on = kwargs.get(\"last_updated_on\", \"\")\n        self.link = kwargs.get(\"link\", \"\")\n        self.programming_language = kwargs.get(\"programming_language\", \"unknown\")\n\n        # fallback empty lists\n        self.instructors = [Instructor(**c) for c in kwargs.get(\"instructors\", [])]\n        self.collaborators = [Collaborator(**c) for c in kwargs.get(\"collaborators\", [])]\n        self.tracks = [Track(**c) for c in kwargs.get(\"tracks\", [])]\n\n        # absorb anything else without crashing\n        self.extra = kwargs"
  },
  {
    "path": "src/datacamp_downloader/templates/exercise.py",
    "content": "# Generated by https://quicktype.io\n\nfrom typing import Any, List, Optional\n\nimport tomd\n\nfrom .course import TypeEnum\n\n\nclass Data:\n    id: int\n    type: str\n    assignment: Optional[str]\n    title: Optional[str]\n    sample_code: str\n    instructions: Optional[str]\n    number: int\n    sct: str\n    pre_exercise_code: str\n    solution: str\n    hint: Optional[str]\n    attachments: None\n    xp: int\n    possible_answers: List[Any]\n    feedbacks: List[Any]\n    question: str\n    subexercises: Optional[List[\"Data\"]]\n    course_id: Optional[int]\n    chapter_id: Optional[int]\n    runtime_config: Optional[str]\n    language: Optional[str]\n\n    def __init__(\n        self,\n        id: int,\n        type: str,\n        assignment: Optional[str] = None,\n        title: Optional[str] = None,\n        number: int = None,\n        hint: Optional[str] = None,\n        xp: int = None,\n        possible_answers: List[Any] = None,\n        feedbacks: List[Any] = None,\n        course_id: Optional[int] = None,\n        chapter_id: Optional[int] = None,\n        runtime_config: Optional[str] = None,\n        language: Optional[str] = None,\n        subexercises: Optional[List[\"Data\"]] = None,\n        instructions: Optional[str] = None,\n        attachments: None = None,\n        sample_code: str = None,\n        pre_exercise_code: str = None,\n        solution: str = None,\n        sct: str = None,\n        question: str = None,\n        **kwargs,\n    ) -> None:\n        self.id = id\n        self.type = type\n        self.assignment = assignment\n        self.title = title\n        self.sample_code = sample_code\n        self.instructions = instructions\n        self.number = number\n        self.sct = sct\n        self.pre_exercise_code = pre_exercise_code\n        self.solution = solution\n        self.hint = hint\n        self.attachments = attachments\n        self.xp = xp\n        self.possible_answers = possible_answers\n        self.feedbacks = feedbacks\n        self.question = question\n        self.subexercises = (\n            [e.get(\"id\") for e in subexercises] if subexercises else None\n        )\n        self.course_id = course_id\n        self.chapter_id = chapter_id\n        self.runtime_config = runtime_config\n        self.language = language\n\n\nclass Exercise:\n    data: Any\n    id: int\n    type: str\n    version: str\n    last_attempt: Optional[str]\n\n    def __init__(\n        self,\n        data: Data,\n        id: int,\n        type: str,\n        version: str,\n        last_attempt: str = None,\n        **kwargs,\n    ) -> None:\n        self.id = id\n        self.type = type\n        self.version = version\n        if not self.is_video:\n            self.data = Data(**data)\n        else:\n            self.data = data\n        self.last_attempt = last_attempt\n\n    @property\n    def is_video(self):\n        return self.type == TypeEnum.VIDEO_EXERCISE.value\n\n    @property\n    def is_python(self):\n        return self.data.language == \"python\"\n\n    def __str__(self) -> str:\n        html = (\n            \"<h1> {}</h1>\\n<pre><code>Exercise ID {}</code></pre>\\n<h2> Assignment </h2>{}\\n\".format(\n                self.data.title, self.id, self.data.assignment\n            )\n            + self.get_pre_exercise_code()\n            + self.get_instructions()\n            + self.get_sample_code()\n            + self.get_anwsers()\n            + self.get_hints()\n            + self.get_solution()\n        )\n        return tomd.convert(html)\n\n    def get_hints(self):\n        code = \"<h2> Hints </h2> {}<p></p>\"\n        if self.data.hint:\n            return code.format(self.data.hint)\n        return \"\"\n\n    def get_anwsers(self):\n        code = \"<h2> Answers </h2>{}<p></p>\"\n        if self.data.possible_answers:\n            return code.format(self._get_ordered_list(self.data.possible_answers))\n        # return code.format(\"No answers were found.\")\n        return \"\"\n\n    def get_instructions(self):\n        code = \"<h2> Instructions </h2>{}<p></p>\"\n        if self.data.instructions:\n            return code.format(self.data.instructions)\n        return \"\"\n\n    def _get_ordered_list(self, list):\n        return \"<ol>{}</ol>\".format(\"\\n\".join(f\"<li>{i}</li>\" for i in list))\n\n    def get_solution(self):\n        code = \"<h2> Solution </h2>{}<p></p>\"\n        if self.data.feedbacks:\n            return code.format(self._get_ordered_list(self.data.feedbacks))\n        if self.data.solution:\n            return code.format(self._get_code(self.data.solution))\n        return code.format(\"<p>No solution was found.</p>\")\n\n    def get_sample_code(self):\n        if self.data.sample_code:\n            return self._get_code(self.data.sample_code)\n        return \"\"\n\n    def get_pre_exercise_code(self):\n        code = \"<h2> Pre exercise code </h2> {}<p></p>\"\n        if self.data.pre_exercise_code:\n            return code.format(self._get_code(self.data.pre_exercise_code))\n        return \"\"\n\n    def _get_code(self, code):\n        return f\"<pre><code>{code}</code></pre>\"\n"
  },
  {
    "path": "src/datacamp_downloader/templates/lang.py",
    "content": "from enum import Enum\n\n\nclass Language(str, Enum):\n    EN = \"en\"\n    ZH = \"zh\"\n    FR = \"fr\"\n    DE = \"de\"\n    IT = \"it\"\n    JA = \"ja\"\n    KO = \"ko\"\n    PT = \"pt\"\n    RU = \"ru\"\n    ES = \"es\"\n    NONE = \"none\"\n"
  },
  {
    "path": "src/datacamp_downloader/templates/track.py",
    "content": "from typing import List\n\nfrom .course import Course\n\n\nclass Track:\n    id: int\n    title: str\n    link: str\n    courses: List[Course]\n\n    def __init__(self, id: int, title: str, link: str) -> None:\n        self.id = id\n        self.title = title\n        self.link = link\n        self.courses = []\n"
  },
  {
    "path": "src/datacamp_downloader/templates/video.py",
    "content": "# Generated by https://quicktype.io\n\nfrom enum import Enum\nfrom typing import Any, List, Optional\n\n\nclass TypeEnum(Enum):\n    FINAL_SLIDE = \"FinalSlide\"\n    FULL_SLIDE = \"FullSlide\"\n    TITLE_SLIDE = \"TitleSlide\"\n\n\nclass Structure:\n    number: int\n    type: TypeEnum\n    key: str\n    script: str\n    title: str\n    instructor_name: Optional[str]\n    instructor_title: Optional[str]\n    technology: Optional[str]\n    citations: List[Any]\n    code_zoom: int\n    disable_transition: bool\n    hide_slide_in_video: bool\n    hide_title: bool\n    use_full_width: bool\n    part1: Optional[str]\n\n    def __init__(\n        self,\n        number: int,\n        type: TypeEnum,\n        key: str,\n        script: str,\n        title: str,\n        instructor_name: Optional[str],\n        instructor_title: Optional[str],\n        technology: Optional[str],\n        citations: List[Any],\n        code_zoom: int,\n        disable_transition: bool,\n        hide_slide_in_video: bool,\n        hide_title: bool,\n        use_full_width: bool,\n        part1: Optional[str],\n        **kwargs\n    ) -> None:\n        self.number = number\n        self.type = type\n        self.key = key\n        self.script = script\n        self.title = title\n        self.instructor_name = instructor_name\n        self.instructor_title = instructor_title\n        self.technology = technology\n        self.citations = citations\n        self.code_zoom = code_zoom\n        self.disable_transition = disable_transition\n        self.hide_slide_in_video = hide_slide_in_video\n        self.hide_title = hide_title\n        self.use_full_width = use_full_width\n        self.part1 = part1\n\n\nclass SlideDeck:\n    key: str\n    plain_video_hls_link: str\n    plain_video_mp4_link: str\n    plain_video_raw_link: None\n    structure: List[Structure]\n    timings: str\n    title: str\n    transformations: str\n\n    def __init__(\n        self,\n        key: str,\n        plain_video_hls_link: str,\n        plain_video_mp4_link: str,\n        plain_video_raw_link: None,\n        structure: List[Structure],\n        timings: str,\n        title: str,\n        transformations: str,\n    ) -> None:\n        self.key = key\n        self.plain_video_hls_link = plain_video_hls_link\n        self.plain_video_mp4_link = plain_video_mp4_link\n        self.plain_video_raw_link = plain_video_raw_link\n        self.structure = [Structure(**s) for s in structure]\n        self.timings = timings\n        self.title = title\n        self.transformations = transformations\n\n\nclass Subtitle:\n    language: str\n    link: str\n\n    def __init__(self, language: str, link: str) -> None:\n        self.language = language\n        self.link = link\n\n\nclass Video:\n    audio_link: str\n    key: str\n    render_dynamically: int\n    script_link: str\n    slide_deck: SlideDeck\n    slides_link: str\n    subtitle_vtt_link: str\n    subtitles: List[Subtitle]\n    thumbnail_link: None\n    transcript_timings: None\n    type: str\n    video_hls_link: None\n    video_mp4_link: str\n    video_raw_link: None\n\n    def __init__(\n        self,\n        audio_link: str,\n        key: str,\n        render_dynamically: int,\n        script_link: str,\n        slide_deck: SlideDeck,\n        slides_link: str,\n        subtitle_vtt_link: str,\n        subtitles: List[Subtitle],\n        thumbnail_link: None,\n        transcript_timings: None,\n        type: str,\n        video_hls_link: None,\n        video_mp4_link: str,\n        video_raw_link: None,\n    ) -> None:\n        self.audio_link = audio_link\n        self.key = key\n        self.render_dynamically = render_dynamically\n        self.script_link = script_link\n        self.slide_deck = slide_deck\n        self.slides_link = slides_link\n        self.subtitle_vtt_link = subtitle_vtt_link\n        self.subtitles = [Subtitle(**s) for s in subtitles]\n        self.thumbnail_link = thumbnail_link\n        self.transcript_timings = transcript_timings\n        self.type = type\n        self.video_hls_link = video_hls_link\n        self.video_mp4_link = video_mp4_link\n        self.video_raw_link = video_raw_link\n"
  }
]