Repository: autoscrape-labs/pydoll Branch: main Commit: 9e236b887ab7 Files: 464 Total size: 5.0 MB Directory structure: gitextract_ulcsxv0t/ ├── .github/ │ ├── FUNDING.yml │ ├── ISSUE_TEMPLATE/ │ │ ├── bug_report.yml │ │ ├── config.yml │ │ ├── documentation.yml │ │ ├── feature_request.yml │ │ └── refactoring.yml │ ├── PULL_REQUEST_TEMPLATE/ │ │ ├── bug_fix.md │ │ ├── refactoring.md │ │ └── release.md │ ├── pull_request_template.md │ └── workflows/ │ ├── deploy-docs.yml │ ├── mypy.yml │ ├── publish.yml │ ├── release.yml │ ├── ruff-ci.yml │ └── tests.yml ├── .gitignore ├── .python-version ├── CHANGELOG.md ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── README_zh.md ├── SPONSORS.md ├── codecov.yml ├── cz.yaml ├── docs/ │ ├── en/ │ │ ├── api/ │ │ │ ├── browser/ │ │ │ │ ├── chrome.md │ │ │ │ ├── edge.md │ │ │ │ ├── managers.md │ │ │ │ ├── options.md │ │ │ │ ├── requests.md │ │ │ │ └── tab.md │ │ │ ├── commands/ │ │ │ │ ├── browser.md │ │ │ │ ├── dom.md │ │ │ │ ├── fetch.md │ │ │ │ ├── index.md │ │ │ │ ├── input.md │ │ │ │ ├── network.md │ │ │ │ ├── page.md │ │ │ │ ├── runtime.md │ │ │ │ ├── storage.md │ │ │ │ └── target.md │ │ │ ├── connection/ │ │ │ │ ├── connection.md │ │ │ │ └── managers.md │ │ │ ├── core/ │ │ │ │ ├── constants.md │ │ │ │ ├── exceptions.md │ │ │ │ └── utils.md │ │ │ ├── elements/ │ │ │ │ ├── mixins.md │ │ │ │ ├── shadow_root.md │ │ │ │ └── web_element.md │ │ │ ├── index.md │ │ │ └── protocol/ │ │ │ ├── base.md │ │ │ ├── browser.md │ │ │ ├── dom.md │ │ │ ├── fetch.md │ │ │ ├── input.md │ │ │ ├── network.md │ │ │ ├── page.md │ │ │ ├── runtime.md │ │ │ ├── storage.md │ │ │ └── target.md │ │ ├── deep-dive/ │ │ │ ├── architecture/ │ │ │ │ ├── browser-domain.md │ │ │ │ ├── browser-requests-architecture.md │ │ │ │ ├── event-architecture.md │ │ │ │ ├── find-elements-mixin.md │ │ │ │ ├── index.md │ │ │ │ ├── shadow-dom.md │ │ │ │ ├── tab-domain.md │ │ │ │ └── webelement-domain.md │ │ │ ├── fingerprinting/ │ │ │ │ ├── behavioral-fingerprinting.md │ │ │ │ ├── browser-fingerprinting.md │ │ │ │ ├── evasion-techniques.md │ │ │ │ ├── index.md │ │ │ │ └── network-fingerprinting.md │ │ │ ├── fundamentals/ │ │ │ │ ├── cdp.md │ │ │ │ ├── connection-layer.md │ │ │ │ ├── iframes-and-contexts.md │ │ │ │ ├── index.md │ │ │ │ └── typing-system.md │ │ │ ├── guides/ │ │ │ │ ├── index.md │ │ │ │ └── selectors-guide.md │ │ │ ├── index.md │ │ │ └── network/ │ │ │ ├── build-proxy.md │ │ │ ├── http-proxies.md │ │ │ ├── index.md │ │ │ ├── network-fundamentals.md │ │ │ ├── proxy-detection.md │ │ │ ├── proxy-legal.md │ │ │ └── socks-proxies.md │ │ ├── features/ │ │ │ ├── advanced/ │ │ │ │ ├── behavioral-captcha-bypass.md │ │ │ │ ├── decorators.md │ │ │ │ ├── event-system.md │ │ │ │ └── remote-connections.md │ │ │ ├── automation/ │ │ │ │ ├── file-operations.md │ │ │ │ ├── human-interactions.md │ │ │ │ ├── iframes.md │ │ │ │ ├── keyboard-control.md │ │ │ │ ├── mouse-control.md │ │ │ │ └── screenshots-and-pdfs.md │ │ │ ├── browser-management/ │ │ │ │ ├── contexts.md │ │ │ │ ├── cookies-sessions.md │ │ │ │ └── tabs.md │ │ │ ├── configuration/ │ │ │ │ ├── browser-options.md │ │ │ │ ├── browser-preferences.md │ │ │ │ └── proxy.md │ │ │ ├── core-concepts.md │ │ │ ├── element-finding.md │ │ │ ├── index.md │ │ │ └── network/ │ │ │ ├── http-requests.md │ │ │ ├── interception.md │ │ │ ├── monitoring.md │ │ │ └── network-recording.md │ │ └── index.md │ ├── pt/ │ │ ├── api/ │ │ │ ├── browser/ │ │ │ │ ├── chrome.md │ │ │ │ ├── edge.md │ │ │ │ ├── managers.md │ │ │ │ ├── options.md │ │ │ │ ├── requests.md │ │ │ │ └── tab.md │ │ │ ├── commands/ │ │ │ │ ├── browser.md │ │ │ │ ├── dom.md │ │ │ │ ├── fetch.md │ │ │ │ ├── index.md │ │ │ │ ├── input.md │ │ │ │ ├── network.md │ │ │ │ ├── page.md │ │ │ │ ├── runtime.md │ │ │ │ ├── storage.md │ │ │ │ └── target.md │ │ │ ├── connection/ │ │ │ │ ├── connection.md │ │ │ │ └── managers.md │ │ │ ├── core/ │ │ │ │ ├── constants.md │ │ │ │ ├── exceptions.md │ │ │ │ └── utils.md │ │ │ ├── elements/ │ │ │ │ ├── mixins.md │ │ │ │ ├── shadow_root.md │ │ │ │ └── web_element.md │ │ │ ├── index.md │ │ │ └── protocol/ │ │ │ ├── base.md │ │ │ ├── browser.md │ │ │ ├── dom.md │ │ │ ├── fetch.md │ │ │ ├── input.md │ │ │ ├── network.md │ │ │ ├── page.md │ │ │ ├── runtime.md │ │ │ ├── storage.md │ │ │ └── target.md │ │ ├── deep-dive/ │ │ │ ├── architecture/ │ │ │ │ ├── browser-domain.md │ │ │ │ ├── browser-requests-architecture.md │ │ │ │ ├── event-architecture.md │ │ │ │ ├── find-elements-mixin.md │ │ │ │ ├── index.md │ │ │ │ ├── shadow-dom.md │ │ │ │ ├── tab-domain.md │ │ │ │ └── webelement-domain.md │ │ │ ├── fingerprinting/ │ │ │ │ ├── behavioral-fingerprinting.md │ │ │ │ ├── browser-fingerprinting.md │ │ │ │ ├── evasion-techniques.md │ │ │ │ ├── index.md │ │ │ │ └── network-fingerprinting.md │ │ │ ├── fundamentals/ │ │ │ │ ├── cdp.md │ │ │ │ ├── connection-layer.md │ │ │ │ ├── iframes-and-contexts.md │ │ │ │ ├── index.md │ │ │ │ └── typing-system.md │ │ │ ├── guides/ │ │ │ │ ├── index.md │ │ │ │ └── selectors-guide.md │ │ │ ├── index.md │ │ │ └── network/ │ │ │ ├── build-proxy.md │ │ │ ├── http-proxies.md │ │ │ ├── index.md │ │ │ ├── network-fundamentals.md │ │ │ ├── proxy-detection.md │ │ │ ├── proxy-legal.md │ │ │ └── socks-proxies.md │ │ ├── features/ │ │ │ ├── advanced/ │ │ │ │ ├── behavioral-captcha-bypass.md │ │ │ │ ├── decorators.md │ │ │ │ ├── event-system.md │ │ │ │ └── remote-connections.md │ │ │ ├── automation/ │ │ │ │ ├── file-operations.md │ │ │ │ ├── human-interactions.md │ │ │ │ ├── iframes.md │ │ │ │ ├── keyboard-control.md │ │ │ │ ├── mouse-control.md │ │ │ │ └── screenshots-and-pdfs.md │ │ │ ├── browser-management/ │ │ │ │ ├── contexts.md │ │ │ │ ├── cookies-sessions.md │ │ │ │ └── tabs.md │ │ │ ├── configuration/ │ │ │ │ ├── browser-options.md │ │ │ │ ├── browser-preferences.md │ │ │ │ └── proxy.md │ │ │ ├── core-concepts.md │ │ │ ├── element-finding.md │ │ │ ├── index.md │ │ │ └── network/ │ │ │ ├── http-requests.md │ │ │ ├── interception.md │ │ │ ├── monitoring.md │ │ │ └── network-recording.md │ │ └── index.md │ ├── resources/ │ │ ├── scripts/ │ │ │ ├── extra.js │ │ │ └── termynal.js │ │ └── stylesheets/ │ │ ├── extra.css │ │ └── termynal.css │ └── zh/ │ ├── api/ │ │ ├── browser/ │ │ │ ├── chrome.md │ │ │ ├── edge.md │ │ │ ├── managers.md │ │ │ ├── options.md │ │ │ ├── requests.md │ │ │ └── tab.md │ │ ├── commands/ │ │ │ ├── browser.md │ │ │ ├── dom.md │ │ │ ├── fetch.md │ │ │ ├── index.md │ │ │ ├── input.md │ │ │ ├── network.md │ │ │ ├── page.md │ │ │ ├── runtime.md │ │ │ ├── storage.md │ │ │ └── target.md │ │ ├── connection/ │ │ │ ├── connection.md │ │ │ └── managers.md │ │ ├── core/ │ │ │ ├── constants.md │ │ │ ├── exceptions.md │ │ │ └── utils.md │ │ ├── elements/ │ │ │ ├── mixins.md │ │ │ ├── shadow_root.md │ │ │ └── web_element.md │ │ ├── index.md │ │ └── protocol/ │ │ ├── base.md │ │ ├── browser.md │ │ ├── dom.md │ │ ├── fetch.md │ │ ├── input.md │ │ ├── network.md │ │ ├── page.md │ │ ├── runtime.md │ │ ├── storage.md │ │ └── target.md │ ├── deep-dive/ │ │ ├── architecture/ │ │ │ ├── browser-domain.md │ │ │ ├── browser-requests-architecture.md │ │ │ ├── event-architecture.md │ │ │ ├── find-elements-mixin.md │ │ │ ├── index.md │ │ │ ├── shadow-dom.md │ │ │ ├── tab-domain.md │ │ │ └── webelement-domain.md │ │ ├── fingerprinting/ │ │ │ ├── behavioral-fingerprinting.md │ │ │ ├── browser-fingerprinting.md │ │ │ ├── evasion-techniques.md │ │ │ ├── index.md │ │ │ └── network-fingerprinting.md │ │ ├── fundamentals/ │ │ │ ├── cdp.md │ │ │ ├── connection-layer.md │ │ │ ├── iframes-and-contexts.md │ │ │ ├── index.md │ │ │ └── typing-system.md │ │ ├── guides/ │ │ │ ├── index.md │ │ │ └── selectors-guide.md │ │ ├── index.md │ │ └── network/ │ │ ├── build-proxy.md │ │ ├── http-proxies.md │ │ ├── index.md │ │ ├── network-fundamentals.md │ │ ├── proxy-detection.md │ │ ├── proxy-legal.md │ │ └── socks-proxies.md │ ├── features/ │ │ ├── advanced/ │ │ │ ├── behavioral-captcha-bypass.md │ │ │ ├── decorators.md │ │ │ ├── event-system.md │ │ │ └── remote-connections.md │ │ ├── automation/ │ │ │ ├── file-operations.md │ │ │ ├── human-interactions.md │ │ │ ├── iframes.md │ │ │ ├── keyboard-control.md │ │ │ ├── mouse-control.md │ │ │ └── screenshots-and-pdfs.md │ │ ├── browser-management/ │ │ │ ├── contexts.md │ │ │ ├── cookies-sessions.md │ │ │ └── tabs.md │ │ ├── configuration/ │ │ │ ├── browser-options.md │ │ │ ├── browser-preferences.md │ │ │ └── proxy.md │ │ ├── core-concepts.md │ │ ├── element-finding.md │ │ ├── index.md │ │ └── network/ │ │ ├── http-requests.md │ │ ├── interception.md │ │ ├── monitoring.md │ │ └── network-recording.md │ └── index.md ├── examples/ │ └── cloudflare_bypass.py ├── mkdocs.yml ├── public/ │ ├── index.html │ ├── robots.txt │ ├── script.js │ ├── scripts/ │ │ ├── extra.js │ │ └── termynal.js │ ├── sitemap.xml │ └── stylesheets/ │ ├── extra.css │ └── termynal.css ├── pydoll/ │ ├── __init__.py │ ├── browser/ │ │ ├── __init__.py │ │ ├── chromium/ │ │ │ ├── __init__.py │ │ │ ├── base.py │ │ │ ├── chrome.py │ │ │ └── edge.py │ │ ├── interfaces.py │ │ ├── managers/ │ │ │ ├── __init__.py │ │ │ ├── browser_options_manager.py │ │ │ ├── browser_process_manager.py │ │ │ ├── proxy_manager.py │ │ │ └── temp_dir_manager.py │ │ ├── options.py │ │ ├── requests/ │ │ │ ├── __init__.py │ │ │ ├── har_recorder.py │ │ │ ├── request.py │ │ │ └── response.py │ │ └── tab.py │ ├── commands/ │ │ ├── __init__.py │ │ ├── browser_commands.py │ │ ├── dom_commands.py │ │ ├── emulation_commands.py │ │ ├── fetch_commands.py │ │ ├── input_commands.py │ │ ├── network_commands.py │ │ ├── page_commands.py │ │ ├── runtime_commands.py │ │ ├── storage_commands.py │ │ └── target_commands.py │ ├── connection/ │ │ ├── __init__.py │ │ ├── connection_handler.py │ │ └── managers/ │ │ ├── __init__.py │ │ ├── commands_manager.py │ │ └── events_manager.py │ ├── constants.py │ ├── decorators.py │ ├── elements/ │ │ ├── __init__.py │ │ ├── mixins/ │ │ │ ├── __init__.py │ │ │ └── find_elements_mixin.py │ │ ├── shadow_root.py │ │ ├── utils/ │ │ │ ├── __init__.py │ │ │ └── selector_parser.py │ │ └── web_element.py │ ├── exceptions.py │ ├── interactions/ │ │ ├── __init__.py │ │ ├── iframe.py │ │ ├── keyboard.py │ │ ├── mouse.py │ │ ├── scroll.py │ │ └── utils.py │ ├── protocol/ │ │ ├── __init__.py │ │ ├── base.py │ │ ├── browser/ │ │ │ ├── __init__.py │ │ │ ├── events.py │ │ │ ├── methods.py │ │ │ └── types.py │ │ ├── debugger/ │ │ │ └── types.py │ │ ├── dom/ │ │ │ ├── __init__.py │ │ │ ├── events.py │ │ │ ├── methods.py │ │ │ └── types.py │ │ ├── emulation/ │ │ │ ├── __init__.py │ │ │ ├── methods.py │ │ │ └── types.py │ │ ├── fetch/ │ │ │ ├── __init__.py │ │ │ ├── events.py │ │ │ ├── methods.py │ │ │ └── types.py │ │ ├── input/ │ │ │ ├── __init__.py │ │ │ ├── events.py │ │ │ ├── methods.py │ │ │ └── types.py │ │ ├── io/ │ │ │ └── types.py │ │ ├── network/ │ │ │ ├── __init__.py │ │ │ ├── events.py │ │ │ ├── har_types.py │ │ │ ├── methods.py │ │ │ └── types.py │ │ ├── page/ │ │ │ ├── __init__.py │ │ │ ├── events.py │ │ │ ├── methods.py │ │ │ └── types.py │ │ ├── runtime/ │ │ │ ├── __init__.py │ │ │ ├── events.py │ │ │ ├── methods.py │ │ │ └── types.py │ │ ├── security/ │ │ │ └── types.py │ │ ├── storage/ │ │ │ ├── __init__.py │ │ │ ├── events.py │ │ │ ├── methods.py │ │ │ └── types.py │ │ └── target/ │ │ ├── __init__.py │ │ ├── events.py │ │ ├── methods.py │ │ └── types.py │ ├── py.typed │ └── utils/ │ ├── __init__.py │ ├── bundle.py │ ├── general.py │ ├── socks5_proxy_forwarder.py │ └── user_agent_parser.py ├── pyproject.toml └── tests/ ├── conftest.py ├── pages/ │ ├── oopif/ │ │ ├── oopif_content.html │ │ ├── oopif_main.html │ │ ├── oopif_nested.html │ │ └── oopif_shadow_iframe.html │ ├── shadow_dom_test.html │ ├── test_children.html │ ├── test_click_nested.html │ ├── test_click_nested_iframe_content.html │ ├── test_core_simple.html │ ├── test_frame_content.html │ ├── test_frameset.html │ ├── test_har_recording.html │ ├── test_iframe_content.html │ ├── test_iframe_nested.html │ ├── test_iframe_nested_level.html │ ├── test_iframe_parent_level.html │ ├── test_iframe_simple.html │ └── test_multiple_iframes.html ├── test_browser/ │ ├── test_browser_base.py │ ├── test_browser_chrome.py │ ├── test_browser_edge.py │ ├── test_browser_options.py │ ├── test_browser_tab.py │ ├── test_har_recorder.py │ ├── test_requests_request.py │ ├── test_requests_response.py │ └── test_tab_request_integration.py ├── test_click_nested_integration.py ├── test_commands/ │ ├── test_browser_commands.py │ ├── test_dom_commands.py │ ├── test_emulation_commands.py │ ├── test_fetch_commands.py │ ├── test_input_commands.py │ ├── test_network_commands.py │ ├── test_page_commands.py │ ├── test_runtime_commands.py │ ├── test_storage_commands.py │ └── test_target_commands.py ├── test_connection_handler.py ├── test_core_integration.py ├── test_decorators.py ├── test_events.py ├── test_exceptions.py ├── test_find_elements_mixin.py ├── test_har_recording_integration.py ├── test_iframe_integration.py ├── test_interactions/ │ ├── __init__.py │ ├── test_iframe.py │ ├── test_keyboard.py │ ├── test_mouse.py │ └── test_scroll.py ├── test_managers/ │ ├── test_browser_managers.py │ └── test_connection_managers.py ├── test_nested_oopif_integration.py ├── test_shadow_root.py ├── test_shadow_root_integration.py ├── test_socks5_proxy_forwarder.py ├── test_user_agent_parser.py ├── test_utils.py └── test_web_element.py ================================================ FILE CONTENTS ================================================ ================================================ FILE: .github/FUNDING.yml ================================================ # These are supported funding model platforms github: [thalissonvs] ================================================ FILE: .github/ISSUE_TEMPLATE/bug_report.yml ================================================ name: Bug Report description: Report a bug in pydoll title: "[Bug]: " labels: ["bug", "needs-triage"] body: - type: markdown attributes: value: | # pydoll Bug Report Thank you for taking the time to report a bug. This form will guide you through providing the information needed to address the issue effectively. - type: checkboxes id: checklist attributes: label: Checklist before reporting description: Please make sure you've completed the following steps before submitting a bug report. options: - label: I have searched for [similar issues](https://github.com/thalissonvs/pydoll/issues) and didn't find a duplicate. required: true - label: I have updated to the latest version of pydoll to verify the issue still exists. required: true - type: input id: version attributes: label: pydoll Version description: What version of pydoll are you using when encountering this bug? placeholder: e.g., 1.3.2 validations: required: true - type: input id: python_version attributes: label: Python Version description: What version of Python are you using? placeholder: e.g., 3.10.4 validations: required: true - type: dropdown id: os attributes: label: Operating System description: What operating system are you using? options: - Windows - macOS - Linux - Other (specify in environment details) validations: required: true - type: textarea id: description attributes: label: Bug Description description: A clear and concise description of what the bug is. placeholder: When I try to use X feature, the library fails with error message Y... validations: required: true - type: textarea id: reproduction_steps attributes: label: Steps to Reproduce description: Step by step instructions to reproduce the bug. placeholder: | 1. Import the library using `import pydoll` 2. Set up the client with `...` 3. Call method X with parameters Y 4. See error validations: required: true - type: textarea id: code_example attributes: label: Code Example description: | A minimal, self-contained code example that demonstrates the issue. This will be automatically formatted into code, so no need for backticks. render: python placeholder: | from pydoll import Client client = Client(...) # Code that triggers the bug result = client.some_method(...) print(result) validations: required: true - type: textarea id: expected_behavior attributes: label: Expected Behavior description: A clear and concise description of what you expected to happen. placeholder: The method should return X or perform Y... validations: required: false - type: textarea id: actual_behavior attributes: label: Actual Behavior description: What actually happened instead? Include full error messages and stack traces if applicable. placeholder: The method raised an exception... validations: required: false - type: textarea id: logs attributes: label: Relevant Log Output description: | If applicable, include any logs or error messages. This will be automatically formatted, so no need for backticks. render: shell placeholder: | Traceback (most recent call last): File "example.py", line 10, in ... File ".../pydoll/...", line N, in some_method ... SomeError: Error message - type: textarea id: additional_context attributes: label: Additional Context description: Add any other context about the problem here (environment details, potential causes, solutions you've tried, etc.) placeholder: I've tried reinstalling the package and using a different Python version, but the issue persists... ================================================ FILE: .github/ISSUE_TEMPLATE/config.yml ================================================ blank_issues_enabled: true contact_links: - name: Questions & Discussions url: https://github.com/thalissonvs/pydoll/discussions about: Please ask and answer questions here. ================================================ FILE: .github/ISSUE_TEMPLATE/documentation.yml ================================================ name: Documentation Issue description: Report missing, incorrect, or unclear documentation title: "[Docs]: " labels: ["documentation", "needs-triage"] body: - type: markdown attributes: value: | # pydoll Documentation Issue Thank you for helping us improve the documentation. This form will guide you through providing the information needed to address documentation issues effectively. - type: checkboxes id: checklist attributes: label: Checklist before reporting description: Please make sure you've completed the following steps before submitting a documentation issue. options: - label: I have searched for [similar documentation issues](https://github.com/thalissonvs/pydoll/issues) and didn't find a duplicate. required: true - label: I have checked the latest documentation to verify this issue still exists. required: true - type: dropdown id: type attributes: label: Type of Documentation Issue description: What type of documentation issue are you reporting? options: - Missing documentation (information does not exist) - Incorrect documentation (information is wrong) - Unclear documentation (information is confusing or ambiguous) - Outdated documentation (information is no longer valid) - Other (please specify in description) validations: required: true - type: input id: location attributes: label: Documentation Location description: Where is the documentation with issues located? Provide URLs, file paths, or section names. placeholder: e.g., https://docs.example.com/pydoll/api.html#section, README.md, API Reference for Client class validations: required: true - type: textarea id: description attributes: label: Issue Description description: Describe the issue with the documentation in detail. placeholder: | The documentation for the `Client.connect()` method doesn't mention the timeout parameter, which I discovered by looking at the source code. validations: required: true - type: textarea id: suggested_fix attributes: label: Suggested Fix description: If you have a suggestion for how to fix the documentation, please provide it here. placeholder: | Add the following to the `Client.connect()` documentation: ``` Parameters: timeout (float, optional): Connection timeout in seconds. Defaults to 30. ``` - type: textarea id: additional_info attributes: label: Additional Information description: Any additional context or information that might help address this documentation issue. placeholder: | I found this issue when trying to implement a connection with a shorter timeout for my specific use case. - type: dropdown id: contribution attributes: label: Contribution description: Would you be willing to contribute a fix for this documentation? options: - Yes, I'd be willing to submit a PR with the fix - No, I don't have the capacity to fix this validations: required: true ================================================ FILE: .github/ISSUE_TEMPLATE/feature_request.yml ================================================ name: Feature Request description: Suggest a new feature or enhancement for pydoll title: "[Feature Request]: " labels: ["enhancement", "needs-triage"] body: - type: markdown attributes: value: | # pydoll Feature Request Thank you for taking the time to suggest a new feature. This form will guide you through providing the information needed to consider your suggestion effectively. - type: checkboxes id: checklist attributes: label: Checklist before requesting description: Please make sure you've completed the following steps before submitting a feature request. options: - label: I have searched for [similar feature requests](https://github.com/thalissonvs/pydoll/issues) and didn't find a duplicate. required: true - label: I have checked the documentation to confirm this feature doesn't already exist. required: true - type: textarea id: problem attributes: label: Problem Statement description: Is your feature request related to a problem? Please describe what you're trying to accomplish. placeholder: I'm trying to accomplish X, but I'm unable to because Y... validations: required: true - type: textarea id: solution attributes: label: Proposed Solution description: Describe the solution you'd like to see implemented. Be as specific as possible. placeholder: | I would like to see a new method/class that can... Example usage might look like: ```python client.new_feature(param1, param2) ``` validations: required: true - type: textarea id: alternatives attributes: label: Alternatives Considered description: Describe any alternative solutions or features you've considered. placeholder: I've tried accomplishing this using X and Y approaches, but they don't work well because... - type: textarea id: context attributes: label: Additional Context description: Add any other context, code examples, or references that might help explain your feature request. placeholder: | Other libraries like X and Y have similar features that work like... This would help users who need to... - type: dropdown id: importance attributes: label: Importance description: How important is this feature to your use case? options: - Nice to have - Important - Critical (blocking my usage) validations: required: true - type: dropdown id: contribution attributes: label: Contribution description: Would you be willing to contribute this feature yourself? options: - Yes, I'd be willing to implement this feature - I could help with parts of the implementation - No, I don't have the capacity to implement this ================================================ FILE: .github/ISSUE_TEMPLATE/refactoring.yml ================================================ name: Refactoring Request description: Suggest code refactoring to improve pydoll's quality, performance, or maintainability title: "[Refactor]: " labels: ["refactor", "needs-triage"] body: - type: markdown attributes: value: | # pydoll Refactoring Request Thank you for suggesting improvements to our codebase. This form will guide you through providing the information needed to consider your refactoring suggestion effectively. - type: checkboxes id: checklist attributes: label: Checklist before suggesting refactoring description: Please make sure you've completed the following steps before submitting a refactoring request. options: - label: I have searched for [similar refactoring requests](https://github.com/thalissonvs/pydoll/issues) and didn't find a duplicate. required: true - label: I have reviewed the current implementation to ensure my understanding is accurate. required: true - type: textarea id: current_implementation attributes: label: Current Implementation description: Describe the current implementation and its limitations. Include file paths if known. placeholder: | The current implementation in `pydoll/module/file.py` has the following issues: 1. It uses an inefficient algorithm for... 2. The code structure makes it difficult to maintain because... validations: required: true - type: textarea id: proposed_changes attributes: label: Proposed Changes description: Describe the changes you're suggesting. Be as specific as possible. placeholder: | I suggest refactoring this code to: 1. Replace the current algorithm with X, which would improve performance by... 2. Restructure the class hierarchy to better separate concerns by... Example code sketch (if applicable): ```python def improved_method(): # better implementation ``` validations: required: true - type: textarea id: benefits attributes: label: Benefits description: Explain the benefits of this refactoring. placeholder: | This refactoring would: - Improve performance by X% - Make the code more maintainable by... - Reduce code complexity by... - Fix potential bugs such as... validations: required: true - type: dropdown id: impact attributes: label: API Impact description: Would this refactoring change the public API? options: - No API changes (internal refactoring only) - Minor API changes (backward compatible) - Breaking API changes validations: required: true - type: textarea id: testing_approach attributes: label: Testing Approach description: How can we verify that the refactoring doesn't break existing functionality? placeholder: | The refactoring can be tested by: - Running the existing test suite - Adding new tests for edge cases such as... - Benchmarking performance before and after - type: dropdown id: contribution attributes: label: Contribution description: Would you be willing to contribute this refactoring yourself? options: - Yes, I'd be willing to implement this refactoring - I could help with parts of the implementation - No, I don't have the capacity to implement this ================================================ FILE: .github/PULL_REQUEST_TEMPLATE/bug_fix.md ================================================ # Bug Fix Pull Request ## Related Issue(s) ## Bug Description ## Root Cause ## Solution ## Verification Steps 1. 2. 3. ## Code Example ```python # Example code showing the fix ``` ## Before / After ## Testing ## Testing Checklist - [ ] Added regression test that would have caught this bug - [ ] Modified existing tests to account for this fix - [ ] All tests pass - [ ] Edge cases have been tested ## Impact - [ ] Low (isolated fix with no side effects) - [ ] Medium (might affect closely related functionality) - [ ] High (affects multiple areas or changes core behavior) ## Backwards Compatibility - [ ] This change is fully backward compatible - [ ] This change introduces backward incompatibilities (explain below) ## Checklist before requesting a review - [ ] My code follows the style guidelines of this project - [ ] I have performed a self-review of my code - [ ] I have added test cases that prove my fix is effective - [ ] I have run `poetry run task lint` and fixed any issues - [ ] I have run `poetry run task test` and all tests pass - [ ] My commits follow the [conventional commits](https://www.conventionalcommits.org/) style with message explaining the fix ================================================ FILE: .github/PULL_REQUEST_TEMPLATE/refactoring.md ================================================ # Refactoring Pull Request ## Refactoring Scope ## Related Issue(s) ## Description ## Motivation ## Before / After ### Before ```python # Original code ``` ### After ```python # Refactored code ``` ## Performance Impact - [ ] Performance improved - [ ] Performance potentially decreased - [ ] No significant performance change - [ ] Performance impact unknown ## Technical Debt ## API Changes - [ ] No changes to public API - [ ] Public API changed, but backward compatible - [ ] Breaking changes to public API ## Testing Strategy ## Testing Checklist - [ ] Existing tests updated - [ ] New tests added for previously uncovered cases - [ ] All tests pass - [ ] Code coverage maintained or improved ## Risks and Mitigations ## Checklist before requesting a review - [ ] My code follows the style guidelines of this project - [ ] I have performed a thorough self-review of the refactored code - [ ] I have commented my code, particularly in complex areas - [ ] I have updated documentation if needed - [ ] I have run `poetry run task lint` and fixed any issues - [ ] I have run `poetry run task test` and all tests pass - [ ] My commits follow the [conventional commits](https://www.conventionalcommits.org/) style ================================================ FILE: .github/PULL_REQUEST_TEMPLATE/release.md ================================================ # Release Pull Request ## Version ## Release Date ## Release Type - [ ] Major (breaking changes) - [ ] Minor (new features, non-breaking) - [ ] Patch (bug fixes, non-breaking) ## Change Summary ## Key Changes ## Breaking Changes ## Dependencies ## Deprecations - While `get_element_text()` is still supported, it is **recommended** to use the new async property `element.text`. ## Documentation ## Release Checklist - [ ] Version number updated in pyproject.toml - [ ] Version number updated in cz.yaml - [ ] CHANGELOG.md updated with all changes - [ ] All tests passing - [ ] Documentation updated - [ ] API reference updated - [ ] Breaking changes documented - [ ] Migration guides prepared (if applicable) ## Additional Release Notes ================================================ FILE: .github/pull_request_template.md ================================================ # Pull Request ## Description ## Related Issue(s) ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] Documentation update - [ ] Refactoring (no functional changes, no API changes) - [ ] Performance improvement - [ ] Tests (adding missing tests or correcting existing tests) - [ ] Build or CI/CD related changes ## How Has This Been Tested? ```python # Include code examples if relevant ``` ## Testing Checklist - [ ] Unit tests added/updated - [ ] Integration tests added/updated - [ ] All existing tests pass ## Screenshots ## Implementation Details ## API Changes ## Additional Info ## Checklist before requesting a review - [ ] My code follows the style guidelines of this project - [ ] I have performed a self-review of my code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [ ] I have run `poetry run task lint` and fixed any issues - [ ] I have run `poetry run task test` and all tests pass - [ ] My commits follow the [conventional commits](https://www.conventionalcommits.org/) style ================================================ FILE: .github/workflows/deploy-docs.yml ================================================ name: Deploy site + docs on: push: branches: [main] jobs: deploy: runs-on: ubuntu-latest steps: - name: Code Checkout uses: actions/checkout@v4 - name: Setup Python uses: actions/setup-python@v5 with: python-version: '3.x' - name: Install Dependencies run: | python -m pip install --upgrade pip pip install mkdocs mkdocs-material pymdown-extensions mkdocstrings[python] mkdocs-static-i18n # Build MkDocs em pasta temporária - name: Build MkDocs into temp folder run: mkdocs build --site-dir temp_docs # Criar estrutura final do site - name: Prepare final site run: | mkdir -p site/docs mkdir -p site/images cp -r temp_docs/* site/docs/ cp -r public/* site/ - name: Deploy to GitHub Pages uses: peaceiris/actions-gh-pages@v3 with: github_token: ${{ secrets.GITHUB_TOKEN }} publish_dir: ./site cname: pydoll.tech ================================================ FILE: .github/workflows/mypy.yml ================================================ name: MyPy CI on: push: branches: - '*' # matches every branch that doesn't contain a '/' - '*/*' # matches every branch containing a single '/' - '**' # matches every branch pull_request: jobs: build: runs-on: ubuntu-latest strategy: max-parallel: 4 matrix: python-version: ["3.11"] steps: - uses: actions/checkout@v2 - name: Set up Python ${{ matrix.python-version }} uses: actions/setup-python@v2 with: python-version: ${{ matrix.python-version }} - name: Install Dependencies run: | python -m pip install --upgrade pip python -m pip install mypy python -m pip install -e . python -m mypy --install-types --non-interactive pydoll - name: mypy run: python -m mypy . ================================================ FILE: .github/workflows/publish.yml ================================================ name: Publish to PyPI (Poetry) on: workflow_dispatch jobs: deploy: runs-on: ubuntu-latest steps: - name: Checkout code uses: actions/checkout@v3 - name: Set up Python uses: actions/setup-python@v4 with: python-version: "3.10" - name: Install Poetry run: | python -m pip install --upgrade pip pip install poetry - name: Configure Poetry run: poetry config pypi-token.pypi ${{ secrets.PYPI_API_TOKEN }} - name: Install dependencies run: poetry install - name: Build package run: poetry build - name: Publish to PyPI run: poetry publish ================================================ FILE: .github/workflows/release.yml ================================================ name: Release on: workflow_dispatch jobs: version-cz: runs-on: ubuntu-latest name: "Version CZ" outputs: version: ${{ steps.cz.outputs.version }} steps: - name: Checkout uses: actions/checkout@v4 with: fetch-depth: 0 token: ${{ secrets.GITHUB_TOKEN }} - id: cz name: Create bump and changelog uses: commitizen-tools/commitizen-action@master with: github_token: ${{ secrets.GITHUB_TOKEN }} - name: Print Version run: echo "Bumped to version ${{ steps.cz.outputs.version }}" version-pyproject: runs-on: ubuntu-latest name: "Version Pyproject" needs: version-cz outputs: version: ${{ needs.version-cz.outputs.version }} steps: - name: Checkout uses: actions/checkout@v4 with: fetch-depth: 0 token: ${{ secrets.GITHUB_TOKEN }} - name: Install Poetry run: | curl -sSL https://install.python-poetry.org | python3 - export PATH="$HOME/.local/bin:$PATH" - name: Update Poetry version in pyproject.toml run: | git config --global user.name "github-actions[bot]" git config --global user.email "github-actions[bot]@users.noreply.github.com" poetry version "${{ needs.version-cz.outputs.version }}" git add pyproject.toml git commit -m "Update pyproject.toml to version ${{ needs.version-cz.outputs.version }}" git pull --rebase git push - name: Update poetry.lock continue-on-error: true run: | git config --global user.name "github-actions[bot]" git config --global user.email "github-actions[bot]@users.noreply.github.com" poetry lock git add poetry.lock git commit -m "Update poetry.lock" git pull --rebase git push release: name: Release needs: version-pyproject runs-on: ubuntu-latest steps: - name: Create Release uses: softprops/action-gh-release@v1 with: draft: false prerelease: false generate_release_notes: true tag_name: ${{ needs.version-pyproject.outputs.version }} env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} ================================================ FILE: .github/workflows/ruff-ci.yml ================================================ name: Ruff CI on: push: branches: - '*' # matches every branch that doesn't contain a '/' - '*/*' # matches every branch containing a single '/' - '**' # matches every branch pull_request: jobs: build: runs-on: ubuntu-latest strategy: max-parallel: 4 matrix: python-version: ["3.11"] steps: - uses: actions/checkout@v2 - name: Set up Python ${{ matrix.python-version }} uses: actions/setup-python@v2 with: python-version: ${{ matrix.python-version }} - name: Install Dependencies run: | python -m pip install --upgrade pip python -m pip install ruff==0.7.1 - name: ruff run: python -m ruff check . ================================================ FILE: .github/workflows/tests.yml ================================================ name: PyDoll Tests on: push: pull_request: jobs: tests: strategy: fail-fast: false matrix: os: [ubuntu-latest, windows-latest] python-version: ["3.10", "3.11", "3.12", "3.13"] runs-on: ${{ matrix.os }} steps: - uses: actions/checkout@v3 - name: Set up Python ${{ matrix.python-version }} uses: actions/setup-python@v4 with: python-version: ${{ matrix.python-version }} - name: Install dependencies run: | python -m pip install poetry poetry install - name: Install Chrome uses: browser-actions/setup-chrome@v1 with: chrome-version: 132 - name: Run tests with coverage run: | poetry run pytest -s -x --cov=pydoll -vv --cov-report=xml - name: Upload coverage to Codecov uses: codecov/codecov-action@v5 with: file: ./coverage.xml flags: tests name: PyDoll Tests fail_ci_if_error: true token: ${{ secrets.CODECOV_TOKEN }} ================================================ FILE: .gitignore ================================================ __pycache__/ *.py[cod] *$py.class # C extensions *.so # Distribution / packaging .Python build/ develop-eggs/ dist/ downloads/ eggs/ .eggs/ lib/ lib64/ parts/ sdist/ var/ wheels/ share/python-wheels/ *.egg-info/ .installed.cfg *.egg MANIFEST # PyInstaller # Usually these files are written by a python script from a template # before PyInstaller builds the exe, so as to inject date/other infos into it. *.manifest *.spec # Installer logs pip-log.txt pip-delete-this-directory.txt # Unit test / coverage reports htmlcov/ .tox/ .nox/ .coverage .coverage.* .cache nosetests.xml coverage.xml *.cover *.py,cover .hypothesis/ .pytest_cache/ cover/ # Translations *.mo *.pot # Django stuff: *.log local_settings.py db.sqlite3 db.sqlite3-journal # Flask stuff: instance/ .webassets-cache # Scrapy stuff: .scrapy # Sphinx documentation docs/_build/ # PyBuilder .pybuilder/ # Jupyter Notebook .ipynb_checkpoints # IPython profile_default/ ipython_config.py # pyenv # For a library or package, you might want to ignore these files since the code is # intended to run in multiple environments; otherwise, check them in: # .python-version # pipenv # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. # However, in case of collaboration, if having platform-specific dependencies or dependencies # having no cross-platform support, pipenv may install dependencies that don't work, or not # install all needed dependencies. #Pipfile.lock # poetry # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. # This is especially recommended for binary packages to ensure reproducibility, and is more # commonly ignored for libraries. # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control #poetry.lock # pdm # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. #pdm.lock # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it # in version control. # https://pdm.fming.dev/latest/usage/project/#working-with-version-control .pdm.toml .pdm-python .pdm-build/ # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm __pypackages__/ # Celery stuff celerybeat-schedule celerybeat.pid # SageMath parsed files *.sage.py # Environments .env .venv env/ venv/ ENV/ env.bak/ venv.bak/ # Spyder project settings .spyderproject .spyproject # Rope project settings .ropeproject # mkdocs documentation /site # mypy .mypy_cache/ .dmypy.json dmypy.json # Pyre type checker .pyre/ # pytype static type analyzer .pytype/ # Cython debug symbols cython_debug/ # PyCharm # JetBrains specific template is maintained in a separate JetBrains.gitignore that can # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore # and can be added to the global gitignore or merged into this file. For a more nuclear # option (not recommended) you can uncomment the following to ignore the entire idea folder. #.idea/ .czrc .ruff_cache/ # Dev test file dev_test_file.py ================================================ FILE: .python-version ================================================ 3.12.5 ================================================ FILE: CHANGELOG.md ================================================ ## 2.21.3 (2026-03-14) ### Fix - **test**: improve OOPIF integration test reliability - **iframe**: resolve nested OOPIF iframes inside shadow roots ## 2.21.2 (2026-03-12) ### Fix - release commit ## 2.21.1 (2026-03-03) ### Fix - **keyboard**: send correct key, code and keycode in type_text - **elements**: fix humanized interactions inside iframes - humanized scroll overshoot correction causes infinite loop ## 2.21.0 (2026-03-01) ### Feat - **interactions**: change humanize default from True to False ### Fix - **elements**: forward humanize flag to click in type_text ## 2.20.2 (2026-02-18) ### Fix - **command**: increase default command timeout from 10s to 60s across multiple components - **tab**: remove temporary flag to avoid duplicate callback removal ## 2.20.1 (2026-02-16) ### Fix - **tab**: replace readyState polling with CDP events in navigation ## 2.20.0 (2026-02-13) ### Feat - **mouse**: add timing property for runtime configuration - **requests**: add record() and replay() to Request class - **requests**: add HAR network recorder - **protocol**: add HAR 1.2 type definitions ### Fix - **requests**: use surgical callback removal instead of nuclear clear_callbacks ### Refactor - **tab**: extract bundle static methods to utils module ## 2.19.0 (2026-02-12) ### Feat - **interactions**: default humanize=True for keyboard type_text - **elements**: integrate Mouse API into WebElement.click() - **interactions**: add Mouse API with humanized simulation - **browser**: add webrtc_leak_protection property to ChromiumOptions - **browser**: add automatic User-Agent consistency override ### Fix - **utils**: harden SOCKS5 proxy forwarder security and robustness ## 2.18.0 (2026-02-11) ### Feat - **utils**: add SOCKS5 proxy forwarder and convert utils to package - **elements**: add cross-iframe selector support for XPath and CSS ## 2.17.0 (2026-02-08) ### Feat - **tab**: refactor cloudflare bypass to use shadow root traversal - **elements**: add shadow root timeout, CSS restriction and context propagation - **tab**: add find_shadow_roots with OOPIF traversal and timeout - **elements**: add shadow DOM support ### Fix - **docs**: replace shadow.find() with query() in all documentation - **tests**: replace shadow.find() with query() in integration tests - **elements**: use float timeout and add contextual WaitElementTimeout messages ## 2.16.0 (2026-02-06) ### Feat - add clear method for input and enhance page load state handling ### Fix - **browser**: support secure websocket connections ## 2.15.1 (2026-01-04) ### Fix - filter Symbol properties from element query results ## 2.15.0 (2025-12-24) ### Feat - Implement incognito mode cookie retrieval for `tab.get_cookies()` and update related documentation ### Fix - inconsistence in type checking - Dispatch `KEY_DOWN` and `KEY_UP` events for character typing ## 2.14.0 (2025-12-10) ### Feat - get_tab_by_target method added - get_tab_by_target method added ### Fix - adding type: ignore in JavascriptDialogOpeningEvent object - adding type: ignore in JavascriptDialogOpeningEvent object ## 2.13.1 (2025-12-07) ### Fix - add stuck scroll detection and minimum flick distance to humanized scroll, and correct scroll distance calculation. ## 2.13.0 (2025-12-07) ### Feat - Implement humanized keyboard typing and physics-based scroll, and add iframe interaction support. ## 2.12.4 (2025-11-29) ### Fix - optimize iframe resolution logic by adjusting backend node ID checks and enhancing child frame handling - refine OOPIF resolution and frame attachment logic for improved handling of backend node IDs - enhance OOPIF target attachment logic for improved session handling ## 2.12.3 (2025-11-27) ### Fix - improve frame retrieval logic for better session handling ## 2.12.2 (2025-11-19) ### Fix - adjust find_elements_mixin.py to refine return types and defaults ## 2.12.1 (2025-11-14) ### Fix - continue cleanup process if temporary directory still exists - adjust sleep duration for Windows and enhance temp dir cleanup - enhance error handling for locked files on Windows systems - remove unnecessary retry_times parameter in file processing - ensure temp directory cleanup handles Chromium locked files - enhance element selection and text extraction for better stability - handle oopif targets - change way to interact with iframes ### Refactor - refactor iframe context handling in FindElementsMixin class ### Perf - update Chrome options for better memory management and stability ## 2.12.0 (2025-11-04) ### Feat - **execute_script**: validate element argument usage - **tab,element,chrome**: revert arguments and add Chromium paths - add a retry decorator for handling function execution failures ### Fix - import TopLevelTargetRequired in test_browser_tab.py - allow one additional retry attempt in the retry decorator ### Refactor - **tab,element**: simplify execute_script parameters - **element**: move and enhance execute_script from tab - **tab**: separate execute_script concerns and enhance with comprehensive options ## 2.11.0 (2025-11-02) ### Feat - add input handling functions and key constants for editing - add KeyboardAPI for simulating keyboard input actions - add KeyboardAPI integration for enhanced keyboard control ### Fix - enhance text insertion and deprecate legacy key methods ## 2.10.0 (2025-11-01) ### Feat - add ScrollAPI for enhanced page scrolling capabilities ## 2.9.3 (2025-10-30) ### Refactor - keep take_screenshot consistent - refactor type hints for better clarity and future compatibility ## 2.9.2 (2025-10-19) ### Fix - update process creation to capture output and clean proxy format - preserve query and fragment in WebSocket URL for tabs ### Refactor - remove debug logging for request status and network events - refactor logger messages to use consistent single quotes - fix merge conflicts - add logging for browser lifecycle and context management events - refactor proxy parsing logic for improved clarity and efficiency ## 2.9.1 (2025-10-15) ### Fix - change download event handling to use PageEvent instead of BrowserEvent ### Refactor - use early return in setup proxy method ## 2.9.0 (2025-10-05) ### Feat - add configurable page load state ## 2.8.2 (2025-10-03) ### Fix - implement proxy authentication handling for browser tabs - map exception when try to take screenshot of an iframe ## 2.8.1 (2025-09-27) ### Fix - store the opened tab in the _tabs_opened dictionary - **elements**: correctly detect parenthesized XPath expressions ### Refactor - simplify FindElementsMixin._get_expression_type startswith checks into single tuple ## 2.8.0 (2025-08-28) ### Feat - adding get_siblings_elements method - adding get_children_elements method - refactor Tab class to support optional WebSocket address handling - add WebSocket connection support for existing browser instances - add optional WebSocket address support in connection handler ### Fix - add get siblings and get childen methods a raise_exc option - improving children and parent retrive docstring and creating a private generic method for then - using new execute_script public method - solving conflicts - rename pages fixtures files and adding a error test ### Refactor - refactor Tab class to improve initialization and error handling - refactor Browser class to manage opened tabs and WebSocket setup - add new exception classes for connection and WebSocket errors ## 2.7.0 (2025-08-22) ### Feat - refactor WebElement methods to use a unified naming convention - add Response type and new bring_to_front method to Tab class - improve element interactability scripts ### Fix - **browser**: add google-chrome-stable path for Arch Linux AUR package - run actions to fix badges - enforce combined condition logic in wait_until - **web_element**: raise WaitElementTimeout on wait_until timeout ### Refactor - update command responses to use Response for empty responses - **webelement**: simplify wait_until condition mapping ## 2.6.0 (2025-08-10) ### Feat - add DownloadTimeout exception for file download timeouts - add context manager for handling file downloads in Tab class ### Refactor - add type checking for connection handler in mixin class - add type overloads for event callback in Browser class ## 2.5.0 (2025-08-07) ### Feat - add HTTP client functionality using the browser's fetch API - add HTTP response object for browser-based fetch requests - implement Request class for HTTP requests using fetch API - add Request handling and improve network log retrieval methods ### Fix - reject cookies with empty names during parsing in Request class - refactor imports to include NotRequired and TypedDict from typing_extensions - update imports to use typing_extensions for compatibility reasons - check for None in events_enabled before updating params - remove unused event type aliases and clean up imports ### Refactor - depreciating headless argument in start method and adding it in to browser options properties - add asynchronous function for makeRequest in JavaScript - refactor imports for cleaner organization and improved clarity - refactor type hints in FindElementsMixin for clarity and type safety - refactor type hints and improve command method signatures - refactor event handling to use specific event types for clarity - refactor connection handler to use CDPEvent and typed commands - refactor storage command methods to return specific command types - refactor target command methods to use specific command types - refactor command return types to specific command classes - refactor page commands to use specific command types directly - refactor network commands to use specific command types - refactor input command methods to return specific command types - refactor fetch_commands to use updated type definitions - refactor enums to inherit from str for better compatibility - refactor DOM command types for improved code clarity and structure - refactor command and event parameter types for better typing - refactor command responses to use EmptyResponse where applicable - improve protocol types for target domain - improve protocol types for storage domain - refactor command response types for improved readability and consistency - improve protocol types for page domain - add IncludeWhitespace and RelationType enums to DOM types - improve protocol types for input domain - refactor AuthChallengeResponse and remove legacy definitions - remove legacy WindowBoundsDict for cleaner type definitions - add new TypedDicts and enums for runtime event parameters - refactor DOM event types and methods for better clarity and structure - refactor fetch command return types for better clarity and structure - enhance browser command functionality with new methods and types - add TypedDict and Enum definitions for emulation and debugging - improve protocol types for network domain ## 2.4.0 (2025-08-01) ### Feat - changing bool prefs to properties and adding support to user-data-dir preferences - adding prefs options customization - add overloads for find and query methods in FindElementsMixin - add method to retrieve parent element and its attributes - implements start_timeout option ### Fix - adding typehint and fixing some codes - removing options preferences private attributes - set default URL to 'about:blank' in create_target method - change navigation when creating a new tab - add type hinting support and update project description ### Refactor - remove redundant asterisk from find method overloads and reorganize query method overloads - refine type hint for response parameter and improve key check ## 2.3.1 (2025-07-12) ### Fix - refactor click_option_tag to use direct script reference - update script to use closest for more reliable DOM selection - improve selection script for higher accuracy - use correct class name and id selector in query() - add fetch command methods to handle request processing ### Refactor - change body type from dict to string in fetch command parameters - refactor continue_request and fulfill_request to use options - enhance continue_request and fulfill_request with new options ## 2.3.0 (2025-06-25) ### Feat - **connection**: Upgrade adapt websockets version to 14.0 ### Fix - refine selector condition to include attributes check ## 2.2.3 (2025-06-20) ### Fix - fix contextmanager for file upload ## 2.2.2 (2025-06-18) ### Fix - fix call_function_on parameters order ### Refactor - replace BeautifulSoup with custom HTML text extractor ## 2.2.1 (2025-06-16) ### Fix - fix call parameters order in call_function_on method ## 2.2.0 (2025-06-15) ### Feat - add method to retrieve non-extension opened tabs as Tab instances ### Refactor - refactor attribute assignments to include type annotations - implement singleton pattern for Tab instances by target_id ## 2.1.0 (2025-06-14) ### Feat - add new script-related exception classes for better handling - add functions to clean scripts and check return statements - add methods to retrieve network response body and logs ### Fix - click in the input before typing and fix documentation ### Refactor - add overloads for execute_script to improve type safety ## 2.0.1 (2025-06-08) ### Fix - fix private proxy configuration ## 2.0.0 (2025-06-08) ### BREAKING CHANGE - pydoll v2 finished ### Feat - intuitive way to interact with iframes - refactor Keys class to Key and add utility methods for enums - add Event TypedDict for standardized event structure - add TargetEvent enum for Chrome DevTools Protocol events - add StorageEvent enumeration for Chrome DevTools Protocol events - add RuntimeEvent enumeration for Chrome DevTools Protocol events - add PageEvent enumeration for Chrome DevTools Protocol events - add NetworkEvent enumeration for Chrome DevTools Protocol events - add InputEvent enum for Chrome DevTools input events - add FetchEvent enumeration for Chrome DevTools Protocol events - add DomEvent enumeration for Chrome DevTools Protocol events - add BrowserEvent enum for Chrome DevTools protocol events - add methods to enable and disable the runtime domain commands - add new enums for whitespace, axes, pseudo types, and modes - add DOM response types and corresponding response classes - add DOM command types and parameter definitions for pydoll - add enums for key, mouse, touch, and drag event types - add input command types for touch, mouse, and keyboard events - enhance TargetCommands class with new methods for targets management - add TypedDicts for target response types and browser contexts - add TypedDict definitions for target command parameters - add storage-related enumerations for bucket durability and types - enhance StorageCommands with new methods for data management - add storage response types and related classes for handling data - add storage command types using TypedDict for structured params - add new enumeration classes for serialization and object types - add runtime response types for handling various object previews - add initial runtime command types for protocol handling - add constants for various encoding, formats, and policies - add TypedDict definitions for page response types and results - add typed dictionaries for various page command parameters - add new command parameter classes for network resource handling - add TypedDict definitions for network response types - organize command types into structured imports and exports - add network command types and parameters for cookie management - add enums for cookie priorities, connection types, and encodings - add response classes for browser window target retrieval - setup mkdocs and install related packages - add async text property for retrieving element text ### Fix - remove target directory from .gitignore file - fix typo in USB_UNRESTRICTED constant for consistency - add new network command parameters and methods for cookies - change postData type from dict to string in ContinueRequestParams ### Refactor - refactor screenshot path handling and enhance error checking - refactor type hints from List to built-in list for consistency - refine XPath condition handling and ensure integer coordinates - refactor condition checks to ensure against None values - refactor exception handling and add browser path validation function - rename BrowserOptionsManager to ChromiumOptionsManager - refactor Edge class to use ChromiumOptionsManager and simplify path validation - refactor Chrome class to use Chromium-specific options manager - refactor Browser class to use options manager and improve methods - refactor Options class to ChromiumOptions and use type hints - refactor to create ChromiumOptionsManager for better clarity - add abstract base classes for browser options management - use `message.get('id')` for safer ID checks in response - refactor message handling to support multiple message types - refactor element finding methods for enhanced flexibility and clarity - rename method for better clarity in captcha element handling - refactor type hints for event callback parameters and options - simplify ping call by inlining WebSocketClientProtocol cast - refactor EventsManager to use typed Event objects consistently - add runtime events management to the Tab class functionality - update event callback signatures for better type handling - remove unused import of Response in runtime_commands.py - add Response import to page_commands for improved functionality - refactor response classes to use TypedDict for better typing - refactor WebElement class to organize exception imports clearly - refactor exception handling in FindElementsMixin class - refactor exception handling to use custom timeout and connection errors - remove unused import statements in events_manager.py - refactor error handling to use specific exceptions for clarity - refactor error handling to use custom exception for arguments - fix PermissionError raising in TempDirectoryManager class - refactor error handling to use specific exceptions for clarity - handle unsupported OS with a custom exception in Edge class - raise UnsupportedOS exception for unsupported operating systems - refactor browser error handling and improve method return types - refactor exception classes to improve organization and clarity - refactor element finding methods to use updated command structure - refactor WebElement class for improved structure and clarity - refactor import statements and clean up code formatting - refactor command imports and enhance download behavior method - refactor Tab import and update FetchCommands method calls - refactor ConnectionHandler docstrings for clarity and conciseness - refactor command and event managers for improved type safety - refactor ConnectionHandler to improve WebSocket management and clarity - add Tab class for managing browser tabs via CDP integration - enhance TempDirectoryManager with detailed docstrings and type hints - refactor ProxyManager to enhance proxy credential handling - refactor Browser class to enhance automation capabilities and structure - move commands to a different module - define base structures for commands and responses in protocol - import Rect from dom_commands_types for response handling - refactor cookie-related types for improved clarity and consistency - remove unnecessary whitespace in docstring of InputCommands class - refactor DOM commands to improve structure and add functionality - refactor InputCommands to enhance user input simulation methods - add CookieParam TypedDict to define cookie attributes - add new runtime command methods for JavaScript bindings and promises - remove unused method to clear accepted encodings in network commands - update ResetPermissionsParams to use NotRequired for context ID - refactor PageCommands to improve structure and add type hints - simplify import statements by using wildcard imports for responses - add new response types and update existing response classes - consolidate command imports using wildcard imports for clarity - correct post_data type from dict to str in FetchCommands class - refactor NetworkCommands to use structured command parameters - refactor fetch command methods to use static methods directly - refactor BrowserCommands to use static methods and improve clarity - refactor response imports and update __all__ definitions - refactor import statements for better readability and structure - refactor import statements for consistency in response types - refactor import and rename EnableParams to FetchEnableParams - refactor import statement for CommandParams module path - refactor fetch command templates to use Command class - add enums for window states, download behaviors, and permissions - remove unused enum imports and rename base_types module - refactor command structures for better organization and clarity - rename command and response modules for better clarity - refactor imports for better organization and readability - add browser command methods for version, permissions, and downloads - add command and response types for protocol implementation - refactor execute_command to use type annotations for clarity - refactor command methods to specify response types in BrowserCommands - refactor command structures and introduce base CommandParams class - refactor browser command constants to use Command class type - refactor connection imports and rename manager files for clarity - refactor BrowserType import to a common constants module - refactor browser modules to use the new chromium structure - refactor element imports and remove deprecated element file - refactor import paths to use the protocol submodule structure - move command files to the protocol directory for better structure - rename insert_text to paste_text and remove unused files - refactor the `InputCommands` class to enhance clarity and simplicity in its operations - add deprecation warning to get_element_text() ## 1.7.0 (2025-04-06) ### Feat - refactor captcha handling with adjustable wait times and parameters ## 1.6.0 (2025-04-06) ### Feat - add connect method to handle existing port scenarios - create enable_auto_solve_cloudflare_captcha method - add context manager to bypass Cloudflare Turnstile captcha ## 1.5.1 (2025-03-31) ### Fix - handle headers input as list or dictionary in fetch command ## 1.5.0 (2025-03-26) ### Feat - add flag to run browser on headless mode on start function ### Fix - Wait for the file `CrashpadMetrics-active.pma` to be deoccupied and cleaned up - Catch websockets.ConnectionClosed errors on duplicate close() - move connection closed log inside if statement ## 1.4.0 (2025-03-23) ### Feat - Update initialize_options method to allow optional browser_type parameter - Refactor Edge browser options handling to use EdgeOptions class - Supports initialization options based on browser type - Edge browser constructors to support optional connection port parameters - Add Microsoft Edge browser support - 为 Edge 浏览器添加默认用户数据目录支持 - Add Microsoft Edge browser support ### Refactor - Clean up imports and improve code formatting across browser modules - Simplify user data directory setup and enhance Edge browser path handling ## 1.3.3 (2025-03-18) ### Fix - solve browser invalid domain events issue - improve process termination - improve process management and deactivate websockets connection size limit ### Refactor - import commands and evebts from __init__.py ## 1.3.2 (2025-03-13) ### Fix - fixed the tests and used lint for the OS multi path support - support multiple default Chrome paths on each OS ## 1.3.1 (2025-03-12) ### Fix - remove unnecessary encoding from screenshot response data ## 1.3.0 (2025-03-12) ### Feat - add method to retrieve screenshot as base64 encoded string ## 1.2.4 (2025-03-11) ### Fix - refactor Chrome constructor to use Optional for parameters ## 1.2.3 (2025-03-11) ### Fix - refactor proxy configuration retrieval for cleaner code flow ## 1.2.2 (2025-03-10) ### Fix - Get file extension from file path and changes use of reserved word 'format' to 'fmt' ## 1.2.1 (2025-03-09) ### Fix - resolve issue #29 where browser path was not found on macOS - Quickstart code given in README is wrong ## 1.2.0 (2025-02-11) ### Feat - add close method and command to Page class functionality ## 1.1.0 (2025-02-11) ### Feat - add method to retrieve Page instance by its ID in Browser class ## 1.0.1 (2025-02-10) ### Fix - add dialog property to ConnectionHandler and manage dialog state ## 1.0.0 (2025-02-05) ### BREAKING CHANGE - now you'll have to use By.CSS_SELECTOR instead of By.CSS ### Feat - refactor import and export statements for better readability - update changelog for version 0.7.0 and fix dependency versions - add ping method to ConnectionHandler for browser connectivity check - add tests for BrowserCommands in test_browser_commands.py ### Fix - add initial module files for commands, connection, events, and mixins - add connection port parameter to Chrome browser initialization - use deepcopy for templates to prevent mutation issues ### Refactor - rename constant CSS to CSS_SELECTOR - add command imports and remove obsolete connection handler code - refactor methods to be static in ConnectionHandler class - refactor proxy configuration and cleanup logic in Browser class - refactor ConnectionHandler to improve WebSocket management logic - refactor Browser class initialization for better clarity and structure - refactor Browser initialization to enhance flexibility and defaults - refactor import statement for ConnectionHandler module - refactor import paths for ConnectionHandler in browser modules - implement ConnectionHandler for WebSocket browser automation - implement command and event management for asynchronous processing - remove unnecessary logging for WebSocket address fetching - refactor Chrome class to use BrowserOptionsManager for path validation - implement proxy and browser management in the new managers module - refactor Browser class to use manager classes for better structure - refactor DOM command scripts for clarity and efficiency ## 0.7.0 (2024-12-09) ### Feat - autoremove dialog from connection_handler when closed - add handle_dialog method to PageCommands class - add dialog handling methods to Page class - add support for handling JavaScript dialog opening events - refactor network response handling for base64 encoding support - add clipping option for screenshots and implement element capture ### Fix - index error on method get_dialog_message - update screenshot format from 'jpg' to 'jpeg' for consistency - handle potential IndexError when retrieving valid page targetId - filter valid pages using URL condition instead of title check ### Refactor - run ruff formatter to ensure code consistency - run ruff formatter to ensure code consistency - change screenshot format from PNG to JPG in commands and element ## 0.6.0 (2024-11-18) ### Feat - add callback ID handling for page load events in Page class - update event registration to return callback IDs and add removal - refactor DOM commands to use object_id instead of node_id ### Fix - refactor page navigation and loading logic for efficiency - add page reload after navigating to a new URL in Page class - refactor URL navigation to use evaluate_script for efficiency - implement page refresh on URL unchanged and add navigation event - update object ID reference in Page class for clarity - refactor element search logic to simplify error handling - DomCommands using `object_id` instead of `node_id` to prevent bugs - handle OSError when cleaning up temporary directories in Browser ### Refactor - change error log to warning for missing callback ID - refactor DOM command scripts for improved readability and reuse - rename methods for clarity and consistency in WebElement class - refactor parameter names for consistency in target methods - normalize variable naming for consistency in fetch commands ## 0.5.1 (2024-11-12) ### Fix - simplify outer HTML retrieval for consistent object handling - refactor click method to check option tag earlier in flow - refactor bounding box retrieval to access nested response value - handle KeyError instead of IndexError for element bounds retrieval - enhance DOM command methods and rename for clarity and consistency - add JavaScript bounding box retrieval for web elements - remove redundant top-checks for element clicks in WebElement ## 0.5.0 (2024-11-11) ### Feat - add method to generate command for calling a function on an object - implement script execution and visibility checks in click method - add JavaScript functions for element visibility and interaction ### Refactor - enhance exception classes with descriptive error messages - simplify command creation by using RuntimeCommands.evaluate_script - refactor JavaScript execution and introduce runtime commands ## 0.4.4 (2024-11-11) ### Fix - remove redundant DOM content loaded event handling logic ## 0.4.3 (2024-11-11) ### Fix - rename event variables for clarity and improve timeout handling ### Refactor - remove debug print statement from connection event handling ## 0.4.2 (2024-11-11) ### Fix - update event handling to use DOM_CONTENT_LOADED for page load - convert Browser context management to async methods ### Refactor - fix string formatting in logger info message for clarity ## 0.4.1 (2024-11-08) ### Fix - fixes workflow removing unnecessary hifen - reduce sleep duration in key press handling for improved speed ## 0.4.0 (2024-11-08) ### Feat - add type_keys method for realistic key input simulation ## 0.3.1 (2024-11-08) ### Fix - addning new package version - removing encode utf8 in get_pdf_base64 ## 0.3.0 (2024-11-08) ### Feat - set_download_path added in browser class methods ## 0.2.0 (2024-11-08) ### Feat - dynamic lib version using pyproject ## 0.1.1 (2024-11-07) ### Fix - ensure browser process terminates after executing close command ## 0.1.0 (2024-11-07) ### Feat - add method to delete all cookies from the browser session - add is_enabled property to check element's enabled status - add option to raise exception in wait_element method - add method to set browser download path via command - refactor text extraction using BeautifulSoup for accuracy - add method to get properties and improve XPath handling - refactor text retrieval methods and improve code readability - add timeout parameter to page navigation and loading methods - add cookie management and scroll into view functionality - add method to retrieve page PDF data as base64 string - add async property to retrieve inner HTML of the element - add async page_source property to retrieve page source code - add async property to retrieve the current page URL - add method to find multiple DOM elements using selectors - refactor WebElement to use FindElementsMixin for clarity - add FindElementsMixin for asynchronous DOM element handling - add methods to retrieve network response bodies from logs - add method to retrieve matching network logs from the page - add cookie management methods to the Browser class - add ElementNotFound exception to handle missing elements - add value property and handle option tag clicks in WebElement - rename FIND_ELEMENT_XPATH_TEMPLATE to EVALUATE_TEMPLATE - add exception handling for element not found in find_element method - downgrade Python version requirement to 3.10 in pyproject.toml - add async function to fetch browser WebSocket address - simplify text input handling by using insert_text command - add TargetCommands class for managing target operations - add method to generate command for disabling the Page domain - add method to generate text insertion commands for inputs - add Page class to manage browser page interactions and events - add page management methods to the Browser class - add detailed logging for command responses and event handling - add event classes for browser, DOM, fetch, and network actions - add NetworkCommands class for managing network operations - implement fetch command methods for handling requests and responses - add method to enable DOM domain events in DomCommands class - add proxy configuration and fetch event handling to Browser - refactor connection errors to use custom exceptions for clarity - add methods to clear callbacks and close WebSocket connection - remove unnecessary newline at the end of PageEvents class file - add context managers and async file handling for efficiency - implement singleton pattern and prevent multiple initializations - add dynamic connection port handling for browser instance - add temporary directory management for browser session storage - add logging for connection events and command executions - add PageEvents class with PAGE_LOADED event constant - add temporary callback option to event registration method - add page event handling and improve loading timeout management - add utility function to decode base64 images to bytes - add WebElement class for handling browser elements asynchronously - add enumeration for selector types in constants module - add PageCommands class for browser page control functions - add InputCommands class for handling mouse and keyboard events - implement DOM commands for interacting with web elements - refactor BrowserCommands to include new window management methods - implement some basic methods to navigate and control the browser instance - enhance ConnectionHandler with detailed docstrings for methods - add .gitignore, .python-version, and poetry.lock files ### Fix - browser context now uses the storage commands to get cookies, while the page context us cookies, while page context uses network - update cookie retrieval to use NetworkCommands for consistency - remove download path method from Browser and add to Page class - add options to disable first-run and browser check flags - handle KeyError when retrieving network response bodies - use get() to safely retrieve attributes in WebElement class - rename class attribute retrieval for clarity and consistency - enhance get_properties and simplify text retrieval method - enhance create_web_element call with additional value parameter - fix incorrect key access in JavaScript evaluation result - update cookie management to clear browser cookies correctly - filter pages by title instead of URL in Browser class - filter out non-page entries when fetching valid page IDs - xpath element solved - refactor event callback storage to use unique callback IDs - add JavaScript execution method and enhance click offsets - simplify response handling and improve event callback structure - reorder page event enabling to ensure proper browser startup - add JSON handling and improve WebSocket command execution ### Refactor - improve WebElement representation and handle None for nodeValue - add newline at end of file for ElementNotFound exception class - remove unused aiohttp import and clean up whitespace - remove unnecessary blank lines in storage.py for clarity - fix missing newline at the end of the file in page.py - remove unnecessary whitespace in InputCommands class methods - refactor DOM command methods for improved clarity and usability - refactor Page class to inherit from FindElementsMixin - refactor code to remove duplicate import of StorageCommands - clarify error messages for command and callback validation - refactor ConnectionHandler to simplify initialization and connect logic - remove unnecessary whitespace in element.py for cleaner code - refactor WebElement to enhance attribute retrieval methods - refactor connection handling and improve error messaging - refactor Browser class to use abstract base class and commands ================================================ FILE: CONTRIBUTING.md ================================================ # Contributing Guide Thank you for your interest in contributing to the project! This document provides guidelines and instructions to help you contribute effectively. ## Table of Contents - [Environment Setup](#environment-setup) - [Development Workflow](#development-workflow) - [Code Standards](#code-standards) - [Testing](#testing) - [Commit Messages](#commit-messages) - [Pull Request Process](#pull-request-process) ## Environment Setup ### Prerequisites - Python 3.10 or higher - [Poetry](https://python-poetry.org/docs/#installation) for dependency management ### Installation 1. Clone the repository: ```bash git clone [REPOSITORY_URL] cd pydoll ``` 2. Install dependencies using Poetry: ```bash poetry install ``` 3. Activate the virtual environment: ```bash poetry shell ``` ## Development Workflow 1. Create a new branch for your contribution: ```bash git checkout -b feature/your-feature-name ``` or ```bash git checkout -b fix/your-fix-name ``` 2. Make your changes following the code and testing guidelines. 3. Check your code using the linter: ```bash poetry run task lint ``` 4. Format your code: ```bash poetry run task format ``` 5. Run the tests to ensure everything is working: ```bash poetry run task test ``` 6. Commit your changes following the commit conventions (see below). 7. Push your changes and open a Pull Request. ## Code Standards This project uses [Ruff](https://github.com/charliermarsh/ruff) for linting and code formatting. The code standards are defined in the `pyproject.toml` file. ### Linting and Formatting To check if your code follows the standards: ```bash poetry run task lint ``` To automatically fix some issues and format your code: ```bash poetry run task format ``` **Important:** Make sure to resolve all linting issues before submitting your changes. Code that doesn't pass the linting checks will not be accepted. ## Testing ### Writing Tests For each new feature or modification, it is **mandatory** to write corresponding tests. We use `pytest` for testing. - Tests should be placed in the `tests/` directory - Test file names should start with `test_` - Test function names should start with `test_` ### Running Tests To run all tests: ```bash poetry run task test ``` This will also generate a code coverage report (HTML) that can be viewed in the `htmlcov/` folder. ## Commit Messages This project follows the [Conventional Commits](https://www.conventionalcommits.org/) standard for commit messages. We use the `commitizen` tool to facilitate the creation of standardized commits. ### Commit Message Structure ``` [optional scope]: [optional body] [optional footer(s)] ``` ### Commit Types - **feat**: A new feature - **fix**: A bug fix - **docs**: Documentation-only changes - **style**: Changes that do not affect the meaning of the code (whitespace, formatting, etc.) - **refactor**: A code change that neither fixes a bug nor adds a feature - **perf**: A code change that improves performance - **test**: Adding or correcting tests - **build**: Changes that affect the build system or external dependencies - **ci**: Changes to CI configuration files - **chore**: Other changes that don't modify src or test files ### Examples of Good Commit Messages ``` feat(parser): add ability to parse arrays ``` ``` fix(networking): resolve connection timeout issue A problem was identified in the networking library that caused unexpected timeouts. This change increases the default timeout from 10s to 30s. ``` ## Pull Request Process 1. Verify that your code passes all tests and linting checks. 2. Push your branch to the repository. 3. Open a Pull Request to the main branch. 4. In the PR description, clearly explain what was changed and why. 5. Link any related issues to your PR. 6. Wait for the code review. Read the comments and make necessary changes. ## Questions? If you have questions or need help, open an issue in the repository or contact the project maintainers. --- We appreciate your contributions to make this project better! ================================================ FILE: LICENSE ================================================ The MIT License (MIT) Copyright © 2025 AutoscrapeLabs Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: README.md ================================================

Pydoll Logo

Async-native, fully typed, built for evasion and performance.

Tests Ruff CI MyPy CI Python >= 3.10 Ask DeepWiki

Documentation · Getting Started · Features · Support

Pydoll automates Chromium-based browsers (Chrome, Edge) by connecting directly to the Chrome DevTools Protocol over WebSocket. No WebDriver binary, no `navigator.webdriver` flag, no compatibility issues. It combines a high-level API for common tasks with low-level CDP access for fine-grained control over network, fingerprinting, and browser behavior. The entire codebase is async-native and fully type-checked with mypy. ### Top Sponsors The Web Scraping Club Read a full review of Pydoll on The Web Scraping Club, the #1 newsletter dedicated to web scraping. ### Sponsors
Thordata CapSolver LambdaTest
[Learn more about our sponsors](SPONSORS.md) · [Become a sponsor](https://github.com/sponsors/thalissonvs) ### Why Pydoll - **Stealth-first**: Human-like mouse movement, realistic typing, and granular [browser preference](https://pydoll.tech/docs/features/configuration/browser-preferences/) control for fingerprint management. - **Async and typed**: Built on `asyncio` from the ground up, 100% type-checked with `mypy`. Full IDE autocompletion and static error checking. - **Network control**: [Intercept](https://pydoll.tech/docs/features/network/interception/) requests to block ads/trackers, [monitor](https://pydoll.tech/docs/features/network/monitoring/) traffic for API discovery, and make [authenticated HTTP requests](https://pydoll.tech/docs/features/network/http-requests/) that inherit the browser session. - **Shadow DOM and iframes**: Full support for [shadow roots](https://pydoll.tech/docs/deep-dive/architecture/shadow-dom/) (including closed) and cross-origin iframes. Discover, query, and interact with elements inside them using the same API. - **Ergonomic API**: `tab.find()` for most cases, `tab.query()` for complex [CSS/XPath selectors](https://pydoll.tech/docs/deep-dive/guides/selectors-guide/). ## Installation ```bash pip install pydoll-python ``` No WebDriver binaries or external dependencies required. ## What's New
HAR Network Recording
Record network activity during a browser session and export as HAR 1.2. Replay recorded requests to reproduce exact API sequences. ```python from pydoll.browser.chromium import Chrome async with Chrome() as browser: tab = await browser.start() async with tab.request.record() as capture: await tab.go_to('https://example.com') capture.save('flow.har') print(f'Captured {len(capture.entries)} requests') responses = await tab.request.replay('flow.har') ``` Filter by resource type: ```python from pydoll.protocol.network.types import ResourceType async with tab.request.record( resource_types=[ResourceType.FETCH, ResourceType.XHR] ) as capture: await tab.go_to('https://example.com') ``` [HAR Recording Docs](https://pydoll.tech/docs/features/network/network-recording/)
Page Bundles
Save the current page and all its assets (CSS, JS, images, fonts) as a `.zip` bundle for offline viewing. Optionally inline everything into a single HTML file. ```python await tab.save_bundle('page.zip') await tab.save_bundle('page-inline.zip', inline_assets=True) ``` [Screenshots, PDFs & Bundles Docs](https://pydoll.tech/docs/features/automation/screenshots-and-pdfs/)
Shadow DOM Support
Full Shadow DOM support, including closed shadow roots. Because Pydoll operates at the CDP level (below JavaScript), the `closed` mode restriction doesn't apply. ```python shadow = await element.get_shadow_root() button = await shadow.query('.internal-btn') await button.click() # Discover all shadow roots on the page shadow_roots = await tab.find_shadow_roots() for sr in shadow_roots: checkbox = await sr.query('input[type="checkbox"]', raise_exc=False) if checkbox: await checkbox.click() ``` Highlights: - Closed shadow roots work without workarounds - `find_shadow_roots()` discovers every shadow root on the page - `timeout` parameter for polling until shadow roots appear - `deep=True` traverses cross-origin iframes (OOPIFs) - Standard `find()`, `query()`, `click()` API inside shadow roots ```python # Cloudflare Turnstile inside a cross-origin iframe shadow_roots = await tab.find_shadow_roots(deep=True, timeout=10) for sr in shadow_roots: checkbox = await sr.query('input[type="checkbox"]', raise_exc=False) if checkbox: await checkbox.click() ``` [Shadow DOM Docs](https://pydoll.tech/docs/deep-dive/architecture/shadow-dom/)
Humanized Mouse Movement
Mouse operations produce human-like cursor movement by default: - **Bezier curve paths** with asymmetric control points - **Fitts's Law timing**: duration scales with distance - **Minimum-jerk velocity**: bell-shaped speed profile - **Physiological tremor**: Gaussian noise scaled with velocity - **Overshoot correction**: ~70% chance on fast movements, then corrects back ```python await tab.mouse.move(500, 300) await tab.mouse.click(500, 300) await tab.mouse.drag(100, 200, 500, 400) button = await tab.find(id='submit') await button.click() # Opt out when speed matters await tab.mouse.click(500, 300, humanize=False) ``` [Mouse Control Docs](https://pydoll.tech/docs/features/automation/mouse-control/)
## Getting Started ```python import asyncio from pydoll.browser import Chrome from pydoll.constants import Key async def google_search(query: str): async with Chrome() as browser: tab = await browser.start() await tab.go_to('https://www.google.com') search_box = await tab.find(tag_name='textarea', name='q') await search_box.insert_text(query) await tab.keyboard.press(Key.ENTER) first_result = await tab.find( tag_name='h3', text='autoscrape-labs/pydoll', timeout=10, ) await first_result.click() await tab.find(id='repository-container-header', timeout=10) print(f"Page loaded: {await tab.title}") asyncio.run(google_search('pydoll site:github.com')) ``` ## Features
Hybrid Automation (UI + API)
Use UI automation to pass login flows (CAPTCHAs, JS challenges), then switch to `tab.request` for fast API calls that inherit the full browser session: cookies, headers, and all. ```python # Log in via UI await tab.go_to('https://my-site.com/login') await (await tab.find(id='username')).type_text('user') await (await tab.find(id='password')).type_text('pass123') await (await tab.find(id='login-btn')).click() # Make authenticated API calls using the browser session response = await tab.request.get('https://my-site.com/api/user/profile') user_data = response.json() ``` [Hybrid Automation Docs](https://pydoll.tech/docs/features/network/http-requests/)
Network Interception and Monitoring
Monitor traffic for API discovery or intercept requests to block ads, trackers, and unnecessary resources. ```python import asyncio from pydoll.browser.chromium import Chrome from pydoll.protocol.fetch.events import FetchEvent, RequestPausedEvent from pydoll.protocol.network.types import ErrorReason async def block_images(): async with Chrome() as browser: tab = await browser.start() async def block_resource(event: RequestPausedEvent): request_id = event['params']['requestId'] resource_type = event['params']['resourceType'] if resource_type in ['Image', 'Stylesheet']: await tab.fail_request(request_id, ErrorReason.BLOCKED_BY_CLIENT) else: await tab.continue_request(request_id) await tab.enable_fetch_events() await tab.on(FetchEvent.REQUEST_PAUSED, block_resource) await tab.go_to('https://example.com') await asyncio.sleep(3) await tab.disable_fetch_events() asyncio.run(block_images()) ``` [Network Monitoring](https://pydoll.tech/docs/features/network/monitoring/) | [Request Interception](https://pydoll.tech/docs/features/network/interception/)
Browser Fingerprint Control
Granular control over [browser preferences](https://pydoll.tech/docs/features/configuration/browser-preferences/): hundreds of internal Chrome settings for building consistent fingerprints. ```python options = ChromiumOptions() options.browser_preferences = { 'profile': { 'default_content_setting_values': { 'notifications': 2, 'geolocation': 2, }, 'password_manager_enabled': False }, 'intl': { 'accept_languages': 'en-US,en', }, 'browser': { 'check_default_browser': False, } } ``` [Browser Preferences Guide](https://pydoll.tech/docs/features/configuration/browser-preferences/)
Concurrency, Contexts and Remote Connections
Manage [multiple tabs](https://pydoll.tech/docs/features/browser-management/tabs/) and [browser contexts](https://pydoll.tech/docs/features/browser-management/contexts/) (isolated sessions) concurrently. Connect to browsers running in Docker or remote servers. ```python async def scrape_page(url, tab): await tab.go_to(url) return await tab.title async def concurrent_scraping(): async with Chrome() as browser: tab_google = await browser.start() tab_ddg = await browser.new_tab() results = await asyncio.gather( scrape_page('https://google.com/', tab_google), scrape_page('https://duckduckgo.com/', tab_ddg) ) print(results) ``` [Multi-Tab Management](https://pydoll.tech/docs/features/browser-management/tabs/) | [Remote Connections](https://pydoll.tech/docs/features/advanced/remote-connections/)
Retry Decorator
The `@retry` decorator supports custom recovery logic between attempts (e.g., refreshing the page, rotating proxies) and exponential backoff. ```python from pydoll.decorators import retry from pydoll.exceptions import ElementNotFound, NetworkError @retry( max_retries=3, exceptions=[ElementNotFound, NetworkError], on_retry=my_recovery_function, exponential_backoff=True ) async def scrape_product(self, url: str): # scraping logic ... ``` [Retry Decorator Docs](https://pydoll.tech/docs/features/advanced/decorators/)
--- ## Contributing Contributions are welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. ## Support If you find Pydoll useful, consider [sponsoring the project on GitHub](https://github.com/sponsors/thalissonvs). ## License [MIT License](LICENSE) ================================================ FILE: README_zh.md ================================================

Pydoll Logo

Pydoll: Automate the Web, Naturally

Tests Ruff CI MyPy CI Python >= 3.10 Ask DeepWiki

📖 文档 • 🚀 快速上手 • ⚡ 高级特性 • 🤝 贡献 • 💖 赞助我

- [English](README.md) 设想以下场景:你需要实现浏览器任务的自动化操作——无论是测试Web应用程序、从网站采集数据,还是批量处理重复性流程。传统方法往往需要配置外部驱动程序、进行复杂的系统设置,还可能面临诸多兼容性问题。 **Pydoll的诞生就是解决这些问题!!!** Pydoll 采用全新设计理念,从零构建,直接对接 Chrome DevTools Protocol(CDP),无需依赖外部驱动。 这种精简的实现方式,结合高度拟真的点击、导航及元素交互机制,使其行为与真实用户几乎毫无区别。 我们坚信,真正强大的自动化工具,不应让用户困于繁琐的配置学习,也不该让用户疲于应对反爬系统的风控。使用Pydoll,你只需专注核心业务逻辑——让自动化回归本质,而非纠缠于底层技术细节或防护机制。

做一个好人,给我们一个星星 ⭐

没有星星,就没有Bug修复。开玩笑的(也许)
## 🌟 Pydoll 的核心优势 - **零 WebDriver 依赖**:彻底告别驱动兼容性烦恼 - **类人交互引擎**:能够通过行为验证码如 reCAPTCHA v3 或 Turnstile,取决于 IP 声誉和交互模式 - **异步高性能**:支持高速自动化与多任务并行处理 - **拟真交互体验**:完美复刻真实用户行为模式 - **极简部署**:安装即用,开箱即自动化 ## 最新功能 ### 类人页面滚动 —— 像真实用户一样滚动! 现在你可以控制页面滚动,支持平滑动画并自动等待完成: ```python from pydoll.constants import ScrollPosition # 带平滑动画向下滚动(等待完成) await tab.scroll.by(ScrollPosition.DOWN, 500, smooth=True) # 导航至特定位置 await tab.scroll.to_bottom(smooth=True) await tab.scroll.to_top(smooth=True) # 需要速度时的即时滚动 await tab.scroll.by(ScrollPosition.UP, 300, smooth=False) ``` 不同于立即返回的 `execute_script("window.scrollBy(...)")`,滚动API使用CDP的`awaitPromise`等待浏览器的`scrollend`事件,确保后续操作仅在滚动完全完成后执行。非常适合截取屏幕截图、加载延迟内容或创建真实的阅读模式。 ### 键盘 API —— 完全控制键盘输入 全新的 `KeyboardAPI` 为页面级别的所有键盘交互提供了简洁、集中的接口: ```python from pydoll.constants import Key # 按单个键 await tab.keyboard.press(Key.ENTER) await tab.keyboard.press(Key.TAB) # 使用快捷键/组合键(最多3个键) await tab.keyboard.hotkey(Key.CONTROL, Key.A) # 全选(有效!) await tab.keyboard.hotkey(Key.CONTROL, Key.C) # 复制(有效!) await tab.keyboard.hotkey(Key.CONTROL, Key.SHIFT, Key.ARROWRIGHT) # 向右选择单词 # 复杂序列的手动控制 await tab.keyboard.down(Key.SHIFT) await tab.keyboard.press(Key.ARROWRIGHT) # 按住 Shift 选择文本 await tab.keyboard.up(Key.SHIFT) ``` **主要改进:** - **集中化**:所有键盘操作通过 `tab.keyboard` 访问 - **智能修饰键检测**:快捷键自动检测并应用修饰键(Ctrl、Shift、Alt、Meta) - **完整按键支持**:26个字母(A-Z)、10个数字(0-9)、所有功能键、数字键盘和特殊键 - **页面级快捷键**:适用于 Ctrl+C、Ctrl+V、Ctrl+A 等(由于 CDP 限制,浏览器 UI 快捷键不起作用) > **⚠️ CDP 限制:** 浏览器 UI 快捷键(如 Ctrl+T 打开新标签,F12 打开开发者工具)通过 CDP 无法使用。请改用 Pydoll 的方法:`await browser.new_tab()`、`await tab.close()`。 ### Retry 装饰器:生产级错误恢复 使用 `@retry` 装饰器将脆弱的脚本转变为强大的生产级爬虫。通过指数退避和自定义恢复策略,自动从网络故障、超时和临时错误中恢复: ```python import asyncio from pydoll.browser.chromium import Chrome from pydoll.decorators import retry from pydoll.exceptions import ElementNotFound, NetworkError class ProductScraper: def __init__(self): self.tab = None self.retry_count = 0 # 在每次重试前执行的恢复回调 async def recover_from_failure(self): self.retry_count += 1 print(f"尝试 {self.retry_count} 失败。恢复中...") # 刷新页面并恢复状态 if self.tab: await self.tab.refresh() await asyncio.sleep(2) @retry( max_retries=3, exceptions=[ElementNotFound, NetworkError], on_retry=recover_from_failure, # 执行恢复逻辑 delay=2.0, exponential_backoff=True ) async def scrape_product(self, url: str): if not self.tab: browser = Chrome() self.tab = await browser.start() await self.tab.go_to(url) title = await self.tab.find(class_name='product-title', timeout=5) return await title.text ``` **强大功能:** - **智能重试逻辑**:仅对您定义的特定异常重试 - **指数退避**:逐步增加等待时间(1秒 → 2秒 → 4秒 → 8秒) - **恢复回调**:在重试之间执行自定义逻辑(刷新页面、切换代理、重启浏览器) - **生产验证**:自信地处理真实世界爬虫的混乱情况 非常适合处理速率限制、网络不稳定、动态内容加载和验证码检测。将不可靠的爬虫转变为防弹自动化。 [**📖 完整文档**](https://pydoll.tech/docs/zh/features/advanced/decorators/) ### 通过 WebSocket 进行远程连接 —— 随时随地控制浏览器! 现在你可以使用浏览器的 WebSocket 地址直接连接到已运行的实例,并立即使用完整的 Pydoll API: ```python from pydoll.browser.chromium import Chrome chrome = Chrome() tab = await chrome.connect('ws://YOUR_HOST:9222/devtools/browser/XXXX') # 直接开干:导航、元素自动化、请求、事件… await tab.go_to('https://example.com') title = await tab.execute_script('return document.title') print(title) ``` 这让你可以轻松对接远程/CI 浏览器、容器或共享调试目标——无需本地启动,只需指向 WS 端点即可自动化。 ### 像专业人士一样漫游 DOM:get_children_elements() 与 get_siblings_elements() 两个让复杂布局遍历更优雅的小助手: ```python # 获取容器的直接子元素 container = await tab.find(id='cards') cards = await container.get_children_elements(max_depth=1) # 想更深入?这将返回子元素的子元素(以此类推) elements = await container.get_children_elements(max_depth=2) # 在横向列表中无痛遍历兄弟元素 active = await tab.find(class_name='item--active') siblings = await active.get_siblings_elements() print(len(cards), len(siblings)) ``` 用更少样板代码表达更多意图,特别适合动态网格、列表与菜单的场景,让抓取/自动化逻辑更清晰、更可读。 ### WebElement:状态等待与新的公共 API - 新增 `wait_until(...)` 用于等待元素状态,使用更简单: ```python # 等待元素变为可见,直到超时 await element.wait_until(is_visible=True, timeout=5) # 等待元素变为可交互(可见、位于顶层并可接收事件) await element.wait_until(is_interactable=True, timeout=10) ``` - 以下 `WebElement` 方法现已公开: - `is_visible()` - 判断元素是否具有可见区域、未被 CSS 隐藏,并在需要时滚动进入视口。适用于交互前的快速校验。 - `is_interactable()` - “可点击”状态:综合可见性、启用状态与指针事件命中等条件,适合构建更稳健的交互流程。 - `is_on_top()` - 检查元素在点击位置是否为顶部命中目标,避免被覆盖导致点击失效。 - `execute_script(script: str, return_by_value: bool = False)` - 在元素上下文中执行 JavaScript(this 指向该元素),便于细粒度调整与快速检查。 ```python # 使用 JS 高亮元素 await element.execute_script("this.style.outline='2px solid #22d3ee'") # 校验状态 visible = await element.is_visible() interactable = await element.is_interactable() on_top = await element.is_on_top() ``` 以上新增能力能显著简化“等待+验证”场景,降低自动化过程中的不稳定性,使用例更可预测。 ### 浏览器上下文 HTTP 请求 - 混合自动化的游戏规则改变者! 你是否曾经希望能够发出自动继承浏览器所有会话状态的 HTTP 请求?**现在你可以了!**
`tab.request` 属性为你提供了一个美观的 `requests` 风格接口,可在浏览器的 JavaScript 上下文中直接执行 HTTP 调用。这意味着每个请求都会自动获得 cookies、身份验证标头、CORS 策略和会话状态,就像浏览器本身发出请求一样。 **混合自动化的完美选择:** ```python # 使用 PyDoll 正常导航到网站并登录 await tab.go_to('https://example.com/login') await (await tab.find(id='username')).type_text('user@example.com') await (await tab.find(id='password')).type_text('password') await (await tab.find(id='login-btn')).click() # 现在发出继承已登录会话的 API 调用! response = await tab.request.get('https://example.com/api/user/profile') user_data = response.json() # 在保持身份验证的同时 POST 数据 response = await tab.request.post( 'https://example.com/api/settings', json={'theme': 'dark', 'notifications': True} ) # 以不同格式访问响应内容 raw_data = response.content text_data = response.text json_data = response.json() # 检查设置的 cookies for cookie in response.cookies: print(f"Cookie: {cookie['name']} = {cookie['value']}") # 向你的请求添加自定义标头 headers = [ {'name': 'X-Custom-Header', 'value': 'my-value'}, {'name': 'X-API-Version', 'value': '2.0'} ] await tab.request.get('https://api.example.com/data', headers=headers) ``` **为什么这很棒:** - **无需会话切换** - 请求自动继承浏览器 cookies - **CORS 无缝工作** - 请求遵循浏览器安全策略 - **现代 SPA 的完美选择** - 无缝混合 UI 自动化与 API 调用 - **身份验证变得简单** - 通过 UI 登录一次,然后调用 API - **混合工作流** - 为每个步骤使用最佳工具(UI 或 API) 这为需要浏览器交互和 API 效率的自动化场景开启了令人难以置信的可能性! ### 使用自定义首选项完全控制浏览器!(感谢 [@LucasAlvws](https://github.com/LucasAlvws)) 想要完全自定义 Chrome 的行为?**现在你可以控制一切!**
新的 `browser_preferences` 系统让你可以访问数百个之前无法通过编程方式更改的内部 Chrome 设置。我们说的是远超命令行标志的深度浏览器自定义! **可能性是无限的:** ```python options = ChromiumOptions() # 创建完美的自动化环境 options.browser_preferences = { 'download': { 'default_directory': '/tmp/downloads', 'prompt_for_download': False, 'directory_upgrade': True, 'extensions_to_open': '' # 不自动打开任何下载 }, 'profile': { 'default_content_setting_values': { 'notifications': 2, # 阻止所有通知 'geolocation': 2, # 阻止位置请求 'media_stream_camera': 2, # 阻止摄像头访问 'media_stream_mic': 2, # 阻止麦克风访问 'popups': 1 # 允许弹窗(对自动化有用) }, 'password_manager_enabled': False, # 禁用密码提示 'exit_type': 'Normal' # 始终正常退出 }, 'intl': { 'accept_languages': 'zh-CN,zh,en-US,en', 'charset_default': 'UTF-8' }, 'browser': { 'check_default_browser': False, # 不询问默认浏览器 'show_update_promotion_infobar': False } } # 或使用便捷的辅助方法 options.set_default_download_directory('/tmp/downloads') options.set_accept_languages('zh-CN,zh,en-US,en') options.prompt_for_download = False ``` **实际应用的强大示例:** - **静默下载** - 无提示、无对话框,只有自动化下载 - **阻止所有干扰** - 通知、弹窗、摄像头请求,应有尽有 - **CI/CD 的完美选择** - 禁用更新检查、默认浏览器提示、崩溃报告 - **多区域测试** - 即时更改语言、时区和区域设置 - **安全加固** - 锁定权限并禁用不必要的功能 - **高级指纹控制** - 修改浏览器安装日期、参与历史和行为模式 **用于隐蔽自动化的指纹自定义:** ```python import time # 模拟一个已经存在几个月的浏览器 fake_engagement_time = int(time.time()) - (7 * 24 * 60 * 60) # 7天前 options.browser_preferences = { 'settings': { 'touchpad': { 'natural_scroll': True, } }, 'profile': { 'last_engagement_time': fake_engagement_time, 'exit_type': 'Normal', 'exited_cleanly': True }, 'newtab_page_location_override': 'https://www.google.com', 'session': { 'restore_on_startup': 1, # 恢复上次会话 'startup_urls': ['https://www.google.com'] } } ``` 这种控制级别以前只有 Chrome 扩展开发者才能使用 - 现在它在你的自动化工具包中! 查看[文档](https://pydoll.tech/docs/zh/features/#custom-browser-preferences/)了解更多详情。 ### 新的 `get_parent_element()` 方法 检索任何 WebElement 的父元素,使导航 DOM 结构更加容易: ```python element = await tab.find(id='button') parent = await element.get_parent_element() ``` ### 新的 start_timeout 选项 (感谢 [@j0j1j2](https://github.com/j0j1j2)) 添加到 ChromiumOptions 来控制浏览器启动可以花费多长时间。在较慢的机器或 CI 环境中很有用。 ```python options = ChromiumOptions() options.start_timeout = 20 # 等待 20 秒 ``` ### 新的 expect_download() 上下文管理器 —— 稳健、优雅的文件下载! 还在为不稳定的下载流程、丢失的文件或混乱的事件监听而头疼吗?`tab.expect_download()` 来了:一种可靠、简洁的下载方式。 - 自动配置浏览器下载行为 - 支持自定义下载目录或临时目录(自动清理!) - 内置超时等待,防止任务卡住 - 提供便捷句柄:读取字节/BASE64,获取 `file_path` 一个“开箱即用”的小示例: ```python import asyncio from pathlib import Path from pydoll.browser import Chrome async def download_report(): async with Chrome() as browser: tab = await browser.start() await tab.go_to('https://example.com/reports') target_dir = Path('/tmp/my-downloads') async with tab.expect_download(keep_file_at=target_dir, timeout=10) as dl: # 触发页面上的下载(按钮/链接等) await (await tab.find(text='Download latest report')).click() # 等待完成并读取内容 data = await dl.read_bytes() print(f"已下载 {len(data)} 字节,保存至: {dl.file_path}") asyncio.run(download_report()) ``` 想要“零成本清理”?不传 `keep_file_at` 即可——我们会创建临时目录,并在上下文退出后自动清理。对测试场景非常友好。 ## 📦 安装 ```bash pip install pydoll-python ``` 就这么简单!安装即用,马上开始自动化 ## 🚀 快速上手 ### 你的第一个自动化任务 让我们从一个实际例子开始:一个自动执行谷歌搜索并点击第一个结果的自动化流程。通过这个示例,你可以了解该库的工作原理,以及如何开始将日常任务自动化。 ```python import asyncio from pydoll.browser import Chrome from pydoll.constants import Key async def google_search(query: str): async with Chrome() as browser: tab = await browser.start() await tab.go_to('https://www.google.com') search_box = await tab.find(tag_name='textarea', name='q') await search_box.insert_text(query) await tab.keyboard.press(Key.ENTER) await (await tab.find( tag_name='h3', text='autoscrape-labs/pydoll', timeout=10, )).click() await tab.find(id='repository-container-header', timeout=10) asyncio.run(google_search('pydoll site:github.com')) ``` 无需任何配置,只需一个简单脚本,我们就能完成一次完整的谷歌搜索! 好了,现在让我们看看如何从网页中提取数据,依然沿用之前的示例。 假设在以下代码中,我们已经进入了 Pydoll 项目页面。我们需要提取以下信息: - 项目描述 - 星标数量 - Fork 数量 - Issue 数量 - Pull Request 数量 如果想要获取项目描述,我们将使用 XPath 查询。你可以查阅相关文档,学习如何构建自己的查询语句。 ```python description = await (await tab.query( '//h2[contains(text(), "About")]/following-sibling::p', timeout=10, )).text ``` 下面让我们来理解这条查询语句的作用: 1. `//h2[contains(text(), "About")]` - 选择第一个包含"About"的 `

` 标签 2. `/following-sibling::p` - 选择第一个在`

` 标签之后的`

`标签 然后你可以获取到剩下的数据: ```python number_of_stars = await (await tab.find( id='repo-stars-counter-star' )).text number_of_forks = await (await tab.find( id='repo-network-counter' )).text number_of_issues = await (await tab.find( id='issues-repo-tab-count', )).text number_of_pull_requests = await (await tab.find( id='pull-requests-repo-tab-count', )).text data = { 'description': description, 'number_of_stars': number_of_stars, 'number_of_forks': number_of_forks, 'number_of_issues': number_of_issues, 'number_of_pull_requests': number_of_pull_requests, } print(data) ``` 下图展示了本次自动化任务的执行速度与结果。 (为演示需要,浏览器界面未显示。) ![google_seach](./docs/images/google-search-example.gif) 短短5秒内,我们就成功提取了所需数据! 这就是使用Pydoll进行自动化所能达到的速度。 ### 更多复杂的例子 接下来我们来看一个你可能经常遇到的场景:类似Cloudflare的验证码防护。 Pydoll提供了相应的处理方法,但需要说明的是,正如前文所述,其有效性会受到多种因素影响。 下面的代码展示了一个完整的Cloudflare验证码处理示例。 ```python import asyncio from pydoll.browser import Chrome from pydoll.constants import By async def cloudflare_example(): async with Chrome() as browser: tab = await browser.start() async with tab.expect_and_bypass_cloudflare_captcha(): await tab.go_to('https://2captcha.com/demo/cloudflare-turnstile') print('Captcha handled, continuing...') await asyncio.sleep(5) # just to see the result :) asyncio.run(cloudflare_example()) ``` 执行结果如下: ![cloudflare_example](./docs/images/cloudflare-example.gif) 仅需数行代码,我们就成功攻克了最棘手的验证码防护之一。 而这仅仅是Pydoll所提供的众多强大功能之一。但这还远不是全部! ### 自定义配置 有时我们需要对浏览器进行更精细的控制。Pydoll提供了灵活的配置方式来实现这一点。下面我们来看具体示例: ```python from pydoll.browser import Chrome from pydoll.browser.options import ChromiumOptions as Options async def custom_automation(): # Configure browser options options = Options() options.add_argument('--proxy-server=username:password@ip:port') options.add_argument('--window-size=1920,1080') options.binary_location = '/path/to/your/browser' options.start_timeout = 20 async with Chrome(options=options) as browser: tab = await browser.start() # Your automation code here await tab.go_to('https://example.com') # The browser is now using your custom settings asyncio.run(custom_automation()) ``` 本示例中,我们配置浏览器使用代理服务器,并设置窗口分辨率为1920x1080。此外,还指定了Chrome二进制文件的自定义路径——适用于您的安装位置与常规默认路径不同的情况。 ## ⚡ 高级功能 Pydoll提供了一系列高级特性满足高端玩家的需求。 ### 高级元素定位 我们提供多种页面元素定位方式。无论您偏好那种方法,都能找到适合您的解决方案: ```python import asyncio from pydoll.browser import Chrome async def element_finding_examples(): async with Chrome() as browser: tab = await browser.start() await tab.go_to('https://example.com') # Find by attributes (most intuitive) submit_btn = await tab.find( tag_name='button', class_name='btn-primary', text='Submit' ) # Find by ID username_field = await tab.find(id='username') # Find multiple elements all_links = await tab.find(tag_name='a', find_all=True) # CSS selectors and XPath nav_menu = await tab.query('nav.main-menu') specific_item = await tab.query('//div[@data-testid="item-123"]') # With timeout and error handling delayed_element = await tab.find( class_name='dynamic-content', timeout=10, raise_exc=False # Returns None if not found ) # Advanced: Custom attributes custom_element = await tab.find( data_testid='submit-button', aria_label='Submit form' ) asyncio.run(element_finding_examples()) ``` find 方法更为友好。我们可以通过常见属性(如 id、tag_name、class_name 等)进行元素查找,甚至支持自定义属性(例如 data-testid)。 如果这些基础方式仍不能满足需求,还可使用 query 方法,通过 CSS 选择器、XPath 查询语句等多种方式进行元素定位。Pydoll 会自动识别当前使用的查询类型。 ### 并发自动化 Pydoll 的一大优势在于其基于异步实现的多任务并行处理能力。我们可以同时自动化操作多个浏览器标签页!下面来看具体示例: ```python import asyncio from pydoll.browser import Chrome async def scrape_page(url, tab): await tab.go_to(url) title = await tab.execute_script('return document.title') links = await tab.find(tag_name='a', find_all=True) return { 'url': url, 'title': title, 'link_count': len(links) } async def concurrent_scraping(): browser = Chrome() tab_google = await browser.start() tab_duckduckgo = await browser.new_tab() tasks = [ scrape_page('https://google.com/', tab_google), scrape_page('https://duckduckgo.com/', tab_duckduckgo) ] results = await asyncio.gather(*tasks) print(results) await browser.stop() asyncio.run(concurrent_scraping()) ``` 下方展示令人惊叹的执行速度: ![concurrent_example](./docs/images/concurrent-example.gif) 这个例子,我们成功实现了同时对两个页面的数据提取. 还有更多强大功能!响应式自动化的事件系统、请求拦截与修改等等。赶快查阅文档! ## 🔧 快速问题排查 **找不到浏览器?** ```python from pydoll.browser import Chrome from pydoll.browser.options import ChromiumOptions options = ChromiumOptions() options.binary_location = '/path/to/your/chrome' browser = Chrome(options=options) ``` **浏览器在 FailedToStartBrowser 错误后启动?** ```python from pydoll.browser import Chrome from pydoll.browser.options import ChromiumOptions options = ChromiumOptions() options.start_timeout = 20 # 默认是 10 秒 browser = Chrome(options=options) ``` **需要代理?** ```python options.add_argument('--proxy-server=your-proxy:port') ``` **在 Docker 中运行?** ```python options.add_argument('--no-sandbox') options.add_argument('--disable-dev-shm-usage') ``` ## 📚 文档 Pydoll 的完整文档、详细示例以及对所有功能的深入探讨可以通过以下链接访问: [官方文档](https://autoscrape-labs.github.io/pydoll/). 文档包含以下部分: - **快速上手指南** - 分步教程 - **API 参考** - 完整的方法文档 - **高级技巧** - 网络拦截、事件处理、性能优化 >此 README 的中文版本在[这里](README_zh.md)。 ## 🤝 贡献 我们很乐意看到您的帮助让 Pydoll 变得更好!查看我们的[贡献指南](CONTRIBUTING.md)开始贡献。无论是修复错误、添加功能还是改进文档 - 所有贡献都受欢迎! 请确保: - 为新功能或错误修复编写测试 - 遵循代码风格和约定 - 对拉取请求使用约定式提交 - 在提交前运行 lint 检查和测试 ## 💖 支持我的工作 如果您发现 Pydoll 有用,请考虑[在 GitHub 上支持我](https://github.com/sponsors/thalissonvs)。 您将获得独家优惠,如优先支持、自定义功能等等! 现在无法赞助?没问题,您仍然可以通过以下方式提供很大帮助: - 为仓库加星 - 在社交媒体上分享 - 撰写文章或教程 - 提供反馈或报告问题 每一点支持都很重要/ ## 💬 传播消息 如果 Pydoll 为您节省了时间、心理健康或者拯救了一个键盘免于被砸,请给它一个 ⭐,分享它,或者告诉您奇怪的开发者朋友。 ## 📄 许可证 Pydoll 在 [MIT 许可证](LICENSE) 下获得许可。

Pydoll — 让浏览器自动化变得神奇!

================================================ FILE: SPONSORS.md ================================================ # Sponsors Pydoll is supported by these amazing sponsors. Their contributions help keep the project maintained and growing. ## Top Sponsors The Web Scraping Club Read a full review of Pydoll on **[The Web Scraping Club](https://substack.thewebscraping.club/p/pydoll-webdriver-scraping?utm_source=github&utm_medium=repo&utm_campaign=pydoll)**, the #1 newsletter dedicated to web scraping. --- ## Sponsors Thordata Pydoll is proudly sponsored by **[Thordata](https://www.thordata.com/?ls=github&lk=pydoll)**: a residential proxy network built for serious web scraping and automation. With **190+ real residential and ISP locations**, fully encrypted connections, and infrastructure optimized for high-performance workflows, Thordata is an excellent choice for scaling your Pydoll automations. **[Sign up through our link](https://www.thordata.com/?ls=github&lk=pydoll)** to support the project and get **1GB free** to get started. --- CapSolver Pydoll excels at behavioral evasion, but it doesn't solve captchas. That's where **[CapSolver](https://dashboard.capsolver.com/passport/register?inviteCode=WPhTbOsbXEpc)** comes in. An AI-powered service that handles reCAPTCHA, Cloudflare challenges, and more, seamlessly integrating with your automation workflows. **[Register with our invite code](https://dashboard.capsolver.com/passport/register?inviteCode=WPhTbOsbXEpc)** and use code **PYDOLL** to get an extra **6% balance bonus**. --- Interested in sponsoring Pydoll? [Become a sponsor](https://github.com/sponsors/thalissonvs). ================================================ FILE: codecov.yml ================================================ coverage: status: project: default: target: 90% threshold: 0% base: auto ================================================ FILE: cz.yaml ================================================ --- commitizen: name: cz_conventional_commits tag_format: $version version: 2.21.3 ================================================ FILE: docs/en/api/browser/chrome.md ================================================ # Chrome Browser ::: pydoll.browser.chromium.Chrome options: show_root_heading: true show_source: false heading_level: 2 ================================================ FILE: docs/en/api/browser/edge.md ================================================ # Edge Browser ::: pydoll.browser.chromium.Edge options: show_root_heading: true show_source: false heading_level: 2 ================================================ FILE: docs/en/api/browser/managers.md ================================================ # Browser Managers The managers module provides specialized classes for managing different aspects of browser lifecycle and configuration. ## Overview Browser managers handle specific responsibilities in browser automation: ::: pydoll.browser.managers options: show_root_heading: true show_source: false heading_level: 2 filters: - "!^_" - "!^__" ## Manager Classes ### Browser Process Manager Manages the browser process lifecycle, including starting, stopping, and monitoring browser processes. ::: pydoll.browser.managers.browser_process_manager options: show_root_heading: true show_source: false heading_level: 3 ### Browser Options Manager Handles browser configuration options and command-line arguments. ::: pydoll.browser.managers.browser_options_manager options: show_root_heading: true show_source: false heading_level: 3 ### Proxy Manager Manages proxy configuration and authentication for browser instances. ::: pydoll.browser.managers.proxy_manager options: show_root_heading: true show_source: false heading_level: 3 ### Temporary Directory Manager Handles creation and cleanup of temporary directories used by browser instances. ::: pydoll.browser.managers.temp_dir_manager options: show_root_heading: true show_source: false heading_level: 3 ## Usage Managers are typically used internally by browser classes like `Chrome` and `Edge`. They provide modular functionality that can be composed together: ```python from pydoll.browser.managers.proxy_manager import ProxyManager from pydoll.browser.managers.temp_dir_manager import TempDirManager # Managers are used internally by browser classes # Direct usage is for advanced scenarios only proxy_manager = ProxyManager() temp_manager = TempDirManager() ``` !!! note "Internal Usage" These managers are primarily used internally by the browser classes. Direct usage is recommended only for advanced scenarios or when extending the library. ================================================ FILE: docs/en/api/browser/options.md ================================================ # Browser Options ## ChromiumOptions ::: pydoll.browser.options.ChromiumOptions options: show_root_heading: true show_source: false heading_level: 3 ## Options Interface ::: pydoll.browser.interfaces.Options options: show_root_heading: true show_source: false heading_level: 3 ## BrowserOptionsManager Interface ::: pydoll.browser.interfaces.BrowserOptionsManager options: show_root_heading: true show_source: false heading_level: 3 ================================================ FILE: docs/en/api/browser/requests.md ================================================ # Browser Requests The requests module provides HTTP request capabilities within the browser context, enabling seamless API calls that inherit the browser's session state, cookies, and authentication. ## Overview The browser requests module offers a `requests`-like interface for making HTTP calls directly within the browser's JavaScript context. This approach provides several advantages over traditional HTTP libraries: - **Session inheritance**: Automatic cookie, authentication, and CORS handling - **Browser context**: Requests execute in the same security context as the page - **No session juggling**: Eliminate the need to transfer cookies and tokens between automation and API calls - **SPA compatibility**: Perfect for Single Page Applications with complex authentication flows ## Request Class The main interface for making HTTP requests within the browser context. ::: pydoll.browser.requests.request.Request options: show_root_heading: true show_source: false heading_level: 3 group_by_category: true members_order: source filters: - "!^__" ## Response Class Represents the response from HTTP requests, providing a familiar interface similar to the `requests` library. ::: pydoll.browser.requests.response.Response options: show_root_heading: true show_source: false heading_level: 3 group_by_category: true members_order: source filters: - "!^__" ## Usage Examples ### Basic HTTP Methods ```python from pydoll.browser.chromium import Chrome async with Chrome() as browser: tab = await browser.start() await tab.go_to("https://api.example.com") # GET request response = await tab.request.get("/users/123") user_data = await response.json() # POST request response = await tab.request.post("/users", json={ "name": "John Doe", "email": "john@example.com" }) # PUT request with headers response = await tab.request.put("/users/123", json={"name": "Jane Doe"}, headers={"Authorization": "Bearer token123"} ) ``` ### Response Handling ```python # Check response status if response.ok: print(f"Success: {response.status_code}") else: print(f"Error: {response.status_code}") response.raise_for_status() # Raises HTTPError for 4xx/5xx # Access response data text_data = response.text json_data = await response.json() raw_bytes = response.content # Inspect headers and cookies print("Response headers:", response.headers) print("Request headers:", response.request_headers) for cookie in response.cookies: print(f"Cookie: {cookie.name}={cookie.value}") ``` ### Advanced Features ```python # Request with custom headers and parameters response = await tab.request.get("/search", params={"q": "python", "limit": 10}, headers={ "User-Agent": "Custom Bot 1.0", "Accept": "application/json" } ) # File upload simulation response = await tab.request.post("/upload", data={"description": "Test file"}, files={"file": ("test.txt", "file content", "text/plain")} ) # Form data submission response = await tab.request.post("/login", data={"username": "user", "password": "pass"} ) ``` ## Integration with Tab The request functionality is accessed through the `tab.request` property, which provides a singleton `Request` instance for each tab: ```python # Each tab has its own request instance tab1 = await browser.get_tab(0) tab2 = await browser.new_tab() # These are separate Request instances request1 = tab1.request # Request bound to tab1 request2 = tab2.request # Request bound to tab2 # Requests inherit the tab's context await tab1.go_to("https://site1.com") await tab2.go_to("https://site2.com") # These requests will have different cookie/session contexts response1 = await tab1.request.get("/api/data") # Uses site1.com cookies response2 = await tab2.request.get("/api/data") # Uses site2.com cookies ``` !!! tip "Hybrid Automation" This module is particularly powerful for hybrid automation scenarios where you need to combine UI interactions with API calls. For example, log in through the UI, then use the authenticated session for API calls without manually handling cookies or tokens. ================================================ FILE: docs/en/api/browser/tab.md ================================================ # Tab ::: pydoll.browser.tab.Tab options: show_root_heading: true show_source: false heading_level: 2 ================================================ FILE: docs/en/api/commands/browser.md ================================================ # Browser Commands Browser commands provide low-level control over browser instances and their configuration. ## Overview The browser commands module handles browser-level operations such as version information, target management, and browser-wide settings. ::: pydoll.commands.browser_commands options: show_root_heading: true show_source: false heading_level: 2 filters: - "!^_" - "!^__" ## Usage Browser commands are typically used internally by browser classes to manage browser instances: ```python from pydoll.commands.browser_commands import get_version from pydoll.connection.connection_handler import ConnectionHandler # Get browser version information connection = ConnectionHandler() version_info = await get_version(connection) ``` ## Available Commands The browser commands module provides functions for: - Getting browser version and user agent information - Managing browser targets (tabs, windows) - Controlling browser-wide settings and permissions - Handling browser lifecycle events !!! note "Internal Usage" These commands are primarily used internally by the `Chrome` and `Edge` browser classes. Direct usage is recommended only for advanced scenarios. ================================================ FILE: docs/en/api/commands/dom.md ================================================ # DOM Commands DOM commands provide comprehensive functionality for interacting with the Document Object Model of web pages. ## Overview The DOM commands module is one of the most important modules in Pydoll, providing all the functionality needed to find, interact with, and manipulate HTML elements on web pages. ::: pydoll.commands.dom_commands options: show_root_heading: true show_source: false heading_level: 2 filters: - "!^_" - "!^__" ## Usage DOM commands are used extensively by the `WebElement` class and element finding methods: ```python from pydoll.commands.dom_commands import query_selector, get_attributes from pydoll.connection.connection_handler import ConnectionHandler # Find element and get its attributes connection = ConnectionHandler() node_id = await query_selector(connection, selector="#username") attributes = await get_attributes(connection, node_id=node_id) ``` ## Key Functionality The DOM commands module provides functions for: ### Element Finding - `query_selector()` - Find single element by CSS selector - `query_selector_all()` - Find multiple elements by CSS selector - `get_document()` - Get the document root node ### Element Interaction - `click_element()` - Click on elements - `focus_element()` - Focus elements - `set_attribute_value()` - Set element attributes - `get_attributes()` - Get element attributes ### Element Information - `get_box_model()` - Get element positioning and dimensions - `describe_node()` - Get detailed element information - `get_outer_html()` - Get element HTML content ### DOM Manipulation - `remove_node()` - Remove elements from DOM - `set_node_value()` - Set element values - `request_child_nodes()` - Get child elements !!! tip "High-Level APIs" While these commands provide powerful low-level access, most users should use the higher-level `WebElement` class methods like `click()`, `type_text()`, and `get_attribute()` which use these commands internally. ================================================ FILE: docs/en/api/commands/fetch.md ================================================ # Fetch Commands Fetch commands provide advanced network request handling and interception capabilities using the Fetch API domain. ## Overview The fetch commands module enables sophisticated network request management, including request modification, response interception, and authentication handling. ::: pydoll.commands.fetch_commands options: show_root_heading: true show_source: false heading_level: 2 filters: - "!^_" - "!^__" ## Usage Fetch commands are used for advanced network interception and request handling: ```python from pydoll.commands.fetch_commands import enable, request_paused, continue_request from pydoll.connection.connection_handler import ConnectionHandler # Enable fetch domain connection = ConnectionHandler() await enable(connection, patterns=[{ "urlPattern": "*", "requestStage": "Request" }]) # Handle paused requests async def handle_paused_request(request_id, request): # Modify request or continue as-is await continue_request(connection, request_id=request_id) ``` ## Key Functionality The fetch commands module provides functions for: ### Request Interception - `enable()` - Enable fetch domain with patterns - `disable()` - Disable fetch domain - `continue_request()` - Continue intercepted requests - `fail_request()` - Fail requests with specific errors ### Request Modification - Modify request headers - Change request URLs - Alter request methods (GET, POST, etc.) - Modify request bodies ### Response Handling - `fulfill_request()` - Provide custom responses - `get_response_body()` - Get response content - Response header modification - Response status code control ### Authentication - `continue_with_auth()` - Handle authentication challenges - Basic authentication support - Custom authentication flows ## Advanced Features ### Pattern-Based Interception ```python # Intercept specific URL patterns patterns = [ {"urlPattern": "*/api/*", "requestStage": "Request"}, {"urlPattern": "*.js", "requestStage": "Response"}, {"urlPattern": "https://example.com/*", "requestStage": "Request"} ] await enable(connection, patterns=patterns) ``` ### Request Modification ```python # Modify intercepted requests async def modify_request(request_id, request): # Add authentication header headers = request.headers.copy() headers["Authorization"] = "Bearer token123" # Continue with modified headers await continue_request( connection, request_id=request_id, headers=headers ) ``` ### Response Mocking ```python # Mock API responses await fulfill_request( connection, request_id=request_id, response_code=200, response_headers=[ {"name": "Content-Type", "value": "application/json"}, {"name": "Access-Control-Allow-Origin", "value": "*"} ], body='{"status": "success", "data": {"mocked": true}}' ) ``` ### Authentication Handling ```python # Handle authentication challenges await continue_with_auth( connection, request_id=request_id, auth_challenge_response={ "response": "ProvideCredentials", "username": "user", "password": "pass" } ) ``` ## Request Stages Fetch commands can intercept requests at different stages: | Stage | Description | Use Cases | |-------|-------------|-----------| | Request | Before request is sent | Modify headers, URL, method | | Response | After response received | Mock responses, modify content | ## Error Handling ```python # Fail requests with specific errors await fail_request( connection, request_id=request_id, error_reason="ConnectionRefused" # or "AccessDenied", "TimedOut", etc. ) ``` ## Integration with Network Commands Fetch commands work alongside network commands but provide more granular control: - **Network Commands**: Broader network monitoring and control - **Fetch Commands**: Specific request/response interception and modification !!! tip "Performance Considerations" Fetch interception can impact page loading performance. Use specific URL patterns and disable when not needed to minimize overhead. ================================================ FILE: docs/en/api/commands/index.md ================================================ # Commands Overview The Commands module provides high-level interfaces for interacting with Chrome DevTools Protocol (CDP) domains. Each command module corresponds to a specific CDP domain and provides methods to execute various browser operations. ## Available Command Modules ### Browser Commands - **Module**: `browser_commands.py` - **Purpose**: Browser-level operations and window management - **Documentation**: [Browser Commands](browser.md) ### DOM Commands - **Module**: `dom_commands.py` - **Purpose**: DOM tree manipulation and element operations - **Documentation**: [DOM Commands](dom.md) ### Input Commands - **Module**: `input_commands.py` - **Purpose**: Input event simulation (keyboard, mouse, touch) - **Documentation**: [Input Commands](input.md) ### Network Commands - **Module**: `network_commands.py` - **Purpose**: Network monitoring and request interception - **Documentation**: [Network Commands](network.md) ### Page Commands - **Module**: `page_commands.py` - **Purpose**: Page lifecycle management and navigation - **Documentation**: [Page Commands](page.md) ### Runtime Commands - **Module**: `runtime_commands.py` - **Purpose**: JavaScript execution and runtime management - **Documentation**: [Runtime Commands](runtime.md) ### Storage Commands - **Module**: `storage_commands.py` - **Purpose**: Browser storage access (cookies, local storage, etc.) - **Documentation**: [Storage Commands](storage.md) ### Target Commands - **Module**: `target_commands.py` - **Purpose**: Target management and tab operations - **Documentation**: [Target Commands](target.md) ### Fetch Commands - **Module**: `fetch_commands.py` - **Purpose**: Network request interception and modification - **Documentation**: [Fetch Commands](fetch.md) ## Usage Pattern Commands are typically accessed through the browser or tab instances: ```python from pydoll.browser.chromium import Chrome # Initialize browser browser = Chrome() await browser.start() # Get active tab tab = await browser.get_active_tab() # Use commands through the tab await tab.navigate("https://example.com") element = await tab.find(id="button") await element.click() ``` ## Command Structure Each command module follows a consistent pattern: - **Static methods**: For direct command execution - **Type hints**: Full type safety with protocol types - **Error handling**: Proper exception handling for CDP errors - **Documentation**: Comprehensive docstrings with examples ================================================ FILE: docs/en/api/commands/input.md ================================================ # Input Commands Input commands handle mouse and keyboard interactions, providing human-like input simulation. ## Overview The input commands module provides functionality for simulating user input including mouse movements, clicks, keyboard typing, and key presses. ::: pydoll.commands.input_commands options: show_root_heading: true show_source: false heading_level: 2 filters: - "!^_" - "!^__" ## Usage Input commands are used by element interaction methods and can be used directly for advanced input scenarios: ```python from pydoll.commands.input_commands import dispatch_mouse_event, dispatch_key_event from pydoll.connection.connection_handler import ConnectionHandler # Simulate mouse click connection = ConnectionHandler() await dispatch_mouse_event( connection, type="mousePressed", x=100, y=200, button="left" ) # Simulate keyboard typing await dispatch_key_event( connection, type="keyDown", key="Enter" ) ``` ## Key Functionality The input commands module provides functions for: ### Mouse Events - `dispatch_mouse_event()` - Mouse clicks, movements, and wheel events - Mouse button states (left, right, middle) - Coordinate-based positioning - Drag and drop operations ### Keyboard Events - `dispatch_key_event()` - Key press and release events - `insert_text()` - Direct text insertion - Special key handling (Enter, Tab, Arrow keys, etc.) - Modifier keys (Ctrl, Alt, Shift) ### Touch Events - Touch screen simulation - Multi-touch gestures - Touch coordinates and pressure ## Human-like Behavior The input commands support human-like behavior patterns: - Natural mouse movement curves - Realistic typing speeds and patterns - Random micro-delays between actions - Pressure-sensitive touch events !!! tip "Element Methods" For most use cases, use the higher-level element methods like `element.click()` and `element.type_text()` which provide a more convenient API and handle common scenarios automatically. ================================================ FILE: docs/en/api/commands/network.md ================================================ # Network Commands Network commands provide comprehensive control over network requests, responses, and browser networking behavior. ## Overview The network commands module enables request interception, response modification, cookie management, and network monitoring capabilities. ::: pydoll.commands.network_commands options: show_root_heading: true show_source: false heading_level: 2 filters: - "!^_" - "!^__" ## Usage Network commands are used for advanced scenarios like request interception and network monitoring: ```python from pydoll.commands.network_commands import enable, set_request_interception from pydoll.connection.connection_handler import ConnectionHandler # Enable network monitoring connection = ConnectionHandler() await enable(connection) # Enable request interception await set_request_interception(connection, patterns=[{"urlPattern": "*"}]) ``` ## Key Functionality The network commands module provides functions for: ### Request Management - `enable()` / `disable()` - Enable/disable network monitoring - `set_request_interception()` - Intercept and modify requests - `continue_intercepted_request()` - Continue or modify intercepted requests - `get_request_post_data()` - Get request body data ### Response Handling - `get_response_body()` - Get response content - `fulfill_request()` - Provide custom responses - `fail_request()` - Simulate network failures ### Cookie Management - `get_cookies()` - Get browser cookies - `set_cookies()` - Set browser cookies - `delete_cookies()` - Delete specific cookies - `clear_browser_cookies()` - Clear all cookies ### Cache Control - `clear_browser_cache()` - Clear browser cache - `set_cache_disabled()` - Disable browser cache - `get_response_body_for_interception()` - Get cached responses ### Security & Headers - `set_user_agent_override()` - Override user agent - `set_extra_http_headers()` - Add custom headers - `emulate_network_conditions()` - Simulate network conditions ## Advanced Use Cases ### Request Interception ```python # Intercept and modify requests await set_request_interception(connection, patterns=[ {"urlPattern": "*/api/*", "requestStage": "Request"} ]) # Handle intercepted request async def handle_request(request): if "api/login" in request.url: # Modify request headers headers = request.headers.copy() headers["Authorization"] = "Bearer token" await continue_intercepted_request( connection, request_id=request.request_id, headers=headers ) ``` ### Response Mocking ```python # Mock API responses await fulfill_request( connection, request_id=request_id, response_code=200, response_headers={"Content-Type": "application/json"}, body='{"status": "success"}' ) ``` !!! warning "Performance Impact" Network interception can impact page loading performance. Use selectively and disable when not needed. ================================================ FILE: docs/en/api/commands/page.md ================================================ # Page Commands Page commands handle page navigation, lifecycle events, and page-level operations. ## Overview The page commands module provides functionality for navigating between pages, managing page lifecycle, handling JavaScript execution, and controlling page behavior. ::: pydoll.commands.page_commands options: show_root_heading: true show_source: false heading_level: 2 filters: - "!^_" - "!^__" ## Usage Page commands are used extensively by the `Tab` class for navigation and page management: ```python from pydoll.commands.page_commands import navigate, reload, enable from pydoll.connection.connection_handler import ConnectionHandler # Navigate to a URL connection = ConnectionHandler() await enable(connection) # Enable page events await navigate(connection, url="https://example.com") # Reload the page await reload(connection) ``` ## Key Functionality The page commands module provides functions for: ### Navigation - `navigate()` - Navigate to URLs - `reload()` - Reload current page - `go_back()` - Navigate back in history - `go_forward()` - Navigate forward in history - `stop_loading()` - Stop page loading ### Page Lifecycle - `enable()` / `disable()` - Enable/disable page events - `get_frame_tree()` - Get page frame structure - `get_navigation_history()` - Get navigation history ### Content Management - `get_resource_content()` - Get page resource content - `search_in_resource()` - Search within page resources - `set_document_content()` - Set page HTML content ### Screenshots & PDF - `capture_screenshot()` - Take page screenshots - `print_to_pdf()` - Generate PDF from page - `capture_snapshot()` - Capture page snapshots ### JavaScript Execution - `add_script_to_evaluate_on_new_document()` - Add startup scripts - `remove_script_to_evaluate_on_new_document()` - Remove startup scripts ### Page Settings - `set_lifecycle_events_enabled()` - Control lifecycle events - `set_ad_blocking_enabled()` - Enable/disable ad blocking - `set_bypass_csp()` - Bypass Content Security Policy ## Advanced Features ### Frame Management ```python # Get all frames in the page frame_tree = await get_frame_tree(connection) for frame in frame_tree.child_frames: print(f"Frame: {frame.frame.url}") ``` ### Resource Interception ```python # Get resource content content = await get_resource_content( connection, frame_id=frame_id, url="https://example.com/script.js" ) ``` ### Page Events The page commands work with various page events: - `Page.loadEventFired` - Page load completed - `Page.domContentEventFired` - DOM content loaded - `Page.frameNavigated` - Frame navigation - `Page.frameStartedLoading` - Frame loading started !!! tip "Tab Class Integration" Most page operations are available through the `Tab` class methods like `tab.go_to()`, `tab.reload()`, and `tab.screenshot()` which provide a more convenient API. ================================================ FILE: docs/en/api/commands/runtime.md ================================================ # Runtime Commands Runtime commands provide JavaScript execution capabilities and runtime environment management. ## Overview The runtime commands module enables JavaScript code execution, object inspection, and runtime environment control within browser contexts. ::: pydoll.commands.runtime_commands options: show_root_heading: true show_source: false heading_level: 2 filters: - "!^_" - "!^__" ## Usage Runtime commands are used for JavaScript execution and runtime management: ```python from pydoll.commands.runtime_commands import evaluate, enable from pydoll.connection.connection_handler import ConnectionHandler # Enable runtime events connection = ConnectionHandler() await enable(connection) # Execute JavaScript result = await evaluate( connection, expression="document.title", return_by_value=True ) print(result.value) # Page title ``` ## Key Functionality The runtime commands module provides functions for: ### JavaScript Execution - `evaluate()` - Execute JavaScript expressions - `call_function_on()` - Call functions on objects - `compile_script()` - Compile JavaScript for reuse - `run_script()` - Run compiled scripts ### Object Management - `get_properties()` - Get object properties - `release_object()` - Release object references - `release_object_group()` - Release object groups ### Runtime Control - `enable()` / `disable()` - Enable/disable runtime events - `discard_console_entries()` - Clear console entries - `set_custom_object_formatter_enabled()` - Enable custom formatters ### Exception Handling - `set_async_call_stack_depth()` - Set call stack depth - Exception capture and reporting - Error object inspection ## Advanced Usage ### Complex JavaScript Execution ```python # Execute complex JavaScript with error handling script = """ try { const elements = document.querySelectorAll('.item'); return Array.from(elements).map(el => ({ text: el.textContent, href: el.href })); } catch (error) { return { error: error.message }; } """ result = await evaluate( connection, expression=script, return_by_value=True, await_promise=True ) ``` ### Object Inspection ```python # Get detailed object properties properties = await get_properties( connection, object_id=object_id, own_properties=True, accessor_properties_only=False ) for prop in properties: print(f"{prop.name}: {prop.value}") ``` ### Console Integration Runtime commands integrate with browser console: - Console messages and errors - Console API method calls - Custom console formatters !!! note "Performance Considerations" JavaScript execution through runtime commands can be slower than native browser execution. Use judiciously for complex operations. ================================================ FILE: docs/en/api/commands/storage.md ================================================ # Storage Commands Storage commands provide comprehensive browser storage management including cookies, localStorage, sessionStorage, and IndexedDB. ## Overview The storage commands module enables management of all browser storage mechanisms, providing functionality for data persistence and retrieval. ::: pydoll.commands.storage_commands options: show_root_heading: true show_source: false heading_level: 2 filters: - "!^_" - "!^__" ## Usage Storage commands are used for managing browser storage across different mechanisms: ```python from pydoll.commands.storage_commands import get_cookies, set_cookies, clear_data_for_origin from pydoll.connection.connection_handler import ConnectionHandler # Get cookies for a domain connection = ConnectionHandler() cookies = await get_cookies(connection, urls=["https://example.com"]) # Set a new cookie await set_cookies(connection, cookies=[{ "name": "session_id", "value": "abc123", "domain": "example.com", "path": "/", "httpOnly": True, "secure": True }]) # Clear all storage for an origin await clear_data_for_origin( connection, origin="https://example.com", storage_types="all" ) ``` ## Key Functionality The storage commands module provides functions for: ### Cookie Management - `get_cookies()` - Get cookies by URL or domain - `set_cookies()` - Set new cookies - `delete_cookies()` - Delete specific cookies - `clear_cookies()` - Clear all cookies ### Local Storage - `get_dom_storage_items()` - Get localStorage items - `set_dom_storage_item()` - Set localStorage item - `remove_dom_storage_item()` - Remove localStorage item - `clear_dom_storage()` - Clear localStorage ### Session Storage - Session storage operations (similar to localStorage) - Session-specific data management - Tab-isolated storage ### IndexedDB - `get_database_names()` - Get IndexedDB databases - `request_database()` - Access database structure - `request_data()` - Query database data - `clear_object_store()` - Clear object stores ### Cache Storage - `request_cache_names()` - Get cache names - `request_cached_response()` - Get cached responses - `delete_cache()` - Delete cache entries ### Application Cache (Deprecated) - Legacy application cache support - Manifest-based caching ## Advanced Features ### Bulk Operations ```python # Clear all storage types for multiple origins origins = ["https://example.com", "https://api.example.com"] for origin in origins: await clear_data_for_origin( connection, origin=origin, storage_types="cookies,local_storage,session_storage,indexeddb" ) ``` ### Storage Quotas ```python # Get storage quota information quota_info = await get_usage_and_quota(connection, origin="https://example.com") print(f"Used: {quota_info.usage} bytes") print(f"Quota: {quota_info.quota} bytes") ``` ### Cross-Origin Storage ```python # Manage storage across different origins await set_cookies(connection, cookies=[{ "name": "cross_site_token", "value": "token123", "domain": ".example.com", # Applies to all subdomains "sameSite": "None", "secure": True }]) ``` ## Storage Types The module supports various storage mechanisms: | Storage Type | Persistence | Scope | Capacity | |--------------|-------------|-------|----------| | Cookies | Persistent | Domain/Path | ~4KB per cookie | | localStorage | Persistent | Origin | ~5-10MB | | sessionStorage | Session | Tab | ~5-10MB | | IndexedDB | Persistent | Origin | Large (GB+) | | Cache API | Persistent | Origin | Large | !!! warning "Privacy Considerations" Storage operations can affect user privacy. Always handle storage data responsibly and in compliance with privacy regulations. ================================================ FILE: docs/en/api/commands/target.md ================================================ # Target Commands Target commands manage browser targets including tabs, windows, and other browsing contexts. ## Overview The target commands module provides functionality for creating, managing, and controlling browser targets such as tabs, popup windows, and service workers. ::: pydoll.commands.target_commands options: show_root_heading: true show_source: false heading_level: 2 filters: - "!^_" - "!^__" ## Usage Target commands are used internally by browser classes to manage tabs and windows: ```python from pydoll.commands.target_commands import get_targets, create_target, close_target from pydoll.connection.connection_handler import ConnectionHandler # Get all browser targets connection = ConnectionHandler() targets = await get_targets(connection) # Create a new tab new_target = await create_target(connection, url="https://example.com") # Close a target await close_target(connection, target_id=new_target.target_id) ``` ## Key Functionality The target commands module provides functions for: ### Target Management - `get_targets()` - List all browser targets - `create_target()` - Create new tabs or windows - `close_target()` - Close specific targets - `activate_target()` - Bring target to foreground ### Target Information - `get_target_info()` - Get detailed target information - Target types: page, background_page, service_worker, browser - Target states: attached, detached, crashed ### Session Management - `attach_to_target()` - Attach to target for control - `detach_from_target()` - Detach from target - `send_message_to_target()` - Send commands to targets ### Browser Context - `create_browser_context()` - Create isolated browser context - `dispose_browser_context()` - Remove browser context - `get_browser_contexts()` - List browser contexts ## Target Types Different types of targets can be managed: ### Page Targets ```python # Create a new tab page_target = await create_target( connection, url="https://example.com", width=1920, height=1080, browser_context_id=None # Default context ) ``` ### Popup Windows ```python # Create a popup window popup_target = await create_target( connection, url="https://popup.example.com", width=800, height=600, new_window=True ) ``` ### Incognito Contexts ```python # Create incognito browser context incognito_context = await create_browser_context(connection) # Create tab in incognito context incognito_tab = await create_target( connection, url="https://private.example.com", browser_context_id=incognito_context.browser_context_id ) ``` !!! info "Headless vs Headed: how contexts show up" Browser contexts are isolated logical environments. In headed mode, the first page created inside a new context will usually open in a new OS window. In headless mode, no window is shown — the isolation remains purely logical (cookies, storage, cache and auth state are still separate per context). Prefer contexts in headless/CI pipelines for performance and clean isolation. ## Advanced Features ### Target Events Target commands work with various target events: - `Target.targetCreated` - New target created - `Target.targetDestroyed` - Target closed - `Target.targetInfoChanged` - Target information updated - `Target.targetCrashed` - Target crashed ### Multi-Target Coordination ```python # Manage multiple tabs targets = await get_targets(connection) page_targets = [t for t in targets if t.type == "page"] for target in page_targets: # Perform operations on each tab await activate_target(connection, target_id=target.target_id) # ... do work in this tab ``` ### Target Isolation ```python # Create isolated browser context for testing test_context = await create_browser_context(connection) # All targets in this context are isolated test_tab1 = await create_target( connection, url="https://test1.com", browser_context_id=test_context.browser_context_id ) test_tab2 = await create_target( connection, url="https://test2.com", browser_context_id=test_context.browser_context_id ) ``` !!! note "Browser Integration" Target commands are primarily used internally by the `Chrome` and `Edge` browser classes. The high-level browser APIs provide more convenient methods for tab management. ================================================ FILE: docs/en/api/connection/connection.md ================================================ # Connection Handler ::: pydoll.connection.connection_handler.ConnectionHandler options: show_root_heading: true show_source: false heading_level: 2 ================================================ FILE: docs/en/api/connection/managers.md ================================================ # Connection Managers ## CommandsManager ::: pydoll.connection.managers.commands_manager.CommandsManager options: show_root_heading: true show_source: false heading_level: 3 ## EventsManager ::: pydoll.connection.managers.events_manager.EventsManager options: show_root_heading: true show_source: false heading_level: 3 ================================================ FILE: docs/en/api/core/constants.md ================================================ # Constants This section documents all constants, enums, and configuration values used throughout Pydoll. ::: pydoll.constants options: show_root_heading: true show_source: false heading_level: 2 group_by_category: true members_order: source ================================================ FILE: docs/en/api/core/exceptions.md ================================================ # Exceptions This section documents all custom exceptions that can be raised by Pydoll operations. ::: pydoll.exceptions options: show_root_heading: true show_source: false heading_level: 2 group_by_category: true members_order: source ================================================ FILE: docs/en/api/core/utils.md ================================================ # Utilities This section documents utility functions and helper classes used throughout Pydoll. ::: pydoll.utils options: show_root_heading: true show_source: false heading_level: 2 group_by_category: true members_order: source ================================================ FILE: docs/en/api/elements/mixins.md ================================================ # Element Mixins The mixins module provides reusable functionality that can be mixed into element classes to extend their capabilities. ## Find Elements Mixin The `FindElementsMixin` provides element finding capabilities to classes that include it. ::: pydoll.elements.mixins.find_elements_mixin options: show_root_heading: true show_source: false heading_level: 2 filters: - "!^_" - "!^__" ## Usage Mixins are typically used internally by the library to compose functionality. The `FindElementsMixin` is used by classes like `Tab` and `WebElement` to provide element finding methods: ```python # These methods come from FindElementsMixin element = await tab.find(id="username") elements = await tab.find(class_name="item", find_all=True) element = await tab.query("#submit-button") ``` ## Available Methods The `FindElementsMixin` provides several methods for finding elements: - `find()` - Modern element finding with keyword arguments - `query()` - CSS selector and XPath queries - `find_element()` - Legacy element finding method - `find_elements()` - Legacy method for finding multiple elements !!! tip "Modern vs Legacy" The `find()` method is the modern, recommended approach for finding elements. The `find_element()` and `find_elements()` methods are maintained for backward compatibility. ================================================ FILE: docs/en/api/elements/shadow_root.md ================================================ # ShadowRoot ::: pydoll.elements.shadow_root.ShadowRoot options: show_root_heading: true show_source: false heading_level: 2 members_order: source group_by_category: true ================================================ FILE: docs/en/api/elements/web_element.md ================================================ # WebElement ::: pydoll.elements.web_element.WebElement options: show_root_heading: true show_source: false heading_level: 2 members_order: source group_by_category: true ================================================ FILE: docs/en/api/index.md ================================================ # API Reference Welcome to the Pydoll API Reference! This section provides comprehensive documentation for all classes, methods, and functions available in the Pydoll library. ## Overview Pydoll is organized into several key modules, each serving a specific purpose in browser automation: ### Browser Module The browser module contains classes for managing browser instances and their lifecycle. - **[Chrome](browser/chrome.md)** - Chrome browser automation - **[Edge](browser/edge.md)** - Microsoft Edge browser automation - **[Options](browser/options.md)** - Browser configuration options - **[Tab](browser/tab.md)** - Tab management and interaction - **[Requests](browser/requests.md)** - HTTP requests within browser context - **[Managers](browser/managers.md)** - Browser lifecycle managers ### Elements Module The elements module provides classes for interacting with web page elements. - **[WebElement](elements/web_element.md)** - Individual element interaction - **[Mixins](elements/mixins.md)** - Reusable element functionality ### Connection Module The connection module handles communication with the browser through the Chrome DevTools Protocol. - **[Connection Handler](connection/connection.md)** - WebSocket connection management - **[Managers](connection/managers.md)** - Connection lifecycle managers ### Commands Module The commands module provides low-level Chrome DevTools Protocol command implementations. - **[Commands Overview](commands/index.md)** - CDP command implementations by domain ### Protocol Module The protocol module implements the Chrome DevTools Protocol commands and events. - **[Base Types](protocol/base.md)** - Base types for Chrome DevTools Protocol - **[Browser](protocol/browser.md)** - Browser domain commands and events - **[DOM](protocol/dom.md)** - DOM domain commands and events - **[Fetch](protocol/fetch.md)** - Fetch domain commands and events - **[Input](protocol/input.md)** - Input domain commands and events - **[Network](protocol/network.md)** - Network domain commands and events - **[Page](protocol/page.md)** - Page domain commands and events - **[Runtime](protocol/runtime.md)** - Runtime domain commands and events - **[Storage](protocol/storage.md)** - Storage domain commands and events - **[Target](protocol/target.md)** - Target domain commands and events ### Core Module The core module contains fundamental utilities, constants, and exceptions. - **[Constants](core/constants.md)** - Library constants and enums - **[Exceptions](core/exceptions.md)** - Custom exception classes - **[Utils](core/utils.md)** - Utility functions ## Quick Navigation ### Most Common Classes | Class | Purpose | Module | |-------|---------|--------| | `Chrome` | Chrome browser automation | `pydoll.browser.chromium` | | `Edge` | Edge browser automation | `pydoll.browser.chromium` | | `Tab` | Tab interaction and control | `pydoll.browser.tab` | | `WebElement` | Element interaction | `pydoll.elements.web_element` | | `ChromiumOptions` | Browser configuration | `pydoll.browser.options` | ### Key Enums and Constants | Name | Purpose | Module | |------|---------|--------| | `By` | Element selector strategies | `pydoll.constants` | | `Key` | Keyboard key constants | `pydoll.constants` | | `PermissionType` | Browser permission types | `pydoll.constants` | ### Common Exceptions | Exception | When Raised | Module | |-----------|-------------|--------| | `ElementNotFound` | Element not found in DOM | `pydoll.exceptions` | | `WaitElementTimeout` | Element wait timeout | `pydoll.exceptions` | | `BrowserNotStarted` | Browser not started | `pydoll.exceptions` | ## Usage Patterns ### Basic Browser Automation ```python from pydoll.browser.chromium import Chrome async with Chrome() as browser: tab = await browser.start() await tab.go_to("https://example.com") element = await tab.find(id="my-element") await element.click() ``` ### Element Finding ```python # Using the modern find() method element = await tab.find(id="username") element = await tab.find(tag_name="button", class_name="submit") # Using CSS selectors or XPath element = await tab.query("#username") element = await tab.query("//button[@class='submit']") ``` ### Event Handling ```python await tab.enable_page_events() await tab.on('Page.loadEventFired', handle_page_load) ``` ## Type Hints Pydoll is fully typed and provides comprehensive type hints for better IDE support and code safety. All public APIs include proper type annotations. ```python from typing import Optional, List from pydoll.elements.web_element import WebElement # Methods return properly typed objects element: Optional[WebElement] = await tab.find(id="test", raise_exc=False) elements: List[WebElement] = await tab.find(class_name="item", find_all=True) ``` ## Async/Await Support All Pydoll operations are asynchronous and must be used with `async`/`await`: ```python import asyncio async def main(): # All Pydoll operations are async async with Chrome() as browser: tab = await browser.start() await tab.go_to("https://example.com") asyncio.run(main()) ``` Browse the sections below to explore the complete API documentation for each module. ================================================ FILE: docs/en/api/protocol/base.md ================================================ # Protocol Base Types Base types and structures for Chrome DevTools Protocol commands, responses, and events. ## Base Types ::: pydoll.protocol.base options: show_root_heading: true show_source: false heading_level: 3 group_by_category: true members_order: source filters: - "!^__" ================================================ FILE: docs/en/api/protocol/browser.md ================================================ # Browser Protocol Browser domain commands, events and types for Chrome DevTools Protocol. ## Methods ::: pydoll.protocol.browser.methods options: show_root_heading: true show_source: false heading_level: 2 ## Events ::: pydoll.protocol.browser.events options: show_root_heading: true show_source: false heading_level: 2 ## Types ::: pydoll.protocol.browser.types options: show_root_heading: true show_source: false heading_level: 2 ================================================ FILE: docs/en/api/protocol/dom.md ================================================ # DOM Protocol DOM domain commands and events for Chrome DevTools Protocol. ## Methods ::: pydoll.protocol.dom.methods options: show_root_heading: true show_source: false heading_level: 2 ## Events ::: pydoll.protocol.dom.events options: show_root_heading: true show_source: false heading_level: 2 ## Types ::: pydoll.protocol.dom.types options: show_root_heading: true show_source: false heading_level: 2 ================================================ FILE: docs/en/api/protocol/fetch.md ================================================ # Fetch Protocol Fetch domain commands, events and types for Chrome DevTools Protocol. ## Methods ::: pydoll.protocol.fetch.methods options: show_root_heading: true show_source: false heading_level: 2 ## Events ::: pydoll.protocol.fetch.events options: show_root_heading: true show_source: false heading_level: 2 ## Types ::: pydoll.protocol.fetch.types options: show_root_heading: true show_source: false heading_level: 2 ================================================ FILE: docs/en/api/protocol/input.md ================================================ # Input Protocol Input domain commands, events and types for Chrome DevTools Protocol. ## Methods ::: pydoll.protocol.input.methods options: show_root_heading: true show_source: false heading_level: 2 ## Events ::: pydoll.protocol.input.events options: show_root_heading: true show_source: false heading_level: 2 ## Types ::: pydoll.protocol.input.types options: show_root_heading: true show_source: false heading_level: 2 ================================================ FILE: docs/en/api/protocol/network.md ================================================ # Network Protocol Network domain commands and events for Chrome DevTools Protocol. ## Methods ::: pydoll.protocol.network.methods options: show_root_heading: false show_source: false heading_level: 2 ## Events ::: pydoll.protocol.network.events options: show_root_heading: false show_source: false heading_level: 2 ## Types ::: pydoll.protocol.network.types options: show_root_heading: false show_source: false heading_level: 2 ================================================ FILE: docs/en/api/protocol/page.md ================================================ # Page Protocol Page domain commands, events and types for Chrome DevTools Protocol. ## Methods ::: pydoll.protocol.page.methods options: show_root_heading: true show_source: false heading_level: 2 ## Events ::: pydoll.protocol.page.events options: show_root_heading: true show_source: false heading_level: 2 ## Types ::: pydoll.protocol.page.types options: show_root_heading: true show_source: false heading_level: 2 ================================================ FILE: docs/en/api/protocol/runtime.md ================================================ # Runtime Protocol Runtime domain commands, events and types for Chrome DevTools Protocol. ## Methods ::: pydoll.protocol.runtime.methods options: show_root_heading: true show_source: false heading_level: 2 ## Events ::: pydoll.protocol.runtime.events options: show_root_heading: true show_source: false heading_level: 2 ## Types ::: pydoll.protocol.runtime.types options: show_root_heading: true show_source: false heading_level: 2 ================================================ FILE: docs/en/api/protocol/storage.md ================================================ # Storage Protocol Storage domain commands, events and types for Chrome DevTools Protocol. ## Methods ::: pydoll.protocol.storage.methods options: show_root_heading: true show_source: false heading_level: 2 ## Events ::: pydoll.protocol.storage.events options: show_root_heading: true show_source: false heading_level: 2 ## Types ::: pydoll.protocol.storage.types options: show_root_heading: true show_source: false heading_level: 2 ================================================ FILE: docs/en/api/protocol/target.md ================================================ # Target Protocol Target domain commands and events for Chrome DevTools Protocol. ## Methods ::: pydoll.protocol.target.methods options: show_root_heading: true show_source: false heading_level: 2 ## Events ::: pydoll.protocol.target.events options: show_root_heading: true show_source: false heading_level: 2 ## Types ::: pydoll.protocol.target.types options: show_root_heading: true show_source: false heading_level: 2 ================================================ FILE: docs/en/deep-dive/architecture/browser-domain.md ================================================ # Browser Domain Architecture The Browser domain represents the highest level of Pydoll's automation hierarchy, managing the browser process lifecycle, CDP connections, context isolation, and global browser operations. This document explores the internal architecture, design decisions, and technical implementation of browser-level control. !!! info "Practical Usage Guide" For practical examples and usage patterns, see the [Browser Management](../features/browser-management/tabs.md) and [Browser Contexts](../features/browser-management/contexts.md) guides. ## Architectural Overview The Browser domain sits at the intersection of process management, protocol communication, and resource coordination. It orchestrates multiple specialized components to provide a unified interface for browser automation: ```mermaid graph LR Browser[Browser Instance] Browser --> ProcessManager[Process Manager] Browser --> ProxyManager[Proxy Manager] Browser --> TempDirManager[Temp Directory Manager] Browser --> TabRegistry[Tab Registry] Browser --> ConnectionHandler[Connection Handler] ProcessManager --> |Manages| BrowserProcess[Browser Process] ConnectionHandler <--> |WebSocket| CDP[Chrome DevTools Protocol] TabRegistry --> |Manages| Tabs[Tab Instances] CDP <--> BrowserProcess ``` ### Hierarchy and Abstraction The Browser domain is implemented as an **abstract base class** that defines the contract for all browser implementations: ```python class Browser(ABC): """Abstract base class for browser automation via CDP.""" @abstractmethod def _get_default_binary_location(self) -> str: """Subclasses must provide browser-specific executable path.""" pass async def start(self, headless: bool = False) -> Tab: """Concrete implementation shared by all browsers.""" # 1. Resolve binary location # 2. Setup user data directory # 3. Start browser process # 4. Verify CDP connection # 5. Configure proxy (if needed) # 6. Return initial tab ``` This design enables **polymorphism** - Chrome, Edge, and other Chromium-based browsers share 99% of their code, differing only in executable paths and minor flag variations. ## Component Architecture The Browser class coordinates several specialized managers, each responsible for a specific aspect of browser automation. Understanding these components is key to understanding Pydoll's design. ### Connection Handler The ConnectionHandler is the **communication bridge** between Pydoll and the browser process. It manages: - **WebSocket lifecycle**: Connection establishment, keep-alive, reconnection - **Command execution**: Sending CDP commands and awaiting responses - **Event dispatching**: Routing CDP events to registered callbacks - **Callback registry**: Maintaining event listeners per connection ```python class Browser: def __init__(self, ...): # ConnectionHandler is initialized with port or WebSocket address self._connection_handler = ConnectionHandler(self._connection_port) async def _execute_command(self, command, timeout=10): """All CDP commands flow through the connection handler.""" return await self._connection_handler.execute_command(command, timeout) ``` !!! info "Connection Layer Deep Dive" For detailed information on WebSocket communication, command/response flow, and async patterns, see [Connection Layer Architecture](./connection-layer.md). ### Process Manager The BrowserProcessManager handles **operating system process lifecycle**: ```python class BrowserProcessManager: def start_browser_process(self, binary, port, arguments): """ 1. Constructs command-line with binary path + arguments 2. Spawns subprocess with proper stdio handling 3. Monitors process startup 4. Stores process handle for later termination """ def stop_process(self): """ 1. Attempts graceful termination (SIGTERM) 2. Waits for process exit 3. Force-kills if timeout exceeded (SIGKILL) 4. Cleans up process resources """ ``` **Why separate process management?** - **Testability**: Process manager can be mocked for unit tests - **Cross-platform**: Encapsulates OS-specific process handling - **Reliability**: Handles edge cases like zombie processes, orphaned children ### Tab Registry The Browser maintains a **registry of Tab instances** to ensure singleton behavior per target: ```python class Browser: def __init__(self, ...): self._tabs_opened: dict[str, Tab] = {} async def new_tab(self, url='', browser_context_id=None) -> Tab: # Create target via CDP response = await self._execute_command( TargetCommands.create_target(browser_context_id=browser_context_id) ) target_id = response['result']['targetId'] # Check if tab already exists in registry if target_id in self._tabs_opened: return self._tabs_opened[target_id] # Create new Tab instance and register it tab = Tab(self, target_id=target_id, ...) self._tabs_opened[target_id] = tab return tab ``` **Why singleton Tab instances?** - **State consistency**: Multiple references to same tab share state (enabled domains, callbacks) - **Memory efficiency**: Prevents duplicate Tab instances for same target - **Event routing**: Ensures events route to correct Tab instance ### Proxy Authentication Architecture Pydoll implements **automatic proxy authentication** via the Fetch domain to avoid exposing credentials in CDP commands. The implementation uses **two distinct mechanisms** depending on proxy scope: #### Mechanism 1: Browser-Level Proxy Auth (Global Proxy) When a proxy is configured via `ChromiumOptions` (applies to all tabs in the default context): ```python # In Browser.start() -> _configure_proxy() async def _configure_proxy(self, private_proxy, proxy_credentials): # Enable Fetch AT BROWSER LEVEL await self.enable_fetch_events(handle_auth_requests=True) # Register callbacks AT BROWSER LEVEL (affects ALL tabs) await self.on(FetchEvent.REQUEST_PAUSED, self._continue_request_callback, temporary=True) await self.on(FetchEvent.AUTH_REQUIRED, partial(self._continue_request_with_auth_callback, proxy_username=credentials[0], proxy_password=credentials[1]), temporary=True) ``` **Scope:** Browser-wide WebSocket connection → affects **all tabs in default context** #### Mechanism 2: Tab-Level Proxy Auth (Per-Context Proxy) When a proxy is configured per-context via `create_browser_context(proxy_server=...)`: ```python # Store credentials per context async def create_browser_context(self, proxy_server, ...): sanitized_proxy, extracted_auth = self._sanitize_proxy_and_extract_auth(proxy_server) response = await self._execute_command( TargetCommands.create_browser_context(proxy_server=sanitized_proxy) ) context_id = response['result']['browserContextId'] if extracted_auth: self._context_proxy_auth[context_id] = extracted_auth # Store per context return context_id # Setup auth for EACH tab in that context async def _setup_context_proxy_auth_for_tab(self, tab, browser_context_id): creds = self._context_proxy_auth.get(browser_context_id) if not creds: return # Enable Fetch ON THE TAB (tab-level WebSocket) await tab.enable_fetch_events(handle_auth=True) # Register callbacks ON THE TAB (affects only this tab) await tab.on(FetchEvent.REQUEST_PAUSED, partial(self._tab_continue_request_callback, tab=tab), temporary=True) await tab.on(FetchEvent.AUTH_REQUIRED, partial(self._tab_continue_request_with_auth_callback, tab=tab, proxy_username=creds[0], proxy_password=creds[1]), temporary=True) ``` **Scope:** Tab-level WebSocket connection → affects **only that specific tab** #### Why Two Mechanisms? | Aspect | Browser-Level | Tab-Level | |--------|---------------|-----------| | **Trigger** | Proxy in `ChromiumOptions` | Proxy in `create_browser_context()` | | **WebSocket** | Browser-level connection | Tab-level connection | | **Scope** | All tabs in default context | Only tabs in that context | | **Efficiency** | One listener for all tabs | One listener per tab | | **Isolation** | No context separation | Each context has different credentials | **Design rationale for tab-level auth:** - **Context isolation**: Each context can have a **different proxy** with **different credentials** - **CDP limitation**: Fetch domain cannot be scoped to a specific context at browser level - **Tradeoff**: Slightly less efficient (one listener per tab), but necessary for per-context proxy support This architecture ensures **credentials never appear in CDP logs** and authentication is handled transparently. !!! warning "Fetch Domain Side Effects" - **Browser-level Fetch**: Temporarily pauses **all requests across all tabs** in the default context until auth completes - **Tab-level Fetch**: Temporarily pauses **all requests in that specific tab** until auth completes This is a CDP limitation - Fetch enables request interception. After authentication completes, Fetch is disabled to minimize overhead. ## Initialization and Lifecycle ### Constructor Design The Browser constructor initializes all internal components but **does not start the browser process**. This separation allows configuration before launch: ```python class Browser(ABC): def __init__( self, options_manager: BrowserOptionsManager, connection_port: Optional[int] = None, ): # 1. Validate parameters self._validate_connection_port(connection_port) # 2. Initialize options via manager self.options = options_manager.initialize_options() # 3. Determine CDP port (random if not specified) self._connection_port = connection_port or randint(9223, 9322) # 4. Initialize specialized managers self._proxy_manager = ProxyManager(self.options) self._browser_process_manager = BrowserProcessManager() self._temp_directory_manager = TempDirectoryManager() self._connection_handler = ConnectionHandler(self._connection_port) # 5. Initialize state tracking self._tabs_opened: dict[str, Tab] = {} self._context_proxy_auth: dict[str, tuple[str, str]] = {} self._ws_address: Optional[str] = None ``` **Key design decisions:** - **Lazy process start**: Constructor is synchronous; `start()` is async - **Port flexibility**: Random port prevents collisions in parallel automation - **Options manager pattern**: Strategy pattern for browser-specific configuration - **Component composition**: Specialized managers instead of monolithic class ### Start Sequence The `start()` method orchestrates browser launch and connection: ```python async def start(self, headless: bool = False) -> Tab: # 1. Resolve binary location binary_location = self.options.binary_location or self._get_default_binary_location() # 2. Setup user data directory (temp or persistent) self._setup_user_dir() # 3. Extract proxy credentials (if private proxy) proxy_config = self._proxy_manager.get_proxy_credentials() # 4. Start browser process with arguments self._browser_process_manager.start_browser_process( binary_location, self._connection_port, self.options.arguments ) # 5. Verify CDP endpoint is responsive await self._verify_browser_running() # 6. Configure proxy authentication (via Fetch domain) await self._configure_proxy(proxy_config[0], proxy_config[1]) # 7. Get first valid target and create Tab valid_tab_id = await self._get_valid_tab_id(await self.get_targets()) tab = Tab(self, target_id=valid_tab_id, connection_port=self._connection_port) self._tabs_opened[valid_tab_id] = tab return tab ``` !!! tip "Why start() Returns a Tab" This is a **design compromise** for ergonomics. Ideally, `start()` would only launch the browser, and users would call `new_tab()` separately. However, returning the initial tab reduces boilerplate for the 90% use case (single-tab automation). The tradeoff: the initial tab cannot be avoided even in multi-tab scenarios. ### Context Manager Protocol The Browser implements `__aenter__` and `__aexit__` for automatic cleanup: ```python async def __aexit__(self, exc_type, exc_val, exc_tb): # 1. Restore backup preferences (if modified) if self._backup_preferences_dir: shutil.copy2(self._backup_preferences_dir, ...) # 2. Check if browser is still running if await self._is_browser_running(timeout=2): await self.stop() # 3. Close WebSocket connection await self._connection_handler.close() ``` This ensures proper cleanup even if exceptions occur during automation. ## Browser Context Architecture Browser contexts are Pydoll's most sophisticated isolation mechanism, providing **complete browsing environment separation** within a single browser process. Understanding their architecture is essential for advanced automation. ### CDP Hierarchy: Browser, Context, Target CDP organizes browser structure into three levels: ```mermaid graph TB Browser[Browser Process] Browser --> DefaultContext[Default BrowserContext] Browser --> Context1[BrowserContext ID: abc-123] Browser --> Context2[BrowserContext ID: def-456] DefaultContext --> Target1[Target/Page ID: page-1] DefaultContext --> Target2[Target/Page ID: page-2] Context1 --> Target3[Target/Page ID: page-3] Context2 --> Target4[Target/Page ID: page-4] Context2 --> Target5[Target/Page ID: page-5] ``` **Key concepts:** 1. **Browser Process**: Single Chromium instance with one CDP endpoint 2. **BrowserContext**: Isolated storage/cache/permission boundary (similar to incognito mode) 3. **Target**: Individual page, popup, worker, or background target ### Context Isolation Boundaries Each browser context maintains **strict isolation** for: | Resource | Isolation Level | Implementation | |----------|----------------|----------------| | Cookies | Full | Separate cookie jar per context | | localStorage | Full | Separate storage per origin per context | | IndexedDB | Full | Separate database per origin per context | | Cache | Full | Independent HTTP cache per context | | Permissions | Full | Context-specific permission grants | | Network proxy | Full | Per-context proxy configuration | | Authentication | Full | Independent auth state per context | !!! info "Why Contexts Are Lightweight" Unlike launching multiple browser processes, contexts share the **rendering engine, GPU process, and network stack**. Only storage and state are isolated. This makes contexts 10-100x faster to create than new browser instances. ### Context Creation and Target Binding Creating a context and target involves two CDP commands: ```python # Step 1: Create isolated browsing context response = await self._execute_command( TargetCommands.create_browser_context( proxy_server='http://proxy.example.com:8080', proxy_bypass_list='localhost,127.0.0.1' ) ) context_id = response['result']['browserContextId'] # Step 2: Create target (page) within that context response = await self._execute_command( TargetCommands.create_target( browser_context_id=context_id # Binds target to context ) ) target_id = response['result']['targetId'] ``` **Critical detail:** The `browser_context_id` parameter **binds the target to the context's isolation boundary**. Without it, the target is created in the default context. ### Window Materialization in Headed Mode In **headed mode** (visible UI), browser contexts have an important physical constraint: - A context initially exists only **in memory** (no window) - The **first target** created in a context **must** open a top-level window - **Subsequent targets** can open as tabs within that window This is a **CDP/Chromium limitation**, not a Pydoll design choice: ```python # First target in context: MUST create window tab1 = await browser.new_tab(browser_context_id=context_id) # Opens new window # Subsequent targets: CAN open as tabs in existing window tab2 = await browser.new_tab(browser_context_id=context_id) # Opens as tab ``` **Why does this matter?** - In **headless mode**: Completely irrelevant (no windows rendered) - In **headed mode**: First target per context will open a visible window - In **test environments**: Multiple contexts → multiple windows (can be confusing) !!! tip "Headless Contexts Are Cleaner" For CI/CD, scraping, or batch automation, use headless mode. Context isolation works identically, but without window materialization overhead. ### Context Deletion and Cleanup Deleting a context **immediately closes all targets** within it: ```python await browser.delete_browser_context(context_id) # All tabs in this context are now closed # All storage for this context is cleared # Context cannot be reused (ID is invalid) ``` **Cleanup sequence:** 1. CDP sends `Target.disposeBrowserContext` command 2. Browser closes all targets in that context 3. Browser clears all storage for that context 4. Browser invalidates the context ID 5. Pydoll removes context from internal registries ## Event System at Browser Level The Browser domain supports **browser-wide event listeners** that operate across all tabs and contexts. This is distinct from tab-level events. ### Browser vs Tab Event Scope ```python # Browser-level event: applies to ALL tabs await browser.on('Target.targetCreated', handle_new_target) # Tab-level event: applies to ONE tab await tab.on('Page.loadEventFired', handle_page_load) ``` **Architectural difference:** - **Browser events** use the **browser-level WebSocket connection** (port-based or `ws://host/devtools/browser/...`) - **Tab events** use **tab-level WebSocket connections** (`ws://host/devtools/page/`) ### Fetch Domain: Global Request Interception The Fetch domain can be enabled at **both** browser and tab levels, with different scopes: ```python # Browser-level Fetch: intercepts requests for ALL tabs await browser.enable_fetch_events(handle_auth_requests=True) await browser.on('Fetch.requestPaused', handle_request) # Tab-level Fetch: intercepts requests for ONE tab await tab.enable_fetch_events(handle_auth_requests=True) await tab.on('Fetch.requestPaused', handle_request) ``` **When to use each:** | Use Case | Level | Reason | |----------|-------|--------| | Proxy authentication | Browser | Applies globally to all contexts | | Ad blocking | Browser | Block ads across all tabs | | API mocking | Tab | Mock specific API for specific test | | Request logging | Tab | Log only relevant tab's requests | !!! warning "Fetch Performance Impact" Enabling Fetch at the browser level **pauses all requests** across all tabs until callbacks execute. This adds latency to every request. Use tab-level Fetch when possible to minimize impact. ### Command Routing All CDP commands flow through the Browser's connection handler: ```python async def _execute_command(self, command, timeout=10): """ Routes command to appropriate connection: - Browser-level commands → browser WebSocket - Tab-level commands → delegated to Tab instance """ return await self._connection_handler.execute_command(command, timeout) ``` This centralized routing enables: - **Request/response correlation**: Match responses to requests via ID - **Timeout management**: Cancel commands that exceed timeout - **Error handling**: Convert CDP errors to Python exceptions ## Resource Management ### Cookie and Storage Operations The Browser domain exposes **browser-wide** and **context-specific** storage operations: ```python # Browser-level operations (all contexts) await browser.set_cookies(cookies) await browser.get_cookies() await browser.delete_all_cookies() # Context-specific operations await browser.set_cookies(cookies, browser_context_id=context_id) await browser.get_cookies(browser_context_id=context_id) await browser.delete_all_cookies(browser_context_id=context_id) ``` These operations use the **Storage domain** under the hood: - `Storage.getCookies`: Retrieve cookies for context or all contexts - `Storage.setCookies`: Set cookies with domain/path/expiry - `Storage.clearCookies`: Clear cookies for context or all contexts !!! info "Browser vs Tab Storage Scope" - **Browser-level**: Operates on entire browser or specific context - **Tab-level**: Scoped to tab's current origin Use browser-level for global cookie management (e.g., setting session cookies for all domains). Use tab-level for origin-specific operations (e.g., clearing cookies after logout). ### Permission Grants The Browser domain provides **programmatic permission control**, bypassing browser prompts: ```python await browser.grant_permissions( [PermissionType.GEOLOCATION, PermissionType.NOTIFICATIONS], origin='https://example.com', browser_context_id=context_id ) ``` **Architecture:** - Permissions are granted via the `Browser.grantPermissions` CDP command - Permissions are **context-specific** (isolated per context) - Grants override default prompt behavior - `reset_permissions()` reverts to default behavior ### Download Management Download behavior is configured via the `Browser.setDownloadBehavior` command: ```python await browser.set_download_behavior( behavior=DownloadBehavior.ALLOW, download_path='/path/to/downloads', events_enabled=True, # Emit download progress events browser_context_id=context_id ) ``` **Options:** - `ALLOW`: Save to specified path - `DENY`: Cancel all downloads - `DEFAULT`: Show browser's default download UI ### Window Management Window operations apply to the **physical OS window** of a target: ```python window_id = await browser.get_window_id_for_target(target_id) await browser.set_window_bounds({ 'left': 100, 'top': 100, 'width': 1920, 'height': 1080, 'windowState': 'normal' # or 'minimized', 'maximized', 'fullscreen' }) ``` **Implementation details:** - Uses `Browser.getWindowForTarget` to resolve window ID from target ID - `Browser.setWindowBounds` modifies window geometry - **Headless mode**: Window operations are no-ops (no physical windows exist) ## Architectural Insights and Design Tradeoffs ### Singleton Tab Registry: Why? The tab registry pattern (`_tabs_opened: dict[str, Tab]`) ensures that: 1. **Event routing works correctly**: CDP events contain a `targetId` but no Tab reference. The registry maps `targetId` → `Tab` for correct callback dispatch. 2. **State consistency**: Multiple code paths that reference the same target get the **same Tab instance**, preventing state divergence. 3. **Memory efficiency**: Without the registry, `get_opened_tabs()` would create duplicate Tab instances for every call. **Tradeoff:** Memory usage grows with tab count, but this is unavoidable for stateful Tab instances. ### Why start() Returns a Tab This design decision sacrifices purity for **ergonomics**: - **Downside**: Initial tab cannot be avoided, even in multi-tab automation - **Upside**: 90% of users (single-tab scripts) don't need boilerplate: ```python # With start() returning Tab tab = await browser.start() # Without (pure design) await browser.start() tab = await browser.new_tab() ``` **Alternative explored:** Auto-close initial tab in `new_tab()`. Rejected because it's surprising behavior (implicit side effects). ### Proxy Authentication: Two-Level Architecture Tradeoff Pydoll's proxy authentication uses two different Fetch domain strategies: **Browser-Level (Global Proxy):** - **Security benefit**: Credentials never logged in CDP traces - **Performance cost**: Fetch pauses **all requests across all tabs** until auth completes - **Efficiency**: Single listener for all tabs in default context - **Mitigation**: Fetch is disabled after first auth, minimizing overhead **Tab-Level (Per-Context Proxy):** - **Security benefit**: Credentials never logged in CDP traces - **Performance cost**: Fetch pauses **all requests in that tab** until auth completes - **Efficiency**: Separate listener per tab (less efficient, but necessary for isolation) - **Isolation benefit**: Each context can have different proxy credentials - **Mitigation**: Fetch is disabled after first auth per tab **Why not use Browser.setProxyAuth?** This CDP command doesn't exist. Fetch is the only mechanism for programmatic auth. **Why tab-level for contexts?** CDP's Fetch domain cannot be scoped to a specific BrowserContext. Since each context can have a different proxy with different credentials, Pydoll must handle auth at the tab level to respect context boundaries. ### Port Randomization Strategy Random CDP ports (9223-9322) prevent collisions when running parallel browser instances: ```python self._connection_port = connection_port or randint(9223, 9322) ``` **Why not increment from 9222?** - Race conditions in multi-process environments (e.g., pytest-xdist) - Collision with user's manual port selection **Tradeoff:** Random ports are harder to debug (can't hardcode). Solution: `browser._connection_port` exposes the chosen port. ### Component Separation: Why Managers? The Browser class delegates to specialized managers (ProcessManager, ProxyManager, TempDirManager, ConnectionHandler) for: 1. **Testability**: Managers can be mocked independently 2. **Reusability**: ProxyManager logic shared across Browser implementations 3. **Maintainability**: Each manager has single responsibility 4. **Cross-platform**: OS-specific logic isolated in ProcessManager **Tradeoff:** More indirection, but significantly better code organization at scale. ## Key Takeaways 1. **Browser is a coordinator**, not a monolith. It orchestrates managers and handles CDP communication. 2. **Tab registry ensures singleton instances** per target, critical for event routing and state consistency. 3. **Browser contexts are lightweight isolation**, sharing browser process but separating storage/cache/auth. 4. **Proxy auth via Fetch** is a security tradeoff - hides credentials but adds latency. 5. **Event system has two levels**: Browser-wide and tab-specific, with different WebSocket connections. 6. **Component separation** (managers) improves testability and cross-platform support. ## Related Documentation For deeper understanding of related architectural components: - **[Connection Layer](./connection-layer.md)**: WebSocket communication, command/response flow, async patterns - **[Event Architecture](./event-architecture.md)**: Event dispatch, callback management, domain enabling - **[Tab Domain](./tab-domain.md)**: Tab-level operations, page navigation, element finding - **[CDP Deep Dive](./cdp.md)**: Chrome DevTools Protocol fundamentals - **[Proxy Architecture](./proxy-architecture.md)**: Network-level proxy concepts and implementation For practical usage patterns: - **[Tab Management](../features/browser-management/tabs.md)**: Multi-tab automation patterns - **[Browser Contexts](../features/browser-management/contexts.md)**: Context isolation in practice - **[Proxy Configuration](../features/configuration/proxy.md)**: Setting up proxies and authentication ================================================ FILE: docs/en/deep-dive/architecture/browser-requests-architecture.md ================================================ # Browser-Context Requests Architecture This document explores the architectural design of Pydoll's browser-context HTTP request system, which enables making HTTP requests that seamlessly inherit the browser's session state, cookies, and authentication. !!! info "Practical Guide Available" This is the architectural deep dive. For practical examples and use cases, see [HTTP Requests Guide](../features/network/http-requests.md). ## Architectural Overview Browser-context requests solve a fundamental problem in hybrid automation: maintaining session continuity between UI interactions and API calls. Traditional approaches require manually extracting cookies and headers, creating fragile coupling between browser and HTTP client. Pydoll's architecture eliminates this complexity by executing HTTP requests **inside** the browser's JavaScript context, while leveraging CDP network events to capture comprehensive metadata that JavaScript alone cannot provide. ### Why This Architecture? | Traditional Approach | Pydoll Architecture | |---------------------|---------------------| | Separate HTTP client (requests, aiohttp) | Unified browser-based execution | | Manual cookie extraction and sync | Automatic cookie inheritance | | Two separate session states | Single session state | | Limited CORS handling | Browser-native CORS enforcement | | Complex authentication flows | Transparent auth preservation | ## Component Architecture The browser-context request system consists of two primary classes that work together with Pydoll's event system: ```mermaid classDiagram class Tab { +request: Request +enable_network_events() +disable_network_events() +get_network_response_body() +on(event_name, callback) +clear_callbacks() } class Request { -tab: Tab -_network_events_enabled: bool -_requests_sent: list -_requests_received: list +get(url, params, kwargs) +post(url, data, json, kwargs) +put(url, data, json, kwargs) +patch(url, data, json, kwargs) +delete(url, kwargs) +head(url, kwargs) +options(url, kwargs) -_execute_fetch_request() -_register_callbacks() -_extract_headers() -_extract_cookies() } class Response { -_status_code: int -_content: bytes -_text: str -_json: dict -_response_headers: list -_request_headers: list -_cookies: list -_url: str +ok: bool +status_code: int +text: str +content: bytes +url: str +headers: list +request_headers: list +cookies: list +json() +raise_for_status() } Tab *-- Request Request ..> Response : creates Request ..> Tab : uses events ``` ### Request Class The `Request` class serves as the interface layer, providing a familiar `requests`-like API while orchestrating the complex interaction between JavaScript execution and network event monitoring. **Key Responsibilities:** - Translate Python method calls to Fetch API JavaScript - Manage temporary network event listeners - Accumulate network events during request execution - Extract metadata from CDP events - Construct Response objects with complete information ### Response Class The `Response` class provides a `requests.Response`-compatible interface, making migration from traditional HTTP clients seamless. **Key Features:** - Multiple content accessors (text, bytes, JSON) - Lazy JSON parsing with caching - Comprehensive header information (both sent and received) - Cookie extraction from Set-Cookie headers - Final URL after redirects ## Execution Flow The request execution follows a six-phase pipeline: ```mermaid flowchart TD Start([tab.request.get#40;url#41;]) --> Phase1[1. Preparation
Build URL + options] Phase1 --> Phase2[2. Event Registration
Enable network events
Register callbacks] Phase2 --> Phase3[3. JavaScript Execution
Runtime.evaluate(fetch)] Phase3 --> Phase4{4. Network Activity} Phase4 -->|Request sent| Event1[REQUEST_WILL_BE_SENT] Phase4 -->|Response received| Event2[RESPONSE_RECEIVED] Phase4 -->|Extra info| Event3[*_EXTRA_INFO events] Event1 --> Collect[Collect metadata] Event2 --> Collect Event3 --> Collect Collect --> Phase5[5. Construction
Extract headers/cookies
Build Response object] Phase5 --> Phase6[6. Cleanup
Clear callbacks
Disable events] Phase6 --> End([Return Response]) ``` ### Phase Details | Phase | Layer | Key Operations | Asynchronous | |-------|-------|----------------|--------------| | **1. Preparation** | Request | URL building, options formatting | No | | **2. Event Registration** | Tab | Enable events, register callbacks | Yes | | **3. JavaScript Execution** | CDP/Browser | Execute fetch() in browser context | Yes | | **4. Network Activity** | Browser/CDP | HTTP request, emit CDP events | Yes (parallel) | | **5. Construction** | Request | Parse events, build Response | No | | **6. Cleanup** | Tab | Remove callbacks, disable events | Yes | ## Event System Integration Browser-context requests are tightly integrated with Pydoll's event system architecture. Understanding this relationship is crucial. ### Temporary Event Lifecycle ```mermaid stateDiagram-v2 [*] --> NoEvents: Request starts NoEvents --> EventsEnabled: Enable network events EventsEnabled --> CallbacksRegistered: Register callbacks CallbacksRegistered --> ExecutingRequest: Execute fetch ExecutingRequest --> CapturingEvents: Events fire CapturingEvents --> ExecutingRequest: More events ExecutingRequest --> CleaningUp: Fetch completes CleaningUp --> CallbacksRemoved: Clear callbacks CallbacksRemoved --> EventsDisabled: Disable if needed EventsDisabled --> [*]: Request complete ``` ### Why Both JavaScript and Events? A common question: if JavaScript can execute the request, why use network events? | Information Source | JavaScript (Fetch API) | Network Events (CDP) | |-------------------|------------------------|----------------------| | Response status | Available | Available | | Response body | Available | Not available | | Response headers | Partial (CORS restricted) | Complete | | Request headers | Not accessible | Complete | | Set-Cookie headers | Hidden by browser | Available | | Timing information | Limited | Comprehensive | | Redirect chain | Only final URL | Full chain | **The Solution:** Combine both sources for complete information. !!! tip "Complementary Technologies" JavaScript provides the response body and triggers the request in the browser's context (with cookies, auth). Network events provide the metadata that JavaScript security policies hide. ### CDP Network Event Types The architecture uses four CDP event types to capture complete metadata: | Event | Purpose | Key Information | |-------|---------|----------------| | `REQUEST_WILL_BE_SENT` | Main outgoing request | URL, method, standard headers | | `REQUEST_WILL_BE_SENT_EXTRA_INFO` | Additional request metadata | Associated cookies, raw headers | | `RESPONSE_RECEIVED` | Main response received | Status, headers, MIME type, timing | | `RESPONSE_RECEIVED_EXTRA_INFO` | Additional response metadata | Set-Cookie headers, security info | !!! info "Event Multiplicity" A single HTTP request generates multiple CDP events. The Request class accumulates all related events and extracts non-duplicate information during the construction phase. ## Header and Cookie Architecture ### Header Extraction Strategy Headers exist in multiple CDP events with potential duplication. The architecture uses a deduplication strategy: ```mermaid flowchart TD A[Network Events] --> B{Event Type} B -->|REQUEST events| C[Extract Sent Headers] B -->|RESPONSE events| D[Extract Received Headers] C --> E[Deduplicate by name+value] D --> F[Deduplicate by name+value] E --> G[Request Headers List] F --> H[Response Headers List] G --> I[Response Object] H --> I ``` **Deduplication Logic:** 1. Events are processed in order 2. Each header is identified by `(name, value)` tuple 3. Only first occurrence of each tuple is kept 4. Result: unique, non-redundant header list ### Cookie Parsing Architecture Cookies require special handling because they come from `Set-Cookie` headers in `RESPONSE_RECEIVED_EXTRA_INFO` events: ```mermaid flowchart TD A[RESPONSE_RECEIVED_EXTRA_INFO] --> B[Extract Set-Cookie headers] B --> C{Multi-line header?} C -->|Yes| D[Split by newline] C -->|No| E[Parse single cookie] D --> F[Parse each line] F --> G[Extract name=value] E --> G G --> H{Valid name?} H -->|Yes| I[Create CookieParam] H -->|No| J[Discard] I --> K[Add to cookie list] K --> L[Deduplicate] L --> M[Response Object] ``` **Cookie Extraction Principles:** - Only `EXTRA_INFO` events contain `Set-Cookie` headers - Cookie attributes (Path, Domain, Secure, HttpOnly) are ignored - Browser manages cookie attributes internally - Only name-value pairs are extracted for informational purposes !!! warning "Cookie Scope" The `Response.cookies` property contains only **new or updated** cookies from this specific response. Existing browser cookies are managed automatically and not exposed through this interface. ## JavaScript Execution Context The Fetch API execution happens in the browser's JavaScript context, which is key to the architecture's power: ### Fetch API Integration The request is translated to JavaScript: ```javascript // Simplified representation (async () => { const response = await fetch(url, { method: 'GET', headers: {'X-Custom': 'value'}, // Browser automatically adds: // - Cookie header // - Authorization if set // - Standard headers (User-Agent, Accept, etc.) }); return { status: response.status, url: response.url, // Final URL after redirects text: await response.text(), content: new Uint8Array(await response.arrayBuffer()), json: response.headers.get('Content-Type')?.includes('application/json') ? await response.clone().json() : null }; })() ``` ### Browser Context Benefits Executing in the browser context provides: | Benefit | Description | |---------|-------------| | **Automatic Cookie Inclusion** | Browser sends all applicable cookies automatically | | **Auth State Preservation** | Authentication headers maintained from browser session | | **CORS Enforcement** | Browser applies same CORS policies as user interactions | | **TLS/SSL Handling** | Browser's certificate validation and security policies apply | | **Compression** | Automatic handling of gzip, br, deflate | | **Redirects** | Browser follows redirects transparently | | **Same Security Context** | Request appears identical to user-initiated requests | !!! info "Anti-Bot Detection" Requests executed in the browser context are indistinguishable from user-initiated requests, making them effective against anti-bot systems that analyze request patterns. ## Performance Considerations ### Event Overhead Network events add overhead to request execution: | Scenario | Overhead | Recommendation | |----------|----------|----------------| | Single request | Low | Acceptable | | Multiple sequential requests | Moderate | Enable events once | | Bulk requests (100+) | High | Consider enabling events at tab level | | Long-running automation | Memory concern | Disable when done | ### Optimization Pattern ```python # Inefficient - events enabled/disabled repeatedly for url in urls: response = await tab.request.get(url) # Efficient - events enabled once await tab.enable_network_events() for url in urls: response = await tab.request.get(url) await tab.disable_network_events() ``` !!! tip "Automatic Optimization" The Request class checks if network events are already enabled and skips redundant enable/disable operations automatically. ### JSON Parsing Strategy Response JSON parsing uses lazy evaluation with caching: 1. First call to `response.json()`: Parse and cache 2. Subsequent calls: Return cached result 3. If JSON pre-parsed during construction: Use that This prevents redundant parsing overhead. ## Security Architecture ### CORS Policy Enforcement Browser-context requests respect CORS policies: ```mermaid flowchart TD A[tab.request.get(url)] --> B{Same Origin?} B -->|Yes| C[Request Allowed] B -->|No| D{CORS Headers Present?} D -->|Yes| E[Request Allowed] D -->|No| F[Request Blocked] C --> G[Response Returned] E --> G F --> H[CORS Error] ``` **CORS Behavior:** - Requests to same origin: Always allowed - Cross-origin requests: Require CORS headers from server - Opaque responses: May be blocked by browser **Workaround for CORS Issues:** Navigate to the domain first to establish same-origin context: ```python await tab.go_to('https://different-domain.com') response = await tab.request.get('https://different-domain.com/api') ``` ### Cookie Security Cookies with security flags (`HttpOnly`, `Secure`, `SameSite`) are handled by the browser: - **HttpOnly cookies**: Sent automatically but not exposed to JavaScript or CDP - **Secure cookies**: Only sent over HTTPS - **SameSite cookies**: Browser enforces SameSite policies The `Response.cookies` property may not show all cookies due to these security restrictions. ### TLS/SSL Validation The browser validates SSL certificates. Self-signed or invalid certificates cause requests to fail unless: ```python options = ChromiumOptions() options.add_argument('--ignore-certificate-errors') browser = Chrome(options=options) ``` !!! warning "Security Trade-off" Disabling certificate validation reduces security. Only use in controlled environments. ## Limitations and Design Decisions ### Request Body Size Very large request bodies (files, large datasets) have JavaScript memory constraints. For file uploads, use `WebElement.set_input_files()` or the file chooser interceptor instead. ### Binary Response Handling Binary responses are converted through JavaScript's `ArrayBuffer` and `Uint8Array`, which adds some overhead for very large responses (>100MB). ### Redirect Transparency The Fetch API follows redirects automatically. Only the final URL is captured. If you need the redirect chain, use network monitoring separately. ### Event Timing Events must be registered **before** executing the fetch. The architecture ensures this through the registration phase, but manual event handling requires careful timing. ## Architectural Principles The browser-context request architecture adheres to these principles: 1. **Session Continuity**: Never break the browser's session state 2. **Zero Manual Sync**: No cookie/header extraction required 3. **Complete Information**: Combine JavaScript + events for full metadata 4. **Automatic Cleanup**: Resources freed after each request 5. **Familiar Interface**: `requests`-compatible API for easy adoption 6. **Performance Conscious**: Optimize for common use cases 7. **Security Aware**: Respect browser security policies ## Integration with Other Systems ### Event System Dependency Browser-context requests depend on the event system architecture: - Leverages `Tab.on()` for callback registration - Uses `Tab.clear_callbacks()` for cleanup - Respects existing network event enablement - Integrates with event lifecycle management See [Event System Architecture](event-architecture.md) for details. ### Type System Integration The architecture uses Python's type system extensively: - `HeaderEntry` TypedDict for headers - `CookieParam` TypedDict for cookies - Event type definitions from `pydoll.protocol.network.events` - Provides IDE autocomplete and type safety See [Typing System](typing-system.md) for details. ## Further Reading - **[HTTP Requests Guide](../features/network/http-requests.md)** - Practical examples and use cases - **[Event System Architecture](event-architecture.md)** - Event system internal design - **[Network Monitoring](../features/network/monitoring.md)** - Passive network observation - **[Request Interception](../features/network/interception.md)** - Active request modification - **[Typing System](typing-system.md)** - Type system integration ## Summary Pydoll's browser-context request architecture achieves seamless HTTP communication by combining JavaScript Fetch API execution with CDP network event monitoring. This hybrid approach provides: - **Complete metadata** from both JavaScript and CDP events - **Automatic session continuity** through browser context execution - **Familiar interface** compatible with the requests library - **Performance optimization** through event reuse - **Security compliance** with browser policies The architecture demonstrates how combining complementary technologies (JavaScript + CDP events) can solve complex problems elegantly, providing power and convenience without compromising on completeness or security. ================================================ FILE: docs/en/deep-dive/architecture/event-architecture.md ================================================ # Event System Architecture This document explores the internal architecture of Pydoll's event system, covering WebSocket communication, event flow, callback management, and performance considerations. !!! info "Practical Usage Guide" For practical examples and usage patterns, see the [Event System Guide](../features/advanced/event-system.md). ## WebSocket Communication and CDP At the core of Pydoll's event system is the Chrome DevTools Protocol (CDP), which provides a structured way to interact with and monitor browser activities over WebSocket connections. This bidirectional communication channel allows your code to both send commands to the browser and receive events back. ```mermaid sequenceDiagram participant Client as Pydoll Code participant Connection as ConnectionHandler participant WebSocket participant Browser Client->>Connection: Register callback for event Connection->>Connection: Store callback in registry Client->>Connection: Enable event domain Connection->>WebSocket: Send CDP command to enable domain WebSocket->>Browser: Forward command Browser-->>WebSocket: Acknowledge domain enabled WebSocket-->>Connection: Forward response Connection-->>Client: Domain enabled Browser->>WebSocket: Event occurs, sends CDP event message WebSocket->>Connection: Forward event message Connection->>Connection: Look up callbacks for this event Connection->>Client: Execute registered callback ``` ### WebSocket Communication Model The WebSocket connection between Pydoll and the browser follows this pattern: 1. **Connection Establishment**: When the browser starts, a WebSocket server is created, and Pydoll establishes a connection to it 2. **Bidirectional Messaging**: Both Pydoll and the browser can send messages at any time 3. **Message Types**: - **Commands**: Sent from Pydoll to the browser (e.g., navigation, DOM manipulation) - **Command Responses**: Sent from the browser to Pydoll in response to commands - **Events**: Sent from the browser to Pydoll when something happens (e.g., page load, network activity) ### Chrome DevTools Protocol Structure CDP organizes its functionality into domains, each responsible for a specific area of browser functionality: | Domain | Responsibility | Typical Events | |--------|----------------|----------------| | Page | Page lifecycle | Load events, navigation, dialogs | | Network | Network activity | Request/response monitoring, WebSockets | | DOM | Document structure | DOM changes, attribute modifications | | Fetch | Request interception | Request paused, authentication required | | Runtime | JavaScript execution | Console messages, exceptions | | Browser | Browser management | Window creation, tabs, contexts | Each domain must be explicitly enabled before it will emit events, which helps manage performance by only processing events that are actually needed. ## Domain Architecture ### The Enable/Disable Pattern The explicit enable/disable pattern serves several important architectural purposes: 1. **Performance Optimization**: By only enabling domains you're interested in, you reduce the overhead of event processing 2. **Resource Management**: Some event domains (like Network or DOM monitoring) can generate large volumes of events that consume memory 3. **Protocol Compliance**: CDP requires explicit domain enabling before events are emitted 4. **Controlled Cleanup**: Explicitly disabling domains ensures proper cleanup when events are no longer needed ```mermaid stateDiagram-v2 [*] --> Disabled: Initial State Disabled --> Enabled: enable_xxx_events() Enabled --> Disabled: disable_xxx_events() Enabled --> [*]: Tab Closed Disabled --> [*]: Tab Closed ``` !!! warning "Event Leak Prevention" Failing to disable event domains when they're no longer needed can lead to memory leaks and performance degradation, especially in long-running automation. Always disable event domains when you're done with them, particularly for high-volume events like network monitoring. ### Domain-Specific Enabling Methods Different domains are enabled through specific methods on the appropriate objects: | Domain | Enable Method | Disable Method | Available On | |--------|--------------|----------------|--------------| | Page | `enable_page_events()` | `disable_page_events()` | Tab | | Network | `enable_network_events()` | `disable_network_events()` | Tab | | DOM | `enable_dom_events()` | `disable_dom_events()` | Tab | | Fetch | `enable_fetch_events()` | `disable_fetch_events()` | Tab, Browser | | File Chooser | `enable_intercept_file_chooser_dialog()` | `disable_intercept_file_chooser_dialog()` | Tab | !!! info "Domain Ownership" Events belong to specific domains based on their functionality. Some domains are only available at certain levels - for instance, Page events are available on the Tab instance but not directly at the Browser level. ## Event Registration System ### The `on()` Method The central method for subscribing to events is the `on()` method, available on both Tab and Browser instances: ```python async def on( self, event_name: str, callback: callable, temporary: bool = False ) -> int: """ Registers an event listener. Args: event_name (str): The event name to listen for. callback (callable): The callback function to execute when the event is triggered. temporary (bool): If True, the callback will be removed after it's triggered once. Defaults to False. Returns: int: The ID of the registered callback. """ ``` This method returns a callback ID that can be used to remove the callback later if needed. ### Callback Registry Internally, the `ConnectionHandler` maintains a callback registry: ```python { 'Page.loadEventFired': [ (callback_id_1, callback_function_1, temporary=False), (callback_id_2, callback_function_2, temporary=True), ], 'Network.requestWillBeSent': [ (callback_id_3, callback_function_3, temporary=False), ] } ``` When an event arrives via WebSocket: 1. The event name is extracted from the message 2. The registry is queried for matching callbacks 3. Each callback is executed with the event data 4. Temporary callbacks are removed after execution ### Async Callback Handling Callbacks can be either synchronous or asynchronous. The event system handles both: ```python async def _trigger_callbacks(self, event_name: str, event_data: dict): for cb_id, cb_data in self._event_callbacks.items(): if cb_data['event'] == event_name: if asyncio.iscoroutinefunction(cb_data['callback']): await cb_data['callback'](event_data) else: cb_data['callback'](event_data) ``` Asynchronous callbacks are awaited sequentially. This means each callback completes before the next one executes, which is important for: - **Predictable Execution Order**: Callbacks execute in registration order - **Error Handling**: Exceptions in one callback don't prevent others from executing - **State Consistency**: Callbacks can rely on sequential state changes !!! info "Sequential vs Concurrent Execution" Callbacks execute sequentially within the same event. However, different events can be processed concurrently since the event loop handles multiple connections simultaneously. ## Event Flow and Lifecycle The event lifecycle follows these steps: ```mermaid flowchart TD A[Browser Activity] -->|Generates| B[CDP Event] B -->|Sent via WebSocket| C[ConnectionHandler] C -->|Filters by Event Name| D{Registered Callbacks?} D -->|Yes| E[Process Event] D -->|No| F[Discard Event] E -->|For Each Callback| G[Execute Callback] G -->|If Temporary| H[Remove Callback] G -->|If Permanent| I[Retain for Future Events] ``` ### Detailed Flow 1. **Browser Activity**: Something happens in the browser (page loads, request sent, DOM changes) 2. **CDP Event Generation**: Browser generates a CDP event message 3. **WebSocket Transmission**: Message is sent over WebSocket to Pydoll 4. **Event Reception**: The ConnectionHandler receives the event 5. **Callback Lookup**: ConnectionHandler checks its registry for callbacks matching the event name 6. **Callback Execution**: If callbacks exist, each is executed with the event data 7. **Temporary Removal**: If a callback was registered as temporary, it's removed after execution ## Browser-Level vs. Tab-Level Events Pydoll's event system operates at both the browser and tab levels, with important distinctions: ```mermaid graph TD Browser[Browser Instance] -->|"Global Events (e.g., Target events)"| BrowserCallbacks[Browser-Level Callbacks] Browser -->|"Creates"| Tab1[Tab Instance 1] Browser -->|"Creates"| Tab2[Tab Instance 2] Tab1 -->|"Tab-Specific Events"| Tab1Callbacks[Tab 1 Callbacks] Tab2 -->|"Tab-Specific Events"| Tab2Callbacks[Tab 2 Callbacks] ``` ### Browser-Level Events Browser-level events operate globally across all tabs. These are limited to specific domains like: - **Target Events**: Tab creation, destruction, crash - **Browser Events**: Window management, download coordination ```python # Browser-level event registration await browser.on('Target.targetCreated', handle_new_target) ``` Browser-level event domains are limited, and trying to use tab-specific events will raise an exception. ### Tab-Level Events Tab-level events are specific to an individual tab: ```python # Each tab has its own event context tab1 = await browser.start() tab2 = await browser.new_tab() await tab1.enable_page_events() await tab1.on(PageEvent.LOAD_EVENT_FIRED, handle_tab1_load) await tab2.enable_page_events() await tab2.on(PageEvent.LOAD_EVENT_FIRED, handle_tab2_load) ``` This architecture allows for: - **Isolated Event Handling**: Events in one tab don't affect others - **Per-Tab Configuration**: Different tabs can monitor different event types - **Resource Efficiency**: Only enable events on tabs that need them !!! info "Domain-Specific Scope" Not all event domains are available at both levels: - **Fetch Events**: Available at both browser and tab levels - **Page Events**: Available only at the tab level - **Target Events**: Available only at the browser level ## Performance Architecture ### Event System Overhead The event system adds overhead to browser automation, especially for high-frequency events: | Event Domain | Typical Event Volume | Performance Impact | |--------------|---------------------|-------------------| | Page | Low | Minimal | | Network | High | Moderate to High | | DOM | Very High | High | | Fetch | Moderate | Moderate (higher if intercepting) | ### Performance Optimization Strategies 1. **Selective Domain Enabling**: Only enable event domains you're actively using 2. **Strategic Scoping**: Use browser-level events only for truly browser-wide concerns 3. **Timely Disabling**: Always disable event domains when you're finished with them 4. **Early Filtering**: In callbacks, filter out irrelevant events as early as possible 5. **Temporary Callbacks**: Use the `temporary=True` flag for one-time events ### Memory Management The event system manages memory through several mechanisms: 1. **Callback Registry Cleanup**: Removing callbacks frees their references 2. **Temporary Auto-Removal**: Temporary callbacks are automatically cleaned up 3. **Domain Disabling**: Disabling a domain stops event generation 4. **Tab Closure**: When a tab closes, all its callbacks are automatically removed !!! warning "Memory Leak Prevention" In long-running automation, always clean up callbacks and disable domains when done. High-frequency events (especially DOM) can accumulate significant memory if left enabled. ## Connection Handler Architecture The `ConnectionHandler` is the central component managing WebSocket communication and event dispatching. ### Key Responsibilities 1. **WebSocket Management**: Establishing and maintaining the WebSocket connection 2. **Message Routing**: Distinguishing between command responses and events 3. **Callback Registry**: Maintaining the mapping of event names to callbacks 4. **Event Dispatching**: Executing registered callbacks when events arrive 5. **Cleanup**: Removing callbacks and closing connections ### Internal Structure ```python class ConnectionHandler: def __init__(self, ...): self._events_handler = EventsManager() self._websocket = None # ... other attributes async def register_callback(self, event_name, callback, temporary): return self._events_handler.register_callback(event_name, callback, temporary) class EventsManager: def __init__(self): self._event_callbacks = {} # Callback ID -> callback data self._callback_id = 0 def register_callback(self, event_name, callback, temporary): self._callback_id += 1 self._event_callbacks[self._callback_id] = { 'event': event_name, 'callback': callback, 'temporary': temporary } return self._callback_id async def _trigger_callbacks(self, event_name, event_data): callbacks_to_remove = [] for cb_id, cb_data in self._event_callbacks.items(): if cb_data['event'] == event_name: # Execute callback (await if async, call directly if sync) if asyncio.iscoroutinefunction(cb_data['callback']): await cb_data['callback'](event_data) else: cb_data['callback'](event_data) # Mark temporary callbacks for removal if cb_data['temporary']: callbacks_to_remove.append(cb_id) # Remove temporary callbacks after all callbacks executed for cb_id in callbacks_to_remove: self.remove_callback(cb_id) ``` This architecture ensures: - **Efficient Lookup**: Event names map directly to callback lists - **Minimal Overhead**: Only registered events are processed - **Automatic Cleanup**: Temporary callbacks are removed after execution - **Thread Safety**: Operations are async-safe ## Event Message Format CDP events follow a standardized message format: ```json { "method": "Network.requestWillBeSent", "params": { "requestId": "1234.56", "loaderId": "7890.12", "documentURL": "https://example.com", "request": { "url": "https://api.example.com/data", "method": "GET", "headers": {...} }, "timestamp": 123456.789, "wallTime": 1234567890.123, "initiator": {...}, "type": "XHR" } } ``` Key components: - **`method`**: The event name in `Domain.eventName` format - **`params`**: Event-specific data, varies by event type - **No `id` field**: Unlike commands, events don't have request IDs The event system extracts the `method` field to route to the appropriate callbacks, passing the entire message to each callback. ## Multi-Tab Event Coordination Pydoll's architecture supports sophisticated multi-tab event coordination: ### Independent Tab Contexts Each tab maintains its own: - Event domain enablement state - Callback registry - Event communication channel - Network logs (if network events enabled) !!! info "Communication Architecture" Each tab has its own event communication channel to the browser. For technical details on how WebSocket connections and target IDs work at the protocol level, see [Browser Domain Architecture](./browser-domain.md). ### Shared Browser Context Multiple tabs can share: - Browser-level event listeners - Cookie storage - Cache - Browser process This architecture allows for: 1. **Parallel Event Processing**: Multiple tabs can process events simultaneously 2. **Isolated Failures**: Issues in one tab don't affect others 3. **Resource Sharing**: Common browser features are shared efficiently 4. **Coordinated Actions**: Browser-level events can coordinate cross-tab activities ## Conclusion Pydoll's event system architecture is designed for: - **Performance**: Minimal overhead through selective domain enabling and efficient callback dispatch - **Flexibility**: Support for both browser-level and tab-level events - **Scalability**: Handle multiple tabs with independent event contexts - **Reliability**: Automatic cleanup and memory management Understanding this architecture helps you: - **Optimize Performance**: Know which domains have high overhead - **Debug Issues**: Understand the event flow when things don't work as expected - **Design Better Automation**: Leverage the architecture for efficient event-driven workflows - **Avoid Pitfalls**: Prevent memory leaks and performance degradation For practical usage patterns and examples, see the [Event System Guide](../features/advanced/event-system.md). ================================================ FILE: docs/en/deep-dive/architecture/find-elements-mixin.md ================================================ # FindElements Mixin Architecture The FindElementsMixin represents a critical architectural decision in Pydoll: using **composition over inheritance** to share element-finding capabilities between `Tab` and `WebElement` without coupling them through a common base class. This document explores the mixin pattern, its implementation, and the internal mechanics of element location. !!! info "Practical Usage Guide" For practical examples and usage patterns, see the [Element Finding Guide](../features/automation/element-finding.md) and [Selectors Guide](./selectors-guide.md). ## Mixin Pattern: Design Philosophy ### What is a Mixin? A mixin is a class designed to **provide methods to other classes** without being a base class in a traditional inheritance hierarchy. Unlike standard inheritance (which models "is-a" relationships), mixins model **"can-do" capabilities**. ```python # Traditional inheritance: "is-a" class Animal: def breathe(self): ... class Dog(Animal): # Dog IS-A Animal def bark(self): ... # Mixin pattern: "can-do" class FlyableMixin: def fly(self): ... class Bird(Animal, FlyableMixin): # Bird IS-A Animal, CAN fly pass ``` ### Why Mixins Over Inheritance? Pydoll faces a specific architectural challenge: - **`Tab`** needs to find elements in the **document context** - **`WebElement`** needs to find elements **relative to itself** (child elements) - Both need **identical selector logic** (CSS, XPath, attribute building) **Option 1: Shared Base Class** ```python class ElementLocator: def find(...): ... class Tab(ElementLocator): pass class WebElement(ElementLocator): pass ``` **Problems:** - Tight coupling: `Tab` and `WebElement` now share inheritance hierarchy - Violates Single Responsibility: `Tab` shouldn't inherit from same class as `WebElement` - Hard to extend: Adding new capabilities requires modifying base class **Option 2: Mixin Pattern (Chosen Approach)** ```python class FindElementsMixin: def find(...): ... def query(...): ... class Tab(FindElementsMixin): # Tab-specific logic pass class WebElement(FindElementsMixin): # WebElement-specific logic pass ``` **Benefits:** - **Decoupling**: `Tab` and `WebElement` remain independent - **Reusability**: Same element-finding logic in both classes - **Composability**: Can add other mixins without conflicts - **Testability**: Mixin can be tested in isolation !!! tip "Mixin Characteristics" 1. **Stateless**: Mixins don't maintain their own state (no `__init__`) 2. **Dependency Injection**: Assumes consuming class provides dependencies (e.g., `_connection_handler`) 3. **Single Purpose**: Each mixin provides one cohesive capability 4. **Not Instantiable**: Never create `FindElementsMixin()` directly ## Mixin Implementation in Pydoll ### Class Structure The FindElementsMixin uses **dependency injection** to work with any class that provides a `_connection_handler`: ```python class FindElementsMixin: """ Mixin providing element finding capabilities. Assumes the consuming class has: - _connection_handler: ConnectionHandler instance for CDP commands - _object_id: Optional[str] for context-relative searches (WebElement only) """ if TYPE_CHECKING: _connection_handler: ConnectionHandler # Type hint, not actual attribute async def find(self, ...): # Implementation uses self._connection_handler # Checks for self._object_id to determine context ``` **Key insight:** The mixin doesn't define `_connection_handler` or `_object_id`. It **assumes** they exist via duck typing. ### How Tab and WebElement Use the Mixin ```python # Tab: Document-level searches class Tab(FindElementsMixin): def __init__(self, browser, target_id, connection_port): self._connection_handler = ConnectionHandler(connection_port) # No _object_id → searches from document root # WebElement: Element-relative searches class WebElement(FindElementsMixin): def __init__(self, object_id, connection_handler, ...): self._object_id = object_id # CDP object ID self._connection_handler = connection_handler # Has _object_id → searches relative to this element ``` **Critical distinction:** - **Tab**: `hasattr(self, '_object_id')` → `False` → uses `RuntimeCommands.evaluate()` (document context) - **WebElement**: `hasattr(self, '_object_id')` → `True` → uses `RuntimeCommands.call_function_on()` (element context) ### Context Detection The mixin dynamically detects search context: ```python async def _find_element(self, by, value, raise_exc=True): if hasattr(self, '_object_id'): # Relative search: call JavaScript function on THIS element command = self._get_find_element_command(by, value, self._object_id) else: # Document search: evaluate JavaScript in global context command = self._get_find_element_command(by, value) response = await self._execute_command(command) # ... ``` This single implementation handles both: - `tab.find(id='submit')` → searches entire document - `form_element.find(id='submit')` → searches within `form_element` !!! warning "Mixin Dependency Coupling" The mixin is **tightly coupled** to CDP's object model. It assumes: - Elements are represented by `objectId` strings - `Runtime.evaluate()` for document searches - `Runtime.callFunctionOn()` for element-relative searches This is acceptable because Pydoll is **CDP-specific**. A more generic design would require abstraction layers. ## Public API Design The mixin exposes two high-level methods with distinct design philosophies: ### find(): Attribute-Based Selection ```python @overload async def find(self, find_all: Literal[False], ...) -> WebElement: ... @overload async def find(self, find_all: Literal[True], ...) -> list[WebElement]: ... async def find( self, id: Optional[str] = None, class_name: Optional[str] = None, name: Optional[str] = None, tag_name: Optional[str] = None, text: Optional[str] = None, timeout: int = 0, find_all: bool = False, raise_exc: bool = True, **attributes, ) -> Union[WebElement, list[WebElement], None]: ``` **Design decisions:** 1. **Kwargs over positional By enum**: ```python # Pydoll (intuitive) await tab.find(id='submit', class_name='primary') # Selenium (verbose) driver.find_element(By.ID, 'submit') # Can't combine attributes easily ``` 2. **Auto-resolution to optimal selector**: - Single attribute → uses `By.ID`, `By.CLASS_NAME`, etc. (fastest) - Multiple attributes → builds XPath (flexible but slower) 3. **`**attributes` for extensibility**: ```python await tab.find(data_testid='submit-btn', aria_label='Submit form') # Builds: //\*[@data-testid='submit-btn' and @aria-label='Submit form'] ``` ### query(): Expression-Based Selection ```python @overload async def query(self, expression, find_all: Literal[False], ...) -> WebElement: ... @overload async def query(self, expression, find_all: Literal[True], ...) -> list[WebElement]: ... async def query( self, expression: str, timeout: int = 0, find_all: bool = False, raise_exc: bool = True ) -> Union[WebElement, list[WebElement], None]: ``` **Design decisions:** 1. **Auto-detect CSS vs XPath**: ```python # XPath detection (starts with / or ./) await tab.query("//div[@id='content']") # CSS detection (default) await tab.query("div#content > p.intro") ``` 2. **Single expression parameter** (unlike `find()`): - Assumes user knows selector syntax - No abstraction overhead 3. **Direct passthrough to browser**: - `querySelector()` / `querySelectorAll()` for CSS - `document.evaluate()` for XPath ### Overload Pattern for Type Safety Both methods use `@overload` to provide **precise return types**: ```python # IDE knows return type is WebElement element = await tab.find(id='submit') # IDE knows return type is list[WebElement] elements = await tab.find(class_name='item', find_all=True) # IDE knows return type is Optional[WebElement] maybe_element = await tab.find(id='optional', raise_exc=False) ``` This is critical for IDE autocomplete and type checking. See [Type System Deep Dive](./typing-system.md) for details. ## Selector Resolution Architecture The mixin converts user input into CDP commands through a resolution pipeline: | Stage | Input | Output | Key Decision | |-------|-------|--------|-------------| | **1. Method Selection** | `find()` kwargs or `query()` expression | Selector strategy | Attribute-based vs expression-based | | **2. Strategy Resolution** | Attributes or expression | `By` enum + value | Single attr → native method, Multiple → XPath | | **3. Context Detection** | `By` + value + `hasattr(_object_id)` | CDP command type | Document vs element-relative search | | **4. Command Generation** | CDP command type + selector | JavaScript + CDP method | `evaluate()` vs `callFunctionOn()` | | **5. Execution** | CDP command | `objectId` or array of `objectId`s | Via ConnectionHandler | | **6. WebElement Creation** | `objectId` + attributes | `WebElement` instance(s) | Factory function to avoid circular imports | ### Key Architectural Decisions **1. Single vs Multiple Attributes** ```python # Single attribute → Direct selector (fast) await tab.find(id='username') # Uses By.ID → getElementById() # Multiple attributes → XPath (flexible) await tab.find(tag_name='input', type='password', name='pwd') # → //input[@type='password' and @name='pwd'] ``` **Why this matters:** - Native methods (`getElementById`, `getElementsByClassName`) are 10-50% faster than XPath - XPath overhead is acceptable when combining attributes (no alternative) **2. Auto-Detection of Selector Type** ```python await tab.query("//div") # Starts with / → XPath await tab.query("#login") # Default → CSS ``` **Implementation:** ```python if expression.startswith(('./', '/', '(/')): return By.XPATH return By.CSS_SELECTOR ``` Heuristic is **unambiguous** - CSS selectors cannot start with `/`. **3. XPath Relative Path Adjustment** For element-relative searches, absolute XPath must be converted: ```python # User provides: //div # For WebElement: .//div (relative to element, not document) def _ensure_relative_xpath(xpath): return f'.{xpath}' if not xpath.startswith('.') else xpath ``` Without this, `element.find()` would search from document root. ## CDP Command Generation The mixin routes to different CDP methods based on search context: | Context | Selector Type | CDP Method | JavaScript Equivalent | |---------|--------------|------------|---------------------| | Document | CSS | `Runtime.evaluate` | `document.querySelector()` | | Document | XPath | `Runtime.evaluate` | `document.evaluate()` | | Element | CSS | `Runtime.callFunctionOn` | `this.querySelector()` | | Element | XPath | `Runtime.callFunctionOn` | `document.evaluate(..., this)` | **Key insight:** `Runtime.callFunctionOn` requires an `objectId` (the element to call on), while `Runtime.evaluate` executes in global scope. ### JavaScript Templates Pydoll uses pre-defined templates for consistency and performance: ```python # CSS selectors Scripts.QUERY_SELECTOR = 'document.querySelector("{selector}")' Scripts.RELATIVE_QUERY_SELECTOR = 'this.querySelector("{selector}")' # XPath expressions Scripts.FIND_XPATH_ELEMENT = ''' document.evaluate("{escaped_value}", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue ''' ``` Templates avoid runtime string concatenation and centralize JavaScript code. ## Object ID Resolution and WebElement Creation CDP represents DOM nodes as **`objectId` strings**. The mixin abstracts this: **Single element flow:** 1. Execute CDP command → Extract `objectId` from response 2. Call `DOM.describeNode(objectId)` → Get attributes, tag name 3. Create `WebElement(objectId, connection_handler, attributes)` **Multiple elements flow:** 1. Execute CDP command → Returns **array as single remote object** 2. Call `Runtime.getProperties(array_objectId)` → Enumerate array indices 3. Extract individual `objectId` for each element 4. Describe and create `WebElement` for each **Why `Runtime.getProperties`?** CDP doesn't return arrays directly - it returns a **reference to an array object**. We must enumerate its properties to extract individual elements. ## Architectural Insights and Design Tradeoffs ### Why Kwargs Instead of By Enum? **Pydoll's choice:** ```python await tab.find(id='submit', class_name='primary') ``` **Selenium's approach:** ```python driver.find_element(By.ID, 'submit') # Can't combine attributes ``` **Rationale:** - **Discoverability**: IDE autocomplete shows all available parameters - **Composability**: Can combine multiple attributes in one call - **Readability**: `id='submit'` is more intuitive than `(By.ID, 'submit')` **Tradeoff:** Kwargs are less explicit about selector strategy. Solved by documentation and logging. ### Why Auto-Detect CSS vs XPath? The `_get_expression_type()` heuristic eliminates user burden: ```python await tab.query("//div") # Auto: XPath await tab.query("#login") # Auto: CSS await tab.query("div > p") # Auto: CSS ``` **Benefits:** - **Ergonomics**: Users don't need to specify selector type - **Correctness**: Impossible to misuse (XPath with CSS method, vice versa) **Limitation:** No way to force CSS interpretation of ambiguous selectors (rare edge case). ### Circular Import Prevention: create_web_element() The mixin uses a **factory function** to avoid circular imports: ```python def create_web_element(*args, **kwargs): """Dynamically import WebElement at runtime.""" from pydoll.elements.web_element import WebElement # Late import return WebElement(*args, **kwargs) ``` **Why needed?** - `FindElementsMixin` → needs to create `WebElement` - `WebElement` → inherits from `FindElementsMixin` - Circular dependency! **Solution:** Late import inside factory function. Import only executes when function is called, breaking the cycle. ### hasattr() for Context Detection: Elegant or Hacky? The mixin uses `hasattr(self, '_object_id')` to detect Tab vs WebElement: ```python if hasattr(self, '_object_id'): # WebElement: element-relative search else: # Tab: document-level search ``` **Is this "hacky"?** - **No**: It's **duck typing** (Pythonic idiom) - Mixin doesn't need to know class hierarchy - Both Tab and WebElement provide `_connection_handler` - WebElement additionally provides `_object_id` **Alternative approaches:** 1. **Type checking**: `if isinstance(self, WebElement)` → Couples mixin to WebElement 2. **Abstract method**: Requires Tab/WebElement to implement `get_search_context()` → More boilerplate 3. **Dependency injection**: Pass context as parameter → Breaks API ergonomics **Verdict:** `hasattr()` is the best solution for this use case. ## Key Takeaways 1. **Mixins enable code sharing** without coupling `Tab` and `WebElement` through inheritance 2. **Context detection via duck typing** (`hasattr`) keeps mixin decoupled from class hierarchy 3. **Auto-resolution optimizes performance** by using native methods for single attributes 4. **XPath building provides composability** for multi-attribute queries 5. **Polling-based waiting is simple** but trades CPU cycles for implementation simplicity 6. **CDP object model complexity** is hidden behind WebElement abstraction 7. **Type safety via overloads** provides precise return types for IDE support ## Related Documentation For deeper understanding of related architectural components: - **[Type System](./typing-system.md)**: Overload pattern, TypedDict, Generic types - **[WebElement Domain](./webelement-domain.md)**: WebElement architecture and interaction methods - **[Selectors Guide](./selectors-guide.md)**: CSS vs XPath syntax and best practices - **[Tab Domain](./tab-domain.md)**: Tab-level operations and context management For practical usage patterns: - **[Element Finding Guide](../features/automation/element-finding.md)**: Practical examples and patterns - **[Human-Like Interactions](../features/automation/human-interactions.md)**: Realistic element interaction ================================================ FILE: docs/en/deep-dive/architecture/index.md ================================================ # Internal Architecture **Understand the design, then break the rules intentionally.** Most documentation shows you **what** a framework does. This section reveals **how** and **why** Pydoll is architected the way it is: the design patterns, architectural decisions, and tradeoffs that shape every line of code. ## Why Architecture Matters You can use Pydoll effectively without understanding its internal architecture. But when you need to: - **Debug** complex issues that span multiple components - **Optimize** performance bottlenecks in large-scale automation - **Extend** Pydoll with custom functionality - **Contribute** improvements to the codebase - **Build** similar tools for different use cases ...architectural knowledge becomes **indispensable**. !!! quote "Architecture as Language" **"Architecture is frozen music."** - Johann Wolfgang von Goethe Good architecture isn't just about making code work, it's about making code **understandable**, **maintainable**, and **extensible**. Understanding Pydoll's architecture teaches you patterns you'll apply to every project. ## The Six Architectural Domains Pydoll's architecture is organized into **six cohesive domains**, each with clear responsibilities and interfaces: ### 1. Browser Domain **[→ Explore Browser Architecture](./browser-domain.md)** **The orchestrator: managing processes, contexts, and global state.** The Browser domain sits at the top of the hierarchy, coordinating: - **Process management**: Launching/terminating browser executables - **Browser contexts**: Isolated environments (like incognito windows) - **Tab registry**: Singleton pattern for Tab instances - **Proxy authentication**: Automatic auth via Fetch domain - **Global operations**: Downloads, permissions, window management **Key architectural patterns**: - **Abstract base class** for Chrome/Edge/other Chromium browsers - **Manager pattern** (ProcessManager, ProxyManager, TempDirManager) - **Singleton registry** for Tab instances (prevents duplicates) - **Context manager protocol** for automatic cleanup **Critical insight**: The Browser doesn't directly manipulate pages, it **coordinates** lower-level components. This separation of concerns enables multi-browser support and concurrent tab operations. --- ### 2. Tab Domain **[→ Explore Tab Architecture](./tab-domain.md)** **The workhorse: executing commands, managing state, coordinating automation.** The Tab domain is Pydoll's primary interface, handling: - **Navigation**: Page loading with configurable wait states - **Element finding**: Delegated to FindElementsMixin - **JavaScript execution**: Both page and element contexts - **Event coordination**: Tab-specific event listeners - **Network monitoring**: Request/response capture and analysis - **IFrame handling**: Nested context management **Key architectural patterns**: - **Façade pattern**: Simplified interface to complex CDP operations - **Mixin composition**: FindElementsMixin for element location - **Per-tab WebSocket**: Independent connections for parallelism - **State flags**: Track enabled domains (network_events_enabled, etc.) - **Lazy initialization**: Request object created on first access **Critical insight**: Each Tab owns its **own ConnectionHandler**, enabling true parallel operations across tabs without contention or state leakage. --- ### 3. WebElement Domain **[→ Explore WebElement Architecture](./webelement-domain.md)** **The interactor: bridging Python code and DOM elements.** The WebElement domain represents **individual DOM elements**, providing: - **Interaction methods**: Click, type, scroll, select - **Property access**: Text, HTML, bounds, attributes - **State queries**: Visibility, enabled status, value - **Screenshots**: Element-specific image capture - **Child finding**: Relative element location (also via FindElementsMixin) **Key architectural patterns**: - **Proxy pattern**: Python object representing remote browser element - **Object ID abstraction**: CDP's objectId hidden behind Python API - **Hybrid properties**: Sync (attributes) vs async (dynamic state) - **Command pattern**: Interaction methods wrap CDP commands - **Fallback strategies**: Multiple approaches for robustness **Critical insight**: WebElement maintains **both cached attributes** (from creation) and **dynamic state** (fetched on demand), balancing performance with freshness. --- ### 4. FindElements Mixin **[→ Explore FindElements Architecture](./find-elements-mixin.md)** **The locator: translating selectors into DOM queries.** The FindElementsMixin provides element-finding capabilities to both Tab and WebElement through **composition**, not inheritance: - **Attribute-based finding**: `find(id='submit', class_name='btn')` - **Expression-based querying**: `query('div.container > p')` - **Strategy resolution**: Optimal selector for single vs. multiple attributes - **Waiting mechanisms**: Polling with configurable timeouts - **Context detection**: Document vs. element-relative searches **Key architectural patterns**: - **Mixin pattern**: Shared capability without inheritance hierarchy - **Strategy pattern**: Different selector strategies based on input - **Template method**: Common flow, strategy-specific implementation - **Factory function**: Late import to avoid circular dependencies - **Overload pattern**: Type-safe return types (WebElement vs list) **Critical insight**: The mixin uses **duck typing** (`hasattr(self, '_object_id')`) to detect Tab vs WebElement, enabling code reuse without tight coupling. --- ### 5. Event Architecture **[→ Explore Event Architecture](./event-architecture.md)** **The dispatcher: routing browser events to Python callbacks.** The Event Architecture enables reactive automation through: - **Event registration**: `on()` method for subscribing to CDP events - **Callback dispatch**: Async execution without blocking - **Domain management**: Explicit enable/disable for performance - **Temporary callbacks**: Auto-removal after first invocation - **Multi-level scope**: Browser-wide vs tab-specific events **Key architectural patterns**: - **Observer pattern**: Subscribe/notify for event-driven code - **Registry pattern**: Event name → callback list mapping - **Wrapper pattern**: Auto-wrap sync callbacks for async execution - **Cleanup protocol**: Automatic callback removal on tab close - **Scope isolation**: Independent event contexts per tab **Critical insight**: Events are **push-based** (browser notifies Python), not poll-based, enabling low-latency reactive automation without busy-waiting. --- ### 6. Browser Requests Architecture **[→ Explore Requests Architecture](./browser-requests-architecture.md)** **The hybrid: HTTP requests with browser session state.** The Browser Requests system bridges HTTP and browser automation: - **Session continuity**: Cookies and auth automatically included - **Dual data sources**: JavaScript Fetch API + CDP network events - **Complete metadata**: Headers, cookies, timing (not all available via JavaScript) - **`requests`-like API**: Familiar interface with browser power **Key architectural patterns**: - **Hybrid execution**: JavaScript for body, CDP for metadata - **Temporary event registration**: Enable/capture/disable pattern - **Lazy property initialization**: Request object created on first use - **Adapter pattern**: Requests-compatible interface to browser fetch **Critical insight**: Browser requests combine **two information sources** (JavaScript and CDP events). JavaScript provides the response body, CDP provides headers and cookies that JavaScript security policies hide. --- ## Architectural Principles These six domains follow consistent principles: ### 1. Separation of Concerns Each domain has a **single, well-defined responsibility**: - Browser → Process/context management - Tab → Command execution and state - WebElement → Element interaction - FindElements → Element location - Events → Reactive dispatch - Requests → HTTP in browser context **Benefit**: Changes in one domain rarely require changes in others. ### 2. Composition Over Inheritance Instead of deep inheritance hierarchies, Pydoll uses: - **Mixins** (FindElementsMixin shared by Tab and WebElement) - **Managers** (ProcessManager, ProxyManager, TempDirManager) - **Dependency injection** (ConnectionHandler passed to components) **Benefit**: Flexible component reuse without tight coupling. ### 3. Async by Default All I/O operations are `async def` and must be `await`ed: - WebSocket communication - CDP command execution - Event callback dispatch - Network requests **Benefit**: Enables true concurrency with multiple tabs, parallel operations, and non-blocking I/O. ### 4. Type Safety Every public API has type annotations: - Function parameters and return types - CDP responses as `TypedDict` - Event types for callback parameters - Overloads for polymorphic methods **Benefit**: IDE autocomplete, static type checking, self-documenting code. ### 5. Resource Management Context managers ensure cleanup: - `async with Browser()` → closes browser on exit - `async with tab.expect_file_chooser()` → disables interceptor - `async with tab.expect_download()` → cleans temp files **Benefit**: Automatic resource cleanup, prevents leaks even on exceptions. ## Component Interaction Understanding how domains interact is key: ```mermaid graph TB User[Your Python Code] User --> Browser[Browser Domain] User --> Tab[Tab Domain] User --> Element[WebElement Domain] Browser --> ProcessMgr[Process Manager] Browser --> ContextMgr[Context Manager] Browser --> TabRegistry[Tab Registry] Tab --> ConnHandler[Connection Handler] Tab --> FindMixin[FindElements Mixin] Tab --> EventSystem[Event System] Tab --> RequestSystem[Request System] Element --> ConnHandler2[Connection Handler] Element --> FindMixin2[FindElements Mixin] ConnHandler --> WebSocket[WebSocket to CDP] ConnHandler2 --> WebSocket EventSystem --> ConnHandler RequestSystem --> ConnHandler RequestSystem --> EventSystem WebSocket --> Chrome[Chrome Browser] ``` **Key interactions**: 1. **Browser creates Tabs** → Tabs stored in registry 2. **Tab and WebElement both use FindElementsMixin** → Shared element location 3. **Each Tab owns a ConnectionHandler** → Independent WebSocket connections 4. **Request system uses Event system** → Network events capture metadata 5. **All components use ConnectionHandler** → Centralized CDP communication ## Prerequisites To fully benefit from this section: - **[Core Fundamentals](../fundamentals/cdp.md)** - Understand CDP, async, and types - **Python design patterns** - Familiarity with common patterns - **OOP concepts** - Classes, inheritance, composition, interfaces - **Async Python** - Comfortable with `async def` and `await` **If you haven't read Fundamentals**, start there first. Architecture builds on those concepts. ## Beyond Architecture After mastering internal architecture, you'll be ready for: - **Contributing code**: Understand where new features fit - **Performance optimization**: Identify bottlenecks and inefficiencies - **Custom extensions**: Build on Pydoll's patterns - **Similar tools**: Apply these patterns to other projects ## Philosophy of Design Good architecture is **invisible**, it shouldn't get in your way. Pydoll's architecture prioritizes: 1. **Simplicity**: Each component does one thing well 2. **Consistency**: Similar operations have similar patterns 3. **Explicitness**: No magic, no hidden behavior 4. **Type safety**: Catch errors at design time, not runtime 5. **Performance**: Async by default, parallelism without locks These aren't arbitrary choices, they're **battle-tested principles** from decades of software engineering. --- ## Ready to Understand the Design? Start with **[Browser Domain](./browser-domain.md)** to understand how process management and context isolation work, then progress through the domains in order. **This is where usage becomes mastery.** --- !!! success "After Completing Architecture" Once you understand these patterns, you'll see them everywhere in software engineering, not just Pydoll. These are **universal patterns** applied to browser automation: - Façade (Tab simplifies CDP complexity) - Observer (Event system for reactive code) - Mixin (FindElementsMixin for code reuse) - Registry (Browser tracks Tab instances) - Strategy (FindElements resolves optimal selectors) Good architecture is **timeless knowledge**. ================================================ FILE: docs/en/deep-dive/architecture/shadow-dom.md ================================================ # Shadow DOM Architecture The Shadow DOM is one of the most challenging aspects of modern web automation. Elements inside shadow trees are invisible to regular DOM queries, which breaks traditional automation approaches. This document explains how Shadow DOM works at the browser level, why conventional tools fail with closed shadow roots, and how Pydoll bypasses these restrictions through direct CDP access. !!! info "Practical Usage Guide" For usage examples and quick-start patterns, see the [Element Finding Guide — Shadow DOM section](../../features/element-finding.md#shadow-dom-support). ## What is Shadow DOM? Shadow DOM is a web standard that enables **DOM encapsulation**. It allows a component to have its own isolated DOM tree (the "shadow tree") attached to a regular DOM element (the "shadow host"). Elements inside a shadow tree are hidden from the main document's queries. ```mermaid graph TB subgraph "Main DOM (Light DOM)" Document["document"] Host["div#my-component\n(shadow host)"] Other["p.normal-content"] end subgraph "Shadow Tree (Encapsulated)" SR["#shadow-root (open)"] Style["style"] Button["button.internal"] Input["input.private"] end Document --> Host Document --> Other Host -.->|"attachShadow()"| SR SR --> Style SR --> Button SR --> Input ``` ### Shadow Root Modes When a component creates a shadow root via `attachShadow()`, it specifies a **mode**: | Mode | JavaScript Access | CDP Access | Common Usage | |------|-------------------|------------|--------------| | `open` | `element.shadowRoot` returns the root | Full access via `backendNodeId` | Custom web components (Lit, Stencil) | | `closed` | `element.shadowRoot` returns `null` | Full access via `backendNodeId` | Security-sensitive components, payment forms | | `user-agent` | Not accessible via JS | Limited access | Browser-internal UI (input placeholders, video controls) | This distinction is critical: **JavaScript-level access is restricted by mode, but CDP-level access is not.** ### Why Regular Automation Fails Traditional automation tools rely on JavaScript execution in the page context: ```javascript // WebDriver / Selenium approach document.querySelector('#my-component') // ✓ Finds the host document.querySelector('#my-component button') // ✗ Cannot cross shadow boundary element.shadowRoot // ✗ Returns null for closed roots ``` The shadow boundary is enforced by the browser's JavaScript engine. Any automation tool that executes JavaScript to find elements will hit this wall. This includes Selenium, Playwright's `page.evaluate()`, and any tool using `Runtime.evaluate()` with `document.querySelector()` at the document level. ## How Pydoll Bypasses Shadow Boundaries Pydoll's approach works at a layer **below JavaScript**: the Chrome DevTools Protocol. CDP has direct access to the browser's internal DOM representation, which ignores shadow mode restrictions entirely. ### The CDP Advantage ```mermaid sequenceDiagram participant User as User Code participant SR as ShadowRoot participant CH as ConnectionHandler participant CDP as Chrome CDP participant DOM as Browser DOM User->>SR: shadow_root.query('.btn') SR->>SR: _get_find_element_command(object_id) SR->>CH: execute_command(Runtime.callFunctionOn) CH->>CDP: WebSocket send CDP->>DOM: Execute querySelector on shadow root object DOM-->>CDP: Element result CDP-->>CH: Response with objectId CH-->>SR: Element data SR-->>User: WebElement instance ``` The key insight is in **how the shadow root object is obtained** and **how queries are executed against it**: 1. **Discovery**: `DOM.describeNode` with `pierce=true` returns shadow root nodes with their `backendNodeId`, regardless of mode 2. **Resolution**: `DOM.resolveNode` converts a `backendNodeId` to a JavaScript `objectId` that references the shadow root directly 3. **Querying**: `Runtime.callFunctionOn` executes `this.querySelector()` on the shadow root's `objectId`; this works because the call is made **on the shadow root object itself**, not from the document context ### Step-by-Step: Shadow Root Access ```mermaid flowchart TD A["WebElement\n(shadow host)"] B["shadowRoots[] with\nbackendNodeId"] C["JavaScript objectId\nfor shadow root"] D["ShadowRoot instance"] E["WebElement\n(inside shadow)"] A -->|"DOM.describeNode\ndepth=1, pierce=true"| B B -->|"DOM.resolveNode\nbackendNodeId"| C C -->|"Create ShadowRoot\nwith objectId"| D D -->|"find() / query()\nvia callFunctionOn"| E ``` #### Step 1: Describe the Host Node ```python # Pydoll sends this CDP command: { "method": "DOM.describeNode", "params": { "objectId": "", "depth": 1, "pierce": true # ← This is the key flag } } ``` The `pierce` parameter tells CDP to traverse shadow boundaries when describing the node. The response includes shadow root information regardless of the shadow root mode: ```json { "result": { "node": { "nodeName": "DIV", "shadowRoots": [ { "nodeId": 0, "backendNodeId": 5, "shadowRootType": "closed", "childNodeCount": 4 } ] } } } ``` !!! warning "nodeId vs backendNodeId" When the DOM domain is not explicitly enabled (which is Pydoll's default to minimize overhead), `nodeId` is always `0`. The `backendNodeId` is the stable, always-available identifier. Pydoll uses `backendNodeId` exclusively for shadow root resolution, which is why it works without requiring `DOM.enable()`. #### Step 2: Resolve to JavaScript Object ```python # Convert backendNodeId to a usable objectId: { "method": "DOM.resolveNode", "params": { "backendNodeId": 5 } } ``` The response provides an `objectId`, a handle to the shadow root in JavaScript's object space: ```json { "result": { "object": { "objectId": "-2296764575741119861.1.3" } } } ``` #### Step 3: Query Within the Shadow Root With the shadow root's `objectId`, Pydoll leverages `FindElementsMixin`'s existing relative search mechanism: ```python # When ShadowRoot.query('.btn') is called: { "method": "Runtime.callFunctionOn", "params": { "functionDeclaration": "function() { return this.querySelector(\".btn\"); }", "objectId": "-2296764575741119861.1.3" } } ``` The function runs with `this` bound to the shadow root object. Since shadow roots implement the `querySelector()` and `querySelectorAll()` interfaces natively, CSS selectors work naturally within the shadow boundary. ## ShadowRoot Architecture ### Design Decision: Reuse FindElementsMixin The most critical architectural decision was making `ShadowRoot` inherit from `FindElementsMixin`: ```python class ShadowRoot(FindElementsMixin): def __init__(self, object_id, connection_handler, mode, host_element): self._object_id = object_id # Shadow root CDP reference self._connection_handler = connection_handler # For CDP communication self._mode = mode # ShadowRootType enum self._host_element = host_element # Back-reference to host ``` **Why this works**: `FindElementsMixin._find_element()` checks `hasattr(self, '_object_id')`. When present, it uses `RELATIVE_QUERY_SELECTOR`, which calls `this.querySelector()` on the referenced object. Since shadow roots support `querySelector()` natively, `query()` with CSS selectors works automatically without any shadow-specific code. ```python # This single line in FindElementsMixin enables shadow root searches: elif hasattr(self, '_object_id'): command = self._get_find_element_command(by, value, self._object_id) ``` `ShadowRoot` inherits `query()` and `find_or_wait_element()` from `FindElementsMixin`. However, `find()` and XPath-based `query()` are explicitly **blocked** on `ShadowRoot` (via the `_css_only` class flag) because shadow roots only support `querySelector()` / `querySelectorAll()` — XPath does not work inside shadow boundaries. !!! tip "Architectural Consistency" This is the same mechanism that makes `WebElement.find()` search within an element's children: the `_object_id` attribute signals "search relative to me" rather than "search the whole document." `ShadowRoot`, `WebElement`, and `Tab` all share element-finding behavior through `FindElementsMixin`, with `ShadowRoot` restricted to CSS selectors only. ### Class Relationships | Class | Has `_object_id` | Has `_connection_handler` | Find Scope | |-------|:-:|:-:|---| | `Tab` | No | Yes | Entire document | | `WebElement` | Yes | Yes | Within element's subtree | | `ShadowRoot` | Yes | Yes | Within shadow tree | All three inherit from `FindElementsMixin`. The presence or absence of `_object_id` determines whether searches are document-global or scoped to a specific node. ### Resolving Shadow Roots: backendNodeId Strategy Pydoll deliberately uses `backendNodeId` instead of `nodeId` for shadow root resolution: | Property | `nodeId` | `backendNodeId` | |----------|----------|-----------------| | Requires `DOM.enable()` | Yes | No | | Stable across describe calls | No (0 when DOM not enabled) | Yes | | Works for shadow root resolution | Only when DOM enabled | Always | | Performance overhead | Higher (DOM domain tracking) | None | By relying on `backendNodeId`, Pydoll avoids the overhead of enabling the DOM domain while maintaining reliable shadow root access. This is a pragmatic choice: most automation scenarios don't need the DOM domain's event stream, and enabling it adds memory and processing overhead for tracking every DOM mutation. ## Closed Shadow Roots: Why CDP Access Works This is the most commonly asked question: **if `element.shadowRoot` returns `null` for closed shadow roots in JavaScript, how can CDP access them?** The answer lies in understanding the browser's architecture: ```mermaid graph TB subgraph "JavaScript Runtime" JS["JavaScript Code"] API["Web APIs\n(shadowRoot property)"] end subgraph "Browser Internals" CDP_Layer["CDP Protocol Layer"] DOM_Internal["Internal DOM Tree"] end JS -->|"element.shadowRoot"| API API -->|"mode == 'closed'\n→ return null"| JS CDP_Layer -->|"DOM.describeNode\npierce=true"| DOM_Internal DOM_Internal -->|"Always returns\nfull shadow tree"| CDP_Layer ``` **JavaScript access** goes through the Web API layer, which enforces the shadow mode restriction. When `mode='closed'`, the API returns `null`; this is an intentional access control boundary for web page code. **CDP access** operates below the Web API layer. It communicates directly with the browser's internal DOM representation. The `closed` mode restriction is a **JavaScript-level policy**, not a **DOM-level restriction**. The shadow tree still exists in the DOM; it's just hidden from JavaScript's view. !!! info "Security Implications" This is by design in the DevTools Protocol. CDP is intended for debugging and automation tools that need full DOM access. The `closed` mode protects shadow contents from other scripts on the same page (e.g., third-party scripts), not from the browser's debugging interface. This is the same reason browser DevTools can inspect closed shadow roots in the Elements panel. ### Practical Verification You can verify this behavior yourself: ```python import asyncio from pydoll.browser.chromium import Chrome from pydoll.protocol.dom.types import ShadowRootType async def verify_closed_access(): async with Chrome() as browser: tab = await browser.start() await tab.go_to('about:blank') # Create a closed shadow root via JavaScript await tab.execute_script(""" const host = document.createElement('div'); host.id = 'test-host'; document.body.appendChild(host); const shadow = host.attachShadow({ mode: 'closed' }); shadow.innerHTML = '

Hidden content

'; """) # JavaScript cannot access it: result = await tab.execute_script( "return document.getElementById('test-host').shadowRoot", return_by_value=True, ) js_value = result['result']['result'].get('value') print(f"JS shadowRoot: {js_value}") # None # But Pydoll can: host = await tab.find(id='test-host') shadow = await host.get_shadow_root() print(f"Shadow mode: {shadow.mode}") # ShadowRootType.CLOSED secret = await shadow.query('.secret') text = await secret.text print(f"Content: {text}") # "Hidden content" asyncio.run(verify_closed_access()) ``` ## Nested Shadow Roots Web components frequently compose other web components, creating multi-level shadow trees: ```mermaid graph TB subgraph "Light DOM" Host1["outer-component\n(shadow host)"] end subgraph "Outer Shadow Tree" SR1["#shadow-root (open)"] Host2["inner-component\n(shadow host)"] P1["p.outer-text"] end subgraph "Inner Shadow Tree" SR2["#shadow-root (closed)"] Button["button.deep-btn"] P2["p.inner-text"] end Host1 -.-> SR1 SR1 --> P1 SR1 --> Host2 Host2 -.-> SR2 SR2 --> P2 SR2 --> Button ``` Pydoll handles this naturally by chaining `get_shadow_root()` calls. Each `ShadowRoot` produces `WebElement` instances that can themselves have shadow roots: ```python outer_host = await tab.find(tag_name='outer-component') outer_shadow = await outer_host.get_shadow_root() # open inner_host = await outer_shadow.query('inner-component') inner_shadow = await inner_host.get_shadow_root() # closed, still works deep_button = await inner_shadow.query('.deep-btn') await deep_button.click() ``` Each level follows the same CDP resolution flow: `describeNode` then `resolveNode` then `ShadowRoot` with `_object_id` then `querySelector` via `callFunctionOn`. ## Shadow Roots Inside IFrames A common real-world scenario involves shadow roots inside cross-origin iframes — for example, Cloudflare Turnstile captchas. This combines two isolation mechanisms: the iframe boundary and the shadow boundary. ```mermaid graph TB subgraph "Main Page" Host["div.widget\n(shadow host)"] end subgraph "Shadow Tree" SR1["#shadow-root"] IFrame["iframe\n(cross-origin)"] end subgraph "IFrame (OOPIF)" Body["body"] end subgraph "IFrame Shadow Tree" SR2["#shadow-root"] Button["label.checkbox"] end Host -.-> SR1 SR1 --> IFrame IFrame -.->|"separate process"| Body Body -.-> SR2 SR2 --> Button ``` Pydoll handles this transparently through **iframe context propagation**. When a `ShadowRoot` is created, it inherits the iframe routing context from its host element: ```python # The full chain: main page → shadow root → iframe → shadow root → element shadow_host = await tab.find(id='widget-container') first_shadow = await shadow_host.get_shadow_root() iframe = await first_shadow.query('iframe') body = await iframe.find(tag_name='body') second_shadow = await body.get_shadow_root() # click() works correctly — mouse events route through the OOPIF session button = await second_shadow.query('label.checkbox') await button.click() ``` ### How Context Propagation Works Cross-origin iframes run in a separate browser process (Out-of-Process IFrame, or OOPIF). CDP commands for these iframes must be routed through a dedicated `sessionId`. Pydoll propagates this routing context automatically through the entire chain: 1. **IFrame resolves its context**: `iframe.find()` establishes an `IFrameContext` with `session_id` and `session_handler` for the OOPIF 2. **Child elements inherit context**: Elements found inside the iframe receive the `IFrameContext` 3. **Shadow roots inherit from host**: `ShadowRoot` copies its host element's `_iframe_context` 4. **Elements in shadow inherit from shadow root**: Elements found via `shadow.query()` receive the propagated context 5. **Commands route correctly**: `_execute_command()` detects the inherited context and routes CDP commands (including `Input.dispatchMouseEvent` for `click()`) through the OOPIF session This means coordinates from `DOM.getBoxModel` (which are relative to the iframe viewport) are correctly paired with mouse events dispatched to the same OOPIF session. ## Finding Shadow Roots: find_shadow_roots() `Tab.find_shadow_roots()` traverses the entire DOM tree to collect all shadow roots found on the page. ### How It Works ``` Tab.find_shadow_roots() ├─ DOM.getDocument(depth=-1, pierce=true) │ └─ Returns full DOM tree with shadowRoots arrays ├─ Recursive tree walk: _collect_shadow_roots_from_tree() │ ├─ Collects shadowRoots entries with host backendNodeId │ ├─ Traverses children recursively │ └─ Traverses contentDocument (same-origin iframes) ├─ For each shadow root entry: │ ├─ DOM.resolveNode(backendNodeId) → objectId │ └─ Resolve host element (best-effort) └─ Returns list[ShadowRoot] with host references ``` ### Timeout: Waiting for Shadow Roots Shadow hosts are often injected asynchronously. `Tab.find_shadow_roots()` accepts a `timeout` parameter that polls every 0.5s until at least one shadow root is found or the timeout expires (raises `WaitElementTimeout`). Similarly, `WebElement.get_shadow_root()` also supports `timeout` for waiting on a specific element's shadow root: ```python # Wait up to 10 seconds for shadow roots to appear shadow_roots = await tab.find_shadow_roots(timeout=10) # Wait for a shadow root on a specific element shadow = await element.get_shadow_root(timeout=5) ``` ### Key Details - **`pierce=True`** in `DOM.getDocument` causes the browser to include `shadowRoots` arrays in node descriptions, allowing discovery of all shadow roots without navigating to each host individually. - **Same-origin iframe content** is included in the tree via `contentDocument` nodes. The traversal handles these. - Each returned `ShadowRoot` has a reference to its `host_element` (resolved best-effort via `DOM.resolveNode`). ### Deep Traversal: Cross-Origin IFrames (OOPIFs) By default, cross-origin iframes (OOPIFs) are **not** included in the DOM tree — their content lives in a separate browser process. Pass `deep=True` to also discover shadow roots inside OOPIFs: ```python shadow_roots = await tab.find_shadow_roots(deep=True, timeout=10) ``` When `deep=True` is set, the method performs additional steps: ``` Tab.find_shadow_roots(deep=True) ├─ ... (main document traversal as above) ... └─ _collect_oopif_shadow_roots() ├─ Browser-level ConnectionHandler (no page_id → browser endpoint) ├─ Target.getTargets() → filter type='iframe' └─ For each iframe target: ├─ Target.attachToTarget(targetId, flatten=True) → sessionId ├─ DOM.getDocument(depth=-1, pierce=True) with sessionId ├─ _collect_shadow_roots_from_tree() on OOPIF DOM └─ For each shadow root found: ├─ DOM.resolveNode(backendNodeId) with sessionId ├─ Resolve host element (best-effort) with sessionId ├─ Create IFrameContext(frame_id, session_handler, session_id) └─ Set IFrameContext on host element (or ShadowRoot directly) ``` The returned `ShadowRoot` objects carry the OOPIF routing context (`IFrameContext`), so elements found via `shadow_root.query()` will automatically route CDP commands through the correct OOPIF session. This is critical for scenarios like Cloudflare Turnstile captchas, where the checkbox lives inside a closed shadow root within a cross-origin iframe. ## Limitations and Edge Cases ### Selector Strategies Inside Shadow Roots !!! warning "CSS Selectors Only Inside Shadow Roots" `find()` and XPath are **not supported** on `ShadowRoot` and will raise `NotImplementedError`. Always use `query()` with CSS selectors to search inside shadow roots. Shadow roots natively implement `querySelector()` and `querySelectorAll()`, but **not** XPath evaluation. Pydoll enforces this by blocking `find()` (which may generate XPath internally) and XPath-based `query()` on `ShadowRoot`: | Method | Inside Shadow Root | Notes | |--------|:--:|---| | `query('css-selector')` | Supported | The only supported approach | | `find(...)` | Not supported | Raises `NotImplementedError` | | `query('//xpath')` | Not supported | Raises `NotImplementedError` | ```python shadow = await host.get_shadow_root() # Supported: query() with CSS selectors button = await shadow.query('button.submit') email = await shadow.query('#email-input') items = await shadow.query('.item', find_all=True) # Not supported: find() and XPath raise NotImplementedError # shadow.find(id='email-input') # NotImplementedError # shadow.query('.//button') # NotImplementedError ``` ### XPath Cannot Cross Shadow Boundaries XPath expressions from the document root cannot traverse shadow boundaries. This is a fundamental limitation of XPath, which was designed before Shadow DOM existed: ```python # Won't find shadow content: document-level XPath cannot cross the boundary element = await tab.find(xpath='//div[@id="host"]//button') ``` ### User-Agent Shadow Roots Browser-internal shadow roots (e.g., `` placeholder styling, `