Full Code of soxoj/maigret for AI

main 83a9dafe55cd cached
98 files
2.1 MB
562.4k tokens
311 symbols
1 requests
Download .txt
Showing preview only (2,249K chars total). Download the full file or copy to clipboard to get everything.
Repository: soxoj/maigret
Branch: main
Commit: 83a9dafe55cd
Files: 98
Total size: 2.1 MB

Directory structure:
gitextract_52ofrfho/

├── .dockerignore
├── .githooks/
│   └── pre-commit
├── .github/
│   ├── FUNDING.yml
│   ├── ISSUE_TEMPLATE/
│   │   ├── add-a-site.md
│   │   ├── bug.md
│   │   └── report-false-result.md
│   ├── dependabot.yml
│   └── workflows/
│       ├── build-docker-image.yml
│       ├── codeql-analysis.yml
│       ├── pyinstaller.yml
│       ├── python-package.yml
│       ├── python-publish.yml
│       └── update-site-data.yml
├── .gitignore
├── .readthedocs.yaml
├── CHANGELOG.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── Dockerfile
├── Installer.bat
├── LICENSE
├── MANIFEST.in
├── Makefile
├── README.md
├── docs/
│   ├── Makefile
│   ├── make.bat
│   ├── requirements.txt
│   └── source/
│       ├── command-line-options.rst
│       ├── conf.py
│       ├── development.rst
│       ├── features.rst
│       ├── index.rst
│       ├── installation.rst
│       ├── philosophy.rst
│       ├── quick-start.rst
│       ├── settings.rst
│       ├── supported-identifier-types.rst
│       ├── tags.rst
│       └── usage-examples.rst
├── maigret/
│   ├── __init__.py
│   ├── __main__.py
│   ├── __version__.py
│   ├── activation.py
│   ├── checking.py
│   ├── errors.py
│   ├── executors.py
│   ├── maigret.py
│   ├── notify.py
│   ├── permutator.py
│   ├── report.py
│   ├── resources/
│   │   ├── data.json
│   │   ├── simple_report.tpl
│   │   ├── simple_report_pdf.css
│   │   └── simple_report_pdf.tpl
│   ├── result.py
│   ├── settings.py
│   ├── sites.py
│   ├── submit.py
│   ├── types.py
│   ├── utils.py
│   └── web/
│       ├── app.py
│       └── templates/
│           ├── base.html
│           ├── index.html
│           ├── results.html
│           └── status.html
├── pyinstaller/
│   ├── maigret_standalone.py
│   ├── maigret_standalone.spec
│   └── requirements.txt
├── pyproject.toml
├── pytest.ini
├── sites.md
├── snapcraft.yaml
├── static/
│   ├── recursive_search.md
│   └── report_alexaimephotographycars.html
├── tests/
│   ├── __init__.py
│   ├── conftest.py
│   ├── db.json
│   ├── local.json
│   ├── test_activation.py
│   ├── test_checking.py
│   ├── test_cli.py
│   ├── test_data.py
│   ├── test_errors.py
│   ├── test_executors.py
│   ├── test_maigret.py
│   ├── test_notify.py
│   ├── test_permutator.py
│   ├── test_report.py
│   ├── test_sites.py
│   ├── test_submit.py
│   └── test_utils.py
├── utils/
│   ├── __init__.py
│   ├── add_tags.py
│   ├── check_engines.py
│   ├── import_sites.py
│   ├── sites_diff.py
│   └── update_site_data.py
└── wizard.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .dockerignore
================================================
.git/
.vscode/
static/
tests/
*.txt
!/requirements.txt
venv/



================================================
FILE: .githooks/pre-commit
================================================
#!/bin/sh
echo 'Activating update_sitesmd hook script...'
poetry run update_sitesmd

================================================
FILE: .github/FUNDING.yml
================================================
# These are supported funding model platforms

patreon: soxoj
github: soxoj
buy_me_a_coffee: soxoj

================================================
FILE: .github/ISSUE_TEMPLATE/add-a-site.md
================================================
---
name: Add a site
about: I want to add a new site for Maigret checks
title: New site
labels: new-site
assignees: soxoj

---

Link to the site main page: https://example.com
Link to an existing account: https://example.com/users/john
Link to a nonexistent account: https://example.com/users/noonewouldeverusethis7
Tags: photo, us, ...


================================================
FILE: .github/ISSUE_TEMPLATE/bug.md
================================================
---
name: Maigret bug report
about: I want to report a bug in Maigret functionality
title: ''
labels: bug
assignees: soxoj

---

## Checklist

- [ ] I'm reporting a bug in Maigret functionality
- [ ] I've checked for similar bug reports including closed ones
- [ ] I've checked for pull requests that attempt to fix this bug

## Description

Info about Maigret version you are running and environment (`--version`, operation system, ISP provider):
<INSERT VERSION INFO HERE>

How to reproduce this bug (commandline options / conditions):
<INSERT EXAMPLE OF CLI COMMAND HERE>

<DESCRIPTION>

<PASTE SCREENSHOT>

<ATTACH LOG FILE>


================================================
FILE: .github/ISSUE_TEMPLATE/report-false-result.md
================================================
---
name: Report invalid result
about: I want to report invalid result of Maigret search
title: Invalid result
labels: false-result
assignees: soxoj

---

Invalid link: <INSERT LINK HERE>

<!--

Put x into the box

[ ] ==> [x]

-->

- [ ] I'm sure that the link leads to "not found" page


================================================
FILE: .github/dependabot.yml
================================================
version: 2
updates:
  - package-ecosystem: "pip"
    directory: "/"
    schedule:
      interval: "daily"


================================================
FILE: .github/workflows/build-docker-image.yml
================================================
name: Build docker image and push to DockerHub

on:
  push:
    branches: [ main ]

jobs:
  docker:
    runs-on: ubuntu-latest
    steps:
      -
        name: Set up QEMU
        uses: docker/setup-qemu-action@v1
      -
        name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v1
      -
        name: Login to DockerHub
        uses: docker/login-action@v1 
        with:
          username: ${{ secrets.DOCKER_HUB_USERNAME }}
          password: ${{ secrets.DOCKER_HUB_ACCESS_TOKEN }}
      -
        name: Build and push
        id: docker_build
        uses: docker/build-push-action@v2
        with:
          push: true
          tags: ${{ secrets.DOCKER_HUB_USERNAME }}/maigret:latest
          platforms: linux/amd64,linux/arm64
      -
        name: Image digest
        run: echo ${{ steps.docker_build.outputs.digest }}


================================================
FILE: .github/workflows/codeql-analysis.yml
================================================
# For most projects, this workflow file will not need changing; you simply need
# to commit it to your repository.
#
# You may wish to alter this file to override the set of languages analyzed,
# or to provide custom queries or build logic.
#
# ******** NOTE ********
# We have attempted to detect the languages in your repository. Please check
# the `language` matrix defined below to confirm you have the correct set of
# supported CodeQL languages.
#
name: "CodeQL"

on:
  push:
    branches: [ main ]
  schedule:
    - cron: '23 6 * * 6'

jobs:
  analyze:
    name: Analyze
    runs-on: ubuntu-latest
    permissions:
      actions: read
      contents: read
      security-events: write

    strategy:
      fail-fast: false
      matrix:
        language: [ 'python' ]
        # CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby' ]
        # Learn more about CodeQL language support at https://git.io/codeql-language-support

    steps:
    - name: Checkout repository
      uses: actions/checkout@v2

    # Initializes the CodeQL tools for scanning.
    - name: Initialize CodeQL
      uses: github/codeql-action/init@v1
      with:
        languages: ${{ matrix.language }}
        # If you wish to specify custom queries, you can do so here or in a config file.
        # By default, queries listed here will override any specified in a config file.
        # Prefix the list here with "+" to use these queries and those in the config file.
        # queries: ./path/to/local/query, your-org/your-repo/queries@main

    # Autobuild attempts to build any compiled languages  (C/C++, C#, or Java).
    # If this step fails, then you should remove it and run the build manually (see below)
    - name: Autobuild
      uses: github/codeql-action/autobuild@v1

    # ℹ️ Command-line programs to run using the OS shell.
    # 📚 https://git.io/JvXDl

    # ✏️ If the Autobuild fails above, remove it and uncomment the following three lines
    #    and modify them (or add more) to build your code if your project
    #    uses a compiled language

    #- run: |
    #   make bootstrap
    #   make release

    - name: Perform CodeQL Analysis
      uses: github/codeql-action/analyze@v1


================================================
FILE: .github/workflows/pyinstaller.yml
================================================
name: Package exe with PyInstaller - Windows

on:
  push:
    branches: [ main, dev ]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout
      uses: actions/checkout@v4

    - name: PyInstaller Windows Build
      uses: JackMcKew/pyinstaller-action-windows@main
      with:
        path: pyinstaller

    - name: Upload PyInstaller Binary to Workflow as Artifact
      uses: actions/upload-artifact@v4
      with:
        name: maigret_standalone_win32
        path: pyinstaller/dist/windows

    - name: Download PyInstaller Binary
      uses: actions/download-artifact@v4
      with:
        name: maigret_standalone_win32

    - name: Create New Release and Upload PyInstaller Binary to Release
      uses: ncipollo/release-action@v1.14.0
      id: create_release
      with:
        allowUpdates: true
        draft: false
        prerelease: false
        artifactErrorsFailBuild: true
        makeLatest: true
        replacesArtifacts: true
        artifacts: maigret_standalone.exe
        name: Development Windows Release [${{ github.ref_name }}]
        tag: ${{ github.ref_name }}
        body: |
          This is a development release built from the **${{ github.ref_name }}** branch.

          Take into account that `dev` releases may be unstable.
          Please, use [the development release](https://github.com/soxoj/maigret/releases/tag/main) build from the **main** branch.

          Instructions:
          - Download the attached file `maigret_standalone.exe` to get the Windows executable.
          - Video guide on how to run it: https://youtu.be/qIgwTZOmMmM
          - For detailed documentation, visit: https://maigret.readthedocs.io/en/latest/

      env:
        GITHUB_TOKEN: ${{ github.token }}


================================================
FILE: .github/workflows/python-package.yml
================================================
name: Linting and testing

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]
    types: [opened, synchronize, reopened]

jobs:
  build:

    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.10", "3.11", "3.12", "3.13"]

    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Set up Python ${{ matrix.python-version }}
      uses: actions/setup-python@v2
      with:
        python-version: ${{ matrix.python-version }}
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        python -m pip install poetry
        python -m poetry install --with dev
    - name: Test with Coverage and Pytest (Fail if coverage is low)
      run: |
        poetry run coverage run --source=./maigret -m pytest --reruns 3 --reruns-delay 5 tests
        poetry run coverage report --fail-under=60
        poetry run coverage html
    - name: Upload coverage report
      uses: actions/upload-artifact@v4
      with:
        name: htmlcov-${{ strategy.job-index }}
        path: htmlcov

================================================
FILE: .github/workflows/python-publish.yml
================================================
name: Upload Python Package to PyPI when a Release is Created
on:
  release:
    types: [created]
  push:
    tags:
      - "v*"
permissions:
  id-token: write
  contents: read
jobs:
  build-and-publish:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: astral-sh/setup-uv@v3
      - run: uv build
      - name: Publish to PyPI (Trusted Publishing)
        uses: pypa/gh-action-pypi-publish@release/v1
        with:
          packages-dir: dist

================================================
FILE: .github/workflows/update-site-data.yml
================================================
name: Update sites rating and statistics

on:
  pull_request:
    branches: [ dev ]
    types: [opened, synchronize]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout repository
      uses: actions/checkout@v2.3.2
      with:
        ref: ${{ github.event.pull_request.head.sha }}
        fetch-depth: 0 # otherwise, there would be errors pushing refs to the destination repository.

    - name: build application
      run: |
        pip3 install .
        python3 ./utils/update_site_data.py --empty-only

    - name: Commit and push changes
      run: |
        git config --global user.name "Maigret autoupdate"
        git config --global user.email "soxoj@protonmail.com"
        echo `git name-rev ${{ github.event.pull_request.head.sha }} --name-only`
        export BRANCH=`git name-rev ${{ github.event.pull_request.head.sha }} --name-only | sed 's/remotes\/origin\///'`
        echo $BRANCH
        git remote -v
        git checkout $BRANCH
        git add sites.md
        git commit -m "Updated site list and statistics"
        git push origin $BRANCH

================================================
FILE: .gitignore
================================================
# Virtual Environment
venv/
.venv/

# Editor Configurations
.vscode/
.idea/

# Python
__pycache__/

# Pip
src/

# Jupyter Notebook
.ipynb_checkpoints
*.ipynb

# Logs and backups
*.log
*.bak

# Output files, except requirements.txt
*.txt
!requirements.txt

# Comma-Separated Values (CSV) Reports
*.csv

# MacOS Folder Metadata File
.DS_Store
/reports/

# Testing
.coverage
dist/
htmlcov/
/test_*

# Maigret files
settings.json

# other
*.egg-info
build

================================================
FILE: .readthedocs.yaml
================================================
version: 2

build:
  os: ubuntu-22.04
  tools:
    python: "3.10"

sphinx:
  configuration: docs/source/conf.py

formats:
  - pdf

python:
  install:
    - requirements: docs/requirements.txt


================================================
FILE: CHANGELOG.md
================================================
# Changelog

## [0.5.0] - 2025-08-10
* Site Supression by @C3n7ral051nt4g3ncy in https://github.com/soxoj/maigret/pull/627
* Bump yarl from 1.7.2 to 1.8.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/626
* Streaming sites by @soxoj in https://github.com/soxoj/maigret/pull/628
* Mirrors by @fen0s in https://github.com/soxoj/maigret/pull/630
* Added Instagram scrapers by @soxoj in https://github.com/soxoj/maigret/pull/633
* Bump psutil from 5.9.1 to 5.9.2 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/624
* Bump pypdf2 from 2.10.4 to 2.10.5 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/625
* Invalid results fixes by @soxoj in https://github.com/soxoj/maigret/pull/634
* Bump pytest-httpserver from 1.0.5 to 1.0.6 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/638
* Bump pypdf2 from 2.10.5 to 2.10.8 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/641
* Bump certifi from 2022.6.15 to 2022.9.14 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/644
* Bump idna from 3.3 to 3.4 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/640
* fix false positives from bot by @fen0s in https://github.com/soxoj/maigret/pull/663
* Add pre commit hook by @fen0s in https://github.com/soxoj/maigret/pull/664
* site deletion by @C3n7ral051nt4g3ncy in https://github.com/soxoj/maigret/pull/648
* Changed docker run to interactive and remove on exit by @dr-BEat in https://github.com/soxoj/maigret/pull/675
* Corrected grammar in README.md by @Trkzi-Omar in https://github.com/soxoj/maigret/pull/674
* fix sites from issues by @fen0s in https://github.com/soxoj/maigret/pull/680
* correct username in usage examples by @LeonGr in https://github.com/soxoj/maigret/pull/673
* Update README.md by @johanburati in https://github.com/soxoj/maigret/pull/669
* Fix typos by @LorenzoSapora in https://github.com/soxoj/maigret/pull/681
* Build docker images for arm64 and amd64 by @krydos in https://github.com/soxoj/maigret/pull/687
* Bump certifi from 2022.9.14 to 2022.9.24 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/652
* Bump aiohttp from 3.8.1 to 3.8.3 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/651
* Bump arabic-reshaper from 2.1.3 to 2.1.4 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/650
* Update README.md, Repl.it -> Replit with new badge by @PeterDaveHello in https://github.com/soxoj/maigret/pull/692
* Refactor Dockerfile with best practices by @PeterDaveHello in https://github.com/soxoj/maigret/pull/691
* Improve README.md Installation section by @PeterDaveHello in https://github.com/soxoj/maigret/pull/690
* Bump pytest-cov from 3.0.0 to 4.0.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/688
* Bump stem from 1.8.0 to 1.8.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/689
* Bump typing-extensions from 4.3.0 to 4.4.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/698
* Typo fixes in error.py by @Ben-Chapman in https://github.com/soxoj/maigret/pull/711
* Fixed docs about tags by @soxoj in https://github.com/soxoj/maigret/pull/715
* Fixed lightstalking.com by @soxoj in https://github.com/soxoj/maigret/pull/716
* Fixed YouTube by @soxoj in https://github.com/soxoj/maigret/pull/717
* Bump pytest-asyncio from 0.19.0 to 0.20.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/732
* Updated snapcraft yaml by @kz6fittycent in https://github.com/soxoj/maigret/pull/720
* Bump colorama from 0.4.5 to 0.4.6 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/733
* Bump pytest from 7.1.3 to 7.2.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/734
* disable not working sites by @fen0s in https://github.com/soxoj/maigret/pull/739
* disable broken sites by @fen0s in https://github.com/soxoj/maigret/pull/756
* Bump cloudscraper from 1.2.64 to 1.2.66 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/769
* fix opensea and shutterstock, disable a few dead sites by @fen0s in https://github.com/soxoj/maigret/pull/798
* Fixed documentation URL by @soxoj in https://github.com/soxoj/maigret/pull/799
* Small readme fix by @soxoj in https://github.com/soxoj/maigret/pull/857
* docs spelling error by @Nadeem-05 in https://github.com/soxoj/maigret/pull/866
* Fix Pinterest false positive by @therealchiendat in https://github.com/soxoj/maigret/pull/862
* Added new Websites by @codyMar30 in https://github.com/soxoj/maigret/pull/838
* Update "future" package to v0.18.3 by @PeterDaveHello in https://github.com/soxoj/maigret/pull/834
* Bump certifi from 2022.9.24 to 2022.12.7 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/793
* Update dependency - networkx from v2.5.1 to v2.6 by @PeterDaveHello in https://github.com/soxoj/maigret/pull/738
* Bump reportlab from 3.6.11 to 3.6.12 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/735
* Bump typing-extensions from 4.4.0 to 4.5.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/888
* Bump psutil from 5.9.2 to 5.9.4 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/741
* Bump attrs from 22.1.0 to 22.2.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/892
* Bump multidict from 6.0.2 to 6.0.4 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/891
* Fixed false positives, updated networkx dep, some lint fixes by @soxoj in https://github.com/soxoj/maigret/pull/894
* Bump lxml from 4.9.1 to 4.9.2 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/900
* Bump yarl from 1.8.1 to 1.8.2 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/899
* Fixed false positives on Mastodon sites by @soxoj in https://github.com/soxoj/maigret/pull/901
* Added valid regex for Mastodon instances (#848) by @soxoj in https://github.com/soxoj/maigret/pull/906
* Fix missing Mastodon Regex on #906 by @therealchiendat in https://github.com/soxoj/maigret/pull/908
* Bump tqdm from 4.64.1 to 4.65.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/905
* Bump requests from 2.28.1 to 2.28.2 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/904
* Bump psutil from 5.9.4 to 5.9.5 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/910
* fix deployment of tests by @noraj in https://github.com/soxoj/maigret/pull/933
* Added 26 ENS and similar domains with tag `crypto` by @soxoj in https://github.com/soxoj/maigret/pull/942
* Bump requests from 2.28.2 to 2.31.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/957
* Update wizard.py by @engNoori in https://github.com/soxoj/maigret/pull/1016
* Improved search through UnstoppableDomains by @soxoj in https://github.com/soxoj/maigret/pull/1040
* Added memory.lol (Twitter usernames archive) by @soxoj in https://github.com/soxoj/maigret/pull/1067
* Disabled and fixed several sites by @soxoj in https://github.com/soxoj/maigret/pull/1132
* Fixed some sites (again) by @soxoj in https://github.com/soxoj/maigret/pull/1133
* fix(sec): upgrade reportlab to 3.6.13 by @realize096 in https://github.com/soxoj/maigret/pull/1051
* Add compatibility with pytest >= 7.3.0 by @tjni in https://github.com/soxoj/maigret/pull/1117
* Additionally fixed sites, win32 build fix by @soxoj in https://github.com/soxoj/maigret/pull/1148
* Sites fixes 250823 by @soxoj in https://github.com/soxoj/maigret/pull/1149
* Bump reportlab from 3.6.12 to 4.0.4 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1160
* Bump certifi from 2022.12.7 to 2023.7.22 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1070
* fix(sec): upgrade certifi to 2022.12.07 by @realize096 in https://github.com/soxoj/maigret/pull/1173
* Bump cloudscraper from 1.2.66 to 1.2.71 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/914
* Some sites fixed & cloudflare detection by @soxoj in https://github.com/soxoj/maigret/pull/1178
* EasyInstaller because everyone likes saving time :) by @CatchySmile in https://github.com/soxoj/maigret/pull/1212
* Tests fixes + last updates by @soxoj in https://github.com/soxoj/maigret/pull/1228
* Bump pypdf2 from 2.10.8 to 3.0.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/815
* Bump pyvis from 0.2.1 to 0.3.2 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/861
* Bump xhtml2pdf from 0.2.8 to 0.2.11 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/935
* Bump flake8 from 5.0.4 to 6.1.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1091
* Bump aiohttp from 3.8.3 to 3.8.6 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1222
* Specified pyinstaller version by @soxoj in https://github.com/soxoj/maigret/pull/1230
* Pyinstaller fix by @soxoj in https://github.com/soxoj/maigret/pull/1231
* Test pyinstaller on dev branch by @soxoj in https://github.com/soxoj/maigret/pull/1233
* Update main from dev again by @soxoj in https://github.com/soxoj/maigret/pull/1234
* Bump typing-extensions from 4.5.0 to 4.8.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1239
* Bump pytest-rerunfailures from 10.2 to 12.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1237
* Bump async-timeout from 4.0.2 to 4.0.3 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1238
* Changed pyinstaller dir by @soxoj in https://github.com/soxoj/maigret/pull/1245
* Bump tqdm from 4.65.0 to 4.66.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1235
* Updating site checkers, disabling suspended sites by @MeowyPouncer in https://github.com/soxoj/maigret/pull/1266
* Updated site statistics by @soxoj in https://github.com/soxoj/maigret/pull/1273
* Compat RegataOS (Opensuse) by @Jeiel0rbit in https://github.com/soxoj/maigret/pull/1308
* fix reddit by @hhhtylerw in https://github.com/soxoj/maigret/pull/1296
* Added Telegram bot link by @soxoj in https://github.com/soxoj/maigret/pull/1321
* Added SOWEL classification by @soxoj in https://github.com/soxoj/maigret/pull/1453
* Bump jinja2 from 3.1.2 to 3.1.3 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1358
* Fixed/Disabled sites. Update requirements.txt by @rly0nheart in https://github.com/soxoj/maigret/pull/1517
* Fixed 4 sites, added 6 sites, disabled 27 sites by @rly0nheart in https://github.com/soxoj/maigret/pull/1536
* Fixed 3 sites, disabed 3, added  by @rly0nheart in https://github.com/soxoj/maigret/pull/1539
* Bump socid-extractor from 0.0.24 to 0.0.26 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1546
* Added code conventions to CONTRIBUTING.md by @Lord-Topa in https://github.com/soxoj/maigret/pull/1589
* Readme by @Lord-Topa in https://github.com/soxoj/maigret/pull/1588
* Update data.json by @ranlo in https://github.com/soxoj/maigret/pull/1559
* Adding permutator feature for usernames by @balestek in https://github.com/soxoj/maigret/pull/1575
* Alik.cz indirectly requests removal by @ppfeister in https://github.com/soxoj/maigret/pull/1671
* Fixed 1 site, PyInstaller workflow, Google Colab example by @Ixve in https://github.com/soxoj/maigret/pull/1558
* Bump soupsieve from 2.5 to 2.6 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1708
* Added dev documentation, fixed some sites, removed GitHub issue links… by @soxoj in https://github.com/soxoj/maigret/pull/1869
* Bump cryptography from 42.0.7 to 43.0.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1870
* Bump requests-futures from 1.0.1 to 1.0.2 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1868
* Bump werkzeug from 3.0.3 to 3.0.6 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1846
* Added .readthedocs.yaml, fixed Pyinstaller and Docker workflows by @soxoj in https://github.com/soxoj/maigret/pull/1874
* Added GitHub and BuyMeACoffee sponsorships by @soxoj in https://github.com/soxoj/maigret/pull/1875
* Bump psutil from 5.9.5 to 6.1.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1839
* Bump flake8 from 6.1.0 to 7.1.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1692
* Bump future from 0.18.3 to 1.0.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1545
* Bump urllib3 from 2.2.1 to 2.2.2 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1600
* Bump certifi from 2023.11.17 to 2024.8.30 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1840
* Fixed test for aiohttp 3.10 by @soxoj in https://github.com/soxoj/maigret/pull/1876
* Bump aiohttp from 3.9.5 to 3.10.5 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1721
* Added new badges to README by @soxoj in https://github.com/soxoj/maigret/pull/1877
* Show detailed error statistics for `-v` by @soxoj in https://github.com/soxoj/maigret/pull/1879
* Disabled unavailable sites by @soxoj in https://github.com/soxoj/maigret/pull/1880
* Added 7 sites, implemented integration with Marple, docs update by @soxoj in https://github.com/soxoj/maigret/pull/1881
* Bump pefile from 2022.5.30 to 2024.8.26 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1883
* Bump lxml from 4.9.4 to 5.3.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1884
* New sites added by @soxoj in https://github.com/soxoj/maigret/pull/1888
* Improved self-check mode, added 15 sites by @soxoj in https://github.com/soxoj/maigret/pull/1887
* Bump pyinstaller from 6.1 to 6.11.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1882
* Bump pytest-asyncio from 0.23.7 to 0.23.8 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1885
* Pyinstaller bump & pefile fix by @soxoj in https://github.com/soxoj/maigret/pull/1890
* Bump python-bidi from 0.4.2 to 0.6.3 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1886
* Sites checks fixes by @soxoj in https://github.com/soxoj/maigret/pull/1896
* Parallel execution optimization by @soxoj in https://github.com/soxoj/maigret/pull/1897
* Maigret bot support (custom progress function fixed) by @soxoj in https://github.com/soxoj/maigret/pull/1898
* Bump markupsafe from 2.1.5 to 3.0.2 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1895
* Retries set to 0 by default, refactored code of executor with progress by @soxoj in https://github.com/soxoj/maigret/pull/1899
* Bump aiohttp-socks from 0.7.1 to 0.9.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1900
* Bump pycountry from 23.12.11 to 24.6.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1903
* Bump pytest-cov from 4.1.0 to 6.0.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1902
* Bump pyvis from 0.2.1 to 0.3.2 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1893
* Close http connections (#1595) by @soxoj in https://github.com/soxoj/maigret/pull/1905
* New logo by @soxoj in https://github.com/soxoj/maigret/pull/1906
* Fixed dateutil parsing error for CDT timezone by @soxoj in https://github.com/soxoj/maigret/pull/1907
* Bump alive-progress from 2.4.1 to 3.2.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1910
* Permutator output and documentation updates by @soxoj in https://github.com/soxoj/maigret/pull/1914
* Bump aiohttp from 3.11.7 to 3.11.8 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1912
* Bump async-timeout from 4.0.3 to 5.0.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1909
* An recursive search animation in README has been updated by @soxoj in https://github.com/soxoj/maigret/pull/1915
* Bump pytest-rerunfailures from 12.0 to 15.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1911
* Bump attrs from 22.2.0 to 24.2.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1913
* Sites fixes by @soxoj in https://github.com/soxoj/maigret/pull/1917
* Update README.md by @soxoj in https://github.com/soxoj/maigret/pull/1919
* Refactored sites module, updated documentation by @soxoj in https://github.com/soxoj/maigret/pull/1918
* Bump aiohttp from 3.11.8 to 3.11.9 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1920
* Bump pytest from 7.4.4 to 8.3.4 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1923
* Bump yarl from 1.18.0 to 1.18.3 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1922
* Bump pytest-asyncio from 0.23.8 to 0.24.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1925
* Documentation update by @soxoj in https://github.com/soxoj/maigret/pull/1926
* Bump mock from 4.0.3 to 5.1.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1921
* Bump pywin32-ctypes from 0.2.1 to 0.2.3 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1924
* Installation docs update by @soxoj in https://github.com/soxoj/maigret/pull/1927
* Disabled Figma check by @soxoj in https://github.com/soxoj/maigret/pull/1928
* Put Windows executable in Releases for each dev and main commit by @soxoj in https://github.com/soxoj/maigret/pull/1929
* Updated PyInstaller workflow by @soxoj in https://github.com/soxoj/maigret/pull/1930
* Documentation update by @soxoj in https://github.com/soxoj/maigret/pull/1931
* Fixed Figma check and some bugs by @soxoj in https://github.com/soxoj/maigret/pull/1932
* Bump six from 1.16.0 to 1.17.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1933
* Activation mechanism documentation added by @soxoj in https://github.com/soxoj/maigret/pull/1935
* Readme/docs update based on GH discussions by @soxoj in https://github.com/soxoj/maigret/pull/1936
* Bump aiohttp from 3.11.9 to 3.11.10 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1937
* Weibo site check fix, activation mechanism added by @soxoj in https://github.com/soxoj/maigret/pull/1938
* Fixed Ebay and BongaCams checks by @soxoj in https://github.com/soxoj/maigret/pull/1939
* Sites fixes by @soxoj in https://github.com/soxoj/maigret/pull/1940
* Fixed Linktr and discourse.mozilla.org by @soxoj in https://github.com/soxoj/maigret/pull/1941
* Refactored self-check method, code formatting, small lint fixes by @soxoj in https://github.com/soxoj/maigret/pull/1942
* Refactoring, test coverage increased to 60% by @soxoj in https://github.com/soxoj/maigret/pull/1943
* Added a test for submitter by @soxoj in https://github.com/soxoj/maigret/pull/1944
* Update README.md by @soxoj in https://github.com/soxoj/maigret/pull/1949
* Updated OP.GG checks by @soxoj in https://github.com/soxoj/maigret/pull/1950
* Fixed ProductHunt check by @soxoj in https://github.com/soxoj/maigret/pull/1951
* Improved check feature extraction function, added tests by @soxoj in https://github.com/soxoj/maigret/pull/1952
* Submit improvements and site check fixes by @soxoj in https://github.com/soxoj/maigret/pull/1956
* chore: update submit.py by @eltociear in https://github.com/soxoj/maigret/pull/1957
* Fixed Gravatar parsing (socid_extractor) by @soxoj in https://github.com/soxoj/maigret/pull/1958
* Site check fixes by @soxoj in https://github.com/soxoj/maigret/pull/1962
* fix bad linux filename generation by @overcuriousity in https://github.com/soxoj/maigret/pull/1961
* Bump pytest-asyncio from 0.24.0 to 0.25.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1963
* Fixed flaky tests to check cookies by @soxoj in https://github.com/soxoj/maigret/pull/1965
* Preparation of 0.5.0 alpha version by @soxoj in https://github.com/soxoj/maigret/pull/1966
* Created web frontend launched via --web flag by @overcuriousity in https://github.com/soxoj/maigret/pull/1967
* Bump certifi from 2024.8.30 to 2024.12.14 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1969
* Bump attrs from 24.2.0 to 24.3.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1970
* Added web interface docs by @soxoj in https://github.com/soxoj/maigret/pull/1972
* Small docs and parameters fixes for web interface mode by @soxoj in https://github.com/soxoj/maigret/pull/1973
* [ImgBot] Optimize images by @imgbot[bot] in https://github.com/soxoj/maigret/pull/1974
* Improving the web interface by @overcuriousity in https://github.com/soxoj/maigret/pull/1975
* make graph more meaningful by @overcuriousity in https://github.com/soxoj/maigret/pull/1977
* Async generator-executor for site checks by @soxoj in https://github.com/soxoj/maigret/pull/1978
* Bump aiohttp from 3.11.10 to 3.11.11 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1979
* Bump psutil from 6.1.0 to 6.1.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1980
* Bump aiohttp-socks from 0.9.1 to 0.10.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1985
* Bump mypy from 1.13.0 to 1.14.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1983
* Bump aiohttp-socks from 0.10.0 to 0.10.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1987
* Bump jinja2 from 3.1.4 to 3.1.5 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1982
* Bump coverage from 7.6.9 to 7.6.10 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1986
* Bump pytest-asyncio from 0.25.0 to 0.25.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1989
* Bump mypy from 1.14.0 to 1.14.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1988
* Bump pytest-asyncio from 0.25.1 to 0.25.2 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/1990
* docs: update usage-examples.rst by @eltociear in https://github.com/soxoj/maigret/pull/1996
* upload-artifact action in python test workflow updated to v4 by @soxoj in https://github.com/soxoj/maigret/pull/2024
* Pass db_file configuration to web interface by @pykereaper in https://github.com/soxoj/maigret/pull/2019
* Fix usage of data.json files from web by @pykereaper in https://github.com/soxoj/maigret/pull/2020
* Bump black from 24.10.0 to 25.1.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2001
* Important Update Installer.bat by @CatchySmile in https://github.com/soxoj/maigret/pull/1994
* Bump cryptography from 44.0.0 to 44.0.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2005
* Bump jinja2 from 3.1.5 to 3.1.6 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2011
* [#2010] Add 6 more websites to manage by @pylapp in https://github.com/soxoj/maigret/pull/2009
* Bump flask from 3.1.0 to 3.1.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2028
* Bump requests from 2.32.3 to 2.32.4 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2026
* Bump pycares from 4.5.0 to 4.9.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2025
* Bump pytest-asyncio from 0.25.2 to 0.26.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2016
* Bump urllib3 from 2.2.3 to 2.5.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2027
* Disable ICQ site by @Echo-Darlyson in https://github.com/soxoj/maigret/pull/1993
* Bump attrs from 24.3.0 to 25.3.0 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2014
* Bump certifi from 2024.12.14 to 2025.1.31 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2004
* Bump typing-extensions from 4.12.2 to 4.14.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2038
* Disable AskFM by @MR-VL in https://github.com/soxoj/maigret/pull/2037
* Bump platformdirs from 4.3.6 to 4.3.8 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2033
* Bump coverage from 7.6.10 to 7.9.2 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2039
* Bump aiohttp from 3.11.11 to 3.12.14 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2041
* Bump yarl from 1.18.3 to 1.20.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2032
* Fixed test dialog_adds_site_negative by @soxoj in https://github.com/soxoj/maigret/pull/2107
* Bump reportlab from 4.2.5 to 4.4.3 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2063
* Bump asgiref from 3.8.1 to 3.9.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2040
* Bump multidict from 6.1.0 to 6.6.3 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2034
* Bump pytest-rerunfailures from 15.0 to 15.1 by @dependabot[bot] in https://github.com/soxoj/maigret/pull/2030

**Full Changelog**: https://github.com/soxoj/maigret/compare/v0.4.4...v0.5.0

## [0.4.4] - 2022-09-03
* Fixed some false positives by @soxoj in https://github.com/soxoj/maigret/pull/433
* Drop Python 3.6 support by @soxoj in https://github.com/soxoj/maigret/pull/434
* Bump xhtml2pdf from 0.2.5 to 0.2.7 by @dependabot in https://github.com/soxoj/maigret/pull/409
* Bump reportlab from 3.6.6 to 3.6.9 by @dependabot in https://github.com/soxoj/maigret/pull/403
* Bump markupsafe from 2.0.1 to 2.1.1 by @dependabot in https://github.com/soxoj/maigret/pull/389
* Bump pycountry from 22.1.10 to 22.3.5 by @dependabot in https://github.com/soxoj/maigret/pull/384
* Bump pypdf2 from 1.26.0 to 1.27.4 by @dependabot in https://github.com/soxoj/maigret/pull/438
* Update GH actions by @soxoj in https://github.com/soxoj/maigret/pull/439
* Bump tqdm from 4.63.0 to 4.64.0 by @dependabot in https://github.com/soxoj/maigret/pull/440
* Bump jinja2 from 3.0.3 to 3.1.1 by @dependabot in https://github.com/soxoj/maigret/pull/441
* Bump soupsieve from 2.3.1 to 2.3.2 by @dependabot in https://github.com/soxoj/maigret/pull/436
* Bump pypdf2 from 1.26.0 to 1.27.4 by @dependabot in https://github.com/soxoj/maigret/pull/442
* Bump pyvis from 0.1.9 to 0.2.0 by @dependabot in https://github.com/soxoj/maigret/pull/443
* Bump pypdf2 from 1.27.4 to 1.27.6 by @dependabot in https://github.com/soxoj/maigret/pull/448
* Bump typing-extensions from 4.1.1 to 4.2.0 by @dependabot in https://github.com/soxoj/maigret/pull/447
* Bump soupsieve from 2.3.2 to 2.3.2.post1 by @dependabot in https://github.com/soxoj/maigret/pull/444
* Bump pypdf2 from 1.27.6 to 1.27.7 by @dependabot in https://github.com/soxoj/maigret/pull/449
* Bump pypdf2 from 1.27.7 to 1.27.8 by @dependabot in https://github.com/soxoj/maigret/pull/450
* XMind 8 report warning and some docs update by @soxoj in https://github.com/soxoj/maigret/pull/452
* False positive fixes 24.04.22 by @soxoj in https://github.com/soxoj/maigret/pull/455
* Bump pypdf2 from 1.27.8 to 1.27.9 by @dependabot in https://github.com/soxoj/maigret/pull/456
* Bump pytest from 7.0.1 to 7.1.2 by @dependabot in https://github.com/soxoj/maigret/pull/457
* Bump jinja2 from 3.1.1 to 3.1.2 by @dependabot in https://github.com/soxoj/maigret/pull/460
* Ubisoft forums addition by @fen0s in https://github.com/soxoj/maigret/pull/461
* Add BYOND, Figma, BeatStars by @fen0s in https://github.com/soxoj/maigret/pull/462
* fix Figma username definition, add a bunch of sites by @fen0s in https://github.com/soxoj/maigret/pull/464
* Bump pypdf2 from 1.27.9 to 1.27.10 by @dependabot in https://github.com/soxoj/maigret/pull/465
* Bump pypdf2 from 1.27.10 to 1.27.12 by @dependabot in https://github.com/soxoj/maigret/pull/466
* Sites fixes 05 05 22 by @soxoj in https://github.com/soxoj/maigret/pull/469
* Bump pyvis from 0.2.0 to 0.2.1 by @dependabot in https://github.com/soxoj/maigret/pull/472
* Social analyzer websites, also fixing presense strs by @fen0s in https://github.com/soxoj/maigret/pull/471
* Updated logic of false positive risk estimating by @soxoj in https://github.com/soxoj/maigret/pull/475
* Improved usability of external progressbar func by @soxoj in https://github.com/soxoj/maigret/pull/476
* New sites added, some tags/rank update by @soxoj in https://github.com/soxoj/maigret/pull/477
* Added new sites by @soxoj in https://github.com/soxoj/maigret/pull/480
* Added new forums, updated ranks, some utils improvements by @soxoj in https://github.com/soxoj/maigret/pull/481
* Disabled sites with false positives results by @soxoj in https://github.com/soxoj/maigret/pull/482
* Bump certifi from 2021.10.8 to 2022.5.18.1 by @dependabot in https://github.com/soxoj/maigret/pull/488
* Bump psutil from 5.9.0 to 5.9.1 by @dependabot in https://github.com/soxoj/maigret/pull/490
* Bump pypdf2 from 1.27.12 to 1.28.1 by @dependabot in https://github.com/soxoj/maigret/pull/491
* Bump pypdf2 from 1.28.1 to 1.28.2 by @dependabot in https://github.com/soxoj/maigret/pull/493
* added and fixed some websites in data.json by @kustermariocoding in https://github.com/soxoj/maigret/pull/494
* Bump pypdf2 from 1.28.2 to 2.0.0 by @dependabot in https://github.com/soxoj/maigret/pull/504
* Bump pefile from 2021.9.3 to 2022.5.30 by @dependabot in https://github.com/soxoj/maigret/pull/499
* Updated sites list, added disabled Anilist by @soxoj in https://github.com/soxoj/maigret/pull/502
* Bump lxml from 4.8.0 to 4.9.0 by @dependabot in https://github.com/soxoj/maigret/pull/503
* Compatibility with Python 10 by @soxoj in https://github.com/soxoj/maigret/pull/509
* feat: add .log & .bak files to gitignore in https://github.com/soxoj/maigret/pull/511
* fix some sites and delete abandoned by @fen0s in https://github.com/soxoj/maigret/pull/526
* Fixesjulyfirst by @fen0s in https://github.com/soxoj/maigret/pull/533
* yazbel, aboutcar, zhihu by @fen0s in https://github.com/soxoj/maigret/pull/531
* Fixes july third by @fen0s in https://github.com/soxoj/maigret/pull/535
* Update data.json by @fen0s in https://github.com/soxoj/maigret/pull/539
* Update data.json by @fen0s in https://github.com/soxoj/maigret/pull/540
* Bump reportlab from 3.6.9 to 3.6.11 by @dependabot in https://github.com/soxoj/maigret/pull/543
* Bump requests from 2.27.1 to 2.28.1 by @dependabot in https://github.com/soxoj/maigret/pull/530
* Bump pypdf2 from 2.0.0 to 2.5.0 by @dependabot in https://github.com/soxoj/maigret/pull/542
* Bump xhtml2pdf from 0.2.7 to 0.2.8 by @dependabot in https://github.com/soxoj/maigret/pull/522
* Bump lxml from 4.9.0 to 4.9.1 by @dependabot in https://github.com/soxoj/maigret/pull/538
* disable yandex music + set utf8 encoding by @fen0s in https://github.com/soxoj/maigret/pull/562
* fix false positives by @fen0s in https://github.com/soxoj/maigret/pull/577
* disable Instagram, fix two false positives by @fen0s in https://github.com/soxoj/maigret/pull/578
* Bump certifi from 2022.5.18.1 to 2022.6.15 by @dependabot in https://github.com/soxoj/maigret/pull/551
* August15 by @fen0s in https://github.com/soxoj/maigret/pull/591
* Bump pytest-httpserver from 1.0.4 to 1.0.5 by @dependabot in https://github.com/soxoj/maigret/pull/583
* Bump typing-extensions from 4.2.0 to 4.3.0 by @dependabot in https://github.com/soxoj/maigret/pull/549
* Bump colorama from 0.4.4 to 0.4.5 by @dependabot in https://github.com/soxoj/maigret/pull/548
* Bump chardet from 4.0.0 to 5.0.0 by @dependabot in https://github.com/soxoj/maigret/pull/550
* Bump cloudscraper from 1.2.60 to 1.2.63 by @dependabot in https://github.com/soxoj/maigret/pull/600
* Bump flake8 from 4.0.1 to 5.0.4 by @dependabot in https://github.com/soxoj/maigret/pull/598
* Bump attrs from 21.4.0 to 22.1.0 by @dependabot in https://github.com/soxoj/maigret/pull/597
* Bump pytest-asyncio from 0.18.2 to 0.19.0 by @dependabot in https://github.com/soxoj/maigret/pull/601
* Bump pypdf2 from 2.5.0 to 2.10.4 by @dependabot in https://github.com/soxoj/maigret/pull/606
* Bump pytest from 7.1.2 to 7.1.3 by @dependabot in https://github.com/soxoj/maigret/pull/613
* Update sites.md -Gitmemory.com suppression by @C3n7ral051nt4g3ncy in https://github.com/soxoj/maigret/pull/610
* Bump cloudscraper from 1.2.63 to 1.2.64 by @dependabot in https://github.com/soxoj/maigret/pull/614
* Bump pycountry from 22.1.10 to 22.3.5 by @dependabot in https://github.com/soxoj/maigret/pull/607
* add ProtonMail, disable 3 broken sites by @fen0s in https://github.com/soxoj/maigret/pull/619
* Bump tqdm from 4.64.0 to 4.64.1 by @dependabot in https://github.com/soxoj/maigret/pull/618

**Full Changelog**: https://github.com/soxoj/maigret/compare/v0.4.3...v0.4.4

## [0.4.3] - 2022-04-13
* Added Sites to data.json by @kustermariocoding in https://github.com/soxoj/maigret/pull/386
* added new Websites to data.json by @kustermariocoding in https://github.com/soxoj/maigret/pull/390
* Skipped broken tests by @soxoj in https://github.com/soxoj/maigret/pull/397
* Added new Websites to data.json by @kustermariocoding in https://github.com/soxoj/maigret/pull/401
* Added new Websites to data.json by @kustermariocoding in https://github.com/soxoj/maigret/pull/404
* Updated statistics by @soxoj in https://github.com/soxoj/maigret/pull/406
* Added new Websites to data.json by @kustermariocoding in https://github.com/soxoj/maigret/pull/413
* Disabled houzz.com, updated sites statistics by @soxoj in https://github.com/soxoj/maigret/pull/422
* Fixed last false positives by @soxoj in https://github.com/soxoj/maigret/pull/424
* Fixed actual false positives by @soxoj in https://github.com/soxoj/maigret/pull/431

**Full Changelog**: https://github.com/soxoj/maigret/compare/v0.4.2...v0.4.3

## [0.4.2] - 2022-03-07
* [ImgBot] Optimize images by @imgbot in https://github.com/soxoj/maigret/pull/319
* Bump pytest-asyncio from 0.17.0 to 0.17.1 by @dependabot in https://github.com/soxoj/maigret/pull/321
* Bump pytest-asyncio from 0.17.1 to 0.17.2 by @dependabot in https://github.com/soxoj/maigret/pull/323
* Disabled Ruboard by @soxoj in https://github.com/soxoj/maigret/pull/327
* Disable kinooh, sites list update workflow added by @soxoj in https://github.com/soxoj/maigret/pull/329
* Bump multidict from 5.2.0 to 6.0.1 by @dependabot in https://github.com/soxoj/maigret/pull/332
* Bump multidict from 6.0.1 to 6.0.2 by @dependabot in https://github.com/soxoj/maigret/pull/333
* Bump pytest-httpserver from 1.0.3 to 1.0.4 by @dependabot in https://github.com/soxoj/maigret/pull/334
* Bump pytest from 6.2.5 to 7.0.0 by @dependabot in https://github.com/soxoj/maigret/pull/339
* Bump pytest-asyncio from 0.17.2 to 0.18.0 by @dependabot in https://github.com/soxoj/maigret/pull/340
* Bump pytest-asyncio from 0.18.0 to 0.18.1 by @dependabot in https://github.com/soxoj/maigret/pull/343
* Bump pytest from 7.0.0 to 7.0.1 by @dependabot in https://github.com/soxoj/maigret/pull/345
* Bump typing-extensions from 4.0.1 to 4.1.1 by @dependabot in https://github.com/soxoj/maigret/pull/346
* Bump lxml from 4.7.1 to 4.8.0 by @dependabot in https://github.com/soxoj/maigret/pull/350
* Pin reportlab version by @cyb3rk0tik in https://github.com/soxoj/maigret/pull/351
* Fix reportlab not only for testing by @cyb3rk0tik in https://github.com/soxoj/maigret/pull/352
* Added some scripts by @soxoj in https://github.com/soxoj/maigret/pull/355
* Added package publishing instruction by @soxoj in https://github.com/soxoj/maigret/pull/356
* Added DB statistics autoupdate and write to sites.md by @soxoj in https://github.com/soxoj/maigret/pull/357
* CI autoupdate by @soxoj in https://github.com/soxoj/maigret/pull/359
* Op.gg fixes by @soxoj in https://github.com/soxoj/maigret/pull/363
* Wikipedia fix by @soxoj in https://github.com/soxoj/maigret/pull/365
* Disabled Netvibes and LeetCode by @soxoj in https://github.com/soxoj/maigret/pull/366
* Fixed several false positives, improved statistics info by @soxoj in https://github.com/soxoj/maigret/pull/368
* Fix false positives  by @soxoj in https://github.com/soxoj/maigret/pull/370
* Fixed the rest of false positives for now by @soxoj in https://github.com/soxoj/maigret/pull/371
* Fix false positive and CI by @soxoj in https://github.com/soxoj/maigret/pull/372
* Added new sites to data.json by @kustermariocoding in https://github.com/soxoj/maigret/pull/375
* Fixed issue with str alexaRank by @soxoj in https://github.com/soxoj/maigret/pull/382
* Bump tqdm from 4.62.3 to 4.63.0 by @dependabot in https://github.com/soxoj/maigret/pull/374
* Bump pytest-asyncio from 0.18.1 to 0.18.2 by @dependabot in https://github.com/soxoj/maigret/pull/380
* @imgbot made their first contribution in https://github.com/soxoj/maigret/pull/319
* @kustermariocoding made their first contribution in https://github.com/soxoj/maigret/pull/375

**Full Changelog**: https://github.com/soxoj/maigret/compare/v0.4.1...v0.4.2

## [0.4.1] - 2022-01-15
* Added dozen of sites, improved submit mode by @soxoj in https://github.com/soxoj/maigret/pull/288
* Bump requests from 2.26.0 to 2.27.0 by @dependabot in https://github.com/soxoj/maigret/pull/292
* changed Bayoushooter to use XenForo and foursquare to use correct checkType by @antomarsi in https://github.com/soxoj/maigret/pull/289
* Bump requests from 2.27.0 to 2.27.1 by @dependabot in https://github.com/soxoj/maigret/pull/293
* Added aparat.com by @soxoj in https://github.com/soxoj/maigret/pull/294
* Fixed BongaCams, links parsing improved by @soxoj in https://github.com/soxoj/maigret/pull/297
* Temporary fix for Twitter (#299) by @soxoj in https://github.com/soxoj/maigret/pull/300
* Fixed TikTok checks (#303) by @soxoj in https://github.com/soxoj/maigret/pull/306
* Bump pycountry from 20.7.3 to 22.1.10 by @dependabot in https://github.com/soxoj/maigret/pull/313
* Pornhub search improved by @soxoj in https://github.com/soxoj/maigret/pull/315
* Codacademy fixed by @soxoj in https://github.com/soxoj/maigret/pull/316
* Bump pytest-asyncio from 0.16.0 to 0.17.0 by @dependabot in https://github.com/soxoj/maigret/pull/314

**Full Changelog**: https://github.com/soxoj/maigret/compare/v0.4.0...v0.4.1

## [0.4.0] - 2022-01-03
* Delayed import of requests module, speed check command, reqs updated by @soxoj in https://github.com/soxoj/maigret/pull/189
* Snapcraft yaml added by @soxoj in https://github.com/soxoj/maigret/pull/190
* Create codeql-analysis.yml by @soxoj in https://github.com/soxoj/maigret/pull/191
* Move wiki pages to ReadTheDocs by @egornagornov in https://github.com/soxoj/maigret/pull/194
* Created ReadTheDocs requirements file by @soxoj in https://github.com/soxoj/maigret/pull/195
* Fix incompatible version requirements by @JasperJuergensen in https://github.com/soxoj/maigret/pull/196
* Added link to documentation by @soxoj in https://github.com/soxoj/maigret/pull/198
* Upgraded base docker image by @soxoj in https://github.com/soxoj/maigret/pull/199
* Run CodeQL only aflter merge and each Saturday by @soxoj in https://github.com/soxoj/maigret/pull/201
* Added cascade settings loading from /.maigret/settings.json and ./settings.json by @soxoj in https://github.com/soxoj/maigret/pull/200
* Documentation and settings improved by @soxoj in https://github.com/soxoj/maigret/pull/203
* New config options added by @soxoj in https://github.com/soxoj/maigret/pull/204
* Added export of cli entrypoint by @soxoj in https://github.com/soxoj/maigret/pull/207
* Removed redundant logging by @soxoj in https://github.com/soxoj/maigret/pull/210
* PyInstaller workflow by @soxoj in https://github.com/soxoj/maigret/pull/206
* Create bug.md by @soxoj in https://github.com/soxoj/maigret/pull/213
* Fixed path and names of report files by @soxoj in https://github.com/soxoj/maigret/pull/216
* Box drawing logic improved, added new settings by @soxoj in https://github.com/soxoj/maigret/pull/217
* Fixes for win32 release by @soxoj in https://github.com/soxoj/maigret/pull/218
* Bump six from 1.15.0 to 1.16.0 by @dependabot in https://github.com/soxoj/maigret/pull/221
* Bump flake8 from 3.8.4 to 4.0.1 by @dependabot in https://github.com/soxoj/maigret/pull/219
* Bump aiohttp from 3.7.4 to 3.8.0 by @dependabot in https://github.com/soxoj/maigret/pull/220
* Bump aiohttp-socks from 0.5.5 to 0.6.0 by @dependabot in https://github.com/soxoj/maigret/pull/222
* Bump typing-extensions from 3.7.4.3 to 3.10.0.2 by @dependabot in https://github.com/soxoj/maigret/pull/224
* Bump multidict from 5.1.0 to 5.2.0 by @dependabot in https://github.com/soxoj/maigret/pull/225
* Bump idna from 2.10 to 3.3 by @dependabot in https://github.com/soxoj/maigret/pull/228
* Bump pytest-cov from 2.10.1 to 3.0.0 by @dependabot in https://github.com/soxoj/maigret/pull/227
* Bump mock from 4.0.2 to 4.0.3 by @dependabot in https://github.com/soxoj/maigret/pull/226
* Bump certifi from 2020.12.5 to 2021.10.8 by @dependabot in https://github.com/soxoj/maigret/pull/233
* Bump pytest-httpserver from 1.0.0 to 1.0.2 by @dependabot in https://github.com/soxoj/maigret/pull/232
* Bump lxml from 4.6.3 to 4.6.4 by @dependabot in https://github.com/soxoj/maigret/pull/231
* Bump pefile from 2019.4.18 to 2021.9.3 by @dependabot in https://github.com/soxoj/maigret/pull/229
* Bump pytest-rerunfailures from 9.1.1 to 10.2 by @dependabot in https://github.com/soxoj/maigret/pull/230
* Bump yarl from 1.6.3 to 1.7.2 by @dependabot in https://github.com/soxoj/maigret/pull/237
* Bump async-timeout from 4.0.0 to 4.0.1 by @dependabot in https://github.com/soxoj/maigret/pull/236
* Bump psutil from 5.7.0 to 5.8.0 by @dependabot in https://github.com/soxoj/maigret/pull/234
* Bump jinja2 from 3.0.2 to 3.0.3 by @dependabot in https://github.com/soxoj/maigret/pull/235
* Bump pytest from 6.2.4 to 6.2.5 by @dependabot in https://github.com/soxoj/maigret/pull/238
* Bump tqdm from 4.55.0 to 4.62.3 by @dependabot in https://github.com/soxoj/maigret/pull/242
* Bump arabic-reshaper from 2.1.1 to 2.1.3 by @dependabot in https://github.com/soxoj/maigret/pull/243
* Bump pytest-asyncio from 0.14.0 to 0.16.0 by @dependabot in https://github.com/soxoj/maigret/pull/240
* Bump chardet from 3.0.4 to 4.0.0 by @dependabot in https://github.com/soxoj/maigret/pull/241
* Bump soupsieve from 2.1 to 2.3.1 by @dependabot in https://github.com/soxoj/maigret/pull/239
* Bump aiohttp from 3.8.0 to 3.8.1 by @dependabot in https://github.com/soxoj/maigret/pull/246
* Bump typing-extensions from 3.10.0.2 to 4.0.0 by @dependabot in https://github.com/soxoj/maigret/pull/245
* Bump aiohttp-socks from 0.6.0 to 0.6.1 by @dependabot in https://github.com/soxoj/maigret/pull/249
* Bump aiohttp-socks from 0.6.1 to 0.7.1 by @dependabot in https://github.com/soxoj/maigret/pull/250
* Bump typing-extensions from 4.0.0 to 4.0.1 by @dependabot in https://github.com/soxoj/maigret/pull/253
* Fixed some false positives by @soxoj in https://github.com/soxoj/maigret/pull/254
* Disabled non-working sites by @soxoj in https://github.com/soxoj/maigret/pull/255
* Added false results buttons to reports, fixed some falses by @soxoj in https://github.com/soxoj/maigret/pull/256
* Fixed xHamster, added support of proxies to self-check mode by @soxoj in https://github.com/soxoj/maigret/pull/259
* Disabled non-working sites, updated public sites list by @soxoj in https://github.com/soxoj/maigret/pull/263
* Bump lxml from 4.6.4 to 4.6.5 by @dependabot in https://github.com/soxoj/maigret/pull/266
* Bump lxml from 4.6.5 to 4.7.1 by @dependabot in https://github.com/soxoj/maigret/pull/269
* Bump pytest-httpserver from 1.0.2 to 1.0.3 by @dependabot in https://github.com/soxoj/maigret/pull/270
* Fixed failed tests (thx to Meta aka Facebook) by @soxoj in https://github.com/soxoj/maigret/pull/273
* Fixed votetags, updated issue template by @soxoj in https://github.com/soxoj/maigret/pull/278
* Bump async-timeout from 4.0.1 to 4.0.2 by @dependabot in https://github.com/soxoj/maigret/pull/275
* Fixed some false positives by @soxoj in https://github.com/soxoj/maigret/pull/280
* Bump attrs from 21.2.0 to 21.3.0 by @dependabot in https://github.com/soxoj/maigret/pull/281
* Bump psutil from 5.8.0 to 5.9.0 by @dependabot in https://github.com/soxoj/maigret/pull/282
* Bump attrs from 21.3.0 to 21.4.0 by @dependabot in https://github.com/soxoj/maigret/pull/283

**Full Changelog**: https://github.com/soxoj/maigret/compare/v0.3.1...v0.4.0

## [0.3.1] - 2021-10-31
* fixed false positives
* accelerated maigret start time by 3 times

## [0.3.0] - 2021-06-02
* added support of Tor and I2P sites
* added experimental DNS checking feature
* implemented sorting by data points for reports
* reports fixes

## [0.2.4] - 2021-05-18
* cli output report
* various improvements

## [0.2.3] - 2021-05-12
* added Yelp and yelp_userid support
* tags markup stabilization
* improved errors detection

## [0.2.2] - 2021-05-07
* improved ids extractors
* updated sites and engines
* updates CLI options

## [0.2.1] - 2021-05-02
* fixed json reports generation bug, added tests

## [0.2.0] - 2021-05-02
* added `--retries` option
* added `source` feature for sites' mirrors
* improved `submit` mode
* lot of style and logic fixes

## [0.1.20] - 2021-05-02 [YANKED]

## [0.1.19] - 2021-04-14
* added `--no-progressbar` option
* fixed ascii tree bug
* fixed `python -m maigret` run
* fixed requests freeze with timeout async tasks

## [0.1.18] - 2021-03-30
* some API improvements

## [0.1.17] - 2021-03-30
* simplified maigret search API
* improved documentation
* fixed 403 response code ignoring bug

## [0.1.16] - 2021-03-21
* improved URL parsing mode
* improved sites submit mode
* added uID.me uguid support
* improved requests processing

## [0.1.15] - 2021-03-14
* improved HTML reports
* fixed python-3.6-specific error
* false positives fixes

## [0.1.14] - 2021-02-25
* added JSON export formats
* improved tags markup
* realized username detection in userinfo links
* added DB stats CLI option
* added site submit logic and CLI option
* added Spotify parsing activation
* main logic refactoring
* fixed Dockerfile
* fixed requirements

## [0.1.13] - 2021-02-06
* improved sites list filtering
* pretty console messages
* Yandex services updates
* false positives fixes

## [0.1.12] - 2021-01-28
* added support of custom cookies
* fixed lots of false positives

## [0.1.11] - 2021-01-16
* tags and custom data checks bugfixes
* added parsing activation logic

## [0.1.10] - 2021-01-13
* added report static resources into package

## [0.1.9] - 2021-01-11
* added HTML and PDF report export
* fixed support of Python 3.6
* fixed tags filtering and ranking
* more than 2000 sites supported
* refactored sites and engines logic
* added tests

## [0.1.8] - 2020-12-31
* added XMind export
* more than 1500 sites supported
* parallel processing of requests

## [0.1.7] - 2020-12-11
* fixed proxies support
* fixed aiohttp stuff to prevent python 3.7 bugs
* fixed self-checking database saving error

## [0.1.6] - 2020-12-05
* fixed Dockerfile and README

## [0.1.5] - 2020-12-05 [YANKED]

## [0.1.4] - 2020-12-05 [YANKED]

## [0.1.3] - 2020-12-05 [YANKED]

## [0.1.2] - 2020-12-05 [YANKED]

## [0.1.1] - 2020-12-05 [YANKED]

## [0.1.0] - 2020-12-05
* initial release

================================================
FILE: CODE_OF_CONDUCT.md
================================================
# Contributor Covenant Code of Conduct

## Our Pledge

We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity
and orientation.

We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.

## Our Standards

Examples of behavior that contributes to a positive environment for our
community include:

* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
  and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
  overall community

Examples of unacceptable behavior include:

* The use of sexualized language or imagery, and sexual attention or
  advances of any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email
  address, without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
  professional setting

## Enforcement Responsibilities

Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.

Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.

## Scope

This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
https://t.me/soxoj.
All complaints will be reviewed and investigated promptly and fairly.

All community leaders are obligated to respect the privacy and security of the
reporter of any incident.

## Enforcement Guidelines

Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:

### 1. Correction

**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.

**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.

### 2. Warning

**Community Impact**: A violation through a single incident or series
of actions.

**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or
permanent ban.

### 3. Temporary Ban

**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.

**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.

### 4. Permanent Ban

**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior,  harassment of an
individual, or aggression toward or disparagement of classes of individuals.

**Consequence**: A permanent ban from any sort of public interaction within
the community.

## Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.0, available at
https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.

Community Impact Guidelines were inspired by [Mozilla's code of conduct
enforcement ladder](https://github.com/mozilla/diversity).

[homepage]: https://www.contributor-covenant.org

For answers to common questions about this code of conduct, see the FAQ at
https://www.contributor-covenant.org/faq. Translations are available at
https://www.contributor-covenant.org/translations.


================================================
FILE: CONTRIBUTING.md
================================================
# How to contribute

Hey! I'm really glad you're reading this. Maigret contains a lot of sites, and it is very hard to keep all the sites operational. That's why any fix is important. 

## Code of Conduct

Please read and follow the [Code of Conduct](CODE_OF_CONDUCT.md) to foster a welcoming and inclusive community.

## How to add a new site

#### Beginner level

You can use Maigret **submit mode** (`maigret --submit URL`) to add a new site or update an existing site. In this mode Maigret do an automatic analysis of the given account URL or site main page URL to determine the site engine and methods to check account presence. After checking Maigret asks if you want to add the site, answering y/Y will rewrite the local database.

#### Advanced level

You can edit [the database JSON file](https://github.com/soxoj/maigret/blob/main/maigret/resources/data.json) (`./maigret/resources/data.json`) manually.

## Testing

There are CI checks for every PR to the Maigret repository. But it will be better to run `make format`, `make link` and `make test` to ensure you've made a corrent changes. 

## Submitting changes

To submit you changes you must [send a GitHub PR](https://github.com/soxoj/maigret/pulls) to the Maigret project.
Always write a clear log message for your commits. One-line messages are fine for small changes, but bigger changes should look like this:

    $ git commit -m "A brief summary of the commit
    > 
    > A paragraph describing what changed and its impact."

## Coding conventions

### General Guidelines

- Try to follow [PEP 8](https://www.python.org/dev/peps/pep-0008/) for Python code style.
- Ensure your code passes all tests before submitting a pull request.

### Code Style

- **Indentation**: Use 4 spaces per indentation level.
- **Imports**: 
  - Standard library imports should be placed at the top.
  - Third-party imports should follow.
  - Group imports logically.

### Naming Conventions

- **Variables and Functions**: Use `snake_case`.
- **Classes**: Use `CamelCase`.
- **Constants**: Use `UPPER_CASE`.
  
Start reading the code and you'll get the hang of it. ;)

================================================
FILE: Dockerfile
================================================
FROM python:3.11-slim
LABEL maintainer="Soxoj <soxoj@protonmail.com>"
WORKDIR /app
RUN pip install --no-cache-dir --upgrade pip
RUN apt-get update && \
    apt-get install --no-install-recommends -y \
      gcc \
      musl-dev \
      libxml2 \
      libxml2-dev \
      libxslt-dev \
    && \
    rm -rf /var/lib/apt/lists/* /tmp/*
COPY . .
RUN YARL_NO_EXTENSIONS=1 python3 -m pip install --no-cache-dir .
# For production use, set FLASK_HOST to a specific IP address for security
ENV FLASK_HOST=0.0.0.0
ENTRYPOINT ["maigret"]


================================================
FILE: Installer.bat
================================================
@echo off
goto check_Permissions

:check_Permissions
net session >nul 2>&1
if %errorLevel% == 0 (
    echo Success: Elevated permissions granted.
) else (
    echo Failure: Requires elevated permissions.
    pause >nul
)

cls
echo --------------------------------------------------------
echo          Python 3.8 or higher and pip3 required.
echo --------------------------------------------------------
echo             Press [I] to begin installation.
echo             Press [R] If already installed.
echo --------------------------------------------------------
choice /c IR
if %errorlevel%==1 goto check_python
if %errorlevel%==2 goto after

:check_python
cls
for /f "tokens=2 delims= " %%i in ('python --version 2^>nul') do (
    for /f "tokens=1,2 delims=." %%j in ("%%i") do (
        if %%j GEQ 3 (
            if %%k GEQ 8 (
                goto check_pip
            )
        )
    )
)
echo Python 3.8 or higher is required. Please install it first.
pause
exit /b

:check_pip
pip --version 2>nul | findstr /r /c:"pip" >nul
if %errorlevel% neq 0 (
    echo pip is required. Please install it first.
    pause
    exit /b
)
goto install1

:install1
cls
echo ========================================================
echo                    Maigret Installation
echo ========================================================
echo.
echo --------------------------------------------------------
echo   If your pip installation is outdated, it could cause
echo         cryptography to fail on installation.
echo --------------------------------------------------------
echo          Check for and install pip 23.3.2 now?
echo --------------------------------------------------------
choice /c YN
if %errorlevel%==1 goto install2
if %errorlevel%==2 goto install3

:install2
cls
python -m pip install --upgrade pip==23.3.2
if %errorlevel% neq 0 (
    echo Failed to update pip to version 23.3.2. Please check your installation.
    pause
    exit /b
)
goto install3

:install3
cls
echo ========================================================
echo                   Maigret Installation
echo ========================================================
echo.
echo --------------------------------------------------------
echo Installing Maigret...
python -m pip install maigret
if %errorlevel% neq 0 (
    echo Failed to install Maigret. Please check your installation.
    pause
    exit /b
)
echo.
echo +------------------------------------------------------+
echo              Maigret installed successfully.           
echo +------------------------------------------------------+
pause
goto after

:after
cls
echo ========================================================
echo                     Maigret Usage
echo ========================================================
echo.
echo +--------------------------------------------------------+
echo To use Maigret, you can run the following command:
echo.
echo     maigret [options] [username]
echo.
echo For example, to search for a username:
echo.
echo     maigret example_username
echo.
echo For more options and usage details, refer to the Maigret documentation.
echo.
echo https://github.com/soxoj/maigret/blob/5b3b81b4822f6deb2e9c31eb95039907f25beb5e/README.md
echo +--------------------------------------------------------+
echo.
cmd
pause
exit /b
exit /b


================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) 2019 Sherlock Project
Copyright (c) 2020-2021 Soxoj

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: MANIFEST.in
================================================
include LICENSE
include README.md
include requirements.txt
include maigret/resources/*


================================================
FILE: Makefile
================================================
LINT_FILES=maigret wizard.py tests

test:
	coverage run --source=./maigret,./maigret/web -m pytest tests
	coverage report -m
	coverage html

rerun-tests:
	pytest --lf -vv

lint:
	@echo 'syntax errors or undefined names'
	flake8 --count --select=E9,F63,F7,F82 --show-source --statistics ${LINT_FILES}

	@echo 'warning'
	flake8 --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics --ignore=E731,W503,E501 ${LINT_FILES}

	@echo 'mypy'
	mypy --check-untyped-defs ${LINT_FILES}

speed:
	time python3 -m maigret --version
	python3 -c "import timeit; t = timeit.Timer('import maigret'); print(t.timeit(number = 1000000))"
	python3 -X importtime -c "import maigret" 2> maigret-import.log
	python3 -m tuna maigret-import.log

format:
	@echo 'black'
	black --skip-string-normalization ${LINT_FILES}

pull:
	git stash
	git checkout main
	git pull origin main
	git stash pop

clean:
	rm -rf reports htmcov dist

install:
	pip3 install .


================================================
FILE: README.md
================================================
# Maigret

<p align="center">
  <p align="center">
    <a href="https://pypi.org/project/maigret/">
        <img alt="PyPI version badge for Maigret" src="https://img.shields.io/pypi/v/maigret?style=flat-square" />
    </a>
    <a href="https://pypi.org/project/maigret/">  
        <img alt="PyPI download count for Maigret" src="https://img.shields.io/pypi/dw/maigret?style=flat-square" />
    </a>
    <a href="https://github.com/soxoj/maigret">
        <img alt="Minimum Python version required: 3.10+" src="https://img.shields.io/badge/Python-3.10%2B-brightgreen?style=flat-square" />
    </a>
    <a href="https://github.com/soxoj/maigret/blob/main/LICENSE">
        <img alt="License badge for Maigret" src="https://img.shields.io/github/license/soxoj/maigret?style=flat-square" />
    </a>
    <a href="https://github.com/soxoj/maigret">
        <img alt="View count for Maigret project" src="https://komarev.com/ghpvc/?username=maigret&color=brightgreen&label=views&style=flat-square" />
    </a>
  </p>
  <p align="center">
    <img src="https://raw.githubusercontent.com/soxoj/maigret/main/static/maigret.png" height="300"/>
  </p>
</p>

<i>The Commissioner Jules Maigret is a fictional French police detective, created by Georges Simenon. His investigation method is based on understanding the personality of different people and their interactions.</i>

<b>👉👉👉 [Online Telegram bot](https://t.me/osint_maigret_bot)</b>

## About

**Maigret** collects a dossier on a person **by username only**, checking for accounts on a huge number of sites and gathering all the available information from web pages. No API keys are required. Maigret is an easy-to-use and powerful fork of [Sherlock](https://github.com/sherlock-project/sherlock).

Currently supports more than 3000 sites ([full list](https://github.com/soxoj/maigret/blob/main/sites.md)), search is launched against 500 popular sites in descending order of popularity by default. Also supported checking Tor sites, I2P sites, and domains (via DNS resolving).

## Powered By Maigret

These are professional tools for social media content analysis and OSINT investigations that use Maigret (banners are clickable).

<a href="https://github.com/SocialLinks-IO/sociallinks-api"><img height="60" alt="Social Links API" src="https://github.com/user-attachments/assets/789747b2-d7a0-4d4e-8868-ffc4427df660"></a>
<a href="https://sociallinks.io/products/sl-crimewall"><img height="60" alt="Social Links Crimewall" src="https://github.com/user-attachments/assets/0b18f06c-2f38-477b-b946-1be1a632a9d1"></a>
<a href="https://usersearch.ai/"><img height="60" alt="UserSearch" src="https://github.com/user-attachments/assets/66daa213-cf7d-40cf-9267-42f97cf77580"></a>

## Main features

* Profile page parsing, [extraction](https://github.com/soxoj/socid_extractor) of personal info, links to other profiles, etc.
* Recursive search by new usernames and other IDs found
* Search by tags (site categories, countries)
* Censorship and captcha detection
* Requests retries

See the full description of Maigret features [in the documentation](https://maigret.readthedocs.io/en/latest/features.html).

## Installation

‼️ Maigret is available online via [official Telegram bot](https://t.me/osint_maigret_bot). Consider using it if you don't want to install anything.

### Windows

Standalone EXE-binaries for Windows are located in [Releases section](https://github.com/soxoj/maigret/releases) of GitHub repository.

Video guide on how to run it: https://youtu.be/qIgwTZOmMmM.

### Installation in Cloud Shells

You can launch Maigret using cloud shells and Jupyter notebooks. Press one of the buttons below and follow the instructions to launch it in your browser.

[![Open in Cloud Shell](https://user-images.githubusercontent.com/27065646/92304704-8d146d80-ef80-11ea-8c29-0deaabb1c702.png)](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/soxoj/maigret&tutorial=README.md)
<a href="https://repl.it/github/soxoj/maigret"><img src="https://replit.com/badge/github/soxoj/maigret" alt="Run on Replit" height="50"></a>

<a href="https://colab.research.google.com/gist/soxoj/879b51bc3b2f8b695abb054090645000/maigret-collab.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab" height="45"></a>
<a href="https://mybinder.org/v2/gist/soxoj/9d65c2f4d3bec5dd25949197ea73cf3a/HEAD"><img src="https://mybinder.org/badge_logo.svg" alt="Open In Binder" height="45"></a>

### Local installation

Maigret can be installed using pip, Docker, or simply can be launched from the cloned repo.


**NOTE**: Python 3.10 or higher and pip is required, **Python 3.11 is recommended.**

```bash
# install from pypi
pip3 install maigret

# usage
maigret username
```

### Cloning a repository

```bash
# or clone and install manually
git clone https://github.com/soxoj/maigret && cd maigret

# build and install
pip3 install .

# usage
maigret username
```

### Docker

```bash
# official image
docker pull soxoj/maigret

# usage
docker run -v /mydir:/app/reports soxoj/maigret:latest username --html

# manual build
docker build -t maigret .
```

## Usage examples

```bash
# make HTML, PDF, and Xmind8 reports
maigret user --html
maigret user --pdf
maigret user --xmind #Output not compatible with xmind 2022+

# search on sites marked with tags photo & dating
maigret user --tags photo,dating

# search on sites marked with tag us
maigret user --tags us

# search for three usernames on all available sites
maigret user1 user2 user3 -a
```

Use `maigret --help` to get full options description. Also options [are documented](https://maigret.readthedocs.io/en/latest/command-line-options.html).

### Web interface

You can run Maigret with a web interface, where you can view the graph with results and download reports of all formats on a single page.

<details>
<summary>Web Interface Screenshots</summary>

![Web interface: how to start](https://raw.githubusercontent.com/soxoj/maigret/main/static/web_interface_screenshot_start.png)

![Web interface: results](https://raw.githubusercontent.com/soxoj/maigret/main/static/web_interface_screenshot.png)

</details>

Instructions:

1. Run Maigret with the ``--web`` flag and specify the port number.

```console
maigret --web 5000
```
2. Open http://127.0.0.1:5000 in your browser and enter one or more usernames to make a search.

3. Wait a bit for the search to complete and view the graph with results, the table with all accounts found, and download reports of all formats.

## Contributing

Maigret has open-source code, so you may contribute your own sites by adding them to `data.json` file, or bring changes to it's code!

For more information about development and contribution, please read the [development documentation](https://maigret.readthedocs.io/en/latest/development.html).

## Demo with page parsing and recursive username search

### Video (asciinema)

<a href="https://asciinema.org/a/Ao0y7N0TTxpS0pisoprQJdylZ">
  <img src="https://asciinema.org/a/Ao0y7N0TTxpS0pisoprQJdylZ.svg" alt="asciicast" width="600">
</a>

### Reports

[PDF report](https://raw.githubusercontent.com/soxoj/maigret/main/static/report_alexaimephotographycars.pdf), [HTML report](https://htmlpreview.github.io/?https://raw.githubusercontent.com/soxoj/maigret/main/static/report_alexaimephotographycars.html)

![HTML report screenshot](https://raw.githubusercontent.com/soxoj/maigret/main/static/report_alexaimephotography_html_screenshot.png)

![XMind 8 report screenshot](https://raw.githubusercontent.com/soxoj/maigret/main/static/report_alexaimephotography_xmind_screenshot.png)

[Full console output](https://raw.githubusercontent.com/soxoj/maigret/main/static/recursive_search.md)

## Disclaimer

**This tool is intended for educational and lawful purposes only.** The developers do not endorse or encourage any illegal activities or misuse of this tool. Regulations regarding the collection and use of personal data vary by country and region, including but not limited to GDPR in the EU, CCPA in the USA, and similar laws worldwide.

It is your sole responsibility to ensure that your use of this tool complies with all applicable laws and regulations in your jurisdiction. Any illegal use of this tool is strictly prohibited, and you are fully accountable for your actions.

The authors and developers of this tool bear no responsibility for any misuse or unlawful activities conducted by its users.

## Feedback

If you have any questions, suggestions, or feedback, please feel free to [open an issue](https://github.com/soxoj/maigret/issues), create a [GitHub discussion](https://github.com/soxoj/maigret/discussions), or contact the author directly via [Telegram](https://t.me/soxoj).

## SOWEL classification

This tool uses the following OSINT techniques:
- [SOTL-2.2. Search For Accounts On Other Platforms](https://sowel.soxoj.com/other-platform-accounts)
- [SOTL-6.1. Check Logins Reuse To Find Another Account](https://sowel.soxoj.com/logins-reuse)
- [SOTL-6.2. Check Nicknames Reuse To Find Another Account](https://sowel.soxoj.com/nicknames-reuse) 

## License

MIT © [Maigret](https://github.com/soxoj/maigret)<br/>
MIT © [Sherlock Project](https://github.com/sherlock-project/)<br/>
Original Creator of Sherlock Project - [Siddharth Dushantha](https://github.com/sdushantha)


================================================
FILE: docs/Makefile
================================================
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS    ?=
SPHINXBUILD   ?= sphinx-build
SOURCEDIR     = source
BUILDDIR      = build

# Put it first so that "make" without argument is like "make help".
help:
	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)


================================================
FILE: docs/make.bat
================================================
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
	set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build

if "%1" == "" goto help

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
	echo.
	echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
	echo.installed, then set the SPHINXBUILD environment variable to point
	echo.to the full path of the 'sphinx-build' executable. Alternatively you
	echo.may add the Sphinx directory to PATH.
	echo.
	echo.If you don't have Sphinx installed, grab it from
	echo.http://sphinx-doc.org/
	exit /b 1
)

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd


================================================
FILE: docs/requirements.txt
================================================
sphinx-copybutton
sphinx_rtd_theme

================================================
FILE: docs/source/command-line-options.rst
================================================
.. _command-line-options:

Command line options
====================

Usernames
---------

``maigret username1 username2 ...``

You can specify several usernames separated by space. Usernames are
**not** mandatory as there are other operations modes (see below).

Parsing of account pages and online documents
---------------------------------------------

``maigret --parse URL``

Maigret will try to extract information about the document/account owner
(including username and other ids) and will make a search by the
extracted username and ids. See examples in the :ref:`extracting-information-from-pages` section.

Main options
------------

Options are also configurable through settings files, see
:doc:`settings section <settings>`.

``--tags`` - Filter sites for searching by tags: sites categories and
two-letter country codes (**not a language!**). E.g. photo, dating, sport; jp, us, global.
Multiple tags can be associated with one site. **Warning**: tags markup is
not stable now. Read more :doc:`in the separate section <tags>`.

``-n``, ``--max-connections`` - Allowed number of concurrent connections
**(default: 100)**.

``-a``, ``--all-sites`` - Use all sites for scan **(default: top 500)**.

``--top-sites`` - Count of sites for scan ranked by Alexa Top
**(default: top 500)**.

``--timeout`` - Time (in seconds) to wait for responses from sites
**(default: 30)**. A longer timeout will be more likely to get results
from slow sites. On the other hand, this may cause a long delay to
gather all results. The choice of the right timeout should be carried
out taking into account the bandwidth of the Internet connection.

``--cookies-jar-file`` - File with custom cookies in Netscape format
(aka cookies.txt). You can install an extension to your browser to
download own cookies (`Chrome <https://chrome.google.com/webstore/detail/get-cookiestxt/bgaddhkoddajcdgocldbbfleckgcbcid>`_, `Firefox <https://addons.mozilla.org/en-US/firefox/addon/cookies-txt/>`_).

``--no-recursion`` - Disable parsing pages for other usernames and
recursive search by them.

``--use-disabled-sites`` - Use disabled sites to search (may cause many
false positives).

``--id-type`` - Specify identifier(s) type (default: username).
Supported types: gaia_id, vk_id, yandex_public_id, ok_id, wikimapia_uid.
Currently, you must add ``-a`` flag to run a scan on sites with custom
id types, sites will be filtered automatically.

``--ignore-ids`` - Do not make search by the specified username or other
ids. Useful for repeated scanning with found known irrelevant usernames.

``--db`` - Load Maigret database from a JSON file or an online, valid,
JSON file.

``--retries RETRIES`` - Count of attempts to restart temporarily failed
requests.

Reports
-------

``-P``, ``--pdf`` - Generate a PDF report (general report on all
usernames).

``-H``, ``--html`` - Generate an HTML report file (general report on all
usernames).

``-X``, ``--xmind`` - Generate an XMind 8 mindmap (one report per
username).

``-C``, ``--csv`` - Generate a CSV report (one report per username).

``-T``, ``--txt`` - Generate a TXT report (one report per username).

``-J``, ``--json`` - Generate a JSON report of specific type: simple,
ndjson (one report per username). E.g. ``--json ndjson``

``-fo``, ``--folderoutput`` - Results will be saved to this folder,
``results`` by default. Will be created if doesn’t exist.

Output options
--------------

``-v``, ``--verbose`` - Display extra information and metrics.
*(loglevel=WARNING)*

``-vv``, ``--info`` - Display service information. *(loglevel=INFO)*

``-vvv``, ``--debug``, ``-d`` - Display debugging information and site
responses. *(loglevel=DEBUG)*

``--print-not-found`` - Print sites where the username was not found.

``--print-errors`` - Print errors messages: connection, captcha, site
country ban, etc.

Other operations modes
----------------------

``--version`` - Display version information and dependencies.

``--self-check`` - Do self-checking for sites and database and disable
non-working ones **for current search session** by default. It’s useful
for testing new internet connection (it depends on provider/hosting on
which sites there will be censorship stub or captcha display). After
checking Maigret asks if you want to save updates, answering y/Y will
rewrite the local database.

``--submit URL`` - Do an automatic analysis of the given account URL or
site main page URL to determine the site engine and methods to check
account presence. After checking Maigret asks if you want to add the
site, answering y/Y will rewrite the local database.




================================================
FILE: docs/source/conf.py
================================================
# Configuration file for the Sphinx documentation builder.

# -- Project information

project = 'Maigret'
copyright = '2025, soxoj'
author = 'soxoj'

release = '0.5.0'
version = '0.5'

# -- General configuration

extensions = [
    'sphinx.ext.duration',
    'sphinx.ext.doctest',
    'sphinx.ext.autodoc',
    'sphinx.ext.autosummary',
    'sphinx.ext.intersphinx',
    'sphinx_copybutton'
]

intersphinx_mapping = {
    'python': ('https://docs.python.org/3/', None),
    'sphinx': ('https://www.sphinx-doc.org/en/master/', None),
}
intersphinx_disabled_domains = ['std']

templates_path = ['_templates']

# -- Options for HTML output

html_theme = 'sphinx_rtd_theme'

# -- Options for EPUB output
epub_show_urls = 'footnote'


================================================
FILE: docs/source/development.rst
================================================
.. _development:

Development
==============

Frequently Asked Questions
--------------------------

1. Where to find the list of supported sites?

The human-readable list of supported sites is available in the `sites.md <https://github.com/soxoj/maigret/blob/main/sites.md>`_ file in the repository.
It's been generated automatically from the main JSON file with the list of supported sites.

The machine-readable JSON file with the list of supported sites is available in the
`data.json <https://github.com/soxoj/maigret/blob/main/maigret/resources/data.json>`_ file in the directory `resources`.

2. Which methods to check the account presence are supported?

The supported methods (``checkType`` values in ``data.json``) are:

- ``message`` - the most reliable method, checks if any string from ``presenceStrs`` is present and none of the strings from ``absenceStrs`` are present in the HTML response
- ``status_code`` - checks that status code of the response is 2XX
- ``response_url`` - check if there is not redirect and the response is 2XX

See the details of check mechanisms in the `checking.py <https://github.com/soxoj/maigret/blob/main/maigret/checking.py#L339>`_ file.

Testing
-------

It is recommended use Python 3.10 for testing.

Install test requirements:

.. code-block:: console

  poetry install --with dev


Use the following commands to check Maigret:

.. code-block:: console

  # run linter and typing checks
  # order of checks:
  # - critical syntax errors or undefined names
  # - flake checks
  # - mypy checks
  make lint

  # run black formatter
  make format

  # run testing with coverage html report
  # current test coverage is 58%
  make test

  # open html report
  open htmlcov/index.html

  # get flamechart of imports to estimate startup time
  make speed


How to fix false-positives
-----------------------------------------------

If you want to work with sites database, don't forget to activate statistics update git hook, command for it would look like this: ``git config --local core.hooksPath .githooks/``.

You should make your git commits from your maigret git repo folder, or else the hook wouldn't find the statistics update script.

1. Determine the problematic site.

If you already know which site has a false-positive and want to fix it specifically, go to the next step.

Otherwise, simply run a search with a random username (e.g. `laiuhi3h4gi3u4hgt`) and check the results.
Alternatively, you can use `the Telegram bot <https://t.me/osint_maigret_bot>`_.

2. Open the account link in your browser and check:

- If the site is completely gone, remove it from the list
- If the site still works but looks different, update in data.json how we check it
- If the site requires login to view profiles, disable checking it

3. Find the site in the `data.json <https://github.com/soxoj/maigret/blob/main/maigret/resources/data.json>`_ file.

If the ``checkType`` method is not ``message`` and you are going to fix check, update it:
- put ``message`` in ``checkType``
- put in ``absenceStrs`` a keyword that is present in the HTML response for an non-existing account
- put in ``presenceStrs`` a keyword that is present in the HTML response for an existing account

If you have trouble determining the right keywords, you can use automatic detection by passing the account URL with the ``--submit`` option:

.. code-block:: console

  maigret --submit https://my.mail.ru/bk/alex

To disable checking, set ``disabled`` to ``true`` or simply run:

.. code-block:: console

  maigret --self-check --site My.Mail.ru@bk.ru

To debug the check method using the response HTML, you can run:

.. code-block:: console

  maigret soxoj --site My.Mail.ru@bk.ru -d 2> response.txt

There are few options for sites data.json helpful in various cases:

- ``engine`` - a predefined check for the sites of certain type (e.g. forums), see the ``engines`` section in the JSON file
- ``headers`` - a dictionary of additional headers to be sent to the site
- ``requestHeadOnly`` - set to ``true`` if it's enough to make a HEAD request to the site
- ``regexCheck`` - a regex to check if the username is valid, in case of frequent false-positives

.. _activation-mechanism:

Activation mechanism
--------------------

The activation mechanism helps make requests to sites requiring additional authentication like cookies, JWT tokens, or custom headers.

Let's study the Vimeo site check record from the Maigret database:

.. code-block:: json

      "Vimeo": {
          "tags": [
              "us",
              "video"
          ],
          "headers": {
              "Authorization": "jwt eyJ0..."
          },
          "activation": {
              "url": "https://vimeo.com/_rv/viewer",
              "marks": [
                  "Something strange occurred. Please get in touch with the app's creator."
              ],
              "method": "vimeo"
          },
          "urlProbe": "https://api.vimeo.com/users/{username}?fields=name...",
          "checkType": "status_code",
          "alexaRank": 148,
          "urlMain": "https://vimeo.com/",
          "url": "https://vimeo.com/{username}",
          "usernameClaimed": "blue",
          "usernameUnclaimed": "noonewouldeverusethis7"
      },

The activation method is:

.. code-block:: python

    def vimeo(site, logger, cookies={}):
        headers = dict(site.headers)
        if "Authorization" in headers:
            del headers["Authorization"]
        import requests

        r = requests.get(site.activation["url"], headers=headers)
        jwt_token = r.json()["jwt"]
        site.headers["Authorization"] = "jwt " + jwt_token

Here's how the activation process works when a JWT token becomes invalid:

1. The site check makes an HTTP request to ``urlProbe`` with the invalid token
2. The response contains an error message specified in the ``activation``/``marks`` field
3. When this error is detected, the ``vimeo`` activation function is triggered
4. The activation function obtains a new JWT token and updates it in the site check record
5. On the next site check (either through retry or a new Maigret run), the valid token is used and the check succeeds

Examples of activation mechanism implementation are available in `activation.py <https://github.com/soxoj/maigret/blob/main/maigret/activation.py>`_ file.

How to publish new version of Maigret
-------------------------------------

**Collaborats rights are requires, write Soxoj to get them**.

For new version publishing you must create a new branch in repository
with a bumped version number and actual changelog first. After it you
must create a release, and GitHub action automatically create a new 
PyPi package. 

- New branch example: https://github.com/soxoj/maigret/commit/e520418f6a25d7edacde2d73b41a8ae7c80ddf39
- Release example: https://github.com/soxoj/maigret/releases/tag/v0.4.1

1. Make a new branch locally with a new version name. Check the current version number here: https://pypi.org/project/maigret/.
**Increase only patch version (third number)** if there are no breaking changes.

.. code-block:: console

  git checkout -b 0.4.0

2. Update Maigret version in three files manually:

- pyproject.toml
- maigret/__version__.py 
- docs/source/conf.py
- snapcraft.yaml

3. Create a new empty text section in the beginning of the file `CHANGELOG.md` with a current date:

.. code-block:: console

  ## [0.4.0] - 2022-01-03

4. Get auto-generate release notes:

- Open https://github.com/soxoj/maigret/releases/new
- Click `Choose a tag`, enter `v0.4.0` (your version)
- Click `Create new tag`
- Press `+ Auto-generate release notes`
- Copy all the text from description text field below
- Paste it to empty text section in `CHANGELOG.txt`
- Remove redundant lines `## What's Changed` and `## New Contributors` section if it exists
- *Close the new release page*

5. Commit all the changes, push, make pull request

.. code-block:: console

  git add -p
  git commit -m 'Bump to YOUR VERSION'
  git push origin head


6. Merge pull request

7. Create new release

- Open https://github.com/soxoj/maigret/releases/new again
- Click `Choose a tag`
- Enter actual version in format `v0.4.0`
- Also enter actual version in the field `Release title` 
- Click `Create new tag`
- Press `+ Auto-generate release notes`
- **Press "Publish release" button**

8. That's all, now you can simply wait push to PyPi. You can monitor it in Action page: https://github.com/soxoj/maigret/actions/workflows/python-publish.yml

Documentation updates
---------------------

Documentations is auto-generated and auto-deployed from the ``docs`` directory.

To manually update documentation:

1. Change something in the ``.rst`` files in the ``docs/source`` directory.
2. Install ``pip install -r requirements.txt`` in the docs directory.
3. Run ``make singlehtml`` in the terminal in the docs directory.
4. Open ``build/singlehtml/index.html`` in your browser to see the result.
5. If everything is ok, commit and push your changes to GitHub. 

Roadmap
-------

.. warning::
   This roadmap requires updating to reflect the current project status and future plans.

.. figure:: https://i.imgur.com/kk8cFdR.png   
   :target: https://i.imgur.com/kk8cFdR.png
   :align: center


================================================
FILE: docs/source/features.rst
================================================
.. _features:

Features
========

This is the list of Maigret features.

.. _web-interface:

Web Interface
-------------

You can run Maigret with a web interface, where you can view the graph with results and download reports of all formats on a single page.


.. image:: https://raw.githubusercontent.com/soxoj/maigret/main/static/web_interface_screenshot_start.png
   :alt: Web interface: how to start


.. image:: https://raw.githubusercontent.com/soxoj/maigret/main/static/web_interface_screenshot.png
   :alt: Web interface: results


Instructions:

1. Run Maigret with the ``--web`` flag and specify the port number.

.. code-block:: console

  maigret --web 5000

2. Open http://127.0.0.1:5000 in your browser and enter one or more usernames to make a search.

3. Wait a bit for the search to complete and view the graph with results, the table with all accounts found, and download reports of all formats.

Personal info gathering
-----------------------

Maigret does the `parsing of accounts webpages and extraction <https://github.com/soxoj/socid-extractor>`_ of personal info, links to other profiles, etc.
Extracted info displayed as an additional result in CLI output and as tables in HTML and PDF reports.
Also, Maigret use found ids and usernames from links to start a recursive search.

Enabled by default, can be disabled with ``--no extracting``.

.. code-block:: text

    $ python3 -m maigret soxoj --timeout 5
        [-] Starting a search on top 500 sites from the Maigret database...
        [!] You can run search by full list of sites with flag `-a`
        [*] Checking username soxoj on:
        ...
        [+] GitHub: https://github.com/soxoj
                ├─uid: 31013580
                ├─image: https://avatars.githubusercontent.com/u/31013580?v=4
                ├─created_at: 2017-08-14T17:03:07Z
                ├─location: Amsterdam, Netherlands
                ├─follower_count: 1304
                ├─following_count: 54
                ├─fullname: Soxoj
                ├─public_gists_count: 3
                ├─public_repos_count: 88
                ├─twitter_username: sox0j
                ├─bio: Head of OSINT Center of Excellence in @SocialLinks-IO
                ├─is_company: Social Links
                └─blog_url: soxoj.com
        ...

Recursive search
----------------

Maigret has the ability to scan account pages for :ref:`common identifiers <supported-identifier-types>` and usernames found in links.
When people include links to their other social media accounts, Maigret can automatically detect and initiate new searches for those profiles.
Any information discovered through this process will be shown in both the command-line interface output and generated reports.

Enabled by default, can be disabled with ``--no-recursion``.


.. code-block:: text

    $ python3 -m maigret soxoj --timeout 5
        [-] Starting a search on top 500 sites from the Maigret database...
        [!] You can run search by full list of sites with flag `-a`
        [*] Checking username soxoj on:
        ...
        [+] GitHub: https://github.com/soxoj
                ├─uid: 31013580
                ├─image: https://avatars.githubusercontent.com/u/31013580?v=4
                ├─created_at: 2017-08-14T17:03:07Z
                ├─location: Amsterdam, Netherlands
                ├─follower_count: 1304
                ├─following_count: 54
                ├─fullname: Soxoj
                ├─public_gists_count: 3
                ├─public_repos_count: 88
                ├─twitter_username: sox0j     <===== another username found here
                ├─bio: Head of OSINT Center of Excellence in @SocialLinks-IO
                ├─is_company: Social Links
                └─blog_url: soxoj.com
        ...
        Searching |████████████████████████████████████████| 500/500 [100%] in 9.1s (54.85/s)
        [-] You can see detailed site check errors with a flag `--print-errors`
        [*] Checking username sox0j on:
        [+] Telegram: https://t.me/sox0j
            ├─fullname: @Sox0j
            ...

Username permutations
---------------------

Maigret can generate permutations of usernames. Just pass a few usernames in the CLI and use ``--permute`` flag.
Thanks to `@balestek <https://github.com/balestek>`_ for the idea and implementation.

.. code-block:: text

    $ python3 -m maigret --permute hope dream --timeout 5
    [-] 12 permutations from hope dream to check...
        ├─ hopedream
        ├─ _hopedream 
        ├─ hopedream_
        ├─ hope_dream
        ├─ hope-dream
        ├─ hope.dream
        ├─ dreamhope
        ├─ _dreamhope
        ├─ dreamhope_
        ├─ dream_hope
        ├─ dream-hope
        └─ dream.hope
    [-] Starting a search on top 500 sites from the Maigret database...
    [!] You can run search by full list of sites with flag `-a`
    [*] Checking username hopedream on:
    ...

Reports 
-------

Maigret currently supports HTML, PDF, TXT, XMind 8 mindmap, and JSON reports.

HTML/PDF reports contain:

- profile photo
- all the gathered personal info
- additional information about supposed personal data (full name, gender, location), resulting from statistics of all found accounts

Also, there is a short text report in the CLI output after the end of a searching phase.

.. warning::
   XMind 8 mindmaps are incompatible with XMind 2022!

Tags
----

The Maigret sites database very big (and will be bigger), and it is maybe an overhead to run a search for all the sites.
Also, it is often hard to understand, what sites more interesting for us in the case of a certain person.

Tags markup allows selecting a subset of sites by interests (photo, messaging, finance, etc.) or by country. Tags of found accounts grouped and displayed in the reports.

See full description :doc:`in the Tags Wiki page <tags>`.

Censorship and captcha detection
--------------------------------

Maigret can detect common errors such as censorship stub pages, CloudFlare captcha pages, and others. 
If you get more them 3% errors of a certain type in a session, you've got a warning message in the CLI output with recommendations to improve performance and avoid problems.

Retries
-------

Maigret will do retries of the requests with temporary errors got (connection failures, proxy errors, etc.).

One attempt by default, can be changed with option ``--retries N``.

Archives and mirrors checking
-----------------------------

The Maigret database contains not only the original websites, but also mirrors, archives, and aggregators. For example:

- `Picuki <https://www.picuki.com/>`_, Instagram mirror
- (no longer available) `Reddit BigData search <https://camas.github.io/reddit-search/>`_
- (no longer available) `Twitter shadowban <https://shadowban.eu/>`_ checker

It allows getting additional info about the person and checking the existence of the account even if the main site is unavailable (bot protection, captcha, etc.)

Activation
----------
The activation mechanism helps make requests to sites requiring additional authentication like cookies, JWT tokens, or custom headers.

It works by implementing a custom function that:

1. Makes a specialized HTTP request to a specific website endpoint
2. Processes the response
3. Updates the headers/cookies for that site in the local Maigret database

Since activation only triggers after encountering specific errors, a retry (or another Maigret run) is needed to obtain a valid response with the updated authentication.

The activation mechanism is enabled by default, and cannot be disabled at the moment.

See for more details in Development section :ref:`activation-mechanism`.

.. _extracting-information-from-pages:

Extraction of information from account pages
--------------------------------------------

Maigret can parse URLs and content of web pages by URLs to extract info about account owner and other meta information.

You must specify the URL with the option ``--parse``, it's can be a link to an account or an online document. List of supported sites `see here <https://github.com/soxoj/socid-extractor#sites>`_.

After the end of the parsing phase, Maigret will start the search phase by :doc:`supported identifiers <supported-identifier-types>` found (usernames, ids, etc.).

.. code-block:: console

  $ maigret --parse https://docs.google.com/spreadsheets/d/1HtZKMLRXNsZ0HjtBmo0Gi03nUPiJIA4CC4jTYbCAnXw/edit\#gid\=0

  Scanning webpage by URL https://docs.google.com/spreadsheets/d/1HtZKMLRXNsZ0HjtBmo0Gi03nUPiJIA4CC4jTYbCAnXw/edit#gid=0...
  ┣╸org_name: Gooten
  ┗╸mime_type: application/vnd.google-apps.ritz
  Scanning webpage by URL https://clients6.google.com/drive/v2beta/files/1HtZKMLRXNsZ0HjtBmo0Gi03nUPiJIA4CC4jTYbCAnXw?fields=alternateLink%2CcopyRequiresWriterPermission%2CcreatedDate%2Cdescription%2CdriveId%2CfileSize%2CiconLink%2Cid%2Clabels(starred%2C%20trashed)%2ClastViewedByMeDate%2CmodifiedDate%2Cshared%2CteamDriveId%2CuserPermission(id%2Cname%2CemailAddress%2Cdomain%2Crole%2CadditionalRoles%2CphotoLink%2Ctype%2CwithLink)%2Cpermissions(id%2Cname%2CemailAddress%2Cdomain%2Crole%2CadditionalRoles%2CphotoLink%2Ctype%2CwithLink)%2Cparents(id)%2Ccapabilities(canMoveItemWithinDrive%2CcanMoveItemOutOfDrive%2CcanMoveItemOutOfTeamDrive%2CcanAddChildren%2CcanEdit%2CcanDownload%2CcanComment%2CcanMoveChildrenWithinDrive%2CcanRename%2CcanRemoveChildren%2CcanMoveItemIntoTeamDrive)%2Ckind&supportsTeamDrives=true&enforceSingleParent=true&key=AIzaSyC1eQ1xj69IdTMeii5r7brs3R90eck-m7k...
  ┣╸created_at: 2016-02-16T18:51:52.021Z
  ┣╸updated_at: 2019-10-23T17:15:47.157Z
  ┣╸gaia_id: 15696155517366416778
  ┣╸fullname: Nadia Burgess
  ┣╸email: nadia@gooten.com
  ┣╸image: https://lh3.googleusercontent.com/a-/AOh14GheZe1CyNa3NeJInWAl70qkip4oJ7qLsD8vDy6X=s64
  ┗╸email_username: nadia

.. code-block:: console

  $ maigret.py --parse https://steamcommunity.com/profiles/76561199113454789
  Scanning webpage by URL https://steamcommunity.com/profiles/76561199113454789...
  ┣╸steam_id: 76561199113454789
  ┣╸nickname: Pok
  ┗╸username: Machine42


Simple API
----------

Maigret can be easily integrated with the use of Python package `maigret <https://pypi.org/project/maigret/>`_.

Example: the official `Telegram bot <https://github.com/soxoj/maigret-tg-bot>`_


================================================
FILE: docs/source/index.rst
================================================
.. _index:

Welcome to the Maigret docs!
============================

**Maigret** is an easy-to-use and powerful OSINT tool for collecting a dossier on a person by a username (alias) only.

This is achieved by checking for accounts on a huge number of sites and gathering all the available information from web pages.

The project's main goal — give to OSINT researchers and pentesters a **universal tool** to get maximum information
about a person of interest by a username and integrate it with other tools in automatization pipelines.

.. warning::
   **This tool is intended for educational and lawful purposes only.**
   The developers do not endorse or encourage any illegal activities or misuse of this tool.
   Regulations regarding the collection and use of personal data vary by country and region,
   including but not limited to GDPR in the EU, CCPA in the USA, and similar laws worldwide.

   It is your sole responsibility to ensure that your use of this tool complies with all applicable laws
   and regulations in your jurisdiction. Any illegal use of this tool is strictly prohibited,
   and you are fully accountable for your actions.

   The authors and developers of this tool bear no responsibility for any misuse
   or unlawful activities conducted by its users.

You may be interested in:
-------------------------
- :doc:`Quick start <quick-start>`
- :doc:`Usage examples <usage-examples>`
- :doc:`Command line options <command-line-options>`
- :doc:`Features list <features>`

.. toctree::
   :hidden:
   :caption: Sections

   quick-start
   installation
   usage-examples
   command-line-options
   features
   philosophy
   supported-identifier-types
   tags
   settings
   development


================================================
FILE: docs/source/installation.rst
================================================
.. _installation:

Installation
============

Maigret can be installed using pip, Docker, or simply can be launched from the cloned repo.
Also, it is available online via `official Telegram bot <https://t.me/osint_maigret_bot>`_,
source code of a bot is `available on GitHub <https://github.com/soxoj/maigret-tg-bot>`_.

Windows Standalone EXE-binaries
-------------------------------

Standalone EXE-binaries for Windows are located in the `Releases section <https://github.com/soxoj/maigret/releases>`_ of GitHub repository.

Currently, the new binary is created automatically after each commit to **main** and **dev** branches.

Video guide on how to run it: https://youtu.be/qIgwTZOmMmM.


Cloud Shells and Jupyter notebooks
----------------------------------

In case you don't want to install Maigret locally, you can use cloud shells and Jupyter notebooks.
Press one of the buttons below and follow the instructions to launch it in your browser.

.. image:: https://user-images.githubusercontent.com/27065646/92304704-8d146d80-ef80-11ea-8c29-0deaabb1c702.png
   :target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/soxoj/maigret&tutorial=README.md
   :alt: Open in Cloud Shell

.. image:: https://replit.com/badge/github/soxoj/maigret
   :target: https://repl.it/github/soxoj/maigret
   :alt: Run on Replit
   :height: 50

.. image:: https://colab.research.google.com/assets/colab-badge.svg
   :target: https://colab.research.google.com/gist/soxoj/879b51bc3b2f8b695abb054090645000/maigret-collab.ipynb
   :alt: Open In Colab
   :height: 45

.. image:: https://mybinder.org/badge_logo.svg
   :target: https://mybinder.org/v2/gist/soxoj/9d65c2f4d3bec5dd25949197ea73cf3a/HEAD
   :alt: Open In Binder
   :height: 45

Local installation from PyPi
----------------------------

Please note that the sites database in the PyPI package may be outdated.
If you encounter frequent false positive results, we recommend installing the latest development version from GitHub instead.

.. note::
   Python 3.10 or higher and pip is required, **Python 3.11 is recommended.**

.. code-block:: bash

   # install from pypi
   pip3 install maigret

   # usage
   maigret username

Development version (GitHub)
----------------------------

.. code-block:: bash

   git clone https://github.com/soxoj/maigret && cd maigret
   pip3 install .

   # OR
   pip3 install git+https://github.com/soxoj/maigret.git

   # usage
   maigret username

   # OR use poetry in case you plan to develop Maigret
   pip3 install poetry
   poetry run maigret

Docker
------

.. code-block:: bash

   # official image of the development version, updated from the github repo
   docker pull soxoj/maigret

   # usage
   docker run -v /mydir:/app/reports soxoj/maigret:latest username --html

   # manual build
   docker build -t maigret .


================================================
FILE: docs/source/philosophy.rst
================================================
.. _philosophy:

Philosophy
==========

TL;DR: Username => Dossier

Maigret is designed to gather all the available information about person by his username.

What kind of information is this? First, links to person accounts. Secondly, all the machine-extractable
pieces of info, such as: other usernames, full name, URLs to people's images, birthday, location (country,
city, etc.), gender.

All this information forms some dossier, but it also useful for other tools and analytical purposes.
Each collected piece of data has a label of a certain format (for example, ``follower_count`` for the number
of subscribers or ``created_at`` for account creation time) so that it can be parsed and analyzed by various
systems and stored in databases.


================================================
FILE: docs/source/quick-start.rst
================================================
.. _quick-start:

Quick start
===========

After :doc:`installing Maigret <installation>`, you can begin searching by providing one or more usernames to look up:

``maigret username1 username2 ...``

Maigret will search for accounts with the specified usernames across a vast number of websites. It will provide you with a list 
of URLs to any discovered accounts, along with relevant information extracted from those profiles.

.. image:: maigret_screenshot.png
   :alt: Maigret search results screenshot
   :align: center


================================================
FILE: docs/source/settings.rst
================================================
.. _settings:

Settings
==============

.. warning::
   The settings system is under development and may be subject to change.

Options are also configurable through settings files. See
`settings JSON file <https://github.com/soxoj/maigret/blob/main/maigret/resources/settings.json>`_
for the list of currently supported options.

After start Maigret tries to load configuration from the following sources in exactly the same order:

.. code-block:: console

  # relative path, based on installed package path
  resources/settings.json

  # absolute path, configuration file in home directory
  ~/.maigret/settings.json

  # relative path, based on current working directory
  settings.json

Missing any of these files is not an error.
If the next settings file contains already known option,
this option will be rewrited. So it is possible to make
custom configuration for different users and directories.


================================================
FILE: docs/source/supported-identifier-types.rst
================================================
.. _supported-identifier-types:

Supported identifier types
==========================

Maigret can search against not only ordinary usernames, but also through certain common identifiers. There is a list of all currently supported identifiers.

- **gaia_id** - Google inner numeric user identifier, in former times was placed in a Google Plus account URL. 
- **steam_id** - Steam inner numeric user identifier.
- **wikimapia_uid** - Wikimapia.org inner numeric user identifier.
- **uidme_uguid** - uID.me inner numeric user identifier.
- **yandex_public_id** - Yandex sites inner letter user identifier. See also: `YaSeeker <https://github.com/HowToFind-bot/YaSeeker>`_. 
- **vk_id** - VK.com inner numeric user identifier.
- **ok_id** - OK.ru inner numeric user identifier.
- **yelp_userid** - Yelp inner user identifier.


================================================
FILE: docs/source/tags.rst
================================================
.. _tags:

Tags
====

The use of tags allows you to select a subset of the sites from big Maigret DB for search.

.. warning::
   Tags markup is still not stable.

There are several types of tags:

1. **Country codes**: ``us``, ``jp``, ``br``... (`ISO 3166-1 alpha-2 <https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2>`_). These tags reflect the site language and regional origin of its users and are then used to locate the owner of a username. If the regional origin is difficult to establish or a site is positioned as worldwide, `no country code is given`. There could be multiple country code tags for one site.

2. **Site engines**. Most of them are forum engines now: ``uCoz``, ``vBulletin``, ``XenForo`` et al. Full list of engines stored in the Maigret database.

3. **Sites' subject/type and interests of its users**. Full list of "standard" tags is `present in the source code <https://github.com/soxoj/maigret/blob/main/maigret/sites.py#L13>`_ only for a moment. 

Usage
-----
``--tags us,jp`` -- search on US and Japanese sites (actually marked as such in the Maigret database)

``--tags coding`` -- search on sites related to software development.

``--tags ucoz`` -- search on uCoz sites only (mostly CIS countries)


================================================
FILE: docs/source/usage-examples.rst
================================================
.. _usage-examples:

Usage examples
==============

You can use Maigret as:

- a command line tool: initial and a default mode
- a `web interface <#web-interface>`_: view the graph with results and download all report formats on a single page
- a library: integrate Maigret into your own project

Use Cases
---------


1. Search for accounts with username ``machine42`` on top 500 sites (by default, according to Alexa rank) from the Maigret DB.

.. code-block:: console

  maigret machine42

2. Search for accounts with username ``machine42`` on **all sites** from the Maigret DB.

.. code-block:: console

  maigret machine42 -a

.. note::
   Maigret will search for accounts on a huge number of sites,
   and some of them may return false positive results. At the moment, we are working on autorepair mode to deliver 
   the most accurate results. 
   
   If you experience many false positives, you can do the following:

   - Install the last development version of Maigret from GitHub
   - Run Maigret with ``--self-check`` flag and agree on disabling of problematic sites

3. Search for accounts with username ``machine42`` and generate HTML and PDF reports.

.. code-block:: console

  maigret machine42 -HP

or

.. code-block:: console

  maigret machine42 -a --html --pdf


4. Search for accounts with username ``machine42`` on Facebook only.

.. code-block:: console

  maigret machine42 --site Facebook

5. Extract information from the Steam page by URL and start a search for accounts with found username ``machine42``.

.. code-block:: console

  maigret --parse https://steamcommunity.com/profiles/76561199113454789 

6. Search for accounts with username ``machine42`` only on US and Japanese sites.

.. code-block:: console

  maigret machine42 --tags us,jp

7. Search for accounts with username ``machine42`` only on sites related to software development.

.. code-block:: console

  maigret machine42 --tags coding

8. Search for accounts with username ``machine42`` on uCoz sites only (mostly CIS countries).

.. code-block:: console

  maigret machine42 --tags ucoz



================================================
FILE: maigret/__init__.py
================================================
"""Maigret"""

__title__ = 'Maigret'
__package__ = 'maigret'
__author__ = 'Soxoj'
__author_email__ = 'soxoj@protonmail.com'


from .__version__ import __version__
from .checking import maigret as search
from .maigret import main as cli
from .sites import MaigretEngine, MaigretSite, MaigretDatabase
from .notify import QueryNotifyPrint as Notifier


================================================
FILE: maigret/__main__.py
================================================
#! /usr/bin/env python3

"""
Maigret entrypoint
"""

import asyncio

from .maigret import main

if __name__ == "__main__":
    asyncio.run(main())


================================================
FILE: maigret/__version__.py
================================================
"""Maigret version file"""

__version__ = '0.5.0'


================================================
FILE: maigret/activation.py
================================================
import json
from http.cookiejar import MozillaCookieJar
from http.cookies import Morsel

from aiohttp import CookieJar


class ParsingActivator:
    @staticmethod
    def twitter(site, logger, cookies={}):
        headers = dict(site.headers)
        del headers["x-guest-token"]
        import requests

        r = requests.post(site.activation["url"], headers=headers)
        logger.info(r)
        j = r.json()
        guest_token = j[site.activation["src"]]
        site.headers["x-guest-token"] = guest_token

    @staticmethod
    def vimeo(site, logger, cookies={}):
        headers = dict(site.headers)
        if "Authorization" in headers:
            del headers["Authorization"]
        import requests

        r = requests.get(site.activation["url"], headers=headers)
        logger.debug(f"Vimeo viewer activation: {json.dumps(r.json(), indent=4)}")
        jwt_token = r.json()["jwt"]
        site.headers["Authorization"] = "jwt " + jwt_token

    @staticmethod
    def spotify(site, logger, cookies={}):
        headers = dict(site.headers)
        if "Authorization" in headers:
            del headers["Authorization"]
        import requests

        r = requests.get(site.activation["url"])
        bearer_token = r.json()["accessToken"]
        site.headers["authorization"] = f"Bearer {bearer_token}"

    @staticmethod
    def weibo(site, logger):
        headers = dict(site.headers)
        import requests

        session = requests.Session()
        # 1 stage: get the redirect URL
        r = session.get(
            "https://weibo.com/clairekuo", headers=headers, allow_redirects=False
        )
        logger.debug(
            f"1 stage: {'success' if r.status_code == 302 else 'no 302 redirect, fail!'}"
        )
        location = r.headers.get("Location")

        # 2 stage: go to passport visitor page
        headers["Referer"] = location
        r = session.get(location, headers=headers)
        logger.debug(
            f"2 stage: {'success' if r.status_code == 200 else 'no 200 response, fail!'}"
        )

        # 3 stage: gen visitor token
        headers["Referer"] = location
        r = session.post(
            "https://passport.weibo.com/visitor/genvisitor2",
            headers=headers,
            data={'cb': 'visitor_gray_callback', 'tid': '', 'from': 'weibo'},
        )
        cookies = r.headers.get('set-cookie')
        logger.debug(
            f"3 stage: {'success' if r.status_code == 200 and cookies else 'no 200 response and cookies, fail!'}"
        )
        site.headers["Cookie"] = cookies


def import_aiohttp_cookies(cookiestxt_filename):
    cookies_obj = MozillaCookieJar(cookiestxt_filename)
    cookies_obj.load(ignore_discard=True, ignore_expires=True)

    cookies = CookieJar()

    cookies_list = []
    for domain in cookies_obj._cookies.values():
        for key, cookie in list(domain.values())[0].items():
            c = Morsel()
            c.set(key, cookie.value, cookie.value)
            c["domain"] = cookie.domain
            c["path"] = cookie.path
            cookies_list.append((key, c))

    cookies.update_cookies(cookies_list)

    return cookies


================================================
FILE: maigret/checking.py
================================================
# Standard library imports
import ast
import asyncio
import logging
import random
import re
import ssl
import sys
from typing import Dict, List, Optional, Tuple
from urllib.parse import quote

# Third party imports
import aiodns
from alive_progress import alive_bar
from aiohttp import ClientSession, TCPConnector, http_exceptions
from aiohttp.client_exceptions import ClientConnectorError, ServerDisconnectedError
from python_socks import _errors as proxy_errors
from socid_extractor import extract

try:
    from mock import Mock
except ImportError:
    from unittest.mock import Mock

# Local imports
from . import errors
from .activation import ParsingActivator, import_aiohttp_cookies
from .errors import CheckError
from .executors import AsyncioQueueGeneratorExecutor
from .result import MaigretCheckResult, MaigretCheckStatus
from .sites import MaigretDatabase, MaigretSite
from .types import QueryOptions, QueryResultWrapper
from .utils import ascii_data_display, get_random_user_agent


SUPPORTED_IDS = (
    "username",
    "yandex_public_id",
    "gaia_id",
    "vk_id",
    "ok_id",
    "wikimapia_uid",
    "steam_id",
    "uidme_uguid",
    "yelp_userid",
)

BAD_CHARS = "#"


class CheckerBase:
    pass


class SimpleAiohttpChecker(CheckerBase):
    def __init__(self, *args, **kwargs):
        self.proxy = kwargs.get('proxy')
        self.cookie_jar = kwargs.get('cookie_jar')
        self.logger = kwargs.get('logger', Mock())
        self.url = None
        self.headers = None
        self.allow_redirects = True
        self.timeout = 0
        self.method = 'get'

    def prepare(self, url, headers=None, allow_redirects=True, timeout=0, method='get'):
        self.url = url
        self.headers = headers
        self.allow_redirects = allow_redirects
        self.timeout = timeout
        self.method = method
        return None

    async def close(self):
        pass

    async def _make_request(
        self, session, url, headers, allow_redirects, timeout, method, logger
    ) -> Tuple[str, int, Optional[CheckError]]:
        try:
            request_method = session.get if method == 'get' else session.head
            async with request_method(
                url=url,
                headers=headers,
                allow_redirects=allow_redirects,
                timeout=timeout,
            ) as response:
                status_code = response.status
                response_content = await response.content.read()
                charset = response.charset or "utf-8"
                decoded_content = response_content.decode(charset, "ignore")

                error = CheckError("Connection lost") if status_code == 0 else None
                logger.debug(decoded_content)

                return decoded_content, status_code, error

        except asyncio.TimeoutError as e:
            return None, 0, CheckError("Request timeout", str(e))
        except ClientConnectorError as e:
            return None, 0, CheckError("Connecting failure", str(e))
        except ServerDisconnectedError as e:
            return None, 0, CheckError("Server disconnected", str(e))
        except http_exceptions.BadHttpMessage as e:
            return None, 0, CheckError("HTTP", str(e))
        except proxy_errors.ProxyError as e:
            return None, 0, CheckError("Proxy", str(e))
        except KeyboardInterrupt:
            return None, 0, CheckError("Interrupted")
        except Exception as e:
            if sys.version_info.minor > 6 and (
                isinstance(e, ssl.SSLCertVerificationError)
                or isinstance(e, ssl.SSLError)
            ):
                return None, 0, CheckError("SSL", str(e))
            else:
                logger.debug(e, exc_info=True)
                return None, 0, CheckError("Unexpected", str(e))

    async def check(self) -> Tuple[str, int, Optional[CheckError]]:
        from aiohttp_socks import ProxyConnector

        connector = (
            ProxyConnector.from_url(self.proxy)
            if self.proxy
            else TCPConnector(ssl=False)
        )
        connector.verify_ssl = False

        async with ClientSession(
            connector=connector,
            trust_env=True,
            # TODO: tests
            cookie_jar=self.cookie_jar if self.cookie_jar else None,
        ) as session:
            html_text, status_code, error = await self._make_request(
                session,
                self.url,
                self.headers,
                self.allow_redirects,
                self.timeout,
                self.method,
                self.logger,
            )

            if error and str(error) == "Invalid proxy response":
                self.logger.debug(error, exc_info=True)

            return str(html_text) if html_text else '', status_code, error


class ProxiedAiohttpChecker(SimpleAiohttpChecker):
    def __init__(self, *args, **kwargs):
        self.proxy = kwargs.get('proxy')
        self.cookie_jar = kwargs.get('cookie_jar')
        self.logger = kwargs.get('logger', Mock())


class AiodnsDomainResolver(CheckerBase):
    if sys.platform == 'win32':  # Temporary workaround for Windows
        asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())

    def __init__(self, *args, **kwargs):
        loop = asyncio.get_event_loop()
        self.logger = kwargs.get('logger', Mock())
        self.resolver = aiodns.DNSResolver(loop=loop)

    def prepare(self, url, headers=None, allow_redirects=True, timeout=0, method='get'):
        self.url = url
        return None

    async def check(self) -> Tuple[str, int, Optional[CheckError]]:
        status = 404
        error = None
        text = ''

        try:
            res = await self.resolver.query(self.url, 'A')
            text = str(res[0].host)
            status = 200
        except aiodns.error.DNSError:
            pass
        except Exception as e:
            self.logger.error(e, exc_info=True)
            error = CheckError('DNS resolve error', str(e))

        return text, status, error


class CheckerMock:
    def __init__(self, *args, **kwargs):
        pass

    def prepare(self, url, headers=None, allow_redirects=True, timeout=0, method='get'):
        return None

    async def check(self) -> Tuple[str, int, Optional[CheckError]]:
        await asyncio.sleep(0)
        return '', 0, None

    async def close(self):
        return


# TODO: move to separate class
def detect_error_page(
    html_text, status_code, fail_flags, ignore_403
) -> Optional[CheckError]:
    # Detect service restrictions such as a country restriction
    for flag, msg in fail_flags.items():
        if flag in html_text:
            return CheckError("Site-specific", msg)

    # Detect common restrictions such as provider censorship and bot protection
    err = errors.detect(html_text)
    if err:
        return err

    # Detect common site errors
    if status_code == 403 and not ignore_403:
        return CheckError("Access denied", "403 status code, use proxy/vpn")

    elif status_code >= 500:
        return CheckError("Server", f"{status_code} status code")

    return None


def debug_response_logging(url, html_text, status_code, check_error):
    with open("debug.log", "a") as f:
        status = status_code or "No response"
        f.write(f"url: {url}\nerror: {check_error}\nr: {status}\n")
        if html_text:
            f.write(f"code: {status}\nresponse: {str(html_text)}\n")


def process_site_result(
    response, query_notify, logger, results_info: QueryResultWrapper, site: MaigretSite
):
    if not response:
        return results_info

    fulltags = site.tags

    # Retrieve other site information again
    username = results_info["username"]
    is_parsing_enabled = results_info["parsing_enabled"]
    url = results_info.get("url_user")
    logger.info(url)

    status = results_info.get("status")
    if status is not None:
        # We have already determined the user doesn't exist here
        return results_info

    # Get the expected check type
    check_type = site.check_type

    # TODO: refactor
    if not response:
        logger.error(f"No response for {site.name}")
        return results_info

    html_text, status_code, check_error = response

    # TODO: add elapsed request time counting
    response_time = None

    if logger.level == logging.DEBUG:
        debug_response_logging(url, html_text, status_code, check_error)

    # additional check for errors
    if status_code and not check_error:
        check_error = detect_error_page(
            html_text, status_code, site.errors_dict, site.ignore403
        )

    # parsing activation
    is_need_activation = any(
        [s for s in site.activation.get("marks", []) if s in html_text]
    )

    if site.activation and html_text and is_need_activation:
        logger.debug(f"Activation for {site.name}")
        method = site.activation["method"]
        try:
            activate_fun = getattr(ParsingActivator(), method)
            # TODO: async call
            activate_fun(site, logger)
        except AttributeError as e:
            logger.warning(
                f"Activation method {method} for site {site.name} not found!",
                exc_info=True,
            )
        except Exception as e:
            logger.warning(
                f"Failed activation {method} for site {site.name}: {str(e)}",
                exc_info=True,
            )
        # TODO: temporary check error

    site_name = site.pretty_name
    # presense flags
    # True by default
    presense_flags = site.presense_strs
    is_presense_detected = False

    if html_text:
        if not presense_flags:
            is_presense_detected = True
            site.stats["presense_flag"] = None
        else:
            for presense_flag in presense_flags:
                if presense_flag in html_text:
                    is_presense_detected = True
                    site.stats["presense_flag"] = presense_flag
                    logger.debug(presense_flag)
                    break

    def build_result(status, **kwargs):
        return MaigretCheckResult(
            username,
            site_name,
            url,
            status,
            query_time=response_time,
            tags=fulltags,
            **kwargs,
        )

    if check_error:
        logger.warning(check_error)
        result = MaigretCheckResult(
            username,
            site_name,
            url,
            MaigretCheckStatus.UNKNOWN,
            query_time=response_time,
            error=check_error,
            context=str(CheckError),
            tags=fulltags,
        )
    elif check_type == "message":
        # Checks if the error message is in the HTML
        is_absence_detected = any(
            [(absence_flag in html_text) for absence_flag in site.absence_strs]
        )
        if not is_absence_detected and is_presense_detected:
            result = build_result(MaigretCheckStatus.CLAIMED)
        else:
            result = build_result(MaigretCheckStatus.AVAILABLE)
    elif check_type in "status_code":
        # Checks if the status code of the response is 2XX
        if 200 <= status_code < 300:
            result = build_result(MaigretCheckStatus.CLAIMED)
        else:
            result = build_result(MaigretCheckStatus.AVAILABLE)
    elif check_type == "response_url":
        # For this detection method, we have turned off the redirect.
        # So, there is no need to check the response URL: it will always
        # match the request.  Instead, we will ensure that the response
        # code indicates that the request was successful (i.e. no 404, or
        # forward to some odd redirect).
        if 200 <= status_code < 300 and is_presense_detected:
            result = build_result(MaigretCheckStatus.CLAIMED)
        else:
            result = build_result(MaigretCheckStatus.AVAILABLE)
    else:
        # It should be impossible to ever get here...
        raise ValueError(
            f"Unknown check type '{check_type}' for " f"site '{site.name}'"
        )

    extracted_ids_data = {}

    if is_parsing_enabled and result.status == MaigretCheckStatus.CLAIMED:
        extracted_ids_data = extract_ids_data(html_text, logger, site)
        if extracted_ids_data:
            new_usernames = parse_usernames(extracted_ids_data, logger)
            results_info = update_results_info(
                results_info, extracted_ids_data, new_usernames
            )
            result.ids_data = extracted_ids_data

    # Save status of request
    results_info["status"] = result

    # Save results from request
    results_info["http_status"] = status_code
    results_info["is_similar"] = site.similar_search
    # results_site['response_text'] = html_text
    results_info["rank"] = site.alexa_rank
    return results_info


def make_site_result(
    site: MaigretSite, username: str, options: QueryOptions, logger, *args, **kwargs
) -> QueryResultWrapper:
    results_site: QueryResultWrapper = {}

    # Record URL of main site and username
    results_site["site"] = site
    results_site["username"] = username
    results_site["parsing_enabled"] = options["parsing"]
    results_site["url_main"] = site.url_main
    results_site["cookies"] = (
        options.get("cookie_jar")
        and options["cookie_jar"].filter_cookies(site.url_main)
        or None
    )

    headers = {
        "User-Agent": get_random_user_agent(),
        # tell server that we want to close connection after request
        "Connection": "close",
    }

    headers.update(site.headers)

    if "url" not in site.__dict__:
        logger.error("No URL for site %s", site.name)

    if kwargs.get('retry') and hasattr(site, "mirrors"):
        site.url_main = random.choice(site.mirrors)
        logger.info(f"Use {site.url_main} as a main url of site {site}")

    # URL of user on site (if it exists)
    url = site.url.format(
        urlMain=site.url_main, urlSubpath=site.url_subpath, username=quote(username)
    )

    # workaround to prevent slash errors
    url = re.sub("(?<!:)/+", "/", url)

    # always clearweb_checker for now
    checker = options["checkers"][site.protocol]

    # site check is disabled
    if site.disabled and not options['forced']:
        logger.debug(f"Site {site.name} is disabled, skipping...")
        results_site["status"] = MaigretCheckResult(
            username,
            site.name,
            url,
            MaigretCheckStatus.ILLEGAL,
            error=CheckError("Check is disabled"),
        )
    # current username type could not be applied
    elif site.type != options["id_type"]:
        results_site["status"] = MaigretCheckResult(
            username,
            site.name,
            url,
            MaigretCheckStatus.ILLEGAL,
            error=CheckError('Unsupported identifier type', f'Want "{site.type}"'),
        )
    # username is not allowed.
    elif site.regex_check and re.search(site.regex_check, username) is None:
        results_site["status"] = MaigretCheckResult(
            username,
            site.name,
            url,
            MaigretCheckStatus.ILLEGAL,
            error=CheckError(
                'Unsupported username format', f'Want "{site.regex_check}"'
            ),
        )
        results_site["url_user"] = ""
        results_site["http_status"] = ""
        results_site["response_text"] = ""
        # query_notify.update(results_site["status"])
    else:
        # URL of user on site (if it exists)
        results_site["url_user"] = url
        url_probe = site.url_probe
        if url_probe is None:
            # Probe URL is normal one seen by people out on the web.
            url_probe = url
        else:
            # There is a special URL for probing existence separate
            # from where the user profile normally can be found.
            url_probe = url_probe.format(
                urlMain=site.url_main,
                urlSubpath=site.url_subpath,
                username=username,
            )

        for k, v in site.get_params.items():
            url_probe += f"&{k}={v}"

        if site.check_type == "status_code" and site.request_head_only:
            # In most cases when we are detecting by status code,
            # it is not necessary to get the entire body:  we can
            # detect fine with just the HEAD response.
            request_method = 'head'
        else:
            # Either this detect method needs the content associated
            # with the GET response, or this specific website will
            # not respond properly unless we request the whole page.
            request_method = 'get'

        if site.check_type == "response_url":
            # Site forwards request to a different URL if username not
            # found.  Disallow the redirect so we can capture the
            # http status from the original URL request.
            allow_redirects = False
        else:
            # Allow whatever redirect that the site wants to do.
            # The final result of the request will be what is available.
            allow_redirects = True

        future = checker.prepare(
            method=request_method,
            url=url_probe,
            headers=headers,
            allow_redirects=allow_redirects,
            timeout=options['timeout'],
        )

        # Store future request object in the results object
        results_site["future"] = future

    results_site["checker"] = checker

    return results_site


async def check_site_for_username(
    site, username, options: QueryOptions, logger, query_notify, *args, **kwargs
) -> Tuple[str, QueryResultWrapper]:
    default_result = make_site_result(
        site, username, options, logger, retry=kwargs.get('retry')
    )
    # future = default_result.get("future")
    # if not future:
    # return site.name, default_result

    checker = default_result.get("checker")
    if not checker:
        print(f"error, no checker for {site.name}")
        return site.name, default_result

    response = await checker.check()

    response_result = process_site_result(
        response, query_notify, logger, default_result, site
    )

    query_notify.update(response_result['status'], site.similar_search)

    return site.name, response_result


async def debug_ip_request(checker, logger):
    checker.prepare(url="https://icanhazip.com")
    ip, status, check_error = await checker.check()
    if ip:
        logger.debug(f"My IP is: {ip.strip()}")
    else:
        logger.debug(f"IP requesting {check_error.type}: {check_error.desc}")


def get_failed_sites(results: Dict[str, QueryResultWrapper]) -> List[str]:
    sites = []
    for sitename, r in results.items():
        status = r.get('status', {})
        if status and status.error:
            if errors.is_permanent(status.error.type):
                continue
            sites.append(sitename)
    return sites


async def maigret(
    username: str,
    site_dict: Dict[str, MaigretSite],
    logger,
    query_notify=None,
    proxy=None,
    tor_proxy=None,
    i2p_proxy=None,
    timeout=3,
    is_parsing_enabled=False,
    id_type="username",
    debug=False,
    forced=False,
    max_connections=100,
    no_progressbar=False,
    cookies=None,
    retries=0,
    check_domains=False,
    *args,
    **kwargs,
) -> QueryResultWrapper:
    """Main search func

    Checks for existence of username on certain sites.

    Keyword Arguments:
    username               -- Username string will be used for search.
    site_dict              -- Dictionary containing sites data in MaigretSite objects.
    query_notify           -- Object with base type of QueryNotify().
                              This will be used to notify the caller about
                              query results.
    logger                 -- Standard Python logger object.
    timeout                -- Time in seconds to wait before timing out request.
                              Default is 3 seconds.
    is_parsing_enabled     -- Extract additional info from account pages.
    id_type                -- Type of username to search.
                              Default is 'username', see all supported here:
                              https://maigret.readthedocs.io/en/latest/supported-identifier-types.html
    max_connections        -- Maximum number of concurrent connections allowed.
                              Default is 100.
    no_progressbar         -- Displaying of ASCII progressbar during scanner.
    cookies                -- Filename of a cookie jar file to use for each request.

    Return Value:
    Dictionary containing results from report. Key of dictionary is the name
    of the social network site, and the value is another dictionary with
    the following keys:
        url_main:      URL of main site.
        url_user:      URL of user on site (if account exists).
        status:        QueryResult() object indicating results of test for
                       account existence.
        http_status:   HTTP status code of query which checked for existence on
                       site.
        response_text: Text that came back from request.  May be None if
                       there was an HTTP error when checking for existence.
    """

    # notify caller that we are starting the query.
    if not query_notify:
        query_notify = Mock()

    query_notify.start(username, id_type)

    cookie_jar = None
    if cookies:
        logger.debug(f"Using cookies jar file {cookies}")
        cookie_jar = import_aiohttp_cookies(cookies)

    clearweb_checker = SimpleAiohttpChecker(
        proxy=proxy, cookie_jar=cookie_jar, logger=logger
    )

    # TODO
    tor_checker = CheckerMock()
    if tor_proxy:
        tor_checker = ProxiedAiohttpChecker(  # type: ignore
            proxy=tor_proxy, cookie_jar=cookie_jar, logger=logger
        )

    # TODO
    i2p_checker = CheckerMock()
    if i2p_proxy:
        i2p_checker = ProxiedAiohttpChecker(  # type: ignore
            proxy=i2p_proxy, cookie_jar=cookie_jar, logger=logger
        )

    # TODO
    dns_checker = CheckerMock()
    if check_domains:
        dns_checker = AiodnsDomainResolver(logger=logger)  # type: ignore

    if logger.level == logging.DEBUG:
        await debug_ip_request(clearweb_checker, logger)

    # setup parallel executor
    executor = AsyncioQueueGeneratorExecutor(
        logger=logger,
        in_parallel=max_connections,
        timeout=timeout + 0.5,
        *args,
        **kwargs,
    )

    # make options objects for all the requests
    options: QueryOptions = {}
    options["cookies"] = cookie_jar
    options["checkers"] = {
        '': clearweb_checker,
        'tor': tor_checker,
        'dns': dns_checker,
        'i2p': i2p_checker,
    }
    options["parsing"] = is_parsing_enabled
    options["timeout"] = timeout
    options["id_type"] = id_type
    options["forced"] = forced

    # results from analysis of all sites
    all_results: Dict[str, QueryResultWrapper] = {}

    sites = list(site_dict.keys())

    attempts = retries + 1
    while attempts:
        tasks_dict = {}

        for sitename, site in site_dict.items():
            if sitename not in sites:
                continue
            default_result: QueryResultWrapper = {
                'site': site,
                'status': MaigretCheckResult(
                    username,
                    sitename,
                    '',
                    MaigretCheckStatus.UNKNOWN,
                    error=CheckError('Request failed'),
                ),
            }
            tasks_dict[sitename] = (
                check_site_for_username,
                [site, username, options, logger, query_notify],
                {
                    'default': (sitename, default_result),
                    'retry': retries - attempts + 1,
                },
            )

        cur_results = []
        with alive_bar(
            len(tasks_dict), title="Searching", force_tty=True, disable=no_progressbar
        ) as progress:
            async for result in executor.run(tasks_dict.values()):
                cur_results.append(result)
                progress()

        all_results.update(cur_results)

        # rerun for failed sites
        sites = get_failed_sites(dict(cur_results))
        attempts -= 1

        if not sites:
            break

        if attempts:
            query_notify.warning(
                f'Restarting checks for {len(sites)} sites... ({attempts} attempts left)'
            )

    # closing http client session
    await clearweb_checker.close()
    await tor_checker.close()
    await i2p_checker.close()

    # notify caller that all queries are finished
    query_notify.finish()

    return all_results


def timeout_check(value):
    """Check Timeout Argument.

    Checks timeout for validity.

    Keyword Arguments:
    value                  -- Time in seconds to wait before timing out request.

    Return Value:
    Floating point number representing the time (in seconds) that should be
    used for the timeout.

    NOTE:  Will raise an exception if the timeout in invalid.
    """
    from argparse import ArgumentTypeError

    try:
        timeout = float(value)
    except ValueError:
        raise ArgumentTypeError(f"Timeout '{value}' must be a number.")
    if timeout <= 0:
        raise ArgumentTypeError(f"Timeout '{value}' must be greater than 0.0s.")
    return timeout


async def site_self_check(
    site: MaigretSite,
    logger: logging.Logger,
    semaphore,
    db: MaigretDatabase,
    silent=False,
    proxy=None,
    tor_proxy=None,
    i2p_proxy=None,
    skip_errors=False,
    cookies=None,
):
    changes = {
        "disabled": False,
    }

    check_data = [
        (site.username_claimed, MaigretCheckStatus.CLAIMED),
        (site.username_unclaimed, MaigretCheckStatus.AVAILABLE),
    ]

    logger.info(f"Checking {site.name}...")

    for username, status in check_data:
        async with semaphore:
            results_dict = await maigret(
                username=username,
                site_dict={site.name: site},
                logger=logger,
                timeout=30,
                id_type=site.type,
                forced=True,
                no_progressbar=True,
                retries=1,
                proxy=proxy,
                tor_proxy=tor_proxy,
                i2p_proxy=i2p_proxy,
                cookies=cookies,
            )

            # don't disable entries with other ids types
            # TODO: make normal checking
            if site.name not in results_dict:
                logger.info(results_dict)
                changes["disabled"] = True
                continue

            logger.debug(results_dict)

            result = results_dict[site.name]["status"]

        if result.error and 'Cannot connect to host' in result.error.desc:
            changes["disabled"] = True

        site_status = result.status

        if site_status != status:
            if site_status == MaigretCheckStatus.UNKNOWN:
                msgs = site.absence_strs
                etype = site.check_type
                logger.warning(
                    f"Error while searching {username} in {site.name}: {result.context}, {msgs}, type {etype}"
                )
                # don't disable sites after the error
                # meaning that the site could be available, but returned error for the check
                # e.g. many sites protected by cloudflare and available in general
                if skip_errors:
                    pass
                # don't disable in case of available username
                elif status == MaigretCheckStatus.CLAIMED:
                    changes["disabled"] = True
            elif status == MaigretCheckStatus.CLAIMED:
                logger.warning(
                    f"Not found `{username}` in {site.name}, must be claimed"
                )
                logger.info(results_dict[site.name])
                changes["disabled"] = True
            else:
                logger.warning(f"Found `{username}` in {site.name}, must be available")
                logger.info(results_dict[site.name])
                changes["disabled"] = True

    logger.info(f"Site {site.name} checking is finished")

    if changes["disabled"] != site.disabled:
        site.disabled = changes["disabled"]
        logger.info(f"Switching property 'disabled' for {site.name} to {site.disabled}")
        db.update_site(site)
        if not silent:
            action = "Disabled" if site.disabled else "Enabled"
            print(f"{action} site {site.name}...")

    # remove service tag "unchecked"
    if "unchecked" in site.tags:
        site.tags.remove("unchecked")
        db.update_site(site)

    return changes


async def self_check(
    db: MaigretDatabase,
    site_data: dict,
    logger: logging.Logger,
    silent=False,
    max_connections=10,
    proxy=None,
    tor_proxy=None,
    i2p_proxy=None,
) -> bool:
    sem = asyncio.Semaphore(max_connections)
    tasks = []
    all_sites = site_data

    def disabled_count(lst):
        return len(list(filter(lambda x: x.disabled, lst)))

    unchecked_old_count = len(
        [site for site in all_sites.values() if "unchecked" in site.tags]
    )
    disabled_old_count = disabled_count(all_sites.values())

    for _, site in all_sites.items():
        check_coro = site_self_check(
            site, logger, sem, db, silent, proxy, tor_proxy, i2p_proxy, skip_errors=True
        )
        future = asyncio.ensure_future(check_coro)
        tasks.append(future)

    if tasks:
        with alive_bar(len(tasks), title='Self-checking', force_tty=True) as progress:
            for f in asyncio.as_completed(tasks):
                await f
                progress()  # Update the progress bar

    unchecked_new_count = len(
        [site for site in all_sites.values() if "unchecked" in site.tags]
    )
    disabled_new_count = disabled_count(all_sites.values())
    total_disabled = disabled_new_count - disabled_old_count

    if total_disabled:
        if total_disabled >= 0:
            message = "Disabled"
        else:
            message = "Enabled"
            total_disabled *= -1

        if not silent:
            print(
                f"{message} {total_disabled} ({disabled_old_count} => {disabled_new_count}) checked sites. "
                "Run with `--info` flag to get more information"
            )

    if unchecked_new_count != unchecked_old_count:
        print(f"Unchecked sites verified: {unchecked_old_count - unchecked_new_count}")

    return total_disabled != 0 or unchecked_new_count != unchecked_old_count


def extract_ids_data(html_text, logger, site) -> Dict:
    try:
        return extract(html_text)
    except Exception as e:
        logger.warning(f"Error while parsing {site.name}: {e}", exc_info=True)
        return {}


def parse_usernames(extracted_ids_data, logger) -> Dict:
    new_usernames = {}
    for k, v in extracted_ids_data.items():
        if "username" in k and not "usernames" in k:
            new_usernames[v] = "username"
        elif "usernames" in k:
            try:
                tree = ast.literal_eval(v)
                if type(tree) == list:
                    for n in tree:
                        new_usernames[n] = "username"
            except Exception as e:
                logger.warning(e)
        if k in SUPPORTED_IDS:
            new_usernames[v] = k
    return new_usernames


def update_results_info(results_info, extracted_ids_data, new_usernames):
    results_info["ids_usernames"] = new_usernames
    links = ascii_data_display(extracted_ids_data.get("links", "[]"))
    if "website" in extracted_ids_data:
        links.append(extracted_ids_data["website"])
    results_info["ids_links"] = links
    return results_info


================================================
FILE: maigret/errors.py
================================================
from typing import Dict, List, Any, Tuple

from .result import MaigretCheckResult
from .types import QueryResultWrapper


# error got as a result of completed search query
class CheckError:
    _type = 'Unknown'
    _desc = ''

    def __init__(self, typename, desc=''):
        self._type = typename
        self._desc = desc

    def __str__(self):
        if not self._desc:
            return f'{self._type} error'

        return f'{self._type} error: {self._desc}'

    @property
    def type(self):
        return self._type

    @property
    def desc(self):
        return self._desc


COMMON_ERRORS = {
    '<title>Attention Required! | Cloudflare</title>': CheckError(
        'Captcha', 'Cloudflare'
    ),
    'Please stand by, while we are checking your browser': CheckError(
        'Bot protection', 'Cloudflare'
    ),
    '<span data-translate="checking_browser">Checking your browser before accessing</span>': CheckError(
        'Bot protection', 'Cloudflare'
    ),
    'This website is using a security service to protect itself from online attacks.': CheckError(
        'Access denied', 'Cloudflare'
    ),
    '<title>Доступ ограничен</title>': CheckError('Censorship', 'Rostelecom'),
    'document.getElementById(\'validate_form_submit\').disabled=true': CheckError(
        'Captcha', 'Mail.ru'
    ),
    'Verifying your browser, please wait...<br>DDoS Protection by</font> Blazingfast.io': CheckError(
        'Bot protection', 'Blazingfast'
    ),
    '404</h1><p class="error-card__description">Мы&nbsp;не&nbsp;нашли страницу': CheckError(
        'Resolving', 'MegaFon 404 page'
    ),
    'Доступ к информационному ресурсу ограничен на основании Федерального закона': CheckError(
        'Censorship', 'MGTS'
    ),
    'Incapsula incident ID': CheckError('Bot protection', 'Incapsula'),
    'Сайт заблокирован хостинг-провайдером': CheckError(
        'Site-specific', 'Site is disabled (Beget)'
    ),
    'Generated by cloudfront (CloudFront)': CheckError('Request blocked', 'Cloudflare'),
    '/cdn-cgi/challenge-platform/h/b/orchestrate/chl_page': CheckError(
        'Just a moment: bot redirect challenge', 'Cloudflare'
    ),
}

ERRORS_TYPES = {
    'Captcha': 'Try to switch to another IP address or to use service cookies',
    'Bot protection': 'Try to switch to another IP address',
    'Censorship': 'Switch to another internet service provider',
    'Request timeout': 'Try to increase timeout or to switch to another internet service provider',
    'Connecting failure': 'Try to decrease number of parallel connections (e.g. -n 10)',
}

# TODO: checking for reason
ERRORS_REASONS = {
    'Login required': 'Add authorization cookies through `--cookies-jar-file` (see cookies.txt)',
}

TEMPORARY_ERRORS_TYPES = [
    'Request timeout',
    'Unknown',
    'Request failed',
    'Connecting failure',
    'HTTP',
    'Proxy',
    'Interrupted',
    'Connection lost',
]

THRESHOLD = 3  # percent


def is_important(err_data):
    return err_data['perc'] >= THRESHOLD


def is_permanent(err_type):
    return err_type not in TEMPORARY_ERRORS_TYPES


def detect(text):
    for flag, err in COMMON_ERRORS.items():
        if flag in text:
            return err
    return None


def solution_of(err_type) -> str:
    return ERRORS_TYPES.get(err_type, '')


def extract_and_group(search_res: QueryResultWrapper) -> List[Dict[str, Any]]:
    errors_counts: Dict[str, int] = {}
    for r in search_res.values():
        if r and isinstance(r, dict) and r.get('status'):
            if not isinstance(r['status'], MaigretCheckResult):
                continue

            err = r['status'].error
            if not err:
                continue
            errors_counts[err.type] = errors_counts.get(err.type, 0) + 1

    counts = []
    for err, count in sorted(errors_counts.items(), key=lambda x: x[1], reverse=True):
        counts.append(
            {
                'err': err,
                'count': count,
                'perc': round(count / len(search_res), 2) * 100,
            }
        )

    return counts


def notify_about_errors(
    search_results: QueryResultWrapper, query_notify, show_statistics=False
) -> List[Tuple]:
    """
    Prepare error notifications in search results, text + symbol,
    to be displayed by notify object.

    Example:
    [
        ("Too many errors of type "timeout" (50.0%)", "!")
        ("Verbose error statistics:", "-")
    ]
    """
    results = []

    errs = extract_and_group(search_results)
    was_errs_displayed = False
    for e in errs:
        if not is_important(e):
            continue
        text = f'Too many errors of type "{e["err"]}" ({round(e["perc"],2)}%)'
        solution = solution_of(e['err'])
        if solution:
            text = '. '.join([text, solution.capitalize()])

        results.append((text, '!'))
        was_errs_displayed = True

    if show_statistics:
        results.append(('Verbose error statistics:', '-'))
        for e in errs:
            text = f'{e["err"]}: {round(e["perc"],2)}%'
            results.append((text, '!'))

    if was_errs_displayed:
        results.append(
            ('You can see detailed site check errors with a flag `--print-errors`', '-')
        )

    return results


================================================
FILE: maigret/executors.py
================================================
import asyncio
import sys
import time
from typing import Any, Iterable, List, Callable

import alive_progress
from alive_progress import alive_bar

from .types import QueryDraft


def create_task_func():
    if sys.version_info.minor > 6:
        create_asyncio_task = asyncio.create_task
    else:
        loop = asyncio.get_event_loop()
        create_asyncio_task = loop.create_task
    return create_asyncio_task


class AsyncExecutor:
    # Deprecated: will be removed soon, don't use it
    def __init__(self, *args, **kwargs):
        self.logger = kwargs['logger']

    async def run(self, tasks: Iterable[QueryDraft]):
        start_time = time.time()
        results = await self._run(tasks)
        self.execution_time = time.time() - start_time
        self.logger.debug(f'Spent time: {self.execution_time}')
        return results

    async def _run(self, tasks: Iterable[QueryDraft]):
        await asyncio.sleep(0)


class AsyncioSimpleExecutor(AsyncExecutor):
    # Deprecated: will be removed soon, don't use it
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.semaphore = asyncio.Semaphore(kwargs.get('in_parallel', 100))

    async def _run(self, tasks: Iterable[QueryDraft]):
        async def sem_task(f, args, kwargs):
            async with self.semaphore:
                return await f(*args, **kwargs)

        futures = [sem_task(f, args, kwargs) for f, args, kwargs in tasks]
        return await asyncio.gather(*futures)


class AsyncioProgressbarExecutor(AsyncExecutor):
    # Deprecated: will be removed soon, don't use it
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    async def _run(self, tasks: Iterable[QueryDraft]):
        futures = [f(*args, **kwargs) for f, args, kwargs in tasks]
        total_tasks = len(futures)
        results = []

        # Use alive_bar for progress tracking
        with alive_bar(total_tasks, title='Searching', force_tty=True) as progress:
            # Chunk progress updates for efficiency
            async def track_task(task):
                result = await task
                progress()  # Update progress bar once task completes
                return result

            # Use gather to run tasks concurrently and track progress
            results = await asyncio.gather(*(track_task(f) for f in futures))

        return results


class AsyncioProgressbarSemaphoreExecutor(AsyncExecutor):
    # Deprecated: will be removed soon, don't use it
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.semaphore = asyncio.Semaphore(kwargs.get('in_parallel', 1))

    async def _run(self, tasks: Iterable[QueryDraft]):
        async def _wrap_query(q: QueryDraft):
            async with self.semaphore:
                f, args, kwargs = q
                return await f(*args, **kwargs)

        async def semaphore_gather(tasks: Iterable[QueryDraft]):
            coros = [_wrap_query(q) for q in tasks]
            results = []

            # Use alive_bar correctly as a context manager
            with alive_bar(len(coros), title='Searching', force_tty=True) as progress:
                for f in asyncio.as_completed(coros):
                    results.append(await f)
                    progress()  # Update the progress bar
            return results

        return await semaphore_gather(tasks)


class AsyncioProgressbarQueueExecutor(AsyncExecutor):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.workers_count = kwargs.get('in_parallel', 10)
        self.queue = asyncio.Queue(self.workers_count)
        self.timeout = kwargs.get('timeout')
        # Pass a progress function; alive_bar by default
        self.progress_func = kwargs.get('progress_func', alive_bar)
        self.progress = None

    # TODO: tests
    async def increment_progress(self, count):
        """Update progress by calling the provided progress function."""
        if self.progress:
            if asyncio.iscoroutinefunction(self.progress):
                await self.progress(count)
            else:
                self.progress(count)
                await asyncio.sleep(0)

    # TODO: tests
    async def stop_progress(self):
        """Stop the progress tracking."""
        if hasattr(self.progress, "close") and self.progress:
            close_func = self.progress.close
            if asyncio.iscoroutinefunction(close_func):
                await close_func()
            else:
                close_func()
                await asyncio.sleep(0)

    async def worker(self):
        """Consume tasks from the queue and process them."""
        while True:
            try:
                f, args, kwargs = self.queue.get_nowait()
            except asyncio.QueueEmpty:
                return

            query_future = f(*args, **kwargs)
            query_task = create_task_func()(query_future)
            try:
                result = await asyncio.wait_for(query_task, timeout=self.timeout)
            except asyncio.TimeoutError:
                result = kwargs.get('default')

            self.results.append(result)

            if self.progress:
                await self.increment_progress(1)

            self.queue.task_done()

    async def _run(self, queries: Iterable[QueryDraft]):
        """Main runner function to execute tasks with progress tracking."""
        self.results: List[Any] = []
        queries_list = list(queries)
        min_workers = min(len(queries_list), self.workers_count)
        workers = [create_task_func()(self.worker()) for _ in range(min_workers)]

        # Initialize the progress bar
        if self.progress_func:
            with self.progress_func(
                len(queries_list), title="Searching", force_tty=True
            ) as bar:
                self.progress = bar  # Assign alive_bar's callable to self.progress

                # Add tasks to the queue
                for t in queries_list:
                    await self.queue.put(t)

                # Wait for tasks to complete
                await self.queue.join()

                # Cancel any remaining workers
                for w in workers:
                    w.cancel()

        return self.results


class AsyncioQueueGeneratorExecutor:
    # Deprecated: will be removed soon, don't use it
    def __init__(self, *args, **kwargs):
        self.workers_count = kwargs.get('in_parallel', 10)
        self.queue = asyncio.Queue()
        self.timeout = kwargs.get('timeout')
        self.logger = kwargs['logger']
        self._results = asyncio.Queue()
        self._stop_signal = object()

    async def worker(self):
        """Process tasks from the queue and put results into the results queue."""
        while True:
            task = await self.queue.get()
            if task is self._stop_signal:
                self.queue.task_done()
                break

            try:
                f, args, kwargs = task
                query_future = f(*args, **kwargs)
                query_task = create_task_func()(query_future)

                try:
                    result = await asyncio.wait_for(query_task, timeout=self.timeout)
                except asyncio.TimeoutError:
                    result = kwargs.get('default')
                await self._results.put(result)
            except Exception as e:
                self.logger.error(f"Error in worker: {e}")
            finally:
                self.queue.task_done()

    async def run(self, queries: Iterable[Callable[..., Any]]):
        """Run workers to process queries in parallel."""
        start_time = time.time()

        # Add tasks to the queue
        for t in queries:
            await self.queue.put(t)

        # Create workers
        workers = [
            asyncio.create_task(self.worker()) for _ in range(self.workers_count)
        ]

        # Add stop signals
        for _ in range(self.workers_count):
            await self.queue.put(self._stop_signal)

        try:
            while any(w.done() is False for w in workers) or not self._results.empty():
                try:
                    result = await asyncio.wait_for(self._results.get(), timeout=1)
                    yield result
                except asyncio.TimeoutError:
                    pass
        finally:
            # Ensure all workers are awaited
            await asyncio.gather(*workers)
            self.execution_time = time.time() - start_time
            self.logger.debug(f"Spent time: {self.execution_time}")


================================================
FILE: maigret/maigret.py
================================================
"""
Maigret main module
"""

import ast
import asyncio
import logging
import os
import sys
import platform
import re
from argparse import ArgumentParser, RawDescriptionHelpFormatter
from typing import List, Tuple
import os.path as path

from socid_extractor import extract, parse

from .__version__ import __version__
from .checking import (
    timeout_check,
    SUPPORTED_IDS,
    self_check,
    BAD_CHARS,
    maigret,
)
from . import errors
from .notify import QueryNotifyPrint
from .report import (
    save_csv_report,
    save_xmind_report,
    save_html_report,
    save_pdf_report,
    generate_report_context,
    save_txt_report,
    SUPPORTED_JSON_REPORT_FORMATS,
    save_json_report,
    get_plaintext_report,
    sort_report_by_data_points,
    save_graph_report,
)
from .sites import MaigretDatabase
from .submit import Submitter
from .types import QueryResultWrapper
from .utils import get_dict_ascii_tree
from .settings import Settings
from .permutator import Permute


def extract_ids_from_page(url, logger, timeout=5) -> dict:
    results = {}
    # url, headers
    reqs: List[Tuple[str, set]] = [(url, set())]
    try:
        # temporary workaround for URL mutations MVP
        from socid_extractor import mutate_url

        reqs += list(mutate_url(url))
    except Exception as e:
        logger.warning(e)

    for req in reqs:
        url, headers = req
        print(f'Scanning webpage by URL {url}...')
        page, _ = parse(url, cookies_str='', headers=headers, timeout=timeout)
        logger.debug(page)
        info = extract(page)
        if not info:
            print('Nothing extracted')
        else:
            print(get_dict_ascii_tree(info.items(), new_line=False), ' ')
        for k, v in info.items():
            # TODO: merge with the same functionality in checking module
            if 'username' in k and not 'usernames' in k:
                results[v] = 'username'
            elif 'usernames' in k:
                try:
                    tree = ast.literal_eval(v)
                    if type(tree) == list:
                        for n in tree:
                            results[n] = 'username'
                except Exception as e:
                    logger.warning(e)
            if k in SUPPORTED_IDS:
                results[v] = k

    return results


def extract_ids_from_results(results: QueryResultWrapper, db: MaigretDatabase) -> dict:
    ids_results = {}
    for website_name in results:
        dictionary = results[website_name]
        # TODO: fix no site data issue
        if not dictionary:
            continue

        new_usernames = dictionary.get('ids_usernames')
        if new_usernames:
            for u, utype in new_usernames.items():
                ids_results[u] = utype

        for url in dictionary.get('ids_links', []):
            ids_results.update(db.extract_ids_from_url(url))

    return ids_results


def setup_arguments_parser(settings: Settings):
    from aiohttp import __version__ as aiohttp_version
    from requests import __version__ as requests_version
    from socid_extractor import __version__ as socid_version

    version_string = '\n'.join(
        [
            f'%(prog)s {__version__}',
            f'Socid-extractor:  {socid_version}',
            f'Aiohttp:  {aiohttp_version}',
            f'Requests:  {requests_version}',
            f'Python:  {platform.python_version()}',
        ]
    )

    parser = ArgumentParser(
        formatter_class=RawDescriptionHelpFormatter,
        description=f"Maigret v{__version__}\n"
        "Documentation: https://maigret.readthedocs.io/\n"
        "All settings are also configurable through files, see docs.",
    )
    parser.add_argument(
        "username",
        nargs='*',
        metavar="USERNAMES",
        help="One or more usernames to search by.",
    )
    parser.add_argument(
        "--version",
        action="version",
        version=version_string,
        help="Display version information and dependencies.",
    )
    parser.add_argument(
        "--timeout",
        action="store",
        metavar='TIMEOUT',
        dest="timeout",
        type=timeout_check,
        default=settings.timeout,
        help="Time in seconds to wait for response to requests "
        f"(default {settings.timeout}s). "
        "A longer timeout will be more likely to get results from slow sites. "
        "On the other hand, this may cause a long delay to gather all results. ",
    )
    parser.add_argument(
        "--retries",
        action="store",
        type=int,
        metavar='RETRIES',
        default=settings.retries_count,
        help="Attempts to restart temporarily failed requests.",
    )
    parser.add_argument(
        "-n",
        "--max-connections",
        action="store",
        type=int,
        dest="connections",
        default=settings.max_connections,
        help=f"Allowed number of concurrent connections (default {settings.max_connections}).",
    )
    parser.add_argument(
        "--no-recursion",
        action="store_true",
        dest="disable_recursive_search",
        default=(not settings.recursive_search),
        help="Disable recursive search by additional data extracted from pages.",
    )
    parser.add_argument(
        "--no-extracting",
        action="store_true",
        dest="disable_extracting",
        default=(not settings.info_extracting),
        help="Disable parsing pages for additional data and other usernames.",
    )
    parser.add_argument(
        "--id-type",
        dest="id_type",
        default='username',
        choices=SUPPORTED_IDS,
        help="Specify identifier(s) type (default: username).",
    )
    parser.add_argument(
        "--permute",
        action="store_true",
        default=False,
        help="Permute at least 2 usernames to generate more possible usernames.",
    )
    parser.add_argument(
        "--db",
        metavar="DB_FILE",
        dest="db_file",
        default=settings.sites_db_path,
        help="Load Maigret database from a JSON file or HTTP web resource.",
    )
    parser.add_argument(
        "--cookies-jar-file",
        metavar="COOKIE_FILE",
        dest="cookie_file",
        default=settings.cookie_jar_file,
        help="File with cookies.",
    )
    parser.add_argument(
        "--ignore-ids",
        action="append",
        metavar='IGNORED_IDS',
        dest="ignore_ids_list",
        default=settings.ignore_ids_list,
        help="Do not make search by the specified username or other ids.",
    )
    # reports options
    parser.add_argument(
        "--folderoutput",
        "-fo",
        dest="folderoutput",
        default=settings.reports_path,
        metavar="PATH",
        help="If using multiple usernames, the output of the results will be saved to this folder.",
    )
    parser.add_argument(
        "--proxy",
        "-p",
        metavar='PROXY_URL',
        action="store",
        dest="proxy",
        default=settings.proxy_url,
        help="Make requests over a proxy. e.g. socks5://127.0.0.1:1080",
    )
    parser.add_argument(
        "--tor-proxy",
        metavar='TOR_PROXY_URL',
        action="store",
        default=settings.tor_proxy_url,
        help="Specify URL of your Tor gateway. Default is socks5://127.0.0.1:9050",
    )
    parser.add_argument(
        "--i2p-proxy",
        metavar='I2P_PROXY_URL',
        action="store",
        default=settings.i2p_proxy_url,
        help="Specify URL of your I2P gateway. Default is http://127.0.0.1:4444",
    )
    parser.add_argument(
        "--with-domains",
        action="store_true",
        default=settings.domain_search,
        help="Enable (experimental) feature of checking domains on usernames.",
    )

    filter_group = parser.add_argument_group(
        'Site filtering', 'Options to set site search scope'
    )
    filter_group.add_argument(
        "-a",
        "--all-sites",
        action="store_true",
        dest="all_sites",
        default=settings.scan_all_sites,
        help="Use all sites for scan.",
    )
    filter_group.add_argument(
        "--top-sites",
        action="store",
        default=settings.top_sites_count,
        metavar="N",
        type=int,
        help="Count of sites for scan ranked by Alexa Top (default: 500).",
    )
    filter_group.add_argument(
        "--tags", dest="tags", default='', help="Specify tags of sites (see `--stats`)."
    )
    filter_group.add_argument(
        "--site",
        action="append",
        metavar='SITE_NAME',
        dest="site_list",
        default=settings.scan_sites_list,
        help="Limit analysis to just the specified sites (multiple option).",
    )
    filter_group.add_argument(
        "--use-disabled-sites",
        action="store_true",
        default=settings.scan_disabled_sites,
        help="Use disabled sites to search (may cause many false positives).",
    )

    modes_group = parser.add_argument_group(
        'Operating modes',
        'Various functions except the default search by a username. '
        'Modes are executed sequentially in the order of declaration.',
    )
    modes_group.add_argument(
        "--parse",
        dest="parse_url",
        default='',
        metavar='URL',
        help="Parse page by URL and extract username and IDs to use for search.",
    )
    modes_group.add_argument(
        "--submit",
        metavar='URL',
        type=str,
        dest="new_site_to_submit",
        default=False,
        help="URL of existing profile in new site to submit.",
    )
    modes_group.add_argument(
        "--self-check",
        action="store_true",
        default=settings.self_check_enabled,
        help="Do self check for sites and database and disable non-working ones.",
    )
    modes_group.add_argument(
        "--stats",
        action="store_true",
        default=False,
        help="Show database statistics (most frequent sites engines and tags).",
    )
    modes_group.add_argument(
        "--web",
        metavar='PORT',
        type=int,
        nargs='?',  # Optional PORT value
        const=5000,  # Default PORT if `--web` is provided without a value
        default=None,  # Explicitly set default to None
        help="Launch the web interface on the specified port (default: 5000 if no PORT is provided).",
    )
    output_group = parser.add_argument_group(
        'Output options', 'Options to change verbosity and view of the console output'
    )
    output_group.add_argument(
        "--print-not-found",
        action="store_true",
        dest="print_not_found",
        default=settings.print_not_found,
        help="Print sites where the username was not found.",
    )
    output_group.add_argument(
        "--print-errors",
        action="store_true",
        dest="print_check_errors",
        default=settings.print_check_errors,
        help="Print errors messages: connection, captcha, site country ban, etc.",
    )
    output_group.add_argument(
        "--verbose",
        "-v",
        action="store_true",
        dest="verbose",
        default=False,
        help="Display extra information and metrics.",
    )
    output_group.add_argument(
        "--info",
        "-vv",
        action="store_true",
        dest="info",
        default=False,
        help="Display extra/service information and metrics.",
    )
    output_group.add_argument(
        "--debug",
        "-vvv",
        "-d",
        action="store_true",
        dest="debug",
        default=False,
        help="Display extra/service/debug information and metrics, save responses in debug.log.",
    )
    output_group.add_argument(
        "--no-color",
        action="store_true",
        dest="no_color",
        default=(not settings.colored_print),
        help="Don't color terminal output",
    )
    output_group.add_argument(
        "--no-progressbar",
        action="store_true",
        dest="no_progressbar",
        default=(not settings.show_progressbar),
        help="Don't show progressbar.",
    )

    report_group = parser.add_argument_group(
        'Report formats', 'Supported formats of report files'
    )
    report_group.add_argument(
        "-T",
        "--txt",
        action="store_true",
        dest="txt",
        default=settings.txt_report,
        help="Create a TXT report (one report per username).",
    )
    report_group.add_argument(
        "-C",
        "--csv",
        action="store_true",
        dest="csv",
        default=settings.csv_report,
        help="Create a CSV report (one report per username).",
    )
    report_group.add_argument(
        "-H",
        "--html",
        action="store_true",
        dest="html",
        default=settings.html_report,
        help="Create an HTML report file (general report on all usernames).",
    )
    report_group.add_argument(
        "-X",
        "--xmind",
        action="store_true",
        dest="xmind",
        default=settings.xmind_report,
        help="Generate an XMind 8 mindmap report (one report per username).",
    )
    report_group.add_argument(
        "-P",
        "--pdf",
        action="store_true",
        dest="pdf",
        default=settings.pdf_report,
        help="Generate a PDF report (general report on all usernames).",
    )
    report_group.add_argument(
        "-G",
        "--graph",
        action="store_true",
        dest="graph",
        default=settings.graph_report,
        help="Generate a graph report (general report on all usernames).",
    )
    report_group.add_argument(
        "-J",
        "--json",
        action="store",
        metavar='TYPE',
        dest="json",
        default=settings.json_report_type,
        choices=SUPPORTED_JSON_REPORT_FORMATS,
        help=f"Generate a JSON report of specific type: {', '.join(SUPPORTED_JSON_REPORT_FORMATS)}"
        " (one report per username).",
    )

    parser.add_argument(
        "--reports-sorting",
        default=settings.report_sorting,
        choices=('default', 'data'),
        help="Method of results sorting in reports (default: in order of getting the result)",
    )
    return parser


async def main():
    # Logging
    log_level = logging.ERROR
    logging.basicConfig(
        format='[%(filename)s:%(lineno)d] %(levelname)-3s  %(asctime)s %(message)s',
        datefmt='%H:%M:%S',
        level=log_level,
    )
    logger = logging.getLogger('maigret')
    logger.setLevel(log_level)

    # Load settings
    settings = Settings()
    settings_loaded, err = settings.load()

    if not settings_loaded:
        logger.error(err)
        sys.exit(3)

    arg_parser = setup_arguments_parser(settings)
    args = arg_parser.parse_args()

    # Re-set logging level based on args
    if args.debug:
        log_level = logging.DEBUG
    elif args.info:
        log_level = logging.INFO
    elif args.verbose:
        log_level = logging.WARNING
    logger.setLevel(log_level)

    # Usernames initial list
    usernames = {
        u: args.id_type
        for u in args.username
        if u and u not in ['-'] and u not in args.ignore_ids_list
    }
    original_usernames = ""
    if args.permute and len(usernames) > 1 and args.id_type == 'username':
        original_usernames = " ".join(usernames.keys())
        usernames = Permute(usernames).gather(method='strict')

    parsing_enabled = not args.disable_extracting
    recursive_search_enabled = not args.disable_recursive_search

    # Make prompts
    if args.proxy is not None:
        print("Using the proxy: " + args.proxy)

    if args.parse_url:
        extracted_ids = extract_ids_from_page(
            args.parse_url, logger, timeout=args.timeout
        )
        usernames.update(extracted_ids)

    if args.tags:
        args.tags = list(set(str(args.tags).split(',')))

    db_file = args.db_file \
        if (args.db_file.startswith("http://") or args.db_file.startswith("https://")) \
        else path.join(path.dirname(path.realpath(__file__)), args.db_file)

    if args.top_sites == 0 or args.all_sites:
        args.top_sites = sys.maxsize

    # Create notify object for query results.
    query_notify = QueryNotifyPrint(
        result=None,
        verbose=args.verbose,
        print_found_only=not args.print_not_found,
        skip_check_errors=not args.print_check_errors,
        color=not args.no_color,
    )

    # Create object with all information about sites we are aware of.
    db = MaigretDatabase().load_from_path(db_file)
    get_top_sites_for_id = lambda x: db.ranked_sites_dict(
        top=args.top_sites,
        tags=args.tags,
        names=args.site_list,
        disabled=args.use_disabled_sites,
        id_type=x,
    )

    site_data = get_top_sites_for_id(args.id_type)

    if args.new_site_to_submit:
        submitter = Submitter(db=db, logger=logger, settings=settings, args=args)
        is_submitted = await submitter.dialog(args.new_site_to_submit, args.cookie_file)
        if is_submitted:
            db.save_to_file(db_file)
        await submitter.close()

    # Database self-checking
    if args.self_check:
        if len(site_data) == 0:
            query_notify.warning(
                'No sites to self-check with the current filters! Exiting...'
            )
            return

        query_notify.success(
            f'Maigret sites database self-check started for {len(site_data)} sites...'
        )
        is_need_update = await self_check(
            db,
            site_data,
            logger,
            proxy=args.proxy,
            max_connections=args.connections,
            tor_proxy=args.tor_proxy,
            i2p_proxy=args.i2p_proxy,
        )
        if is_need_update:
            if input('Do you want to save changes permanently? [Yn]\n').lower() in (
                'y',
                '',
            ):
                db.save_to_file(db_file)
                print('Database was successfully updated.')
            else:
                print('Updates will be applied only for current search session.')

        if args.verbose or args.debug:
            query_notify.info(
                'Scan sessions flags stats: ' + str(db.get_scan_stats(site_data))
            )

    # Database statistics
    if args.stats:
        print(db.get_db_stats())

    report_dir = path.join(os.getcwd(), args.folderoutput)

    # Make reports folder is not exists
    os.makedirs(report_dir, exist_ok=True)

    # Define one report filename template
    report_filepath_tpl = path.join(report_dir, 'report_{username}{postfix}')

    # Web interface
    if args.web is not None:
        from maigret.web.app import app

        app.config["MAIGRET_DB_FILE"] = db_file

        port = (
            args.web if args.web else 5000
        )  # args.web is either the specified port or 5000 by default

        # Host configuration: secure by default, but allow override via environment
        host = os.getenv('FLASK_HOST', '127.0.0.1')
        app.run(host=host, port=port)
        return

    if usernames == {}:
        # magic params to exit after init
        query_notify.warning('No usernames to check, exiting.')
        sys.exit(0)

    if len(usernames) > 1 and args.permute and args.id_type == 'username':
        query_notify.warning(
            f"{len(usernames)} permutations from {original_usernames} to check..."
            + get_dict_ascii_tree(usernames, prepend="\t")
        )

    if not site_data:
        query_notify.warning('No sites to check, exiting!')
        sys.exit(2)

    query_notify.warning(
        f'Starting a search on top {len(site_data)} sites from the Maigret database...'
    )
    if not args.all_sites:
        query_notify.warning(
            'You can run search by full list of sites with flag `-a`', '!'
        )

    already_checked = set()
    general_results = []

    while usernames:
        username, id_type = list(usernames.items())[0]
        del usernames[username]

        if username.lower() in already_checked:
            continue

        already_checked.add(username.lower())

        if username in args.ignore_ids_list:
            query_notify.warning(
                f'Skip a search by username {username} cause it\'s marked as ignored.'
            )
            continue

        # check for characters do not supported by sites generally
        found_unsupported_chars = set(BAD_CHARS).intersection(set(username))
        if found_unsupported_chars:
            pretty_chars_str = ','.join(
                map(lambda s: f'"{s}"', found_unsupported_chars)
            )
            query_notify.warning(
                f'Found unsupported URL characters: {pretty_chars_str}, skip search by username "{username}"'
            )
            continue

        sites_to_check = get_top_sites_for_id(id_type)

        results = await maigret(
            username=username,
            site_dict=dict(sites_to_check),
            query_notify=query_notify,
            proxy=args.proxy,
            tor_proxy=args.tor_proxy,
            i2p_proxy=args.i2p_proxy,
            timeout=args.timeout,
            is_parsing_enabled=parsing_enabled,
            id_type=id_type,
            debug=args.verbose,
            logger=logger,
            cookies=args.cookie_file,
            forced=args.use_disabled_sites,
            max_connections=args.connections,
            no_progressbar=args.no_progressbar,
            retries=args.retries,
            check_domains=args.with_domains,
        )

        errs = errors.notify_about_errors(
            results, query_notify, show_statistics=args.verbose
        )
        for e in errs:
            query_notify.warning(*e)

        if args.reports_sorting == "data":
            results = sort_report_by_data_points(results)

        general_results.append((username, id_type, results))

        # TODO: tests
        if recursive_search_enabled:
            extracted_ids = extract_ids_from_results(results, db)
            query_notify.warning(f'Extracted IDs: {extracted_ids}')
            usernames.update(extracted_ids)

        # reporting for a one username
        if args.xmind:
            username = username.replace('/', '_')
            filename = report_filepath_tpl.format(username=username, postfix='.xmind')
            save_xmind_report(filename, username, results)
            query_notify.warning(f'XMind report for {username} saved in {filename}')

        if args.csv:
            username = username.replace('/', '_')
            filename = report_filepath_tpl.format(username=username, postfix='.csv')
            save_csv_report(filename, username, results)
            query_notify.warning(f'CSV report for {username} saved in {filename}')

        if args.txt:
            username = username.replace('/', '_')
            filename = report_filepath_tpl.format(username=username, postfix='.txt')
            save_txt_report(filename, username, results)
            query_notify.warning(f'TXT report for {username} saved in {filename}')

        if args.json:
            username = username.replace('/', '_')
            filename = report_filepath_tpl.format(
                username=username, postfix=f'_{args.json}.json'
            )
            save_json_report(filename, username, results, report_type=args.json)
            query_notify.warning(
                f'JSON {args.json} report for {username} saved in {filename}'
            )

    # reporting for all the result
    if general_results:
        if args.html or args.pdf:
            query_notify.warning('Generating report info...')
        report_context = generate_report_context(general_results)
        # determine main username
        username = report_context['username']

        if args.html:
            username = username.replace('/', '_')
            filename = report_filepath_tpl.format(
                username=username, postfix='_plain.html'
            )
            save_html_report(filename, report_context)
            query_notify.warning(f'HTML report on all usernames saved in {filename}')

        if args.pdf:
            username = username.replace('/', '_')
            filename = report_filepath_tpl.format(username=username, postfix='.pdf')
            save_pdf_report(filename, report_context)
            query_notify.warning(f'PDF report on all usernames saved in {filename}')

        if args.graph:
            username = username.replace('/', '_')
            filename = report_filepath_tpl.format(
                username=username, postfix='_graph.html'
            )
            save_graph_report(filename, general_results, db)
            query_notify.warning(f'Graph report on all usernames saved in {filename}')

        text_report = get_plaintext_report(report_context)
        if text_report:
            query_notify.info('Short text report:')
            print(text_report)

    # update database
    db.save_to_file(db_file)


def run():
    try:
        if sys.version_info.minor >= 10:
            asyncio.run(main())
        else:
            loop = asyncio.get_event_loop()
            loop.run_until_complete(main())
    except KeyboardInterrupt:
        print('Maigret is interrupted.')
        sys.exit(1)


if __name__ == "__main__":
    run()


================================================
FILE: maigret/notify.py
================================================
"""Sherlock Notify Module

This module defines the objects for notifying the caller about the
results of queries.
"""

import sys

from colorama import Fore, Style, init

from .result import MaigretCheckStatus
from .utils import get_dict_ascii_tree


class QueryNotify:
    """Query Notify Object.

    Base class that describes methods available to notify the results of
    a query.
    It is intended that other classes inherit from this base class and
    override the methods to implement specific functionality.
    """

    def __init__(self, result=None):
        """Create Query Notify Object.

        Contains information about a specific method of notifying the results
        of a query.

        Keyword Arguments:
        self                   -- This object.
        result                 -- Object of type QueryResult() containing
                                  results for this query.

        Return Value:
        Nothing.
        """

        self.result = result

        return

    def start(self, message=None, id_type="username"):
        """Notify Start.

        Notify method for start of query.  This method will be called before
        any queries are performed.  This method will typically be
        overridden by higher level classes that will inherit from it.

        Keyword Arguments:
        self                   -- This object.
        message                -- Object that is used to give context to start
                                  of query.
                                  Default is None.

        Return Value:
        Nothing.
        """

        return

    def update(self, result):
        """Notify Update.

        Notify method for query result.  This method will typically be
        overridden by higher level classes that will inherit from it.

        Keyword Arguments:
        self                   -- This object.
        result                 -- Object of type QueryResult() containing
                                  results for this query.

        Return Value:
        Nothing.
        """

        self.result = result

        return

    def finish(self, message=None):
        """Notify Finish.

        Notify method for finish of query.  This method will be called after
        all queries have been performed.  This method will typically be
        overridden by higher level classes that will inherit from it.

        Keyword Arguments:
        self                   -- This object.
        message                -- Object that is used to give context to start
                                  of query.
                                  Default is None.

        Return Value:
        Nothing.
        """

        return

    def __str__(self):
        """Convert Object To String.

        Keyword Arguments:
        self                   -- This object.

        Return Value:
        Nicely formatted string to get information about this object.
        """
        result = str(self.result)

        return result


class QueryNotifyPrint(QueryNotify):
    """Query Notify Print Object.

    Query notify class that prints results.
    """

    def __init__(
        self,
        result=None,
        verbose=False,
        print_found_only=False,
        skip_check_errors=False,
        col
Download .txt
gitextract_52ofrfho/

├── .dockerignore
├── .githooks/
│   └── pre-commit
├── .github/
│   ├── FUNDING.yml
│   ├── ISSUE_TEMPLATE/
│   │   ├── add-a-site.md
│   │   ├── bug.md
│   │   └── report-false-result.md
│   ├── dependabot.yml
│   └── workflows/
│       ├── build-docker-image.yml
│       ├── codeql-analysis.yml
│       ├── pyinstaller.yml
│       ├── python-package.yml
│       ├── python-publish.yml
│       └── update-site-data.yml
├── .gitignore
├── .readthedocs.yaml
├── CHANGELOG.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── Dockerfile
├── Installer.bat
├── LICENSE
├── MANIFEST.in
├── Makefile
├── README.md
├── docs/
│   ├── Makefile
│   ├── make.bat
│   ├── requirements.txt
│   └── source/
│       ├── command-line-options.rst
│       ├── conf.py
│       ├── development.rst
│       ├── features.rst
│       ├── index.rst
│       ├── installation.rst
│       ├── philosophy.rst
│       ├── quick-start.rst
│       ├── settings.rst
│       ├── supported-identifier-types.rst
│       ├── tags.rst
│       └── usage-examples.rst
├── maigret/
│   ├── __init__.py
│   ├── __main__.py
│   ├── __version__.py
│   ├── activation.py
│   ├── checking.py
│   ├── errors.py
│   ├── executors.py
│   ├── maigret.py
│   ├── notify.py
│   ├── permutator.py
│   ├── report.py
│   ├── resources/
│   │   ├── data.json
│   │   ├── simple_report.tpl
│   │   ├── simple_report_pdf.css
│   │   └── simple_report_pdf.tpl
│   ├── result.py
│   ├── settings.py
│   ├── sites.py
│   ├── submit.py
│   ├── types.py
│   ├── utils.py
│   └── web/
│       ├── app.py
│       └── templates/
│           ├── base.html
│           ├── index.html
│           ├── results.html
│           └── status.html
├── pyinstaller/
│   ├── maigret_standalone.py
│   ├── maigret_standalone.spec
│   └── requirements.txt
├── pyproject.toml
├── pytest.ini
├── sites.md
├── snapcraft.yaml
├── static/
│   ├── recursive_search.md
│   └── report_alexaimephotographycars.html
├── tests/
│   ├── __init__.py
│   ├── conftest.py
│   ├── db.json
│   ├── local.json
│   ├── test_activation.py
│   ├── test_checking.py
│   ├── test_cli.py
│   ├── test_data.py
│   ├── test_errors.py
│   ├── test_executors.py
│   ├── test_maigret.py
│   ├── test_notify.py
│   ├── test_permutator.py
│   ├── test_report.py
│   ├── test_sites.py
│   ├── test_submit.py
│   └── test_utils.py
├── utils/
│   ├── __init__.py
│   ├── add_tags.py
│   ├── check_engines.py
│   ├── import_sites.py
│   ├── sites_diff.py
│   └── update_site_data.py
└── wizard.py
Download .txt
SYMBOL INDEX (311 symbols across 34 files)

FILE: maigret/activation.py
  class ParsingActivator (line 8) | class ParsingActivator:
    method twitter (line 10) | def twitter(site, logger, cookies={}):
    method vimeo (line 22) | def vimeo(site, logger, cookies={}):
    method spotify (line 34) | def spotify(site, logger, cookies={}):
    method weibo (line 45) | def weibo(site, logger):
  function import_aiohttp_cookies (line 80) | def import_aiohttp_cookies(cookiestxt_filename):

FILE: maigret/checking.py
  class CheckerBase (line 51) | class CheckerBase:
  class SimpleAiohttpChecker (line 55) | class SimpleAiohttpChecker(CheckerBase):
    method __init__ (line 56) | def __init__(self, *args, **kwargs):
    method prepare (line 66) | def prepare(self, url, headers=None, allow_redirects=True, timeout=0, ...
    method close (line 74) | async def close(self):
    method _make_request (line 77) | async def _make_request(
    method check (line 120) | async def check(self) -> Tuple[str, int, Optional[CheckError]]:
  class ProxiedAiohttpChecker (line 152) | class ProxiedAiohttpChecker(SimpleAiohttpChecker):
    method __init__ (line 153) | def __init__(self, *args, **kwargs):
  class AiodnsDomainResolver (line 159) | class AiodnsDomainResolver(CheckerBase):
    method __init__ (line 163) | def __init__(self, *args, **kwargs):
    method prepare (line 168) | def prepare(self, url, headers=None, allow_redirects=True, timeout=0, ...
    method check (line 172) | async def check(self) -> Tuple[str, int, Optional[CheckError]]:
  class CheckerMock (line 190) | class CheckerMock:
    method __init__ (line 191) | def __init__(self, *args, **kwargs):
    method prepare (line 194) | def prepare(self, url, headers=None, allow_redirects=True, timeout=0, ...
    method check (line 197) | async def check(self) -> Tuple[str, int, Optional[CheckError]]:
    method close (line 201) | async def close(self):
  function detect_error_page (line 206) | def detect_error_page(
  function debug_response_logging (line 229) | def debug_response_logging(url, html_text, status_code, check_error):
  function process_site_result (line 237) | def process_site_result(
  function make_site_result (line 396) | def make_site_result(
  function check_site_for_username (line 528) | async def check_site_for_username(
  function debug_ip_request (line 554) | async def debug_ip_request(checker, logger):
  function get_failed_sites (line 563) | def get_failed_sites(results: Dict[str, QueryResultWrapper]) -> List[str]:
  function maigret (line 574) | async def maigret(
  function timeout_check (line 755) | def timeout_check(value):
  function site_self_check (line 780) | async def site_self_check(
  function self_check (line 880) | async def self_check(
  function extract_ids_data (line 940) | def extract_ids_data(html_text, logger, site) -> Dict:
  function parse_usernames (line 948) | def parse_usernames(extracted_ids_data, logger) -> Dict:
  function update_results_info (line 966) | def update_results_info(results_info, extracted_ids_data, new_usernames):

FILE: maigret/errors.py
  class CheckError (line 8) | class CheckError:
    method __init__ (line 12) | def __init__(self, typename, desc=''):
    method __str__ (line 16) | def __str__(self):
    method type (line 23) | def type(self):
    method desc (line 27) | def desc(self):
  function is_important (line 94) | def is_important(err_data):
  function is_permanent (line 98) | def is_permanent(err_type):
  function detect (line 102) | def detect(text):
  function solution_of (line 109) | def solution_of(err_type) -> str:
  function extract_and_group (line 113) | def extract_and_group(search_res: QueryResultWrapper) -> List[Dict[str, ...
  function notify_about_errors (line 138) | def notify_about_errors(

FILE: maigret/executors.py
  function create_task_func (line 12) | def create_task_func():
  class AsyncExecutor (line 21) | class AsyncExecutor:
    method __init__ (line 23) | def __init__(self, *args, **kwargs):
    method run (line 26) | async def run(self, tasks: Iterable[QueryDraft]):
    method _run (line 33) | async def _run(self, tasks: Iterable[QueryDraft]):
  class AsyncioSimpleExecutor (line 37) | class AsyncioSimpleExecutor(AsyncExecutor):
    method __init__ (line 39) | def __init__(self, *args, **kwargs):
    method _run (line 43) | async def _run(self, tasks: Iterable[QueryDraft]):
  class AsyncioProgressbarExecutor (line 52) | class AsyncioProgressbarExecutor(AsyncExecutor):
    method __init__ (line 54) | def __init__(self, *args, **kwargs):
    method _run (line 57) | async def _run(self, tasks: Iterable[QueryDraft]):
  class AsyncioProgressbarSemaphoreExecutor (line 76) | class AsyncioProgressbarSemaphoreExecutor(AsyncExecutor):
    method __init__ (line 78) | def __init__(self, *args, **kwargs):
    method _run (line 82) | async def _run(self, tasks: Iterable[QueryDraft]):
  class AsyncioProgressbarQueueExecutor (line 102) | class AsyncioProgressbarQueueExecutor(AsyncExecutor):
    method __init__ (line 103) | def __init__(self, *args, **kwargs):
    method increment_progress (line 113) | async def increment_progress(self, count):
    method stop_progress (line 123) | async def stop_progress(self):
    method worker (line 133) | async def worker(self):
    method _run (line 155) | async def _run(self, queries: Iterable[QueryDraft]):
  class AsyncioQueueGeneratorExecutor (line 183) | class AsyncioQueueGeneratorExecutor:
    method __init__ (line 185) | def __init__(self, *args, **kwargs):
    method worker (line 193) | async def worker(self):
    method run (line 216) | async def run(self, queries: Iterable[Callable[..., Any]]):

FILE: maigret/maigret.py
  function extract_ids_from_page (line 49) | def extract_ids_from_page(url, logger, timeout=5) -> dict:
  function extract_ids_from_results (line 89) | def extract_ids_from_results(results: QueryResultWrapper, db: MaigretDat...
  function setup_arguments_parser (line 108) | def setup_arguments_parser(settings: Settings):
  function main (line 465) | async def main():
  function run (line 779) | def run():

FILE: maigret/notify.py
  class QueryNotify (line 15) | class QueryNotify:
    method __init__ (line 24) | def __init__(self, result=None):
    method start (line 43) | def start(self, message=None, id_type="username"):
    method update (line 62) | def update(self, result):
    method finish (line 81) | def finish(self, message=None):
    method __str__ (line 100) | def __str__(self):
  class QueryNotifyPrint (line 114) | class QueryNotifyPrint(QueryNotify):
    method __init__ (line 120) | def __init__(
    method make_colored_terminal_notify (line 156) | def make_colored_terminal_notify(
    method make_simple_terminal_notify (line 166) | def make_simple_terminal_notify(
    method make_terminal_notify (line 171) | def make_terminal_notify(self, *args):
    method start (line 177) | def start(self, message, id_type):
    method _colored_print (line 209) | def _colored_print(self, fore_color, msg):
    method success (line 215) | def success(self, message, symbol="+"):
    method warning (line 219) | def warning(self, message, symbol="-"):
    method info (line 223) | def info(self, message, symbol="*"):
    method update (line 227) | def update(self, result, is_similar=False):
    method __str__ (line 299) | def __str__(self):

FILE: maigret/permutator.py
  class Permute (line 5) | class Permute:
    method __init__ (line 6) | def __init__(self, elements: dict):
    method gather (line 10) | def gather(self, method: str = "strict" or "all") -> dict:

FILE: maigret/report.py
  function filter_supposed_data (line 32) | def filter_supposed_data(data):
  function sort_report_by_data_points (line 43) | def sort_report_by_data_points(results):
  function save_csv_report (line 60) | def save_csv_report(filename: str, username: str, results: dict):
  function save_txt_report (line 65) | def save_txt_report(filename: str, username: str, results: dict):
  function save_html_report (line 70) | def save_html_report(filename: str, context: dict):
  function save_pdf_report (line 77) | def save_pdf_report(filename: str, context: dict):
  function save_json_report (line 88) | def save_json_report(filename: str, username: str, results: dict, report...
  class MaigretGraph (line 93) | class MaigretGraph:
    method __init__ (line 98) | def __init__(self, graph):
    method add_node (line 101) | def add_node(self, key, value, color=None):
    method link (line 117) | def link(self, node1_name, node2_name):
  function save_graph_report (line 121) | def save_graph_report(filename: str, username_results: list, db: Maigret...
  function get_plaintext_report (line 249) | def get_plaintext_report(context: dict) -> str:
  function generate_report_template (line 265) | def generate_report_template(is_pdf: bool):
  function generate_report_context (line 288) | def generate_report_context(username_results: list):
  function generate_csv_report (line 424) | def generate_csv_report(username: str, results: dict, csvfile):
  function generate_txt_report (line 446) | def generate_txt_report(username: str, results: dict, file):
  function generate_json_report (line 462) | def generate_json_report(username: str, results: dict, file, report_type):
  function save_xmind_report (line 497) | def save_xmind_report(filename, username, results):
  function add_xmind_subtopic (line 506) | def add_xmind_subtopic(userlink, k, v, supposed_data):
  function design_xmind_sheet (line 515) | def design_xmind_sheet(sheet, username, results):

FILE: maigret/result.py
  class MaigretCheckStatus (line 9) | class MaigretCheckStatus(Enum):
    method __str__ (line 20) | def __str__(self):
  class MaigretCheckResult (line 32) | class MaigretCheckResult:
    method __init__ (line 37) | def __init__(
    method json (line 85) | def json(self):
    method is_found (line 95) | def is_found(self):
    method __repr__ (line 98) | def __repr__(self):
    method __str__ (line 101) | def __str__(self):

FILE: maigret/settings.py
  class Settings (line 13) | class Settings:
    method __init__ (line 51) | def __init__(self):
    method load (line 54) | def load(self, paths=None):
    method json (line 85) | def json(self):

FILE: maigret/sites.py
  class MaigretEngine (line 11) | class MaigretEngine:
    method __init__ (line 14) | def __init__(self, name, data):
    method json (line 19) | def json(self):
  class MaigretSite (line 23) | class MaigretSite:
    method __init__ (line 96) | def __init__(self, name, information):
    method __str__ (line 109) | def __str__(self):
    method __is_equal_by_url_or_name (line 112) | def __is_equal_by_url_or_name(self, url_or_name_str: str):
    method __eq__ (line 126) | def __eq__(self, other):
    method update_detectors (line 160) | def update_detectors(self):
    method detect_username (line 172) | def detect_username(self, url: str) -> Optional[str]:
    method extract_id_from_url (line 180) | def extract_id_from_url(self, url: str) -> Optional[Tuple[str, str]]:
    method pretty_name (line 198) | def pretty_name(self):
    method json (line 204) | def json(self):
    method errors_dict (line 219) | def errors_dict(self) -> dict:
    method get_url_template (line 226) | def get_url_template(self) -> str:
    method update (line 237) | def update(self, updates: "dict") -> "MaigretSite":
    method update_from_engine (line 243) | def update_from_engine(self, engine: MaigretEngine) -> "MaigretSite":
    method strip_engine_data (line 261) | def strip_engine_data(self) -> "MaigretSite":
  class MaigretDatabase (line 293) | class MaigretDatabase:
    method __init__ (line 294) | def __init__(self):
    method sites (line 300) | def sites(self):
    method sites_dict (line 304) | def sites_dict(self):
    method has_site (line 307) | def has_site(self, site: MaigretSite):
    method __contains__ (line 313) | def __contains__(self, site):
    method ranked_sites_dict (line 316) | def ranked_sites_dict(
    method engines (line 377) | def engines(self):
    method engines_dict (line 381) | def engines_dict(self):
    method update_site (line 384) | def update_site(self, site: MaigretSite) -> "MaigretDatabase":
    method save_to_file (line 393) | def save_to_file(self, filename: str) -> "MaigretDatabase":
    method load_from_json (line 410) | def load_from_json(self, json_data: dict) -> "MaigretDatabase":
    method load_from_str (line 438) | def load_from_str(self, db_str: "str") -> "MaigretDatabase":
    method load_from_path (line 449) | def load_from_path(self, path: str) -> "MaigretDatabase":
    method load_from_http (line 455) | def load_from_http(self, url: str) -> "MaigretDatabase":
    method load_from_file (line 486) | def load_from_file(self, filename: "str") -> "MaigretDatabase":
    method get_scan_stats (line 503) | def get_scan_stats(self, sites_dict):
    method extract_ids_from_url (line 513) | def extract_ids_from_url(self, url: str) -> dict:
    method get_db_stats (line 523) | def get_db_stats(self, is_markdown=False):
    method _format_top_items (line 591) | def _format_top_items(

FILE: maigret/submit.py
  class CloudflareSession (line 22) | class CloudflareSession:
    method __init__ (line 23) | def __init__(self):
    method get (line 26) | async def get(self, *args, **kwargs):
    method status_code (line 33) | def status_code(self):
    method text (line 36) | async def text(self):
    method close (line 40) | async def close(self):
  class Submitter (line 44) | class Submitter:
    method __init__ (line 55) | def __init__(self, db: MaigretDatabase, settings: Settings, logger, ar...
    method close (line 77) | async def close(self):
    method get_alexa_rank (line 81) | def get_alexa_rank(site_url_main):
    method extract_mainpage_url (line 98) | def extract_mainpage_url(url):
    method site_self_check (line 101) | async def site_self_check(self, site, semaphore, silent=False):
    method generate_additional_fields_dialog (line 116) | def generate_additional_fields_dialog(self, engine: MaigretEngine, dia...
    method detect_known_engine (line 128) | async def detect_known_engine(
    method extract_username_dialog (line 183) | def extract_username_dialog(url):
    method get_html_response_to_compare (line 193) | async def get_html_response_to_compare(
    method check_features_manually (line 209) | async def check_features_manually(
    method add_site (line 321) | async def add_site(self, site):
    method dialog (line 393) | async def dialog(self, url_exists, cookie_file):

FILE: maigret/utils.py
  class CaseConverter (line 15) | class CaseConverter:
    method camel_to_snake (line 17) | def camel_to_snake(camelcased_string: str) -> str:
    method snake_to_camel (line 21) | def snake_to_camel(snakecased_string: str) -> str:
    method snake_to_title (line 27) | def snake_to_title(snakecased_string: str) -> str:
  function is_country_tag (line 33) | def is_country_tag(tag: str) -> bool:
  function enrich_link_str (line 38) | def enrich_link_str(link: str) -> str:
  class URLMatcher (line 45) | class URLMatcher:
    method extract_main_part (line 51) | def extract_main_part(self, url: str) -> str:
    method make_profile_url_regexp (line 59) | def make_profile_url_regexp(self, url: str, username_regexp: str = ""):
  function ascii_data_display (line 73) | def ascii_data_display(data: str) -> Any:
  function get_dict_ascii_tree (line 77) | def get_dict_ascii_tree(items, prepend="", new_line=True):
  function get_random_user_agent (line 106) | def get_random_user_agent():
  function get_match_ratio (line 110) | def get_match_ratio(base_strs: list):
  function generate_random_username (line 125) | def generate_random_username():

FILE: maigret/web/app.py
  function setup_logger (line 36) | def setup_logger(log_level, name):
  function maigret_search (line 42) | async def maigret_search(username, options):
  function search_multiple_usernames (line 87) | async def search_multiple_usernames(usernames, options):
  function process_search_task (line 98) | def process_search_task(usernames, options, timestamp):
  function index (line 193) | def index():
  function search (line 213) | def search():
  function status (line 265) | def status(timestamp):
  function results (line 296) | def results(session_id):
  function download_report (line 322) | def download_report(filename):

FILE: tests/conftest.py
  function by_slow_marker (line 44) | def by_slow_marker(item):
  function pytest_collection_modifyitems (line 48) | def pytest_collection_modifyitems(items):
  function get_test_reports_filenames (line 52) | def get_test_reports_filenames():
  function remove_test_reports (line 56) | def remove_test_reports():
  function default_db (line 64) | def default_db():
  function test_db (line 69) | def test_db():
  function local_test_db (line 74) | def local_test_db():
  function reports_autoclean (line 79) | def reports_autoclean():
  function settings (line 86) | def settings():
  function argparser (line 93) | def argparser():
  function httpserver_listen_address (line 100) | def httpserver_listen_address():
  function cookie_test_server (line 105) | async def cookie_test_server():

FILE: tests/test_activation.py
  function test_vimeo_activation (line 30) | def test_vimeo_activation(default_db):
  function test_import_aiohttp_cookies (line 42) | async def test_import_aiohttp_cookies(cookie_test_server):

FILE: tests/test_checking.py
  function site_result_except (line 7) | def site_result_except(server, username, **kwargs):
  function test_checking_by_status_code (line 14) | async def test_checking_by_status_code(httpserver, local_test_db):
  function test_checking_by_message_positive_full (line 29) | async def test_checking_by_message_positive_full(httpserver, local_test_...
  function test_checking_by_message_positive_part (line 44) | async def test_checking_by_message_positive_part(httpserver, local_test_...
  function test_checking_by_message_negative (line 59) | async def test_checking_by_message_negative(httpserver, local_test_db):

FILE: tests/test_cli.py
  function test_args_search_mode (line 51) | def test_args_search_mode(argparser):
  function test_args_search_mode_several_usernames (line 63) | def test_args_search_mode_several_usernames(argparser):
  function test_args_self_check_mode (line 75) | def test_args_self_check_mode(argparser):
  function test_args_multiple_sites (line 91) | def test_args_multiple_sites(argparser):

FILE: tests/test_data.py
  function test_tags_validity (line 8) | def test_tags_validity(default_db):

FILE: tests/test_errors.py
  function test_notify_about_errors (line 7) | def test_notify_about_errors():

FILE: tests/test_executors.py
  function func (line 17) | async def func(n):
  function test_simple_asyncio_executor (line 23) | async def test_simple_asyncio_executor():
  function test_asyncio_progressbar_executor (line 32) | async def test_asyncio_progressbar_executor():
  function test_asyncio_progressbar_semaphore_executor (line 43) | async def test_asyncio_progressbar_semaphore_executor():
  function test_asyncio_progressbar_queue_executor (line 55) | async def test_asyncio_progressbar_queue_executor():
  function test_asyncio_queue_generator_executor (line 83) | async def test_asyncio_queue_generator_executor():

FILE: tests/test_maigret.py
  function test_self_check_db (line 21) | async def test_self_check_db(test_db):
  function test_maigret_results (line 40) | def test_maigret_results(test_db):
  function test_extract_ids_from_url (line 88) | def test_extract_ids_from_url(default_db):
  function test_extract_ids_from_page (line 105) | def test_extract_ids_from_page(test_db):
  function test_extract_ids_from_results (line 112) | def test_extract_ids_from_results(test_db):

FILE: tests/test_notify.py
  function test_notify_illegal (line 6) | def test_notify_illegal():
  function test_notify_claimed (line 22) | def test_notify_claimed():
  function test_notify_available (line 38) | def test_notify_available():
  function test_notify_unknown (line 54) | def test_notify_unknown():

FILE: tests/test_permutator.py
  function test_gather_strict (line 5) | def test_gather_strict():
  function test_gather_all (line 26) | def test_gather_all():

FILE: tests/test_report.py
  function test_generate_report_template (line 266) | def test_generate_report_template():
  function test_generate_csv_report (line 278) | def test_generate_csv_report():
  function test_generate_csv_report_broken (line 291) | def test_generate_csv_report_broken():
  function test_generate_txt_report (line 304) | def test_generate_txt_report():
  function test_generate_txt_report_broken (line 317) | def test_generate_txt_report_broken():
  function test_generate_json_simple_report (line 329) | def test_generate_json_simple_report():
  function test_generate_json_simple_report_broken (line 342) | def test_generate_json_simple_report_broken():
  function test_generate_json_ndjson_report (line 355) | def test_generate_json_ndjson_report():
  function test_save_xmind_report (line 368) | def test_save_xmind_report():
  function test_save_xmind_report_broken (line 388) | def test_save_xmind_report_broken():
  function test_html_report (line 402) | def test_html_report():
  function test_html_report_broken (line 414) | def test_html_report_broken():
  function test_pdf_report (line 430) | def test_pdf_report():
  function test_text_report (line 438) | def test_text_report():
  function test_text_report_broken (line 448) | def test_text_report_broken():

FILE: tests/test_sites.py
  function test_load_empty_db_from_str (line 32) | def test_load_empty_db_from_str():
  function test_load_valid_db (line 40) | def test_load_valid_db():
  function test_site_json_dump (line 51) | def test_site_json_dump():
  function test_site_correct_initialization (line 62) | def test_site_correct_initialization():
  function test_site_strip_engine_data (line 75) | def test_site_strip_engine_data():
  function test_site_strip_engine_data_with_site_prior_updates (line 85) | def test_site_strip_engine_data_with_site_prior_updates():
  function test_saving_site_error (line 97) | def test_saving_site_error():
  function test_site_url_detector (line 113) | def test_site_url_detector():
  function test_ranked_sites_dict (line 127) | def test_ranked_sites_dict():
  function test_ranked_sites_dict_names (line 154) | def test_ranked_sites_dict_names():
  function test_ranked_sites_dict_disabled (line 165) | def test_ranked_sites_dict_disabled():
  function test_ranked_sites_dict_id_type (line 174) | def test_ranked_sites_dict_id_type():
  function test_get_url_template (line 185) | def test_get_url_template():
  function test_has_site_url_or_name (line 208) | def test_has_site_url_or_name(default_db):

FILE: tests/test_submit.py
  function test_detect_known_engine (line 11) | async def test_detect_known_engine(test_db, local_test_db):
  function test_check_features_manually_success (line 58) | async def test_check_features_manually_success(settings):
  function test_check_features_manually_success (line 112) | async def test_check_features_manually_success(settings):
  function test_dialog_adds_site_positive (line 146) | async def test_dialog_adds_site_positive(settings):
  function test_dialog_replace_site (line 194) | async def test_dialog_replace_site(settings, test_db):
  function test_dialog_adds_site_negative (line 248) | async def test_dialog_adds_site_negative(settings):

FILE: tests/test_utils.py
  function test_case_convert_camel_to_snake (line 16) | def test_case_convert_camel_to_snake():
  function test_case_convert_snake_to_camel (line 23) | def test_case_convert_snake_to_camel():
  function test_case_convert_snake_to_title (line 30) | def test_case_convert_snake_to_title():
  function test_case_convert_camel_with_digits_to_snake (line 37) | def test_case_convert_camel_with_digits_to_snake():
  function test_is_country_tag (line 44) | def test_is_country_tag():
  function test_enrich_link_str (line 54) | def test_enrich_link_str():
  function test_url_extract_main_part_negative (line 62) | def test_url_extract_main_part_negative():
  function test_url_extract_main_part (line 67) | def test_url_extract_main_part():
  function test_url_make_profile_url_regexp (line 86) | def test_url_make_profile_url_regexp():
  function test_get_dict_ascii_tree (line 106) | def test_get_dict_ascii_tree():
  function test_get_match_ratio (line 143) | def test_get_match_ratio():

FILE: utils/add_tags.py
  function update_tags (line 9) | def update_tags(site):

FILE: utils/check_engines.py
  function check_engine_of_site (line 14) | async def check_engine_of_site(site_name, sites_with_engines, future, en...
  function update_site_data (line 106) | async def update_site_data(site_name, site_data, all_sites, logger, no_p...

FILE: utils/import_sites.py
  function maigret_check (line 18) | async def maigret_check(site, site_data, username, status, logger):
  function check_and_add_maigret_site (line 57) | async def check_and_add_maigret_site(site_data, semaphore, logger, ok_us...
  function create_site_from_engine (line 236) | def create_site_from_engine(sitename, data, e):

FILE: utils/sites_diff.py
  function get_match_ratio (line 26) | def get_match_ratio(x):

FILE: utils/update_site_data.py
  function get_rank (line 30) | def get_rank(domain_to_query, site, print_errors=True):
  function get_step_rank (line 59) | def get_step_rank(rank):
  function main (line 70) | def main():

FILE: wizard.py
  function main (line 11) | def main():
Condensed preview — 98 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (2,394K chars).
[
  {
    "path": ".dockerignore",
    "chars": 62,
    "preview": ".git/\n.vscode/\nstatic/\ntests/\n*.txt\n!/requirements.txt\nvenv/\n\n"
  },
  {
    "path": ".githooks/pre-commit",
    "chars": 83,
    "preview": "#!/bin/sh\necho 'Activating update_sitesmd hook script...'\npoetry run update_sitesmd"
  },
  {
    "path": ".github/FUNDING.yml",
    "chars": 98,
    "preview": "# These are supported funding model platforms\n\npatreon: soxoj\ngithub: soxoj\nbuy_me_a_coffee: soxoj"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/add-a-site.md",
    "chars": 337,
    "preview": "---\nname: Add a site\nabout: I want to add a new site for Maigret checks\ntitle: New site\nlabels: new-site\nassignees: soxo"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/bug.md",
    "chars": 629,
    "preview": "---\nname: Maigret bug report\nabout: I want to report a bug in Maigret functionality\ntitle: ''\nlabels: bug\nassignees: sox"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/report-false-result.md",
    "chars": 288,
    "preview": "---\nname: Report invalid result\nabout: I want to report invalid result of Maigret search\ntitle: Invalid result\nlabels: f"
  },
  {
    "path": ".github/dependabot.yml",
    "chars": 106,
    "preview": "version: 2\nupdates:\n  - package-ecosystem: \"pip\"\n    directory: \"/\"\n    schedule:\n      interval: \"daily\"\n"
  },
  {
    "path": ".github/workflows/build-docker-image.yml",
    "chars": 852,
    "preview": "name: Build docker image and push to DockerHub\n\non:\n  push:\n    branches: [ main ]\n\njobs:\n  docker:\n    runs-on: ubuntu-"
  },
  {
    "path": ".github/workflows/codeql-analysis.yml",
    "chars": 2219,
    "preview": "# For most projects, this workflow file will not need changing; you simply need\n# to commit it to your repository.\n#\n# Y"
  },
  {
    "path": ".github/workflows/pyinstaller.yml",
    "chars": 1755,
    "preview": "name: Package exe with PyInstaller - Windows\n\non:\n  push:\n    branches: [ main, dev ]\n\njobs:\n  build:\n    runs-on: ubunt"
  },
  {
    "path": ".github/workflows/python-package.yml",
    "chars": 1080,
    "preview": "name: Linting and testing\n\non:\n  push:\n    branches: [ main ]\n  pull_request:\n    branches: [ main ]\n    types: [opened,"
  },
  {
    "path": ".github/workflows/python-publish.yml",
    "chars": 480,
    "preview": "name: Upload Python Package to PyPI when a Release is Created\non:\n  release:\n    types: [created]\n  push:\n    tags:\n    "
  },
  {
    "path": ".github/workflows/update-site-data.yml",
    "chars": 1091,
    "preview": "name: Update sites rating and statistics\n\non:\n  pull_request:\n    branches: [ dev ]\n    types: [opened, synchronize]\n\njo"
  },
  {
    "path": ".gitignore",
    "chars": 451,
    "preview": "# Virtual Environment\nvenv/\n.venv/\n\n# Editor Configurations\n.vscode/\n.idea/\n\n# Python\n__pycache__/\n\n# Pip\nsrc/\n\n# Jupyte"
  },
  {
    "path": ".readthedocs.yaml",
    "chars": 192,
    "preview": "version: 2\n\nbuild:\n  os: ubuntu-22.04\n  tools:\n    python: \"3.10\"\n\nsphinx:\n  configuration: docs/source/conf.py\n\nformats"
  },
  {
    "path": "CHANGELOG.md",
    "chars": 46156,
    "preview": "# Changelog\n\n## [0.5.0] - 2025-08-10\n* Site Supression by @C3n7ral051nt4g3ncy in https://github.com/soxoj/maigret/pull/6"
  },
  {
    "path": "CODE_OF_CONDUCT.md",
    "chars": 5220,
    "preview": "# Contributor Covenant Code of Conduct\n\n## Our Pledge\n\nWe as members, contributors, and leaders pledge to make participa"
  },
  {
    "path": "CONTRIBUTING.md",
    "chars": 2118,
    "preview": "# How to contribute\n\nHey! I'm really glad you're reading this. Maigret contains a lot of sites, and it is very hard to k"
  },
  {
    "path": "Dockerfile",
    "chars": 529,
    "preview": "FROM python:3.11-slim\nLABEL maintainer=\"Soxoj <soxoj@protonmail.com>\"\nWORKDIR /app\nRUN pip install --no-cache-dir --upgr"
  },
  {
    "path": "Installer.bat",
    "chars": 3431,
    "preview": "@echo off\r\ngoto check_Permissions\r\n\r\n:check_Permissions\r\nnet session >nul 2>&1\r\nif %errorLevel% == 0 (\r\n    echo Success"
  },
  {
    "path": "LICENSE",
    "chars": 1103,
    "preview": "MIT License\n\nCopyright (c) 2019 Sherlock Project\nCopyright (c) 2020-2021 Soxoj\n\nPermission is hereby granted, free of ch"
  },
  {
    "path": "MANIFEST.in",
    "chars": 87,
    "preview": "include LICENSE\ninclude README.md\ninclude requirements.txt\ninclude maigret/resources/*\n"
  },
  {
    "path": "Makefile",
    "chars": 947,
    "preview": "LINT_FILES=maigret wizard.py tests\n\ntest:\n\tcoverage run --source=./maigret,./maigret/web -m pytest tests\n\tcoverage repor"
  },
  {
    "path": "README.md",
    "chars": 9337,
    "preview": "# Maigret\n\n<p align=\"center\">\n  <p align=\"center\">\n    <a href=\"https://pypi.org/project/maigret/\">\n        <img alt=\"Py"
  },
  {
    "path": "docs/Makefile",
    "chars": 638,
    "preview": "# Minimal makefile for Sphinx documentation\n#\n\n# You can set these variables from the command line, and also\n# from the "
  },
  {
    "path": "docs/make.bat",
    "chars": 764,
    "preview": "@ECHO OFF\n\npushd %~dp0\n\nREM Command file for Sphinx documentation\n\nif \"%SPHINXBUILD%\" == \"\" (\n\tset SPHINXBUILD=sphinx-bu"
  },
  {
    "path": "docs/requirements.txt",
    "chars": 34,
    "preview": "sphinx-copybutton\nsphinx_rtd_theme"
  },
  {
    "path": "docs/source/command-line-options.rst",
    "chars": 4591,
    "preview": ".. _command-line-options:\n\nCommand line options\n====================\n\nUsernames\n---------\n\n``maigret username1 username2"
  },
  {
    "path": "docs/source/conf.py",
    "chars": 728,
    "preview": "# Configuration file for the Sphinx documentation builder.\n\n# -- Project information\n\nproject = 'Maigret'\ncopyright = '2"
  },
  {
    "path": "docs/source/development.rst",
    "chars": 9246,
    "preview": ".. _development:\n\nDevelopment\n==============\n\nFrequently Asked Questions\n--------------------------\n\n1. Where to find th"
  },
  {
    "path": "docs/source/features.rst",
    "chars": 10245,
    "preview": ".. _features:\n\nFeatures\n========\n\nThis is the list of Maigret features.\n\n.. _web-interface:\n\nWeb Interface\n-------------"
  },
  {
    "path": "docs/source/index.rst",
    "chars": 1715,
    "preview": ".. _index:\n\nWelcome to the Maigret docs!\n============================\n\n**Maigret** is an easy-to-use and powerful OSINT "
  },
  {
    "path": "docs/source/installation.rst",
    "chars": 2839,
    "preview": ".. _installation:\n\nInstallation\n============\n\nMaigret can be installed using pip, Docker, or simply can be launched from"
  },
  {
    "path": "docs/source/philosophy.rst",
    "chars": 745,
    "preview": ".. _philosophy:\n\nPhilosophy\n==========\n\nTL;DR: Username => Dossier\n\nMaigret is designed to gather all the available info"
  },
  {
    "path": "docs/source/quick-start.rst",
    "chars": 524,
    "preview": ".. _quick-start:\n\nQuick start\n===========\n\nAfter :doc:`installing Maigret <installation>`, you can begin searching by pr"
  },
  {
    "path": "docs/source/settings.rst",
    "chars": 907,
    "preview": ".. _settings:\n\nSettings\n==============\n\n.. warning::\n   The settings system is under development and may be subject to c"
  },
  {
    "path": "docs/source/supported-identifier-types.rst",
    "chars": 824,
    "preview": ".. _supported-identifier-types:\n\nSupported identifier types\n==========================\n\nMaigret can search against not o"
  },
  {
    "path": "docs/source/tags.rst",
    "chars": 1230,
    "preview": ".. _tags:\n\nTags\n====\n\nThe use of tags allows you to select a subset of the sites from big Maigret DB for search.\n\n.. war"
  },
  {
    "path": "docs/source/usage-examples.rst",
    "chars": 2087,
    "preview": ".. _usage-examples:\n\nUsage examples\n==============\n\nYou can use Maigret as:\n\n- a command line tool: initial and a defaul"
  },
  {
    "path": "maigret/__init__.py",
    "chars": 348,
    "preview": "\"\"\"Maigret\"\"\"\n\n__title__ = 'Maigret'\n__package__ = 'maigret'\n__author__ = 'Soxoj'\n__author_email__ = 'soxoj@protonmail.c"
  },
  {
    "path": "maigret/__main__.py",
    "chars": 147,
    "preview": "#! /usr/bin/env python3\n\n\"\"\"\nMaigret entrypoint\n\"\"\"\n\nimport asyncio\n\nfrom .maigret import main\n\nif __name__ == \"__main__"
  },
  {
    "path": "maigret/__version__.py",
    "chars": 50,
    "preview": "\"\"\"Maigret version file\"\"\"\n\n__version__ = '0.5.0'\n"
  },
  {
    "path": "maigret/activation.py",
    "chars": 3157,
    "preview": "import json\nfrom http.cookiejar import MozillaCookieJar\nfrom http.cookies import Morsel\n\nfrom aiohttp import CookieJar\n\n"
  },
  {
    "path": "maigret/checking.py",
    "chars": 31935,
    "preview": "# Standard library imports\nimport ast\nimport asyncio\nimport logging\nimport random\nimport re\nimport ssl\nimport sys\nfrom t"
  },
  {
    "path": "maigret/errors.py",
    "chars": 5248,
    "preview": "from typing import Dict, List, Any, Tuple\n\nfrom .result import MaigretCheckResult\nfrom .types import QueryResultWrapper\n"
  },
  {
    "path": "maigret/executors.py",
    "chars": 8587,
    "preview": "import asyncio\nimport sys\nimport time\nfrom typing import Any, Iterable, List, Callable\n\nimport alive_progress\nfrom alive"
  },
  {
    "path": "maigret/maigret.py",
    "chars": 25249,
    "preview": "\"\"\"\nMaigret main module\n\"\"\"\n\nimport ast\nimport asyncio\nimport logging\nimport os\nimport sys\nimport platform\nimport re\nfro"
  },
  {
    "path": "maigret/notify.py",
    "chars": 9093,
    "preview": "\"\"\"Sherlock Notify Module\n\nThis module defines the objects for notifying the caller about the\nresults of queries.\n\"\"\"\n\ni"
  },
  {
    "path": "maigret/permutator.py",
    "chars": 1201,
    "preview": "# License MIT. by balestek https://github.com/balestek\nfrom itertools import permutations\n\n\nclass Permute:\n    def __ini"
  },
  {
    "path": "maigret/report.py",
    "chars": 19624,
    "preview": "import ast\nimport csv\nimport io\nimport json\nimport logging\nimport os\nfrom datetime import datetime\nfrom typing import Di"
  },
  {
    "path": "maigret/resources/data.json",
    "chars": 1244159,
    "preview": "{\n    \"sites\": {\n        \"0-3.RU\": {\n            \"tags\": [\n                \"forum\",\n                \"ru\"\n            ],\n"
  },
  {
    "path": "maigret/resources/simple_report.tpl",
    "chars": 5549,
    "preview": "<html>\n<head>\n    <meta charset=\"utf-8\" />\n</head>\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0,"
  },
  {
    "path": "maigret/resources/simple_report_pdf.css",
    "chars": 575,
    "preview": "h2 {\n  font-size: 30px;\n  width: 100%;\n  display:block;\n}\nh3 {\n  font-size: 25px;\n  width: 100%;\n  display:block;\n}\nh4 {"
  },
  {
    "path": "maigret/resources/simple_report_pdf.tpl",
    "chars": 5420,
    "preview": "<html>type=\"text/css\"\n<head>\n    <meta charset=\"utf-8\" />\n</head>\n<meta name=\"viewport\" content=\"width=device-width, ini"
  },
  {
    "path": "maigret/result.py",
    "chars": 3678,
    "preview": "\"\"\"Maigret Result Module\n\nThis module defines various objects for recording the results of queries.\n\"\"\"\n\nfrom enum impor"
  },
  {
    "path": "maigret/settings.py",
    "chars": 2103,
    "preview": "import os\nimport os.path as path\nimport json\nfrom typing import List\n\nSETTINGS_FILES_PATHS = [\n    path.join(path.dirnam"
  },
  {
    "path": "maigret/sites.py",
    "chars": 20544,
    "preview": "# ****************************** -*-\n\"\"\"Maigret Sites Information\"\"\"\nimport copy\nimport json\nimport sys\nfrom typing impo"
  },
  {
    "path": "maigret/submit.py",
    "chars": 24120,
    "preview": "import asyncio\nimport json\nimport re\nimport os\nimport logging\nfrom typing import Any, Dict, List, Optional, Tuple\n\nfrom "
  },
  {
    "path": "maigret/types.py",
    "chars": 211,
    "preview": "from typing import Callable, List, Dict, Tuple, Any\n\n\n# search query\nQueryDraft = Tuple[Callable, List, Dict]\n\n# options"
  },
  {
    "path": "maigret/utils.py",
    "chars": 3729,
    "preview": "# coding: utf8\nimport ast\nimport difflib\nimport re\nimport random\nimport string\nfrom typing import Any\n\n\nDEFAULT_USER_AGE"
  },
  {
    "path": "maigret/web/app.py",
    "chars": 12216,
    "preview": "from flask import (\n    Flask,\n    render_template,\n    request,\n    send_file,\n    Response,\n    flash,\n    redirect,\n "
  },
  {
    "path": "maigret/web/templates/base.html",
    "chars": 3250,
    "preview": "<!DOCTYPE html>\n<html lang=\"en\" data-bs-theme=\"dark\">\n\n<head>\n    <meta charset=\"UTF-8\">\n    <meta name=\"viewport\" conte"
  },
  {
    "path": "maigret/web/templates/index.html",
    "chars": 15418,
    "preview": "{% extends \"base.html\" %}\n\n{% block content %}\n<style>\n    .tag-cloud {\n        display: flex;\n        flex-wrap: wrap;\n"
  },
  {
    "path": "maigret/web/templates/results.html",
    "chars": 4999,
    "preview": "{% extends \"base.html\" %}\n{% block content %}\n<style>\n    .tag-badge {\n       background-color: #214e7b;\n       padding:"
  },
  {
    "path": "maigret/web/templates/status.html",
    "chars": 565,
    "preview": "{% extends \"base.html\" %}\n{% block content %}\n<div class=\"container mt-4 text-center\">\n    <h2>Search in progress...</h2"
  },
  {
    "path": "pyinstaller/maigret_standalone.py",
    "chars": 112,
    "preview": "#!/usr/bin/env python3\nimport asyncio\n\nimport maigret\n\nif __name__ == \"__main__\":\n    asyncio.run(maigret.cli())"
  },
  {
    "path": "pyinstaller/maigret_standalone.spec",
    "chars": 1585,
    "preview": "# -*- mode: python ; coding: utf-8 -*-\nfrom PyInstaller.utils.hooks import collect_all\n\ndatas = []\nbinaries = []\nhiddeni"
  },
  {
    "path": "pyinstaller/requirements.txt",
    "chars": 207,
    "preview": "maigret @ https://github.com/soxoj/maigret/archive/refs/heads/main.zip\npefile==2023.2.7 # do not bump while pyinstaller "
  },
  {
    "path": "pyproject.toml",
    "chars": 2485,
    "preview": "[build-system]\nrequires = [\"poetry-core\"]\nbuild-backend = \"poetry.core.masonry.api\"\n\n[tool.poetry]\nname = \"maigret\"\nvers"
  },
  {
    "path": "pytest.ini",
    "chars": 90,
    "preview": "# pytest.ini\n[pytest]\nfilterwarnings =\n    error\n    ignore::UserWarning\nasyncio_mode=auto"
  },
  {
    "path": "sites.md",
    "chars": 480598,
    "preview": "\n## List of supported sites (search methods): total 3143\n\nRank data fetched from Alexa by domains.\n\n1. ![](https://www.g"
  },
  {
    "path": "snapcraft.yaml",
    "chars": 1015,
    "preview": "title: Maigret\nicon: static/maigret.png\nname: maigret\nsummary: 🕵️‍♂️ Collect a dossier on a person by username from thou"
  },
  {
    "path": "static/recursive_search.md",
    "chars": 5113,
    "preview": "## Demo with page parsing and recursive username search\n\n```bash\n$ maigret.py alexaimephotographycars\nSites in database,"
  },
  {
    "path": "static/report_alexaimephotographycars.html",
    "chars": 90104,
    "preview": "<html>\n<head>\n    <meta charset=\"utf-8\" />\n</head>\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0,"
  },
  {
    "path": "tests/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "tests/conftest.py",
    "chars": 2973,
    "preview": "import glob\nimport logging\nimport os\n\nimport pytest\nfrom _pytest.mark import Mark\n\nfrom maigret.sites import MaigretData"
  },
  {
    "path": "tests/db.json",
    "chars": 2275,
    "preview": "{\n    \"engines\": {\n        \"Discourse\": {\n            \"name\": \"Discourse\",\n            \"site\": {\n                \"presen"
  },
  {
    "path": "tests/local.json",
    "chars": 679,
    "preview": "{\n    \"engines\": {},\n    \"sites\": {\n        \"StatusCode\": {\n            \"checkType\": \"status_code\",\n            \"url\": \""
  },
  {
    "path": "tests/test_activation.py",
    "chars": 1816,
    "preview": "\"\"\"Maigret activation test functions\"\"\"\n\nimport json\nimport yarl\n\nimport aiohttp\nimport pytest\nfrom mock import Mock\n\nfr"
  },
  {
    "path": "tests/test_checking.py",
    "chars": 2491,
    "preview": "from mock import Mock\nimport pytest\n\nfrom maigret import search\n\n\ndef site_result_except(server, username, **kwargs):\n  "
  },
  {
    "path": "tests/test_cli.py",
    "chars": 2618,
    "preview": "\"\"\"Maigret command-line arguments parsing tests\"\"\"\n\nfrom argparse import Namespace\nfrom typing import Dict, Any\n\nDEFAULT"
  },
  {
    "path": "tests/test_data.py",
    "chars": 599,
    "preview": "\"\"\"Maigret data test functions\"\"\"\n\nimport pytest\nfrom maigret.utils import is_country_tag\n\n\n@pytest.mark.slow\ndef test_t"
  },
  {
    "path": "tests/test_errors.py",
    "chars": 1828,
    "preview": "import pytest\nfrom maigret.errors import notify_about_errors, CheckError\nfrom maigret.types import QueryResultWrapper\nfr"
  },
  {
    "path": "tests/test_executors.py",
    "chars": 3921,
    "preview": "\"\"\"Maigret checking logic test functions\"\"\"\n\nimport pytest\nimport asyncio\nimport logging\nfrom maigret.executors import ("
  },
  {
    "path": "tests/test_maigret.py",
    "chars": 3864,
    "preview": "\"\"\"Maigret main module test functions\"\"\"\n\nimport asyncio\nimport copy\n\nimport pytest\nfrom mock import Mock\n\nfrom maigret."
  },
  {
    "path": "tests/test_notify.py",
    "chars": 1703,
    "preview": "from maigret.errors import CheckError\nfrom maigret.notify import QueryNotifyPrint\nfrom maigret.result import MaigretChec"
  },
  {
    "path": "tests/test_permutator.py",
    "chars": 955,
    "preview": "import pytest\nfrom maigret.permutator import Permute\n\n\ndef test_gather_strict():\n    elements = {'a': 1, 'b': 2}\n    per"
  },
  {
    "path": "tests/test_report.py",
    "chars": 15392,
    "preview": "\"\"\"Maigret reports test functions\"\"\"\n\nimport copy\nimport json\nimport os\nimport pytest\nfrom io import StringIO\n\nimport xm"
  },
  {
    "path": "tests/test_sites.py",
    "chars": 6642,
    "preview": "\"\"\"Maigret Database test functions\"\"\"\n\nfrom maigret.sites import MaigretDatabase, MaigretSite\n\nEXAMPLE_DB = {\n    'engin"
  },
  {
    "path": "tests/test_submit.py",
    "chars": 8140,
    "preview": "import pytest\nfrom unittest.mock import MagicMock, patch\nfrom maigret.submit import Submitter\nfrom aiohttp import Client"
  },
  {
    "path": "tests/test_utils.py",
    "chars": 4505,
    "preview": "\"\"\"Maigret utils test functions\"\"\"\n\nimport itertools\nimport re\n\nfrom maigret.utils import (\n    CaseConverter,\n    is_co"
  },
  {
    "path": "utils/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "utils/add_tags.py",
    "chars": 1742,
    "preview": "#!/usr/bin/env python3\nimport random\nfrom argparse import ArgumentParser, RawDescriptionHelpFormatter\n\nfrom maigret.maig"
  },
  {
    "path": "utils/check_engines.py",
    "chars": 5368,
    "preview": "#!/usr/bin/env python3\n\"\"\"Maigret: Supported Site Listing with Alexa ranking and country tags\nThis module generates the "
  },
  {
    "path": "utils/import_sites.py",
    "chars": 9373,
    "preview": "#!/usr/bin/env python3\nimport json\nimport random\nimport re\n\nimport alive_progress\nfrom mock import Mock\nimport requests\n"
  },
  {
    "path": "utils/sites_diff.py",
    "chars": 787,
    "preview": "import sys\nimport difflib\nimport requests\n\n\na = requests.get(sys.argv[1]).text\nb = requests.get(sys.argv[2]).text\n\n\ntoke"
  },
  {
    "path": "utils/update_site_data.py",
    "chars": 5406,
    "preview": "#!/usr/bin/env python3\n\"\"\"Maigret: Supported Site Listing with Alexa ranking and country tags\nThis module generates the "
  },
  {
    "path": "wizard.py",
    "chars": 1791,
    "preview": "import asyncio\nimport logging\nimport maigret\n\n\nTOP_SITES_COUNT = 300\nTIMEOUT = 10\nMAX_CONNECTIONS = 50\n\n\ndef main():\n   "
  }
]

About this extraction

This page contains the full source code of the soxoj/maigret GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 98 files (2.1 MB), approximately 562.4k tokens, and a symbol index with 311 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!