Showing preview only (525K chars total). Download the full file or copy to clipboard to get everything.
Repository: sensity-ai/dot
Branch: main
Commit: 64cb9db61047
Files: 127
Total size: 490.8 KB
Directory structure:
gitextract_wvjb3qsw/
├── .flake8
├── .github/
│ ├── ISSUE_TEMPLATE/
│ │ ├── ask_a_question.md
│ │ ├── bug_report.md
│ │ ├── documentation.md
│ │ └── feature_request.md
│ ├── PULL_REQUEST_TEMPLATE.md
│ └── workflows/
│ ├── build_dot.yaml
│ └── code_check.yaml
├── .gitignore
├── .pre-commit-config.yaml
├── .yamllint
├── CHANGELOG.md
├── CONTRIBUTING.md
├── Dockerfile
├── LICENSE
├── README.md
├── configs/
│ ├── faceswap_cv2.yaml
│ ├── fomm.yaml
│ ├── simswap.yaml
│ └── simswaphq.yaml
├── docker-compose.yml
├── docs/
│ ├── create_executable.md
│ ├── profiling.md
│ └── run_without_camera.md
├── envs/
│ ├── environment-apple-m2.yaml
│ ├── environment-cpu.yaml
│ └── environment-gpu.yaml
├── notebooks/
│ └── colab_demo.ipynb
├── pyproject.toml
├── requirements-apple-m2.txt
├── requirements-dev.txt
├── requirements.txt
├── scripts/
│ ├── image_swap.py
│ ├── metadata_swap.py
│ ├── profile_simswap.py
│ └── video_swap.py
├── setup.cfg
├── src/
│ └── dot/
│ ├── __init__.py
│ ├── __main__.py
│ ├── commons/
│ │ ├── __init__.py
│ │ ├── cam/
│ │ │ ├── __init__.py
│ │ │ ├── cam.py
│ │ │ └── camera_selector.py
│ │ ├── camera_utils.py
│ │ ├── model_option.py
│ │ ├── pose/
│ │ │ └── head_pose.py
│ │ ├── utils.py
│ │ └── video/
│ │ ├── __init__.py
│ │ ├── video_utils.py
│ │ └── videocaptureasync.py
│ ├── dot.py
│ ├── faceswap_cv2/
│ │ ├── __init__.py
│ │ ├── generic.py
│ │ ├── option.py
│ │ └── swap.py
│ ├── fomm/
│ │ ├── __init__.py
│ │ ├── config/
│ │ │ └── vox-adv-256.yaml
│ │ ├── face_alignment.py
│ │ ├── modules/
│ │ │ ├── __init__.py
│ │ │ ├── dense_motion.py
│ │ │ ├── generator_optim.py
│ │ │ ├── keypoint_detector.py
│ │ │ └── util.py
│ │ ├── option.py
│ │ ├── predictor_local.py
│ │ └── sync_batchnorm/
│ │ ├── __init__.py
│ │ ├── batchnorm.py
│ │ └── comm.py
│ ├── gpen/
│ │ ├── __init__.py
│ │ ├── __init_paths.py
│ │ ├── align_faces.py
│ │ ├── face_enhancement.py
│ │ ├── face_model/
│ │ │ ├── __init__.py
│ │ │ ├── face_gan.py
│ │ │ ├── model.py
│ │ │ └── op/
│ │ │ ├── __init__.py
│ │ │ ├── fused_act.py
│ │ │ ├── fused_act_v2.py
│ │ │ ├── fused_bias_act.cpp
│ │ │ ├── fused_bias_act_kernel.cu
│ │ │ ├── upfirdn2d.cpp
│ │ │ ├── upfirdn2d.py
│ │ │ ├── upfirdn2d_kernel.cu
│ │ │ └── upfirdn2d_v2.py
│ │ └── retinaface/
│ │ ├── __init__.py
│ │ ├── data/
│ │ │ ├── FDDB/
│ │ │ │ └── img_list.txt
│ │ │ ├── __init__.py
│ │ │ ├── config.py
│ │ │ ├── data_augment.py
│ │ │ └── wider_face.py
│ │ ├── facemodels/
│ │ │ ├── __init__.py
│ │ │ ├── net.py
│ │ │ └── retinaface.py
│ │ ├── layers/
│ │ │ ├── __init__.py
│ │ │ ├── functions/
│ │ │ │ └── prior_box.py
│ │ │ └── modules/
│ │ │ ├── __init__.py
│ │ │ └── multibox_loss.py
│ │ ├── retinaface_detection.py
│ │ └── utils/
│ │ ├── __init__.py
│ │ ├── box_utils.py
│ │ ├── nms/
│ │ │ ├── __init__.py
│ │ │ └── py_cpu_nms.py
│ │ └── timer.py
│ ├── simswap/
│ │ ├── __init__.py
│ │ ├── configs/
│ │ │ ├── config.yaml
│ │ │ └── config_512.yaml
│ │ ├── fs_model.py
│ │ ├── mediapipe/
│ │ │ ├── __init__.py
│ │ │ ├── face_mesh.py
│ │ │ └── utils/
│ │ │ ├── face_align_ffhqandnewarc.py
│ │ │ └── mediapipe_landmarks.py
│ │ ├── models/
│ │ │ ├── __init__.py
│ │ │ ├── arcface_models.py
│ │ │ ├── base_model.py
│ │ │ ├── fs_networks.py
│ │ │ ├── fs_networks_512.py
│ │ │ └── models.py
│ │ ├── option.py
│ │ ├── parsing_model/
│ │ │ ├── __init__.py
│ │ │ ├── model.py
│ │ │ └── resnet.py
│ │ └── util/
│ │ ├── __init__.py
│ │ ├── norm.py
│ │ ├── reverse2original.py
│ │ └── util.py
│ └── ui/
│ └── ui.py
└── tests/
└── pipeline_test.py
================================================
FILE CONTENTS
================================================
================================================
FILE: .flake8
================================================
[flake8]
max-line-length = 120
extend-ignore = E203
per-file-ignores = __init__.py:F401
================================================
FILE: .github/ISSUE_TEMPLATE/ask_a_question.md
================================================
---
name: Ask a Question
about: Ask a question about using dot
labels: question
---
## :question: Ask a Question:
### Description:
<!-- A clear and concise description of your question. Ex. what is/how to [...] -->
================================================
FILE: .github/ISSUE_TEMPLATE/bug_report.md
================================================
---
name: Bug Report
about: Report bugs to improve dot
labels: bug
---
## :bug: Bug Report
<!-- Note: Remove sections from the template that are not relevant to the issue. -->
### Description:
#### Actual Behavior:
<!-- A clear and concise description of what the bug is, including steps for reproducing it. -->
#### Expected Behavior:
<!-- A clear and concise description of what you expected to happen. -->
================================================
FILE: .github/ISSUE_TEMPLATE/documentation.md
================================================
---
name: Documentation
about: Report an issue related to dot documentation
labels: documentation
---
## :memo: Documentation
<!-- Note: Remove sections from the template that are not relevant to the issue. -->
### Description:
<!-- A clear and concise description of what needs to be added, updated or removed from current documentation. -->
================================================
FILE: .github/ISSUE_TEMPLATE/feature_request.md
================================================
---
name: Feature Request
about: Submit a feature request for dot
labels: feature
---
## :sparkles: Feature Request
<!-- Note: Remove sections from the template that are not relevant to the issue. -->
### Description:
<!-- A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] -->
================================================
FILE: .github/PULL_REQUEST_TEMPLATE.md
================================================
<!-- Is this pull request ready for review? (if not, please submit in draft mode) -->
## Description
<!--
Please include a summary of the change and which issue is fixed.
Please also include relevant motivation and context.
List any dependencies that are required for this change.
-->
<!-- remove if not applicable -->
Fixes #(issue-number)
### Changelog:
<!--
Add changes in a list and add issue number in brackets, if required.
Remove sections which are not applicable and remember to update CHANGELOG.md as well.
-->
#### Added:
- ...
#### Updated:
- ...
#### Fixed:
- ...
#### Removed:
- ...
================================================
FILE: .github/workflows/build_dot.yaml
================================================
name: build-dot
on:
push:
branches:
- main
paths-ignore:
- "**.md"
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
paths-ignore:
- "**.md"
jobs:
build-and-test:
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest
steps:
- name: Code Checkout
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: 3.8
cache: 'pip'
cache-dependency-path: 'requirements*.txt'
- name: Install dependencies
run: |
sudo apt-get update && sudo apt-get install -y ffmpeg libsndfile1-dev
pip install -r requirements.txt
pip install -e .
- name: Unit Tests
run: |
pip install -c requirements-dev.txt --force-reinstall pytest pytest-cov
pytest --cov=src --cov-report=term-missing:skip-covered --cov-fail-under=10
================================================
FILE: .github/workflows/code_check.yaml
================================================
name: code-check
on:
push:
branches:
- main
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
jobs:
code-check:
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest
steps:
- name: Code Checkout
uses: actions/checkout@v2
- name: Set up Python 3.8
uses: actions/setup-python@v2
with:
python-version: 3.8
cache: 'pip'
cache-dependency-path: 'requirements*.txt'
- uses: actions/cache@v3
with:
path: ~/.cache/pre-commit
key: ${{ runner.os }}-pre-commit-${{ hashFiles('.pre-commit-config.yaml') }}
- name: Code Check
run: |
pip install pre-commit
pre-commit run --all --show-diff-on-failure
================================================
FILE: .gitignore
================================================
# repo ignores
data/results/*
saved_models/*
*.patch
# Created by https://www.toptal.com/developers/gitignore/api/python,macos,windows,linux
# Edit at https://www.toptal.com/developers/gitignore?templates=python,macos,windows,linux
### Linux ###
*~
# temporary files which can be created if a process still has a handle open of a deleted file
.fuse_hidden*
# KDE directory preferences
.directory
# Linux trash folder which might appear on any partition or disk
.Trash-*
# .nfs files are created when an open file is removed but is still being accessed
.nfs*
### macOS ###
# General
.DS_Store
.AppleDouble
.LSOverride
# Icon must end with two \r
Icon
# Thumbnails
._*
# Files that might appear in the root of a volume
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent
# Directories potentially created on remote AFP share
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk
### macOS Patch ###
# iCloud generated files
*.icloud
### Python ###
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
### Windows ###
# Windows thumbnail cache files
Thumbs.db
Thumbs.db:encryptable
ehthumbs.db
ehthumbs_vista.db
# Dump file
*.stackdump
# Folder config file
[Dd]esktop.ini
# Recycle Bin used on file shares
$RECYCLE.BIN/
# Windows Installer files
*.cab
*.msi
*.msix
*.msm
*.msp
# Windows shortcuts
*.lnk
# End of https://www.toptal.com/developers/gitignore/api/python,macos,windows,linux
================================================
FILE: .pre-commit-config.yaml
================================================
default_language_version:
python: python3.8
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.3.0
hooks:
- id: check-json
- id: check-toml
- id: check-yaml
args: [--allow-multiple-documents]
- id: end-of-file-fixer
- id: mixed-line-ending
- id: trailing-whitespace
args: [--markdown-linebreak-ext=md]
exclude: "setup.cfg"
- repo: https://github.com/psf/black
rev: 22.6.0
hooks:
- id: black
- repo: https://github.com/PyCQA/flake8
rev: 6.0.0
hooks:
- id: flake8
args: [--max-line-length=150, --extend-ignore=E203]
- repo: https://github.com/PyCQA/isort
rev: 5.12.0
hooks:
- id: isort
args: ["--profile", "black"]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v0.961
hooks:
- id: mypy
files: ^dot/
args: [--ignore-missing, --no-strict-optional]
additional_dependencies: [types-pyyaml, types-requests]
================================================
FILE: .yamllint
================================================
---
yaml-files:
- '*.yaml'
- '*.yml'
- .yamllint
rules:
braces: enable
brackets: enable
colons: enable
commas: enable
comments:
level: warning
comments-indentation:
level: warning
document-end: disable
document-start: disable
empty-lines: enable
empty-values: disable
hyphens: enable
indentation: enable
key-duplicates: enable
key-ordering: disable
line-length: disable
new-line-at-end-of-file: enable
new-lines: enable
octal-values: disable
quoted-strings: disable
trailing-spaces: enable
truthy:
level: warning
================================================
FILE: CHANGELOG.md
================================================
# Changelog
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [Unreleased]
* Fix fomm model download by @Ghassen-Chaabouni in https://github.com/sensity-ai/dot/pull/160
* Add video and image swap to the GUI by @Ghassen-Chaabouni in https://github.com/sensity-ai/dot/pull/116
## [1.3.0] - 2024-02-19
## What's Changed
* Trace error in CLI and UI by @Ghassen-Chaabouni in https://github.com/sensity-ai/dot/pull/137
* Update Windows executable by @Ghassen-Chaabouni in https://github.com/sensity-ai/dot/pull/133
* Update colab notebook by @Ghassen-Chaabouni in https://github.com/sensity-ai/dot/pull/128
* Add a Docker container for dot by @Ghassen-Chaabouni in https://github.com/sensity-ai/dot/pull/95
* Fix of cusolver error on GPU by @Ghassen-Chaabouni in https://github.com/sensity-ai/dot/pull/110
* Update the GUI, PyTorch and the documentation by @Ghassen-Chaabouni in https://github.com/sensity-ai/dot/pull/107
**Full Changelog**: https://github.com/sensity-ai/dot/compare/1.2.0...1.3.0
## [1.2.0] - 2023-07-20
## What's Changed
* Create a dot executable for windows by @Ghassen-Chaabouni in https://github.com/sensity-ai/dot/pull/92
* Add a graphical interface for dot by @Ghassen-Chaabouni in https://github.com/sensity-ai/dot/pull/85
* Update README and CONTRIBUTING by @giorgiop in https://github.com/sensity-ai/dot/pull/40
* Fix config paths in additional scripts under `scripts/` folder by @Ghassen-Chaabouni in https://github.com/sensity-ai/dot/pull/43
* Update README and add instructions for running dot with an Android emulator by @Ghassen-Chaabouni in https://github.com/sensity-ai/dot/pull/45
**Full Changelog**: https://github.com/sensity-ai/dot/compare/1.1.0...1.2.0
## [1.1.0] - 2022-07-27
## What's Changed
* Update readme by @giorgiop in https://github.com/sensity-ai/dot/pull/6
* Add more press on README.md by @giorgiop in https://github.com/sensity-ai/dot/pull/7
* [ImgBot] Optimize images by @imgbot in https://github.com/sensity-ai/dot/pull/8
* Update README to Download Models from Github Release Binaries by @ajndkr in https://github.com/sensity-ai/dot/pull/19
* Update README + Add Github Templates by @ajndkr in https://github.com/sensity-ai/dot/pull/16
* Verify camera ID when running dot in camera mode by @ajndkr in https://github.com/sensity-ai/dot/pull/18
* Add Feature to Use Config Files by @ajndkr in https://github.com/sensity-ai/dot/pull/17
* ⬆️ Bump numpy from 1.21.1 to 1.22.0 by @dependabot in https://github.com/sensity-ai/dot/pull/25
* Update python version to 3.8 by @vassilispapadop in https://github.com/sensity-ai/dot/pull/28
* Requirements changes now trigger CI by @giorgiop in https://github.com/sensity-ai/dot/pull/27
* Fix python3.8 pip cache location in CI by @ajndkr in https://github.com/sensity-ai/dot/pull/29
* Fix `--save_folder` CLI Option by @vassilispapadop and @Ghassen-Chaabouni in https://github.com/sensity-ai/dot/pull/26
* Add contributors list by @ajndkr in https://github.com/sensity-ai/dot/pull/31
* Add Google Colab demo notebook by @ajndkr in https://github.com/sensity-ai/dot/pull/33
* Speed up SimSwap's `reverse2original` by @ajndkr and @Ghassen-Chaabouni in https://github.com/sensity-ai/dot/pull/20
* Add `bumpversion` for semantic versioning by @ajndkr in https://github.com/sensity-ai/dot/pull/34
* Update README with speed metrics by @giorgiop in https://github.com/sensity-ai/dot/pull/37
## New Contributors
* @giorgiop made their first contribution in https://github.com/sensity-ai/dot/pull/6
* @ghassen1302 made their first contribution in https://github.com/sensity-ai/dot/pull/6
* @imgbot made their first contribution in https://github.com/sensity-ai/dot/pull/8
* @ajndkr made their first contribution in https://github.com/sensity-ai/dot/pull/19
* @dependabot made their first contribution in https://github.com/sensity-ai/dot/pull/25
* @vassilispapadop made their first contribution in https://github.com/sensity-ai/dot/pull/28
**Full Changelog**: https://github.com/sensity-ai/dot/compare/1.0.0...1.1.0
## [1.0.0] - 2022-06-04
* dot is open sourced
**Full Changelog**: https://github.com/sensity-ai/dot/commits/1.0.0
[Unreleased]: https://github.com/sensity-ai/dot/compare/1.2.0...HEAD
[1.2.0]: https://github.com/sensity-ai/dot/compare/1.1.0...1.2.0
[1.1.0]: https://github.com/sensity-ai/dot/compare/1.0.0...1.1.0
[1.0.0]: https://github.com/sensity-ai/dot/releases/tag/1.0.0
================================================
FILE: CONTRIBUTING.md
================================================
# Contributing
When contributing to this repository, please refer to the following.
## Suggested Guidelines
1. When opening a pull request (PR), the title should be clear and concise in describing the changes. The PR description can include a more descriptive log of the changes.
2. If the pull request (PR) is linked to a specific issue, the PR should be linked to the issue. You can use the [Closing Keywords](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue) in the PR description to automatically link the issue. Merging a PR will close the linked issue.
3. This repository follows the [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html) for code formatting.
4. If you are working on improving the speed of *dot*, please read first our guide on [code profiling](docs/profiling.md).
## Setup Dev-Tools
1. Install Dev Requirements
```bash
pip install -r requirements-dev.txt
```
2. Install Pre-Commit Hooks
```bash
pre-commit install
```
## CI/CD
Run Unit Tests (with coverage):
```bash
pytest --cov=src --cov-report=term-missing:skip-covered --cov-fail-under=10
```
Lock Base and Dev Requirements (pre-requisite: `pip install pip-tools==6.8.0`):
```bash
pip-compile setup.cfg
pip-compile --extra=dev --output-file=requirements-dev.txt --strip-extras setup.cfg
```
## Semantic Versioning
This repository follows the [Semantic Versioning](https://semver.org/) standard.
Bump a major release:
```bash
bumpversion major
```
Bump a minor release:
```bash
bumpversion minor
```
Bump a patch release:
```bash
bumpversion patch
```
================================================
FILE: Dockerfile
================================================
FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04
# copy repo codebase
COPY . ./dot
# set working directory
WORKDIR ./dot
ARG DEBIAN_FRONTEND=noninteractive
# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
# Needed by opencv
libglib2.0-0 libsm6 libgl1 \
libxext6 libxrender1 ffmpeg \
build-essential cmake wget unzip zip \
git libprotobuf-dev protobuf-compiler \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
# Install Miniconda
RUN wget \
https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
&& mkdir /root/.conda \
&& bash Miniconda3-latest-Linux-x86_64.sh -b \
&& rm -f Miniconda3-latest-Linux-x86_64.sh
# Add Miniconda to the PATH environment variable
ENV PATH="/root/miniconda3/bin:${PATH}"
RUN conda --version
# Install requirements
RUN conda config --add channels conda-forge
RUN conda install python==3.8
RUN conda install pip==21.3
RUN pip install onnxruntime-gpu==1.9.0
RUN pip install -r requirements.txt
# Install pytorch
RUN pip install --no-cache-dir torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
# Install dot
RUN pip install -e .
# Download and extract the checkpoints
RUN pip install gdown
RUN gdown 1Qaf9hE62XSvgmxR43dfiwEPWWS_dXSCE
RUN unzip -o dot_model_checkpoints.zip
RUN rm -rf *.z*
ENTRYPOINT /bin/bash
================================================
FILE: LICENSE
================================================
Copyright (c) 2022, Sensity B.V.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
================================================
FILE: README.md
================================================
<div align="center">
<h1> the Deepfake Offensive Toolkit </h1>
[](https://github.com/sensity-ai/dot/stargazers)
[](https://github.com/sensity-ai/dot/blob/main/LICENSE)
[](https://www.python.org/downloads/release/python-3812/)
[](https://github.com/sensity-ai/dot/actions/workflows/build_dot.yaml)
[](https://github.com/sensity-ai/dot/actions/workflows/code_check.yaml)
<a href="https://colab.research.google.com/github/sensity-ai/dot/blob/main/notebooks/colab_demo.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" height=20></a>
</div>
*dot* (aka Deepfake Offensive Toolkit) makes real-time, controllable deepfakes ready for virtual cameras injection. *dot* is created for performing penetration testing against e.g. identity verification and video conferencing systems, for the use by security analysts, Red Team members, and biometrics researchers.
If you want to learn more about *dot* is used for penetration tests with deepfakes in the industry, read these articles by [The Verge](https://www.theverge.com/2022/5/18/23092964/deepfake-attack-facial-recognition-liveness-test-banks-sensity-report) and [Biometric Update](https://www.biometricupdate.com/202205/sensity-alleges-biometric-onboarding-providers-downplaying-deepfake-threat).
dot *is developed for research and demonstration purposes. As an end user, you have the responsibility to obey all applicable laws when using this program. Authors and contributing developers assume no liability and are not responsible for any misuse or damage caused by the use of this program.*
<p align="center">
<img src="./assets/dot_intro.gif" width="500"/>
</p>
## How it works
In a nutshell, *dot* works like this
```mermaid
flowchart LR;
A(your webcam feed) --> B(suite of realtime deepfakes);
B(suite of realtime deepfakes) --> C(virtual camera injection);
```
All deepfakes supported by *dot* do not require additional training. They can be used
in real-time on the fly on a photo that becomes the target of face impersonation.
Supported methods:
- face swap (via [SimSwap](https://github.com/neuralchen/SimSwap)), at resolutions `224` and `512`
- with the option of face superresolution (via [GPen](https://github.com/yangxy/GPEN)) at resolutions `256` and `512`
- lower quality face swap (via OpenCV)
- [FOMM](https://github.com/AliaksandrSiarohin/first-order-model), First Order Motion Model for image animation
## Running dot
### Graphical interface
#### GUI Installation
Download and run the dot executable for your OS:
- Windows (Tested on Windows 10 and 11):
- Download `dot.zip` from [here](https://drive.google.com/file/d/1_duaEs2SAUGfAvr5oC4V3XR-ZzBtWQXo/view), unzip it and then run `dot.exe`
- Ubuntu:
- ToDo
- Mac (Tested on Apple M2 Sonoma 14.0):
- Download `dot-m2.zip` from [here](https://drive.google.com/file/d/1KTRzQrl_AVpiFIxUxW_k2F5EsosJJ_1Y/view?usp=sharing) and unzip it
- Open terminal and run `xattr -cr dot-executable.app` to remove any extended attributes
- In case of camera reading error:
- Right click and choose `Show Package Contents`
- Execute `dot-executable` from `Contents/MacOS` folder
#### GUI Usage
Usage example:
1. Specify the source image in the field `source`.
2. Specify the camera id number in the field `target`. In most cases, `0` is the correct camera id.
3. Specify the config file in the field `config_file`. Select a default configuration from the dropdown list or use a custom file.
4. (Optional) Check the field `use_gpu` to use the GPU.
5. Click on the `RUN` button to start the deepfake.
For more information about each field, click on the menu `Help/Usage`.
Watch the following demo video for better understanding of the interface
<p align="center">
<img src="./assets/gui_dot_demo.gif" width="500" height="406"/>
</p>
### Command Line
#### CLI Installation
##### Install Pre-requisites
- Linux
```bash
sudo apt install ffmpeg cmake
```
- MacOS
```bash
brew install ffmpeg cmake
```
- Windows
1. Download and install Visual Studio Community from [here](https://visualstudio.microsoft.com/vs/community/)
2. Install Desktop development with C++ from the Visual studio installer
##### Create Conda Environment
> The instructions assumes that you have Miniconda installed on your machine. If you don't, you can refer to this [link](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html) for installation instructions.
###### With GPU Support
```bash
conda env create -f envs/environment-gpu.yaml
conda activate dot
```
Install the `torch` and `torchvision` dependencies based on the CUDA version installed on your machine:
- Install CUDA 11.8 from [link](https://developer.nvidia.com/cuda-11-8-0-download-archive)
- Install `cudatoolkit` from `conda`: `conda install cudatoolkit=<cuda_version_no>` (replace `<cuda_version_no>` with the version on your machine)
- Install `torch` and `torchvision` dependencies: `pip install torch==2.0.1+<cuda_tag> torchvision==0.15.2+<cuda_tag> torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118`, where `<cuda_tag>` is the CUDA tag defined by Pytorch. For example, `pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118` for CUDA 11.8.
Note: `torch1.9.0+cu111` can also be used.
To check that `torch` and `torchvision` are installed correctly, run the following command: `python -c "import torch; print(torch.cuda.is_available())"`. If the output is `True`, the dependencies are installed with CUDA support.
###### With MPS Support(Apple Silicon)
```bash
conda env create -f envs/environment-apple-m2.yaml
conda activate dot
```
To check that `torch` and `torchvision` are installed correctly, run the following command: `python -c "import torch; print(torch.backends.mps.is_available())"`. If the output is `True`, the dependencies are installed with Metal programming framework support.
###### With CPU Support (slow, not recommended)
```bash
conda env create -f envs/environment-cpu.yaml
conda activate dot
```
##### Install dot
```bash
pip install -e .
```
##### Download Models
- Download dot model checkpoints from [here](https://drive.google.com/file/d/1Y_11R66DL4N1WY8cNlXVNR3RkHnGDGWX/view)
- Unzip the downloaded file in the root of this project
#### CLI Usage
Run `dot --help` to get a full list of available options.
1. Simswap
```bash
dot -c ./configs/simswap.yaml --target 0 --source "./data" --use_gpu
```
2. SimSwapHQ
```bash
dot -c ./configs/simswaphq.yaml --target 0 --source "./data" --use_gpu
```
3. FOMM
```bash
dot -c ./configs/fomm.yaml --target 0 --source "./data" --use_gpu
```
4. FaceSwap CV2
```bash
dot -c ./configs/faceswap_cv2.yaml --target 0 --source "./data" --use_gpu
```
**Note**: To enable face superresolution, use the flag `--gpen_type gpen_256` or `--gpen_type gpen_512`. To use *dot* on CPU (not recommended), do not pass the `--use_gpu` flag.
#### Controlling dot with CLI
> **Disclaimer**: We use the `SimSwap` technique for the following demonstration
Running *dot* via any of the above methods generates real-time Deepfake on the input video feed using source images from the `data/` folder.
<p align="center">
<img src="./assets/dot_run.gif" width="500"/>
</p>
When running *dot* a list of available control options appear on the terminal window as shown above. You can toggle through and select different source images by pressing the associated control key.
Watch the following demo video for better understanding of the control options:
<p align="center">
<img src="./assets/dot_demo.gif" width="480"/>
</p>
## Docker
### Setting up docker
- Build the container
```
docker-compose up --build -d
```
- Access the container
```
docker-compose exec dot "/bin/bash"
```
### Connect docker to the webcam
#### Ubuntu
1. Build the container
```
docker build -t dot -f Dockerfile .
```
2. Run the container
```
xhost +
docker run -ti --gpus all \
-e NVIDIA_DRIVER_CAPABILITIES=compute,utility \
-e NVIDIA_VISIBLE_DEVICES=all \
-e PYTHONUNBUFFERED=1 \
-e DISPLAY \
-v .:/dot \
-v /tmp/.X11-unix:/tmp/.X11-unix:rw \
--runtime nvidia \
--entrypoint /bin/bash \
-p 8080:8080 \
--device=/dev/video0:/dev/video0 \
dot
```
#### Windows
1. Follow the instructions [here](https://medium.com/@jijupax/connect-the-webcam-to-docker-on-mac-or-windows-51d894c44468) under Windows to set up the webcam with docker.
2. Build the container
```
docker build -t dot -f Dockerfile .
```
3. Run the container
```
docker run -ti --gpus all \
-e NVIDIA_DRIVER_CAPABILITIES=compute,utility \
-e NVIDIA_VISIBLE_DEVICES=all \
-e PYTHONUNBUFFERED=1 \
-e DISPLAY=192.168.99.1:0 \
-v .:/dot \
--runtime nvidia \
--entrypoint /bin/bash \
-p 8080:8080 \
--device=/dev/video0:/dev/video0 \
-v /tmp/.X11-unix:/tmp/.X11-unix \
dot
```
#### macOS
1. Follow the instructions [here](https://github.com/gzupark/boot2docker-webcam-mac/blob/master/README.md) to set up the webcam with docker.
2. Build the container
```
docker build -t dot -f Dockerfile .
```
3. Run the container
```
docker run -ti --gpus all \
-e NVIDIA_DRIVER_CAPABILITIES=compute,utility \
-e NVIDIA_VISIBLE_DEVICES=all \
-e PYTHONUNBUFFERED=1 \
-e DISPLAY=$IP:0 \
-v .:/dot \
-v /tmp/.X11-unix:/tmp/.X11-unix \
--runtime nvidia \
--entrypoint /bin/bash \
-p 8080:8080 \
--device=/dev/video0:/dev/video0 \
dot
```
## Virtual Camera Injection
Instructions vary depending on your operating system.
### Windows
- Install [OBS Studio](https://obsproject.com/).
- Run OBS Studio.
- In the Sources section, press on Add button ("+" sign),
select Windows Capture and press OK. In the appeared window,
choose "[python.exe]: fomm" in Window drop-down menu and press OK.
Then select Edit -> Transform -> Fit to screen.
- In OBS Studio, go to Tools -> VirtualCam. Check AutoStart,
set Buffered Frames to 0 and press Start.
- Now `OBS-Camera` camera should be available in Zoom
(or other videoconferencing software).
### Ubuntu
```bash
sudo apt update
sudo apt install v4l-utils v4l2loopback-dkms v4l2loopback-utils
sudo modprobe v4l2loopback devices=1 card_label="OBS Cam" exclusive_caps=1
v4l2-ctl --list-devices
sudo add-apt-repository ppa:obsproject/obs-studio
sudo apt install obs-studio
```
Open `OBS Studio` and check if `tools --> v4l2sink` exists.
If it doesn't follow these instructions:
```bash
mkdir -p ~/.config/obs-studio/plugins/v4l2sink/bin/64bit/
ln -s /usr/lib/obs-plugins/v4l2sink.so ~/.config/obs-studio/plugins/v4l2sink/bin/64bit/
```
Use the virtual camera with `OBS Studio`:
- Open `OBS Studio`
- Go to `tools --> v4l2sink`
- Select `/dev/video2` and `YUV420`
- Click on `start`
- Join a meeting and select `OBS Cam`
### MacOS
- Download and install OBS Studio for MacOS from [here](https://obsproject.com/)
- Open OBS and follow the first-time setup (you might be required to enable certain permissions in *System Preferences*)
- Run *dot* with `--use_cam` flag to enable camera feed
- Click the "+" button in the sources section → select "Windows Capture", create a new source and enter "OK" → select window with "python" included in the name and enter OK
- Click "Start Virtual Camera" button in the controls section
- Select "OBS Cam" as default camera in the video settings of the application target of the injection
## Run dot with an Android emulator
If you are performing a test against a mobile app, virtual cameras are much harder to inject. An alternative is to use mobile emulators and still resort to virtual camera injection.
- Run `dot`. Check [running dot](https://github.com/sensity-ai/dot#running-dot) for more information.
- Run `OBS Studio` and set up the virtual camera. Check [virtual-camera-injection](https://github.com/sensity-ai/dot#virtual-camera-injection) for more information.
- Download and Install [Genymotion](https://www.genymotion.com/download/).
- Open Genymotion and set up the Android emulator.
- Set up dot with the Android emulator:
- Open the Android emulator.
- Click on `camera` and select `OBS-Camera` as front and back cameras. A preview of the dot window should appear.
In case there is no preview, restart `OBS` and the emulator and try again.
If that didn't work, use a different virtual camera software like `e2eSoft VCam` or `ManyCam`.
- `dot` deepfake output should be now the emulator's phone camera.
## Speed
### With GPU
Tested on a AMD Ryzen 5 2600 Six-Core Processor with one NVIDIA GeForce RTX 2070
```example
Simswap: FPS 13.0
Simswap + gpen 256: FPS 7.0
SimswapHQ: FPS 11.0
FOMM: FPS 31.0
```
### With Apple Silicon
Tested on Macbook Air M2 2022 16GB
```example
Simswap: FPS 3.2
Simswap + gpen 256: FPS 1.8
SimswapHQ: FPS 2.7
FOMM: FPS 2.0
```
## License
*This is not a commercial Sensity product, and it is distributed freely with no warranties*
The software is distributed under [BSD 3-Clause](LICENSE).
*dot* utilizes several open source libraries. If you use *dot*, make sure you agree with their
licenses too. In particular, this codebase is built on top of the following research projects:
- <https://github.com/AliaksandrSiarohin/first-order-model>
- <https://github.com/alievk/avatarify-python>
- <https://github.com/neuralchen/SimSwap>
- <https://github.com/yangxy/GPEN>
## Contributing
If you have ideas for improving *dot*, feel free to open relevant Issues and PRs. Please read [CONTRIBUTING.md](./CONTRIBUTING.md) before contributing to the repository.
## Maintainers
- [@ghassen1302](https://github.com/ghassen1302)
- [@vassilispapadop](https://github.com/vassilispapadop)
- [@giorgiop](https://github.com/giorgiop)
- [@AjinkyaIndulkar](https://github.com/AjinkyaIndulkar)
- [@kjod](https://github.com/kjod)
## Contributors
[](https://github.com/sensity-ai/dot/graphs/contributors)
<a href="https://github.com/sensity-ai/dot/graphs/contributors">
<img src="https://contrib.rocks/image?repo=sensity-ai/dot" />
</a>
## Run `dot` on pre-recorded image and video files
- [Run *dot* on image and video files instead of camera feed](docs/run_without_camera.md)
## FAQ
- **`dot` is very slow and I can't run it in real time**
Make sure that you are running it on a GPU card by using the `--use_gpu` flag. CPU is not recommended.
If you still find it too slow it may be because you running it on an old GPU model, with less than 8GB of RAM.
- **Does `dot` only work with a webcam feed or also with a pre-recorded video?**
You can use `dot` on a pre-recorded video file by [these scripts](docs/run_without_camera.md) or try it directly on [Colab](https://colab.research.google.com/github/sensity-ai/dot/blob/main/notebooks/colab_demo.ipynb).
================================================
FILE: configs/faceswap_cv2.yaml
================================================
---
swap_type: faceswap_cv2
model_path: saved_models/faceswap_cv/shape_predictor_68_face_landmarks.dat
================================================
FILE: configs/fomm.yaml
================================================
---
swap_type: fomm
model_path: saved_models/fomm/vox-adv-cpk.pth.tar
head_pose: true
================================================
FILE: configs/simswap.yaml
================================================
---
swap_type: simswap
parsing_model_path: saved_models/simswap/parsing_model/checkpoint/79999_iter.pth
arcface_model_path: saved_models/simswap/arcface_model/arcface_checkpoint.tar
checkpoints_dir: saved_models/simswap/checkpoints
================================================
FILE: configs/simswaphq.yaml
================================================
---
swap_type: simswap
parsing_model_path: saved_models/simswap/parsing_model/checkpoint/79999_iter.pth
arcface_model_path: saved_models/simswap/arcface_model/arcface_checkpoint.tar
checkpoints_dir: saved_models/simswap/checkpoints
crop_size: 512
================================================
FILE: docker-compose.yml
================================================
services:
dot:
build:
context: .
dockerfile: Dockerfile
# Set environment variables, if needed
environment:
- PYTHONUNBUFFERED=1
- NVIDIA_DRIVER_CAPABILITIES=compute,utility
- NVIDIA_VISIBLE_DEVICES=all
# Preserve files across container restarts
volumes:
- .:/dot
# Use NVIDIA runtime to enable GPU support in the container
runtime: nvidia
entrypoint: /bin/bash
ports:
- "8080:8080"
container_name: dot
stdin_open: true
tty: true
================================================
FILE: docs/create_executable.md
================================================
# Create executable
Create an executable of dot for different OS.
## Windows
Follow these steps to generate the executable for Windows.
1. Run these commands
```
cd path/to/dot
conda activate dot
```
2. Get the path of the `site-packages` by running this command
```
python -c "import site; print(''.join(site.getsitepackages()))"
```
3. Replace `path/to/site-packages` with the path of the `site-packages` and run this command
```
pyinstaller --noconfirm --onedir --name "dot" --add-data "src/dot/fomm/config;dot/fomm/config" --add-data "src/dot/simswap/models;dot/simswap/models" --add-data "path/to/site-packages;." --add-data "configs;configs/" --add-data "data;data/" --add-data "saved_models;saved_models/" src/dot/ui/ui.py
```
The executable files can be found under the folder `dist`.
## Ubuntu
ToDo
## Mac
Follow these steps to generate the executable for Mac.
1. Run these commands
```
cd path/to/dot
conda activate dot
```
2. Get the path of the `site-packages` by running this command
```
python -c "import site; print(''.join(site.getsitepackages()))"
```
3. Replace `path/to/site-packages` with the path of the `site-packages` and run this comman
```
pyinstaller --noconfirm --onedir --name "dot" --add-data "src/dot/fomm/config:dot/fomm/config" --add-data "src/dot/simswap/models:dot/simswap/models" --add-data "path/to/site-packages:." --add-data "configs:configs/" --add-data "data:data/" --add-data "saved_models:saved_models/" src/dot/ui/ui.py
```
The executable files can be found under the folder `dist`.
================================================
FILE: docs/profiling.md
================================================
# Profiling
Profiling should be carried out whenever significant changes are made to the pipeline. Profiling results are saved as `.txt` and `.prof` files.
## Scripts
### Profile SimSwap - `profile_simswap.py`
This script profiles SimSwap pipeline on a single image pair.
#### Basic Usage
```bash
python profile_simswap.py
```
## Visualisation Tools
Apart from analysing the `.txt` profiling data, we visualise and explore the `.prof` profiling data with:
* [snakeviz](#snakviz): <https://jiffyclub.github.io/snakeviz/>
* [gprof2dot](#gprof2dot): <https://github.com/jrfonseca/gprof2dot>
* [flameprof](#flameprof): <https://github.com/baverman/flameprof>
### SnakeViz
#### Conda Installation
```bash
conda install -c conda-forge snakeviz
```
#### Basic Usage
```bash
snakeviz <path/to/profiling_data>.prof --server
```
### GProf2Dot
#### Conda Installation
```bash
conda install graphviz
conda install -c conda-forge gprof2dot
```
#### Basic Usage
```bash
python -m gprof2dot -f pstats <path/to/profiling_data>.prof | dot -Tpng -o <path/to/profiling_data>.png
```
### FlameProf
#### Pip Installation
```bash
pip install flameprof
```
#### Basic Usage
```bash
python -m flameprof <path/to/profiling_data>.prof > <path/to/profiling_data>.svg
```
================================================
FILE: docs/run_without_camera.md
================================================
# Run dot on image and video files instead of camera feed
## Using Images
```bash
dot -c ./configs/simswap.yaml --target data/ --source "data/" --save_folder test_local/ --use_image --use_gpu
```
```bash
dot -c ./configs/faceswap_cv2.yaml --target data/ --source "data/" --save_folder test_local/ --use_image --use_gpu
```
## Using Videos
```
dot -c ./configs/simswap.yaml --target "/path/to/driving/video" --source "data/image.png" --save_folder test_local/ --use_gpu --use_video
```
```
dot -c ./configs/fomm.yaml --target "/path/to/driving/video" --source "data/image.png" --save_folder test_local/ --use_gpu --use_video
```
## Faceswap images from directory (Simswap)
You can pass a `--source` folder with images and some `--target` images. Faceswapped images will be generated at `--save_folder` including a metadata json file.
```bash
python scripts/image_swap.py --config <path_to_config/config.yaml> --source <path_to_source_images_folder> --target <path_to_target_images_folder> --save_folder <output_dir> --limit 100
```
## Faceswap images from metadata (SimSwap)
```bash
python scripts/metadata_swap.py --config <path_to_config/config.yaml> --local_root_path <path_to_root_directory> --metadata <path_to_metadata_file> --set <train_or_test_dataset> --save_folder <path_to_output_folder> --limit 100
```
## Faceswap on video files (SimSwap)
```bash
python scripts/video_swap.py -c <path_to_simpswap_config/config.yaml> -s <path_to_source_images> -t <path_to_target_videos> -o <path_to_output_folder> -d 5 -l 5
```
`-d 5` is optional to trim video in seconds
`-l 5` is optional limit total swaps
================================================
FILE: envs/environment-apple-m2.yaml
================================================
---
name: dot
channels:
- conda-forge
- defaults
dependencies:
- python=3.8
- pip=21.3
- pip:
- -r ../requirements-apple-m2.txt
================================================
FILE: envs/environment-cpu.yaml
================================================
---
name: dot
channels:
- conda-forge
- defaults
dependencies:
- python=3.8
- pip=21.3
- pip:
- -r ../requirements.txt
================================================
FILE: envs/environment-gpu.yaml
================================================
---
name: dot
channels:
- conda-forge
- defaults
dependencies:
- python=3.8
- pip=21.3
- pip:
- onnxruntime-gpu==1.18.0
- -r ../requirements.txt
================================================
FILE: notebooks/colab_demo.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "view-in-github"
},
"source": [
"<a href=\"https://colab.research.google.com/github/sensity-ai/dot/blob/update-colab-notebook/notebooks/colab_demo.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "rOTJFaF9WIqg"
},
"source": [
"# Deepfake Offensive Toolkit\n",
"\n",
"> **Disclaimer**: This notebook is primarily used for demo purposes on Google Colab.\n",
"\n",
"**Note**: We recommend running this notebook on Google Colab with GPU enabled.\n",
"\n",
"To enable GPU, do the following:\n",
"\n",
"`Click \"Runtime\" tab > select \"Change runtime type\" option > set \"Hardware accelerator\" to \"GPU\"`\n",
"\n",
"### Install Notebook Pre-requisites:\n",
"\n",
"We install the following pre-requisities:\n",
"- `ffmpeg`\n",
"- `conda` (via [condacolab](https://github.com/conda-incubator/condacolab))\n",
"\n",
"Note: The notebook session will restart after installing the pre-requisites.\n",
"\n",
"**RUN THE BELOW CELL ONLY ONCE.**\n",
"\n",
"**ONCE THE NOTEBOOK SESSION RESTARTS, SKIP THIS CELL MOVE TO \"STEP 1\" SECTION OF THIS NOTEBOOK**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "GnL7GZXGWIqo"
},
"outputs": [],
"source": [
"# install linux pre-requisites\n",
"!sudo apt install ffmpeg\n",
"\n",
"# install miniconda3\n",
"!pip install -q condacolab\n",
"import condacolab\n",
"condacolab.install_miniconda()\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9oI_egyVWIqq"
},
"source": [
"## Step 1 - Clone Repository"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "LvZL-BD0WIqq"
},
"outputs": [],
"source": [
"import os\n",
"os.chdir('/content')\n",
"CODE_DIR = 'dot'\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "gTnnBM5xWIqr"
},
"outputs": [],
"source": [
"!git clone https://github.com/sensity-ai/dot.git $CODE_DIR\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Hgx6JdrrWIqr"
},
"outputs": [],
"source": [
"os.chdir(f'./{CODE_DIR}')\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Nb3q4HbSWIqs"
},
"source": [
"## Step 2 - Setup Conda Environment\n",
"\n",
"**ONCE THE INSTALLATION IS COMPLETE, RESTART THE NOTEBOOK AND MOVE TO \"STEP 2\" SECTION OF THIS NOTEBOOK**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "VkLiUqtbWIqt"
},
"outputs": [],
"source": [
"# update base conda environment: install python=3.8 + cudatoolkit=11.8\n",
"!conda install python=3.8 cudatoolkit=11.8\n",
"\n",
"# install pip requirements\n",
"!pip install llvmlite==0.38.1 onnxruntime-gpu==1.9.0\n",
"!pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118\n",
"!pip install -r requirements.txt\n",
"\n",
"# install dot\n",
"!pip install -e .\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cuCaEkOiWIqy"
},
"source": [
"## Step 2 - Download Pretrained models"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "RVQqmGmsWIqy"
},
"outputs": [],
"source": [
"%cd /content/dot\n",
"\n",
"# download binaries\n",
"!gdown 1Qaf9hE62XSvgmxR43dfiwEPWWS_dXSCE\n",
"\n",
"# unzip binaries\n",
"!unzip dot_model_checkpoints.zip\n",
"\n",
"# clean-up\n",
"!rm -rf *.z*\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IEYtimAjWIqz"
},
"source": [
"## Step 3: Run dot on image and video files instead of camera feed\n",
"\n",
"### Using SimSwap on Images\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "cA0H6ynvWIq0"
},
"outputs": [],
"source": [
"!dot \\\n",
"-c ./configs/simswap.yaml \\\n",
"--target \"data/\" \\\n",
"--source \"data/\" \\\n",
"--save_folder \"image_simswap_output/\" \\\n",
"--use_image \\\n",
"--use_gpu\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "MKbRDeSAWIq0"
},
"source": [
"### Using SimSwap on Videos"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "rJqqmy2vD8uf"
},
"outputs": [],
"source": [
"!dot \\\n",
"-c ./configs/simswap.yaml \\\n",
"--source \"data/\" \\\n",
"--target \"data/\" \\\n",
"--save_folder \"video_simswap_output/\" \\\n",
"--limit 1 \\\n",
"--use_video \\\n",
"--use_gpu"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "oBJOJ2NWWIq1"
},
"outputs": [],
"source": [
"!python scripts/video_swap.py \\\n",
"-s \"data/\" \\\n",
"-t \"data/\" \\\n",
"-o \"video_simswap_output/\" \\\n",
"-d 5 \\\n",
"-l 1\n"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"include_colab_link": true,
"provenance": []
},
"gpuClass": "standard",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
================================================
FILE: pyproject.toml
================================================
[build-system]
requires = [
"setuptools>=42",
"wheel",
]
build-backend = "setuptools.build_meta"
[tool.pytest.ini_options]
filterwarnings = ["ignore:.*"]
================================================
FILE: requirements-apple-m2.txt
================================================
#
# This file is autogenerated by pip-compile with python 3.8
# To update, run:
#
# pip-compile setup.cfg
#
absl-py==1.1.0
# via mediapipe
attrs==21.4.0
# via mediapipe
certifi==2023.7.22
# via requests
chardet==4.0.0
# via requests
click==8.0.2
# via dot (setup.cfg)
cycler==0.11.0
# via matplotlib
dlib==19.19.0
# via dot (setup.cfg)
face-alignment==1.3.3
# via dot (setup.cfg)
flatbuffers==2.0
# via onnxruntime
fonttools==4.43.0
# via matplotlib
idna==2.10
# via requests
imageio==2.19.3
# via scikit-image
kiwisolver==1.4.3
# via matplotlib
kornia==0.6.5
# via dot (setup.cfg)
llvmlite==0.38.1
# via numba
matplotlib==3.5.2
# via mediapipe
mediapipe-silicon
# via dot (setup.cfg)
mediapipe==0.10.3
networkx==2.8.4
# via scikit-image
numba==0.55.2
# via face-alignment
numpy==1.22.0
# via
# dot (setup.cfg)
# face-alignment
# imageio
# matplotlib
# mediapipe
# numba
# onnxruntime
# opencv-contrib-python
# opencv-python
# pywavelets
# scikit-image
# scipy
# tifffile
# torchvision
onnxruntime==1.15.1
# via dot (setup.cfg)
opencv-contrib-python==4.5.5.62
# via
# dot (setup.cfg)
# mediapipe
opencv-python==4.5.5.62
# via
# dot (setup.cfg)
# face-alignment
packaging==21.3
# via
# kornia
# matplotlib
# scikit-image
pillow==10.0.1
# via
# dot (setup.cfg)
# imageio
# matplotlib
# scikit-image
# torchvision
protobuf==3.20.2
# via
# dot (setup.cfg)
# mediapipe
# onnxruntime
pyparsing==3.0.9
# via
# matplotlib
# packaging
python-dateutil==2.8.2
# via matplotlib
pywavelets==1.3.0
# via scikit-image
pyyaml==5.4.1
# via dot (setup.cfg)
requests==2.31.0
# via dot (setup.cfg)
scikit-image==0.19.1
# via
# dot (setup.cfg)
# face-alignment
scipy==1.10.1
# via
# dot (setup.cfg)
# face-alignment
# scikit-image
six==1.16.0
# via
# mediapipe
# python-dateutil
tifffile==2022.5.4
# via scikit-image
torch==2.0.1
# via
# dot (setup.cfg)
# face-alignment
# kornia
# torchvision
torchvision==0.15.2
# via dot (setup.cfg)
tqdm==4.64.0
# via face-alignment
typing-extensions==4.3.0
# via torch
urllib3==1.26.18
# via requests
wheel==0.38.1
# via mediapipe
# The following packages are considered to be unsafe in a requirements file:
# setuptools
================================================
FILE: requirements-dev.txt
================================================
#
# This file is autogenerated by pip-compile with Python 3.8
# by the following command:
#
# pip-compile --extra=dev --output-file=requirements-dev.txt --strip-extras setup.cfg
#
absl-py==1.1.0
# via mediapipe
altgraph==0.17.3
# via pyinstaller
asttokens==2.0.5
# via stack-data
atomicwrites==1.4.1
# via pytest
attrs==21.4.0
# via
# mediapipe
# pytest
backcall==0.2.0
# via ipython
black==22.3.0
# via dot (setup.cfg)
bump2version==1.0.1
# via bumpversion
bumpversion==0.6.0
# via dot (setup.cfg)
certifi==2023.7.22
# via requests
cffi==1.15.1
# via sounddevice
cfgv==3.3.1
# via pre-commit
charset-normalizer==3.2.0
# via requests
click==8.0.2
# via
# black
# dot (setup.cfg)
colorama==0.4.6
# via
# click
# ipython
# pytest
# tqdm
coloredlogs==15.0.1
# via onnxruntime-gpu
coverage==6.4.2
# via
# coverage
# pytest-cov
customtkinter==5.2.0
# via dot (setup.cfg)
cycler==0.11.0
# via matplotlib
darkdetect==0.8.0
# via customtkinter
decorator==5.1.1
# via
# ipdb
# ipython
distlib==0.3.4
# via virtualenv
dlib==19.19.0
# via dot (setup.cfg)
executing==0.8.3
# via stack-data
face-alignment==1.4.1
# via dot (setup.cfg)
filelock==3.7.1
# via
# torch
# virtualenv
flake8==3.9.2
# via dot (setup.cfg)
flatbuffers==2.0
# via
# mediapipe
# onnxruntime-gpu
fonttools==4.43.0
# via matplotlib
humanfriendly==10.0
# via coloredlogs
identify==2.5.1
# via pre-commit
idna==2.10
# via requests
imageio==2.19.3
# via scikit-image
iniconfig==1.1.1
# via pytest
ipdb==0.13.9
# via dot (setup.cfg)
ipython==8.10.0
# via
# dot (setup.cfg)
# ipdb
isort==5.12.0
# via dot (setup.cfg)
jedi==0.18.1
# via ipython
jinja2==3.1.3
# via torch
kiwisolver==1.4.3
# via matplotlib
kornia==0.6.5
# via dot (setup.cfg)
llvmlite==0.38.1
# via numba
markupsafe==2.1.3
# via jinja2
matplotlib==3.5.2
# via mediapipe
matplotlib-inline==0.1.3
# via ipython
mccabe==0.6.1
# via flake8
mediapipe==0.10.3
# via dot (setup.cfg)
mpmath==1.3.0
# via sympy
mypy-extensions==0.4.3
# via black
networkx==2.8.4
# via
# scikit-image
# torch
nodeenv==1.7.0
# via pre-commit
numba==0.55.2
# via face-alignment
numpy==1.22.0
# via
# dot (setup.cfg)
# face-alignment
# imageio
# matplotlib
# mediapipe
# numba
# onnxruntime-gpu
# opencv-contrib-python
# opencv-python
# pywavelets
# scikit-image
# scipy
# tifffile
# torchvision
onnxruntime-gpu==1.18.0
# via dot (setup.cfg)
opencv-contrib-python==4.5.5.62
# via
# dot (setup.cfg)
# mediapipe
opencv-python==4.5.5.62
# via
# dot (setup.cfg)
# face-alignment
packaging==21.3
# via
# kornia
# matplotlib
# onnxruntime-gpu
# pytest
# scikit-image
parso==0.8.3
# via jedi
pathspec==0.9.0
# via black
pefile==2023.2.7
# via pyinstaller
pickleshare==0.7.5
# via ipython
pillow==10.0.1
# via
# dot (setup.cfg)
# imageio
# matplotlib
# scikit-image
# torchvision
platformdirs==2.5.2
# via
# black
# virtualenv
pluggy==1.0.0
# via pytest
pre-commit==2.19.0
# via dot (setup.cfg)
prompt-toolkit==3.0.30
# via ipython
protobuf==3.20.2
# via
# dot (setup.cfg)
# mediapipe
# onnxruntime-gpu
pure-eval==0.2.2
# via stack-data
py==1.11.0
# via pytest
pycodestyle==2.7.0
# via flake8
pycparser==2.21
# via cffi
pyflakes==2.3.1
# via flake8
pygments==2.15.0
# via ipython
pyinstaller==5.13.1
# via dot (setup.cfg)
pyinstaller-hooks-contrib==2023.5
# via pyinstaller
pyparsing==3.0.9
# via
# matplotlib
# packaging
pyreadline3==3.4.1
# via humanfriendly
pytest==7.1.2
# via
# dot (setup.cfg)
# pytest-cov
pytest-cov==3.0.0
# via dot (setup.cfg)
python-dateutil==2.8.2
# via matplotlib
pywavelets==1.3.0
# via scikit-image
pywin32-ctypes==0.2.2
# via pyinstaller
pyyaml==5.4.1
# via
# dot (setup.cfg)
# pre-commit
requests==2.31.0
# via
# dot (setup.cfg)
# torchvision
scikit-image==0.19.1
# via
# dot (setup.cfg)
# face-alignment
scipy==1.10.0
# via
# dot (setup.cfg)
# face-alignment
# scikit-image
six==1.16.0
# via
# asttokens
# python-dateutil
# virtualenv
sounddevice==0.4.6
# via mediapipe
stack-data==0.3.0
# via ipython
sympy==1.12
# via
# onnxruntime-gpu
# torch
tifffile==2022.5.4
# via scikit-image
toml==0.10.2
# via
# ipdb
# pre-commit
tomli==2.0.1
# via
# black
# coverage
# pytest
torch==2.0.1
# via
# dot (setup.cfg)
# face-alignment
# kornia
# torchvision
torchvision==0.15.2
# via dot (setup.cfg)
tqdm==4.64.0
# via face-alignment
traitlets==5.3.0
# via
# ipython
# matplotlib-inline
types-pyyaml==6.0.10
# via dot (setup.cfg)
typing-extensions==4.3.0
# via
# black
# torch
urllib3==1.26.18
# via requests
virtualenv==20.15.1
# via pre-commit
wcwidth==0.2.5
# via prompt-toolkit
# The following packages are considered to be unsafe in a requirements file:
# setuptools
================================================
FILE: requirements.txt
================================================
#
# This file is autogenerated by pip-compile with Python 3.8
# by the following command:
#
# pip-compile setup.cfg
#
absl-py==1.1.0
# via mediapipe
attrs==21.4.0
# via mediapipe
certifi==2023.7.22
# via requests
cffi==1.15.1
# via sounddevice
charset-normalizer==3.2.0
# via requests
click==8.0.2
# via dot (setup.cfg)
colorama==0.4.6
# via
# click
# pytest
# tqdm
coloredlogs==15.0.1
# via onnxruntime-gpu
customtkinter==5.2.0
# via dot (setup.cfg)
cycler==0.11.0
# via matplotlib
darkdetect==0.8.0
# via customtkinter
dlib==19.19.0
# via dot (setup.cfg)
exceptiongroup==1.1.2
# via pytest
face-alignment==1.4.1
# via dot (setup.cfg)
filelock==3.12.2
# via torch
flatbuffers==2.0
# via
# mediapipe
# onnxruntime-gpu
fonttools==4.43.0
# via matplotlib
humanfriendly==10.0
# via coloredlogs
idna==2.10
# via requests
imageio==2.19.3
# via scikit-image
iniconfig==2.0.0
# via pytest
jinja2==3.1.3
# via torch
kiwisolver==1.4.3
# via matplotlib
kornia==0.6.5
# via dot (setup.cfg)
llvmlite==0.38.1
# via numba
markupsafe==2.1.3
# via jinja2
matplotlib==3.5.2
# via mediapipe
mediapipe==0.10.3
# via dot (setup.cfg)
mpmath==1.3.0
# via sympy
networkx==2.8.4
# via
# scikit-image
# torch
numba==0.55.2
# via face-alignment
numpy==1.22.0
# via
# dot (setup.cfg)
# face-alignment
# imageio
# matplotlib
# mediapipe
# numba
# onnxruntime-gpu
# opencv-contrib-python
# opencv-python
# pywavelets
# scikit-image
# scipy
# tifffile
# torchvision
onnxruntime-gpu==1.18.0
# via dot (setup.cfg)
opencv-contrib-python==4.5.5.62
# via
# dot (setup.cfg)
# mediapipe
opencv-python==4.5.5.62
# via
# dot (setup.cfg)
# face-alignment
packaging==21.3
# via
# kornia
# matplotlib
# onnxruntime-gpu
# pytest
# scikit-image
pillow==10.0.1
# via
# dot (setup.cfg)
# imageio
# matplotlib
# scikit-image
# torchvision
pluggy==1.2.0
# via pytest
protobuf==3.20.2
# via
# dot (setup.cfg)
# mediapipe
# onnxruntime-gpu
pycparser==2.21
# via cffi
pyparsing==3.0.9
# via
# matplotlib
# packaging
pyreadline3==3.4.1
# via humanfriendly
pytest==7.4.0
# via dot (setup.cfg)
python-dateutil==2.8.2
# via matplotlib
pywavelets==1.3.0
# via scikit-image
pyyaml==5.4.1
# via dot (setup.cfg)
requests==2.31.0
# via
# dot (setup.cfg)
# torchvision
scikit-image==0.19.1
# via
# dot (setup.cfg)
# face-alignment
scipy==1.10.0
# via
# dot (setup.cfg)
# face-alignment
# scikit-image
six==1.16.0
# via python-dateutil
sounddevice==0.4.6
# via mediapipe
sympy==1.12
# via
# onnxruntime-gpu
# torch
tifffile==2022.5.4
# via scikit-image
tomli==2.0.1
# via pytest
torch==2.0.1
# via
# dot (setup.cfg)
# face-alignment
# kornia
# torchvision
torchvision==0.15.2
# via dot (setup.cfg)
tqdm==4.64.0
# via face-alignment
typing-extensions==4.3.0
# via torch
urllib3==1.26.18
# via requests
# The following packages are considered to be unsafe in a requirements file:
# setuptools
================================================
FILE: scripts/image_swap.py
================================================
#!/usr/bin/env python3
"""
Copyright (c) 2022, Sensity B.V. All rights reserved.
licensed under the BSD 3-Clause "New" or "Revised" License.
"""
import glob
import json
import os
import click
import yaml
import dot
"""
Usage:
python image_swap.py
-c <path/to/config>
-s <path/to/source/images>
-t <path/to/target/images>
-o <path/to/output/folder>
-l 5(Optional limit total swaps)
"""
@click.command()
@click.option("-c", "--config", default="./src/dot/simswap/configs/config.yaml")
@click.option("-s", "--source", required=True)
@click.option("-t", "--target", required=True)
@click.option("-o", "--save_folder", required=False)
@click.option("-l", "--limit", type=int, required=False)
def main(
config: str, source: str, target: str, save_folder: str, limit: int = False
) -> None:
"""Performs face-swap given a `source/target` image(s). Saves JSON file of (un)successful swaps.
Args:
config (str): Path to DOT configuration yaml file.
source (str): Path to source images folder or certain image file.
target (str): Path to target images folder or certain image file.
save_folder (str): Output folder to store face-swaps and metadata file.
limit (int, optional): Number of desired face-swaps. If not specified,
all possible combinations of source/target pairs will be processed. Defaults to False.
"""
print(f"Loading config: {config}")
with open(config) as f:
config = yaml.safe_load(f)
_dot = dot.DOT(use_cam=False, use_video=False, save_folder=save_folder)
analysis_config = config["analysis"]["simswap"]
option = _dot.simswap(
use_gpu=analysis_config.get("use_gpu", False),
use_mask=analysis_config.get("opt_use_mask", False),
gpen_type=analysis_config.get("gpen", None),
gpen_path=analysis_config.get("gpen_path", None),
crop_size=analysis_config.get("opt_crop_size", 224),
)
swappedMD, rejectedMD = _dot.generate(
option, source=source, target=target, limit=limit, **analysis_config
)
# save metadata file
if swappedMD:
with open(os.path.join(save_folder, "metadata.json"), "a") as fp:
json.dump(swappedMD, fp, indent=4)
# save rejected face-swaps
if rejectedMD:
with open(os.path.join(save_folder, "rejected.json"), "a") as fp:
json.dump(rejectedMD, fp, indent=4)
def find_images_from_path(path):
if os.path.isfile(path):
return [path]
try:
return int(path)
except ValueError:
# supported extensions
ext = ["png", "jpg", "jpeg"]
files = []
[files.extend(glob.glob(path + "**/*." + e, recursive=True)) for e in ext]
return files
if __name__ == "__main__":
main()
================================================
FILE: scripts/metadata_swap.py
================================================
#!/usr/bin/env python3
"""
Copyright (c) 2022, Sensity B.V. All rights reserved.
licensed under the BSD 3-Clause "New" or "Revised" License.
"""
import json
import os
import click
import numpy as np
import pandas as pd
import yaml
import dot
"""
Usage:
python metadata_swap.py \
--config <path_to_config/config.yaml> \
--local_root_path <path_to_root_directory> \
--metadata <path_to_metadata_file> \
--set <train_or_test_dataset> \
--save_folder <path_to_output_folder> \
--limit 100
"""
# common face identity features
face_identity_features = {
1: "ArchedEyebrows",
2: "Attractive",
3: "BagsUnderEyes",
6: "BigLips",
7: "BigNose",
12: "BushyEyebrows",
16: "Goatee",
18: "HeavyMakeup",
19: "HighCheekbones",
22: "Mustache",
23: "NarrowEyes",
24: "NoBeard",
27: "PointyNose",
}
@click.command()
@click.option("-c", "--config", default="./src/dot/simswap/configs/config.yaml")
@click.option("--local_root_path", required=True)
@click.option("--metadata", required=True)
@click.option("--set", required=True)
@click.option("-o", "--save_folder", required=False)
@click.option("--limit", type=int, required=False)
def main(
config: str,
local_root_path: str,
metadata: str,
set: str,
save_folder: str,
limit: bool = None,
) -> None:
"""Script is tailored to dictionary format as shown below. `key` is the relative path to image,
`value` is a list of total 44 attributes.
[0:40] `Face attributes`: 50'ClockShadow, ArchedEyebrows, Attractive, BagsUnderEyes, Bald,Bangs,BigLips,
BigNose, BlackHair, BlondHair, Blurry, BrownHair, BushyEyebrows, Chubby, DoubleChin ,Eyeglasses,Goatee,
GrayHair, HeavyMakeup, HighCheekbones, Male, MouthSlightlyOpen, Mustache, NarrowEyes, NoBeard, OvalFace,
PaleSkin, PointyNose, RecedingHairline, RosyCheeks, Sideburns, Smiling, StraightHair, WavyHair, WearingEarrings,
WearingHat, WearingLipstick, WearingNecklace, WearingNecktie, Young.
[41] `Spoof type`: Live, Photo, Poster, A4, Face Mask, Upper Body Mask, Region Mask, PC, Pa, Phone, 3D Mask.
[42] `Illumination`: Live, Normal, Strong, Back, Dark.
[43] `Live/Spoof(binary)`: Live, Spoof.
{
"rel_path/img1.png": [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,2,2,1],
"rel_path/img2.png": [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1,2,1]
....
}
It constructs a pd.DataFrame from `metadata` and filters rows where examples are under-aged(young==0).
Face-swaps are performed randomly based on gender. The `result-swap` image shares common attributes with
the `source` image which are defined in `face_identity_features` dict.
Spoof-type of swapped image is defined at index 40 of attributes list and set to 11.
Args:
config (str): Path to DOT configuration yaml file.
local_root_path (str): Root path of dataset.
metadata (str): JSON metadata file path of dataset.
set (str): Defines train/test dataset.
save_folder (str): Output folder to store face-swaps and metadata file.
limit (int, optional): Number of desired face-swaps. If not specified, will be set equal to DataFrame size.
Defaults to False.
"""
if limit and limit < 4:
print("Error: limit should be >= 4")
return
output_data_folder = os.path.join(save_folder + f"Data/{set}/swap/")
df = pd.read_json(metadata, orient="index")
mapping = {
df.columns[20]: "gender",
df.columns[26]: "pale_skin",
df.columns[39]: "young",
}
df = df.rename(columns=mapping)
df.head()
# keep only live images
df = df.loc[df.index.str.contains("live")]
# keep only adult images
df = df.loc[df["young"] == 0]
if not limit:
limit = df.shape[0]
print(f"Limit is set to: {limit}")
filters = ["gender==1", "gender==0"]
swaps = []
for filter in filters:
# get n random rows based on condition ==1(male)
filtered = df.query(filter).sample(n=round(limit / len(filters)), replace=True)
# shuffle again, keep only indices and convert to list
filtered = filtered.sample(frac=1).index.tolist()
# append local_root_path
filtered = [os.path.join(local_root_path, p) for p in filtered]
# split into two lists roughly equal size
mid_index = round(len(filtered) / 2)
src = filtered[0:mid_index]
tar = filtered[mid_index:]
swaps.append((src, tar))
print(f"Loading config: {config}")
with open(config) as f:
config = yaml.safe_load(f)
analysis_config = config["analysis"]["simswap"]
_dot = dot.DOT(use_video=False, save_folder=output_data_folder)
_dot.use_cam = False
option = _dot.build_option(
swap_type="simswap",
use_gpu=analysis_config.get("use_gpu", False),
use_mask=analysis_config.get("opt_use_mask", False),
gpen_type=analysis_config.get("gpen", None),
gpen_path=analysis_config.get("gpen_path", None),
crop_size=analysis_config.get("opt_crop_size", 224),
)
total_succeed = {}
total_failed = {}
for swap in swaps:
source_list = swap[0]
target_list = swap[1]
# perform faceswap
for source, target in zip(source_list, target_list):
success, rejections = _dot.generate(
option,
source=source,
target=target,
duration=None,
**analysis_config,
)
total_succeed = {**total_succeed, **success}
total_failed = {**total_failed, **rejections}
# save succeed face-swaps file
if total_succeed:
# append attribute list for source/target images
for key, value in total_succeed.items():
src_attr = (
df.loc[df.index == value["source"]["path"].replace(local_root_path, "")]
.iloc[0, 0:]
.tolist()
)
tar_attr = (
df.loc[df.index == value["target"]["path"].replace(local_root_path, "")]
.iloc[0, 0:]
.tolist()
)
total_succeed[key]["source"]["attr"] = src_attr
total_succeed[key]["target"]["attr"] = tar_attr
with open(os.path.join(save_folder, "swaps_succeed.json"), "w") as fp:
json.dump(total_succeed, fp)
# save failed face-swaps file
if total_failed:
with open(os.path.join(save_folder, "swaps_failed.json"), "w") as fp:
json.dump(total_failed, fp)
# format metadata to appropriate format
formatted = format_swaps(total_succeed)
# save file
if formatted:
with open(os.path.join(save_folder, f"{set}_label_swap.json"), "w") as fp:
json.dump(formatted, fp)
def format_swaps(succeeds):
formatted = {}
for key, value in succeeds.items():
# attributes of source image
src_attr = np.asarray(value["source"]["attr"])
# attributes of target image
tar_attr = np.asarray(value["target"]["attr"])
# attributes of swapped image. copy from target image
swap_attr = tar_attr
# transfer facial attributes from source image
for idx in face_identity_features.keys():
swap_attr[idx] = src_attr[idx]
# swap-spoof-type-11, FaceSwap
swap_attr[40] = 11
# store in dict
formatted[key] = swap_attr.tolist()
return formatted
if __name__ == "__main__":
main()
================================================
FILE: scripts/profile_simswap.py
================================================
#!/usr/bin/env python3
"""
Copyright (c) 2022, Sensity B.V. All rights reserved.
licensed under the BSD 3-Clause "New" or "Revised" License.
"""
import cProfile
import glob
import os
import pstats
import click
import yaml
import dot
# define globals
CONFIG = "./src/dot/simswap/configs/config.yaml"
SOURCE = "data/obama.jpg"
TARGET = "data/mona.jpg"
SAVE_FOLDER = "./profile_output/"
LIMIT = 1
@click.command()
@click.option("-c", "--config", default=CONFIG)
@click.option("--source", default=SOURCE)
@click.option("--target", default=TARGET)
@click.option("--save_folder", default=SAVE_FOLDER)
@click.option("--limit", type=int, default=LIMIT)
def main(
config=CONFIG, source=SOURCE, target=TARGET, save_folder=SAVE_FOLDER, limit=LIMIT
):
profiler = cProfile.Profile()
with open(config) as f:
config = yaml.safe_load(f)
analysis_config = config["analysis"]["simswap"]
_dot = dot.DOT(use_cam=False, use_video=False, save_folder=save_folder)
option = _dot.simswap(
use_gpu=config["analysis"]["simswap"]["use_gpu"],
gpen_type=config["analysis"]["simswap"]["gpen"],
gpen_path=config["analysis"]["simswap"]["gpen_path"],
use_mask=config["analysis"]["simswap"]["opt_use_mask"],
crop_size=config["analysis"]["simswap"]["opt_crop_size"],
)
option.create_model(**analysis_config)
profiler.enable()
swappedMD, rejectedMD = _dot.generate(
option,
source=source,
target=target,
limit=limit,
profiler=True,
**analysis_config
)
profiler.disable()
stats = pstats.Stats(profiler)
stats.dump_stats("SimSwap_profiler.prof")
def find_images_from_path(path):
if os.path.isfile(path):
return [path]
try:
return int(path)
except ValueError:
# supported extensions
ext = ["png", "jpg", "jpeg"]
files = []
[files.extend(glob.glob(path + "**/*." + e, recursive=True)) for e in ext]
return files
if __name__ == "__main__":
main()
================================================
FILE: scripts/video_swap.py
================================================
#!/usr/bin/env python3
"""
Copyright (c) 2022, Sensity B.V. All rights reserved.
licensed under the BSD 3-Clause "New" or "Revised" License.
"""
import click
import yaml
import dot
"""
Usage:
python video_swap.py
-c <path/to/config>
-s <path/to/source/images>
-t <path/to/target/videos>
-o <path/to/output/folder>
-d 5(Optional trim video)
-l 5(Optional limit total swaps)
"""
@click.command()
@click.option("-c", "--config", default="./src/dot/simswap/configs/config.yaml")
@click.option("-s", "--source_image_path", required=True)
@click.option("-t", "--target_video_path", required=True)
@click.option("-o", "--output", required=True)
@click.option("-d", "--duration_per_video", required=False)
@click.option("-l", "--limit", type=int, required=False)
def main(
config: str,
source_image_path: str,
target_video_path: str,
output: str,
duration_per_video: int,
limit: int = None,
):
"""Given `source` and `target` folders, performs face-swap on each video with randomly chosen
image found `source` path.
Supported image formats: `["jpg", "png", "jpeg"]`
Supported video formats: `["avi", "mp4", "mov", "MOV"]`
Args:
config (str): Path to configuration file.
source_image_path (str): Path to source images
target_video_path (str): Path to target videos
output (str): Output folder path.
duration_per_video (int): Trim duration of target video in seconds.
limit (int, optional): Limit number of video-swaps. Defaults to None.
"""
print(f"Loading config: {config}")
with open(config) as f:
config = yaml.safe_load(f)
_dot = dot.DOT(use_cam=False, use_video=True, save_folder=output)
analysis_config = config["analysis"]["simswap"]
option = _dot.simswap(
use_gpu=analysis_config.get("use_gpu", False),
use_mask=analysis_config.get("opt_use_mask", False),
gpen_type=analysis_config.get("gpen", None),
gpen_path=analysis_config.get("gpen_path", None),
crop_size=analysis_config.get("opt_crop_size", 224),
)
_dot.generate(
option=option,
source=source_image_path,
target=target_video_path,
duration=duration_per_video,
limit=limit,
**analysis_config,
)
if __name__ == "__main__":
main()
================================================
FILE: setup.cfg
================================================
[bumpversion]
current_version = 1.4.0
commit = True
tag = False
parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)?
serialize =
{major}.{minor}.{patch}
[bumpversion:file:src/dot/__init__.py]
search = __version__ = "{current_version}"
replace = __version__ = "{new_version}"
[metadata]
name = dot
version = attr: dot.__version__
author = attr: dot.__author__
description = attr: dot.__doc__
long_description = file: README.md
log_description_content_type = text/markdown
url = attr: dot.__url__
license = BSD 3-Clause License
classifiers =
Programming Language :: Python :: 3.8
[options]
package_dir =
= src
packages = find:
python_requires = >=3.8,<3.9
install_requires =
click
dlib
face_alignment==1.4.1
kornia
mediapipe
numpy
onnxruntime-gpu==1.18.0
opencv-contrib-python
opencv_python
Pillow
protobuf
PyYAML
requests
scikit_image
scipy
torch==2.0.1
torchvision==0.15.2
customtkinter
pytest
[options.extras_require]
dev =
black
bumpversion
flake8
ipdb
ipython
isort==5.12.0
pre-commit
pyinstaller
pytest
pytest-cov
types-PyYAML
[options.packages.find]
where = src
[options.entry_points]
console_scripts =
dot = dot.__main__:main
dot-ui = dot.ui.ui:main
================================================
FILE: src/dot/__init__.py
================================================
#!/usr/bin/env python3
"""
Copyright (c) 2022, Sensity B.V. All rights reserved.
licensed under the BSD 3-Clause "New" or "Revised" License.
"""
from .dot import DOT
__version__ = "1.4.0"
__author__ = "Sensity"
__url__ = "https://github.com/sensity-ai/dot/tree/main/dot"
__docs__ = "Deepfake offensive toolkit"
__all__ = ["DOT"]
================================================
FILE: src/dot/__main__.py
================================================
#!/usr/bin/env python3
"""
Copyright (c) 2022, Sensity B.V. All rights reserved.
licensed under the BSD 3-Clause "New" or "Revised" License.
"""
import traceback
from typing import Union
import click
import yaml
from .dot import DOT
def run(
swap_type: str,
source: str,
target: Union[int, str],
model_path: str = None,
parsing_model_path: str = None,
arcface_model_path: str = None,
checkpoints_dir: str = None,
gpen_type: str = None,
gpen_path: str = "saved_models/gpen",
crop_size: int = 224,
head_pose: bool = False,
save_folder: str = None,
show_fps: bool = False,
use_gpu: bool = False,
use_video: bool = False,
use_image: bool = False,
limit: int = None,
):
"""Builds a DOT object and runs it.
Args:
swap_type (str): The type of swap to run.
source (str): The source image or video.
target (Union[int, str]): The target image or video.
model_path (str, optional): The path to the model's weights. Defaults to None.
parsing_model_path (str, optional): The path to the parsing model. Defaults to None.
arcface_model_path (str, optional): The path to the arcface model. Defaults to None.
checkpoints_dir (str, optional): The path to the checkpoints directory. Defaults to None.
gpen_type (str, optional): The type of gpen model to use. Defaults to None.
gpen_path (str, optional): The path to the gpen models. Defaults to "saved_models/gpen".
crop_size (int, optional): The size to crop the images to. Defaults to 224.
save_folder (str, optional): The path to the save folder. Defaults to None.
show_fps (bool, optional): Pass flag to show fps value. Defaults to False.
use_gpu (bool, optional): Pass flag to use GPU else use CPU. Defaults to False.
use_video (bool, optional): Pass flag to use video-swap pipeline. Defaults to False.
use_image (bool, optional): Pass flag to use image-swap pipeline. Defaults to False.
limit (int, optional): The number of frames to process. Defaults to None.
"""
try:
# initialize dot
_dot = DOT(use_video=use_video, use_image=use_image, save_folder=save_folder)
# build dot
option = _dot.build_option(
swap_type=swap_type,
use_gpu=use_gpu,
gpen_type=gpen_type,
gpen_path=gpen_path,
crop_size=crop_size,
)
# run dot
_dot.generate(
option=option,
source=source,
target=target,
show_fps=show_fps,
model_path=model_path,
limit=limit,
parsing_model_path=parsing_model_path,
arcface_model_path=arcface_model_path,
checkpoints_dir=checkpoints_dir,
opt_crop_size=crop_size,
head_pose=head_pose,
)
except: # noqa
print(traceback.format_exc())
@click.command()
@click.option(
"--swap_type",
"swap_type",
type=click.Choice(["fomm", "faceswap_cv2", "simswap"], case_sensitive=False),
)
@click.option(
"--source",
"source",
required=True,
help="Images to swap with target",
)
@click.option(
"--target",
"target",
required=True,
help="Cam ID or target media",
)
@click.option(
"--model_path",
"model_path",
default=None,
help="Path to 68-point facial landmark detector for FaceSwap-cv2 or to the model's weights for the FOM",
)
@click.option(
"--parsing_model_path",
"parsing_model_path",
default=None,
help="Path to the parsing model",
)
@click.option(
"--arcface_model_path",
"arcface_model_path",
default=None,
help="Path to arcface model",
)
@click.option(
"--checkpoints_dir",
"checkpoints_dir",
default=None,
help="models are saved here",
)
@click.option(
"--gpen_type",
"gpen_type",
default=None,
type=click.Choice(["gpen_256", "gpen_512"]),
)
@click.option(
"--gpen_path",
"gpen_path",
default="saved_models/gpen",
help="Path to gpen models.",
)
@click.option("--crop_size", "crop_size", type=int, default=224)
@click.option("--save_folder", "save_folder", type=str, default=None)
@click.option(
"--show_fps",
"show_fps",
type=bool,
default=False,
is_flag=True,
help="Pass flag to show fps value.",
)
@click.option(
"--use_gpu",
"use_gpu",
type=bool,
default=False,
is_flag=True,
help="Pass flag to use GPU else use CPU.",
)
@click.option(
"--use_video",
"use_video",
type=bool,
default=False,
is_flag=True,
help="Pass flag to use video-swap pipeline.",
)
@click.option(
"--use_image",
"use_image",
type=bool,
default=False,
is_flag=True,
help="Pass flag to use image-swap pipeline.",
)
@click.option("--limit", "limit", type=int, default=None)
@click.option(
"-c",
"--config",
"config_file",
help="Configuration file. Overrides duplicate options passed.",
required=False,
default=None,
)
def main(
swap_type: str,
source: str,
target: Union[int, str],
model_path: str = None,
parsing_model_path: str = None,
arcface_model_path: str = None,
checkpoints_dir: str = None,
gpen_type: str = None,
gpen_path: str = "saved_models/gpen",
crop_size: int = 224,
save_folder: str = None,
show_fps: bool = False,
use_gpu: bool = False,
use_video: bool = False,
use_image: bool = False,
limit: int = None,
config_file: str = None,
):
"""CLI entrypoint for dot."""
# load config, if provided
config = {}
if config_file is not None:
with open(config_file) as f:
config = yaml.safe_load(f)
# run dot
run(
swap_type=config.get("swap_type", swap_type),
source=source,
target=target,
model_path=config.get("model_path", model_path),
parsing_model_path=config.get("parsing_model_path", parsing_model_path),
arcface_model_path=config.get("arcface_model_path", arcface_model_path),
checkpoints_dir=config.get("checkpoints_dir", checkpoints_dir),
gpen_type=config.get("gpen_type", gpen_type),
gpen_path=config.get("gpen_path", gpen_path),
crop_size=config.get("crop_size", crop_size),
head_pose=config.get("head_pose", False),
save_folder=save_folder,
show_fps=show_fps,
use_gpu=use_gpu,
use_video=use_video,
use_image=use_image,
limit=limit,
)
if __name__ == "__main__":
main()
================================================
FILE: src/dot/commons/__init__.py
================================================
#!/usr/bin/env python3
"""
Copyright (c) 2022, Sensity B.V. All rights reserved.
licensed under the BSD 3-Clause "New" or "Revised" License.
"""
from .model_option import ModelOption
__all__ = ["ModelOption"]
================================================
FILE: src/dot/commons/cam/__init__.py
================================================
#!/usr/bin/env python3
================================================
FILE: src/dot/commons/cam/cam.py
================================================
#!/usr/bin/env python3
import glob
import os
import cv2
import numpy as np
import requests
import yaml
from ..utils import info, resize
from .camera_selector import query_cameras
def is_new_frame_better(log, source, driving, predictor):
global avatar_kp
global display_string
if avatar_kp is None:
display_string = "No face detected in avatar."
return False
if predictor.get_start_frame() is None:
display_string = "No frame to compare to."
return True
_ = resize(driving, (128, 128))[..., :3]
new_kp = predictor.get_frame_kp(driving)
if new_kp is not None:
new_norm = (np.abs(avatar_kp - new_kp) ** 2).sum()
old_norm = (np.abs(avatar_kp - predictor.get_start_frame_kp()) ** 2).sum()
out_string = "{0} : {1}".format(int(new_norm * 100), int(old_norm * 100))
display_string = out_string
log(out_string)
return new_norm < old_norm
else:
display_string = "No face found!"
return False
def load_stylegan_avatar(IMG_SIZE=256):
url = "https://thispersondoesnotexist.com/image"
r = requests.get(url, headers={"User-Agent": "My User Agent 1.0"}).content
image = np.frombuffer(r, np.uint8)
image = cv2.imdecode(image, cv2.IMREAD_COLOR)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = resize(image, (IMG_SIZE, IMG_SIZE))
return image
def load_images(log, opt_avatars, IMG_SIZE=256):
avatars = []
filenames = []
images_list = sorted(glob.glob(f"{opt_avatars}/*"))
for i, f in enumerate(images_list):
if f.endswith(".jpg") or f.endswith(".jpeg") or f.endswith(".png"):
img = cv2.imread(f)
if img is None:
log("Failed to open image: {}".format(f))
continue
if img.ndim == 2:
img = np.tile(img[..., None], [1, 1, 3])
img = img[..., :3][..., ::-1]
img = resize(img, (IMG_SIZE, IMG_SIZE))
avatars.append(img)
filenames.append(f)
return avatars, filenames
def draw_rect(img, rw=0.6, rh=0.8, color=(255, 0, 0), thickness=2):
h, w = img.shape[:2]
_l = w * (1 - rw) // 2
r = w - _l
u = h * (1 - rh) // 2
d = h - u
img = cv2.rectangle(img, (int(_l), int(u)), (int(r), int(d)), color, thickness)
def kp_to_pixels(arr):
"""Convert normalized landmark locations to screen pixels"""
return ((arr + 1) * 127).astype(np.int32)
def draw_face_landmarks(LANDMARK_SLICE_ARRAY, img, face_kp, color=(20, 80, 255)):
if face_kp is not None:
img = cv2.polylines(
img, np.split(kp_to_pixels(face_kp), LANDMARK_SLICE_ARRAY), False, color
)
def print_help(avatar_names):
info("\n\n=== Control keys ===")
info("1-9: Change avatar")
for i, fname in enumerate(avatar_names):
key = i + 1
name = fname.split("/")[-1]
info(f"{key}: {name}")
info("W: Zoom camera in")
info("S: Zoom camera out")
info("A: Previous avatar in folder")
info("D: Next avatar in folder")
info("Q: Get random avatar")
info("X: Calibrate face pose")
info("I: Show FPS")
info("ESC: Quit")
info("\nFull key list: https://github.com/alievk/avatarify#controls")
info("\n\n")
def draw_fps(
frame,
fps,
timing,
x0=10,
y0=20,
ystep=30,
fontsz=0.5,
color=(255, 255, 255),
IMG_SIZE=256,
):
frame = frame.copy()
black = (0, 0, 0)
black_thick = 2
cv2.putText(
frame,
f"FPS: {fps:.1f}",
(x0, y0 + ystep * 0),
0,
fontsz * IMG_SIZE / 256,
(0, 0, 0),
black_thick,
)
cv2.putText(
frame,
f"FPS: {fps:.1f}",
(x0, y0 + ystep * 0),
0,
fontsz * IMG_SIZE / 256,
color,
1,
)
cv2.putText(
frame,
f"Model time (ms): {timing['predict']:.1f}",
(x0, y0 + ystep * 1),
0,
fontsz * IMG_SIZE / 256,
black,
black_thick,
)
cv2.putText(
frame,
f"Model time (ms): {timing['predict']:.1f}",
(x0, y0 + ystep * 1),
0,
fontsz * IMG_SIZE / 256,
color,
1,
)
cv2.putText(
frame,
f"Preproc time (ms): {timing['preproc']:.1f}",
(x0, y0 + ystep * 2),
0,
fontsz * IMG_SIZE / 256,
black,
black_thick,
)
cv2.putText(
frame,
f"Preproc time (ms): {timing['preproc']:.1f}",
(x0, y0 + ystep * 2),
0,
fontsz * IMG_SIZE / 256,
color,
1,
)
cv2.putText(
frame,
f"Postproc time (ms): {timing['postproc']:.1f}",
(x0, y0 + ystep * 3),
0,
fontsz * IMG_SIZE / 256,
black,
black_thick,
)
cv2.putText(
frame,
f"Postproc time (ms): {timing['postproc']:.1f}",
(x0, y0 + ystep * 3),
0,
fontsz * IMG_SIZE / 256,
color,
1,
)
return frame
def draw_landmark_text(frame, thk=2, fontsz=0.5, color=(0, 0, 255), IMG_SIZE=256):
frame = frame.copy()
cv2.putText(frame, "ALIGN FACES", (60, 20), 0, fontsz * IMG_SIZE / 255, color, thk)
cv2.putText(
frame, "THEN PRESS X", (60, 245), 0, fontsz * IMG_SIZE / 255, color, thk
)
return frame
def draw_calib_text(frame, thk=2, fontsz=0.5, color=(0, 0, 255), IMG_SIZE=256):
frame = frame.copy()
cv2.putText(
frame, "FIT FACE IN RECTANGLE", (40, 20), 0, fontsz * IMG_SIZE / 255, color, thk
)
cv2.putText(frame, "W - ZOOM IN", (60, 40), 0, fontsz * IMG_SIZE / 255, color, thk)
cv2.putText(frame, "S - ZOOM OUT", (60, 60), 0, fontsz * IMG_SIZE / 255, color, thk)
cv2.putText(
frame, "THEN PRESS X", (60, 245), 0, fontsz * IMG_SIZE / 255, color, thk
)
return frame
def select_camera(log, config):
cam_config = config["cam_config"]
cam_id = None
if os.path.isfile(cam_config):
with open(cam_config, "r") as f:
cam_config = yaml.load(f, Loader=yaml.FullLoader)
cam_id = cam_config["cam_id"]
else:
cam_frames = query_cameras(config["query_n_cams"])
if cam_frames:
if len(cam_frames) == 1:
cam_id = list(cam_frames)[0]
else:
cam_id = select_camera(cam_frames, window="CLICK ON YOUR CAMERA")
log(f"Selected camera {cam_id}")
with open(cam_config, "w") as f:
yaml.dump({"cam_id": cam_id}, f)
else:
log("No cameras are available")
return cam_id
================================================
FILE: src/dot/commons/cam/camera_selector.py
================================================
#!/usr/bin/env python3
import cv2
import numpy as np
import yaml
from ..utils import log
g_selected_cam = None
def query_cameras(n_cams):
cam_frames = {}
cap = None
for camid in range(n_cams):
log(f"Trying camera with id {camid}")
cap = cv2.VideoCapture(camid)
if not cap.isOpened():
log(f"Camera with id {camid} is not available")
continue
ret, frame = cap.read()
if not ret or frame is None:
log(f"Could not read from camera with id {camid}")
cap.release()
continue
for i in range(10):
ret, frame = cap.read()
cam_frames[camid] = frame.copy()
cap.release()
return cam_frames
def make_grid(images, cell_size=(320, 240), cols=2):
w0, h0 = cell_size
_rows = len(images) // cols + int(len(images) % cols)
_cols = min(len(images), cols)
grid = np.zeros((h0 * _rows, w0 * _cols, 3), dtype=np.uint8)
for i, (camid, img) in enumerate(images.items()):
img = cv2.resize(img, (w0, h0))
# add rect
img = cv2.rectangle(img, (1, 1), (w0 - 1, h0 - 1), (0, 0, 255), 2)
# add id
img = cv2.putText(img, f"Camera {camid}", (10, 30), 0, 1, (0, 255, 0), 2)
c = i % cols
r = i // cols
grid[r * h0 : (r + 1) * h0, c * w0 : (c + 1) * w0] = img[..., :3]
return grid
def mouse_callback(event, x, y, flags, userdata):
global g_selected_cam
if event == 1:
cell_size, grid_cols, cam_frames = userdata
c = x // cell_size[0]
r = y // cell_size[1]
camid = r * grid_cols + c
if camid < len(cam_frames):
g_selected_cam = camid
def select_camera(cam_frames, window="Camera selector"):
cell_size = 320, 240
grid_cols = 2
grid = make_grid(cam_frames, cols=grid_cols)
# to fit the text if only one cam available
if grid.shape[1] == 320:
cell_size = 640, 480
grid = cv2.resize(grid, cell_size)
cv2.putText(
grid,
"Click on the web camera to use",
(10, grid.shape[0] - 30),
0,
0.7,
(200, 200, 200),
2,
)
cv2.namedWindow(window)
cv2.setMouseCallback(window, mouse_callback, (cell_size, grid_cols, cam_frames))
cv2.imshow(window, grid)
while True:
key = cv2.waitKey(10)
if g_selected_cam is not None:
break
if key == 27:
break
cv2.destroyAllWindows()
if g_selected_cam is not None:
return list(cam_frames)[g_selected_cam]
else:
return list(cam_frames)[0]
if __name__ == "__main__":
with open("config.yaml", "r") as f:
config = yaml.load(f, Loader=yaml.FullLoader)
cam_frames = query_cameras(config["query_n_cams"])
if cam_frames:
selected_cam = select_camera(cam_frames)
print(f"Selected camera {selected_cam}")
else:
log("No cameras are available")
================================================
FILE: src/dot/commons/camera_utils.py
================================================
#!/usr/bin/env python3
"""
Copyright (c) 2022, Sensity B.V. All rights reserved.
licensed under the BSD 3-Clause "New" or "Revised" License.
"""
from typing import Any, Callable, Dict, List, Union
import cv2
import numpy as np
from .cam.cam import draw_fps
from .utils import TicToc, find_images_from_path
from .video.videocaptureasync import VideoCaptureAsync
def fetch_camera(target: int) -> VideoCaptureAsync:
"""Fetches a VideoCaptureAsync object.
Args:
target (int): Camera ID descriptor.
Raises:
ValueError: If camera ID descriptor is not valid.
Returns:
VideoCaptureAsync: VideoCaptureAsync object.
"""
try:
return VideoCaptureAsync(target)
except RuntimeError:
raise ValueError(f"Camera {target} does not exist.")
def camera_pipeline(
cap: VideoCaptureAsync,
source: str,
target: int,
change_option: Callable[[np.ndarray], None],
process_image: Callable[[np.ndarray], np.ndarray],
post_process_image: Callable[[np.ndarray], np.ndarray],
crop_size: int = 224,
show_fps: bool = False,
**kwargs: Dict,
) -> None:
"""Open a webcam stream `target` and performs face-swap based on `source` image by frame.
Args:
cap (VideoCaptureAsync): VideoCaptureAsync object.
source (str): Path to source image folder.
target (int): Camera ID descriptor.
change_option (Callable[[np.ndarray], None]): Set `source` arg as faceswap source image.
process_image (Callable[[np.ndarray], np.ndarray]): Performs actual face swap.
post_process_image (Callable[[np.ndarray], np.ndarray]): Applies face restoration GPEN to result image.
crop_size (int, optional): Face crop size. Defaults to 224.
show_fps (bool, optional): Display FPS. Defaults to False.
"""
source = find_images_from_path(source)
print("=== Control keys ===")
print("1-9: Change avatar")
for i, fname in enumerate(source):
print(str(i + 1) + ": " + fname)
# Todo describe controls available
pic_a = source[0]
img_a_whole = cv2.imread(pic_a)
change_option(img_a_whole)
img_a_align_crop = process_image(img_a_whole)
img_a_align_crop = post_process_image(img_a_align_crop)
cap.start()
ret, frame = cap.read()
cv2.namedWindow("cam", cv2.WINDOW_GUI_NORMAL)
cv2.moveWindow("cam", 500, 250)
frame_index = -1
fps_hist: List = []
fps: Union[Any, float] = 0
show_self = False
while True:
frame_index += 1
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if ret:
tt = TicToc()
timing = {"preproc": 0, "predict": 0, "postproc": 0}
tt.tic()
key = cv2.waitKey(1)
if 48 < key < 58:
show_self = False
source_image_i = min(key - 49, len(source) - 1)
pic_a = source[source_image_i]
img_a_whole = cv2.imread(pic_a)
change_option(img_a_whole, **kwargs)
elif key == ord("y"):
show_self = True
elif key == ord("q"):
break
elif key == ord("i"):
show_fps = not show_fps
if not show_self:
result_frame = process_image(frame, crop_size=crop_size, **kwargs) # type: ignore
timing["postproc"] = tt.toc()
result_frame = post_process_image(result_frame, **kwargs)
if show_fps:
result_frame = draw_fps(np.array(result_frame), fps, timing)
fps_hist.append(tt.toc(total=True))
if len(fps_hist) == 10:
fps = 10 / (sum(fps_hist) / 1000)
fps_hist = []
cv2.imshow("cam", result_frame)
else:
cv2.imshow("cam", frame)
else:
break
cap.stop()
cv2.destroyAllWindows()
================================================
FILE: src/dot/commons/model_option.py
================================================
#!/usr/bin/env python3
"""
Copyright (c) 2022, Sensity B.V. All rights reserved.
licensed under the BSD 3-Clause "New" or "Revised" License.
"""
import os
from abc import ABC, abstractmethod
from typing import Dict, List, Optional, Tuple, Union
import cv2
import torch
from ..gpen.face_enhancement import FaceEnhancement
from .camera_utils import camera_pipeline, fetch_camera
from .utils import find_images_from_path, generate_random_file_idx, rand_idx_tuple
from .video.video_utils import video_pipeline
class ModelOption(ABC):
def __init__(
self,
gpen_type=None,
gpen_path="saved_models/gpen",
use_gpu=True,
crop_size=256,
):
self.gpen_type = gpen_type
self.use_gpu = use_gpu
self.crop_size = crop_size
if gpen_type:
if gpen_type == "gpen_512":
model = {
"name": "GPEN-BFR-512",
"size": 512,
"channel_multiplier": 2,
"narrow": 1,
}
else:
model = {
"name": "GPEN-BFR-256",
"size": 256,
"channel_multiplier": 1,
"narrow": 0.5,
}
self.face_enhancer = FaceEnhancement(
size=model["size"],
model=model["name"],
channel_multiplier=model["channel_multiplier"],
narrow=model["narrow"],
use_gpu=self.use_gpu,
base_dir=gpen_path,
)
def generate_from_image(
self,
source: Union[str, List],
target: Union[str, List],
save_folder: str,
limit: Optional[int] = None,
swap_case_idx: Optional[Tuple] = (0, 0),
**kwargs,
) -> Optional[List[Dict]]:
"""_summary_
Args:
source (Union[str, List]): A list with source images filepaths, or single image filepath.
target (Union[str, List]): A list with target images filepaths, or single image filepath.
save_folder (str): Output path.
limit (Optional[int], optional): Total number of face-swaps. If None,
is set to `len(souce)` * `len(target)`. Defaults to None.
swap_case_idx (Optional[Tuple], optional): Used as keyword among multiple swaps. Defaults to (0, 0).
Returns:
List[Dict]: Array of successful and rejected metadata dictionaries
"""
if not save_folder:
print("Need to define output folder... Skipping")
return None
# source/target can be single file
if not isinstance(source, list):
source = find_images_from_path(source)
target = find_images_from_path(target)
if not limit:
# allow all possible swaps
limit = len(source) * len(target)
swappedDict = {}
rejectedDict = {}
count = 0
rejected_count = 0
seen_swaps = []
source_len = len(source)
target_len = len(target)
with torch.no_grad():
profiler = kwargs.get("profiler", False)
if not profiler:
self.create_model(**kwargs)
while count < limit:
rand_swap = rand_idx_tuple(source_len, target_len)
while rand_swap in seen_swaps:
rand_swap = rand_idx_tuple(source_len, target_len)
src_idx = rand_swap[0]
tar_idx = rand_swap[1]
src_img = source[src_idx]
tar_img = target[tar_idx]
# check if files exits
if not os.path.exists(src_img) or not os.path.exists(tar_img):
print("source/image file does not exist", src_img, tar_img)
continue
# read source image
source_image = cv2.imread(src_img)
frame = cv2.imread(tar_img)
try:
self.change_option(source_image)
frame = self.process_image(frame, use_cam=False, ignore_error=False)
# check if frame == target_image, if it does, image rejected
frame = self.post_process_image(frame)
# flush image to disk
file_idx = generate_random_file_idx(6)
file_name = os.path.join(save_folder, f"{file_idx:0>6}.jpg")
while os.path.exists(file_name):
print(f"Swap id: {file_idx} already exists, generating again.")
file_idx = generate_random_file_idx(6)
file_name = os.path.join(save_folder, f"{file_idx:0>6}.jpg")
cv2.imwrite(file_name, frame)
# keep track metadata
key = f"{swap_case_idx[1]}{file_idx:0>6}.jpg"
swappedDict[key] = {
"target": {"path": tar_img, "size": frame.shape},
"source": {"path": src_img, "size": source_image.shape},
}
print(
f"{count}: Performed face swap {src_img, tar_img} saved to {file_name}"
)
# keep track of previous swaps
seen_swaps.append(rand_swap)
count += 1
except Exception as e:
rejectedDict[rejected_count] = {
"target": {"path": tar_img, "size": frame.shape},
"source": {"path": src_img, "size": source_image.shape},
}
rejected_count += 1
print(f"Cannot perform face swap {src_img, tar_img}")
print(e)
return [swappedDict, rejectedDict]
def generate_from_camera(
self,
source: str,
target: int,
opt_crop_size: int = 224,
show_fps: bool = False,
**kwargs: Dict,
) -> None:
"""Invokes `camera_pipeline` main-loop.
Args:
source (str): Source image filepath.
target (int): Camera descriptor/ID.
opt_crop_size (int, optional): Crop size. Defaults to 224.
show_fps (bool, optional): Show FPS. Defaults to False.
"""
with torch.no_grad():
cap = fetch_camera(target)
self.create_model(opt_crop_size=opt_crop_size, **kwargs)
camera_pipeline(
cap,
source,
target,
self.change_option,
self.process_image,
self.post_process_image,
crop_size=opt_crop_size,
show_fps=show_fps,
)
def generate_from_video(
self,
source: str,
target: str,
save_folder: str,
duration: int,
limit: int = None,
**kwargs: Dict,
) -> None:
"""Invokes `video_pipeline` main-loop.
Args:
source (str): Source image filepath.
target (str): Target video filepath.
save_folder (str): Output folder.
duration (int): Trim target video in seconds.
limit (int, optional): Limit number of video-swaps. Defaults to None.
"""
with torch.no_grad():
self.create_model(**kwargs)
video_pipeline(
source,
target,
save_folder,
duration,
self.change_option,
self.process_image,
self.post_process_image,
self.crop_size,
limit,
**kwargs,
)
def post_process_image(self, image, **kwargs):
if self.gpen_type:
image, orig_faces, enhanced_faces = self.face_enhancer.process(
img=image, use_gpu=self.use_gpu
)
return image
@abstractmethod
def change_option(self, image, **kwargs):
pass
@abstractmethod
def process_image(self, image, **kwargs):
pass
@abstractmethod
def create_model(self, source, target, limit=None, swap_case_idx=0, **kwargs):
pass
================================================
FILE: src/dot/commons/pose/head_pose.py
================================================
#!/usr/bin/env python3
import cv2
import mediapipe as mp
import numpy as np
mp_face_mesh = mp.solutions.face_mesh
face_mesh = mp_face_mesh.FaceMesh(
min_detection_confidence=0.5, min_tracking_confidence=0.5
)
mp_drawing = mp.solutions.drawing_utils
# https://github.com/google/mediapipe/issues/1615
HEAD_POSE_LANDMARKS = [
33,
263,
1,
61,
291,
199,
]
def pose_estimation(
image: np.array, roll: int = 3, pitch: int = 3, yaw: int = 3
) -> int:
"""
Adjusted from: https://github.com/niconielsen32/ComputerVision/blob/master/headPoseEstimation.py
Given an image and desired `roll`, `pitch` and `yaw` angles, the method checks whether
estimated head-pose meets requirements.
Args:
image: Image to estimate head pose.
roll: Rotation margin in X axis.
pitch: Rotation margin in Y axis.
yaw: Rotation margin in Z axis.
Returns:
int: Success(0) or Fail(-1).
"""
results = face_mesh.process(image)
img_h, img_w, img_c = image.shape
face_3d = []
face_2d = []
if results.multi_face_landmarks:
for face_landmarks in results.multi_face_landmarks:
for idx, lm in enumerate(face_landmarks.landmark):
if idx in HEAD_POSE_LANDMARKS:
x, y = int(lm.x * img_w), int(lm.y * img_h)
# get 2d coordinates
face_2d.append([x, y])
# get 3d coordinates
face_3d.append([x, y, lm.z])
# convert to numpy
face_2d = np.array(face_2d, dtype=np.float64)
face_3d = np.array(face_3d, dtype=np.float64)
# camera matrix
focal_length = 1 * img_w
cam_matrix = np.array(
[[focal_length, 9, img_h / 2], [0, focal_length, img_w / 2], [0, 0, 1]]
)
# distortion
dist_matrix = np.zeros((4, 1), dtype=np.float64)
# solve pnp
success, rot_vec, trans_vec = cv2.solvePnP(
face_3d, face_2d, cam_matrix, dist_matrix
)
# rotational matrix
rmat, jac = cv2.Rodrigues(rot_vec)
# get angles
angles, mtxR, mtxQ, Qx, Qy, Qz = cv2.RQDecomp3x3(rmat)
# get rotation angles
x = angles[0] * 360
y = angles[1] * 360
z = angles[2] * 360
# head rotation in X axis
if x < -roll or x > roll:
return -1
# head rotation in Y axis
if y < -pitch or y > pitch:
return -1
# head rotation in Z axis
if z < -yaw or z > yaw:
return -1
return 0
return -1
================================================
FILE: src/dot/commons/utils.py
================================================
#!/usr/bin/env python3
import glob
import os
import random
import sys
import time
from collections import defaultdict
from typing import Dict, List, Tuple
import cv2
import numpy as np
SEED = 42
np.random.seed(SEED)
def log(*args, **kwargs):
time_str = f"{time.time():.6f}"
print(f"[{time_str}]", *args, **kwargs)
def info(*args, file=sys.stdout, **kwargs):
print(*args, file=file, **kwargs)
def find_images_from_path(path):
"""
@arguments:
path (str/int) : Could be either path(str)
or a CamID(int)
"""
if os.path.isfile(path):
return [path]
try:
return int(path)
except ValueError:
# supported extensions
ext = ["png", "jpg", "jpeg"]
files = []
[files.extend(glob.glob(path + "**/*." + e, recursive=True)) for e in ext]
return files
def find_files_from_path(path: str, ext: List, filter: str = None):
"""
@arguments:
path (str) Parent directory of files
ext (list) List of desired file extensions
"""
if os.path.isdir(path):
files = []
[
files.extend(glob.glob(path + "**/*." + e, recursive=True)) for e in ext # type: ignore
]
np.random.shuffle(files)
# filter
if filter is not None:
files = [file for file in files if filter in file]
print("Filtered files: ", len(files))
return files
return [path]
def expand_bbox(
bbox, image_width, image_height, scale=None
) -> Tuple[int, int, int, int]:
if scale is None:
raise ValueError("scale parameter is none")
x1, y1, x2, y2 = bbox
center_x, center_y = (x1 + x2) // 2, (y1 + y2) // 2
size_bb = round(max(x2 - x1, y2 - y1) * scale)
# Check for out of bounds, x-y top left corner
x1 = max(int(center_x - size_bb // 2), 0)
y1 = max(int(center_y - size_bb // 2), 0)
# Check for too big bb size for given x, y
size_bb = min(image_width - x1, size_bb)
size_bb = min(image_height - y1, size_bb)
return (x1, y1, x1 + size_bb, y1 + size_bb)
def rand_idx_tuple(source_len, target_len):
"""
pick a random tuple for source/target
"""
return (random.randrange(source_len), random.randrange(target_len))
def generate_random_file_idx(length):
return int("".join([str(random.randint(0, 10)) for _ in range(length)]))
class Tee(object):
def __init__(self, filename, mode="w", terminal=sys.stderr):
self.file = open(filename, mode, buffering=1)
self.terminal = terminal
def __del__(self):
self.file.close()
def write(self, *args, **kwargs):
log(*args, file=self.file, **kwargs)
log(*args, file=self.terminal, **kwargs)
def __call__(self, *args, **kwargs):
return self.write(*args, **kwargs)
def flush(self):
self.file.flush()
class Logger:
def __init__(self, filename, verbose=True):
self.tee = Tee(filename)
self.verbose = verbose
def __call__(self, *args, important=False, **kwargs):
if not self.verbose and not important:
return
self.tee(*args, **kwargs)
class Once:
_id: Dict = {}
def __init__(self, what, who=log, per=1e12):
"""Do who(what) once per seconds.
what: args for who
who: callable
per: frequency in seconds.
"""
assert callable(who)
now = time.time()
if what not in Once._id or now - Once._id[what] > per:
who(what)
Once._id[what] = now
class TicToc:
def __init__(self):
self.t = None
self.t_init = time.time()
def tic(self):
self.t = time.time()
def toc(self, total=False):
if total:
return (time.time() - self.t_init) * 1000
assert self.t, "You forgot to call tic()"
return (time.time() - self.t) * 1000
def tocp(self, str):
t = self.toc()
log(f"{str} took {t:.4f}ms")
return t
class AccumDict:
def __init__(self, num_f=3):
self.d = defaultdict(list)
self.num_f = num_f
def add(self, k, v):
self.d[k] += [v]
def __dict__(self):
return self.d
def __getitem__(self, key):
return self.d[key]
def __str__(self):
s = ""
for k in self.d:
if not self.d[k]:
continue
cur = self.d[k][-1]
avg = np.mean(self.d[k])
format_str = "{:.%df}" % self.num_f
cur_str = format_str.format(cur)
avg_str = format_str.format(avg)
s += f"{k} {cur_str} ({avg_str})\t\t"
return s
def __repr__(self):
return self.__str__()
def clamp(value, min_value, max_value):
return max(min(value, max_value), min_value)
def crop(img, p=0.7, offset_x=0, offset_y=0):
h, w = img.shape[:2]
x = int(min(w, h) * p)
_l = (w - x) // 2
r = w - _l
u = (h - x) // 2
d = h - u
offset_x = clamp(offset_x, -_l, w - r)
offset_y = clamp(offset_y, -u, h - d)
_l += offset_x
r += offset_x
u += offset_y
d += offset_y
return img[u:d, _l:r], (offset_x, offset_y)
def pad_img(img, target_size, default_pad=0):
sh, sw = img.shape[:2]
w, h = target_size
pad_w, pad_h = default_pad, default_pad
if w / h > 1:
pad_w += int(sw * (w / h) - sw) // 2
else:
pad_h += int(sh * (h / w) - sh) // 2
out = np.pad(img, [[pad_h, pad_h], [pad_w, pad_w], [0, 0]], "constant")
return out
def resize(img, size, version="cv"):
return cv2.resize(img, size)
def determine_path():
"""
Find the script path
"""
try:
root = __file__
if os.path.islink(root):
root = os.path.realpath(root)
return os.path.dirname(os.path.abspath(root))
except Exception as e:
print(e)
print("I'm sorry, but something is wrong.")
print("There is no __file__ variable. Please contact the author.")
sys.exit()
================================================
FILE: src/dot/commons/video/__init__.py
================================================
#!/usr/bin/env python3
================================================
FILE: src/dot/commons/video/video_utils.py
================================================
#!/usr/bin/env python3
"""
Copyright (c) 2022, Sensity B.V. All rights reserved.
licensed under the BSD 3-Clause "New" or "Revised" License.
"""
import os
import random
from typing import Callable, Dict, Union
import cv2
import mediapipe as mp
import numpy as np
from mediapipe.python.solutions.drawing_utils import _normalized_to_pixel_coordinates
from ..pose.head_pose import pose_estimation
from ..utils import expand_bbox, find_files_from_path
mp_face = mp.solutions.face_detection.FaceDetection(
model_selection=0, # model selection
min_detection_confidence=0.5, # confidence threshold
)
def _crop_and_pose(
image: np.array, estimate_pose: bool = False
) -> Union[np.array, int]:
"""Crops face of `image` and estimates head pose.
Args:
image (np.array): Image to be cropped and estimate pose.
estimate_pose (Boolean, optional): Enables pose estimation. Defaults to False.
Returns:
Union[np.array,int]: Cropped image or -1.
"""
image_rows, image_cols, _ = image.shape
results = mp_face.process(image)
if results.detections is None:
return -1
detection = results.detections[0]
location = detection.location_data
relative_bounding_box = location.relative_bounding_box
rect_start_point = _normalized_to_pixel_coordinates(
relative_bounding_box.xmin, relative_bounding_box.ymin, image_cols, image_rows
)
rect_end_point = _normalized_to_pixel_coordinates(
min(relative_bounding_box.xmin + relative_bounding_box.width, 1.0),
min(relative_bounding_box.ymin + relative_bounding_box.height, 1.0),
image_cols,
image_rows,
)
xleft, ytop = rect_start_point
xright, ybot = rect_end_point
xleft, ytop, xright, ybot = expand_bbox(
(xleft, ytop, xright, ybot), image_rows, image_cols, 2.0
)
try:
crop_image = image[ytop:ybot, xleft:xright]
if estimate_pose:
if pose_estimation(image=crop_image, roll=3, pitch=3, yaw=3) != 0:
return -1
return cv2.flip(crop_image, 1)
except Exception as e:
print(e)
return -1
def video_pipeline(
source: str,
target: str,
save_folder: str,
duration: int,
change_option: Callable[[np.ndarray], None],
process_image: Callable[[np.ndarray], np.ndarray],
post_process_image: Callable[[np.ndarray], np.ndarray],
crop_size: int = 224,
limit: int = None,
**kwargs: Dict,
) -> None:
"""Process input video file `target` by frame and performs face-swap based on first image
found in `source` path folder. Uses cv2.VideoWriter to flush the resulted video on disk.
Trimming video is done as: trimmed = fps * duration.
Args:
source (str): Path to source image folder.
target (str): Path to target video folder.
save_folder (str): Output folder path.
duration (int): Crop target video in seconds.
change_option (Callable[[np.ndarray], None]): Set `source` arg as faceswap source image.
process_image (Callable[[np.ndarray], np.ndarray]): Performs actual face swap.
post_process_image (Callable[[np.ndarray], np.ndarray]): Applies face restoration GPEN to result image.
head_pose (bool): Estimates head pose before swap. Used by Avatarify.
crop_size (int, optional): Face crop size. Defaults to 224.
limit (int, optional): Limit number of video-swaps. Defaults to None.
"""
head_pose = kwargs.get("head_pose", False)
source_imgs = find_files_from_path(source, ["jpg", "png", "jpeg"], filter=None)
target_videos = find_files_from_path(target, ["avi", "mp4", "mov", "MOV"])
if not source_imgs or not target_videos:
print("Could not find any source/target files")
return
# unique combinations of source/target
swaps_combination = [(im, vi) for im in source_imgs for vi in target_videos]
# randomize list
random.shuffle(swaps_combination)
if limit:
swaps_combination = swaps_combination[:limit]
print("Total source images: ", len(source_imgs))
print("Total target videos: ", len(target_videos))
print("Total number of face-swaps: ", len(swaps_combination))
# iterate on each source-target pair
for (source, target) in swaps_combination:
img_a_whole = cv2.imread(source)
img_a_whole = _crop_and_pose(img_a_whole, estimate_pose=head_pose)
if isinstance(img_a_whole, int):
print(
f"Image {source} failed on face detection or pose estimation requirements haven't met."
)
continue
change_option(img_a_whole)
img_a_align_crop = process_image(img_a_whole)
img_a_align_crop = post_process_image(img_a_align_crop)
# video handle
cap = cv2.VideoCapture(target)
fps = int(cap.get(cv2.CAP_PROP_FPS))
frame_width = int(cap.get(3))
frame_height = int(cap.get(4))
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
# trim original video length
if duration and (fps * int(duration)) < total_frames:
total_frames = fps * int(duration)
# result video is saved in `save_folder` with name combining source/target files.
source_base_name = os.path.basename(source)
target_base_name = os.path.basename(target)
output_file = f"{os.path.splitext(source_base_name)[0]}_{os.path.splitext(target_base_name)[0]}.mp4"
output_file = os.path.join(save_folder, output_file)
fourcc = cv2.VideoWriter_fourcc("X", "V", "I", "D")
video_writer = cv2.VideoWriter(
output_file, fourcc, fps, (frame_width, frame_height), True
)
print(
f"Source: {source} \nTarget: {target} \nOutput: {output_file} \nFPS: {fps} \nTotal frames: {total_frames}"
)
# process each frame individually
for _ in range(total_frames):
ret, frame = cap.read()
if ret is True:
frame = cv2.flip(frame, 1)
result_frame = process_image(frame, use_cam=False, crop_size=crop_size, **kwargs) # type: ignore
result_frame = post_process_image(result_frame, **kwargs)
video_writer.write(result_frame)
else:
break
cap.release()
video_writer.release()
================================================
FILE: src/dot/commons/video/videocaptureasync.py
================================================
#!/usr/bin/env python3
# https://github.com/gilbertfrancois/video-capture-async
import threading
import time
import cv2
WARMUP_TIMEOUT = 10.0
class VideoCaptureAsync:
def __init__(self, src=0, width=640, height=480):
self.src = src
self.cap = cv2.VideoCapture(self.src)
if not self.cap.isOpened():
raise RuntimeError("Cannot open camera")
self.cap.set(cv2.CAP_PROP_FRAME_WIDTH, width)
self.cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height)
self.grabbed, self.frame = self.cap.read()
self.started = False
self.read_lock = threading.Lock()
def set(self, var1, var2):
self.cap.set(var1, var2)
def isOpened(self):
return self.cap.isOpened()
def start(self):
if self.started:
print("[!] Asynchronous video capturing has already been started.")
return None
self.started = True
self.thread = threading.Thread(target=self.update, args=(), daemon=True)
self.thread.start()
# (warmup) wait for the first successfully grabbed frame
warmup_start_time = time.time()
while not self.grabbed:
warmup_elapsed_time = time.time() - warmup_start_time
if warmup_elapsed_time > WARMUP_TIMEOUT:
raise RuntimeError(
f"Failed to succesfully grab frame from "
f"the camera (timeout={WARMUP_TIMEOUT}s). "
f"Try to restart."
)
time.sleep(0.5)
return self
def update(self):
while self.started:
grabbed, frame = self.cap.read()
if not grabbed or frame is None or frame.size == 0:
continue
with self.read_lock:
self.grabbed = grabbed
self.frame = frame
def read(self):
while True:
with self.read_lock:
frame = self.frame.copy()
grabbed = self.grabbed
break
return grabbed, frame
def stop(self):
self.started = False
self.thread.join()
def __exit__(self, exec_type, exc_value, traceback):
self.cap.release()
================================================
FILE: src/dot/dot.py
================================================
#!/usr/bin/env python3
"""
Copyright (c) 2022, Sensity B.V. All rights reserved.
licensed under the BSD 3-Clause "New" or "Revised" License.
"""
from pathlib import Path
from typing import List, Optional, Union
from .commons import ModelOption
from .faceswap_cv2 import FaceswapCVOption
from .fomm import FOMMOption
from .simswap import SimswapOption
AVAILABLE_SWAP_TYPES = ["simswap", "fomm", "faceswap_cv2"]
class DOT:
"""Main DOT Interface.
Supported Engines:
- `simswap`
- `fomm`
- `faceswap_cv2`
Attributes:
use_cam (bool): Use camera descriptor and pipeline.
use_video (bool): Use video-swap pipeline.
use_image (bool): Use image-swap pipeline.
save_folder (str): Output folder to store face-swaps and metadata file when `use_cam` is False.
"""
def __init__(
self,
use_video: bool = False,
use_image: bool = False,
save_folder: str = None,
*args,
**kwargs,
):
"""Constructor method.
Args:
use_video (bool, optional): if True, use video-swap pipeline. Defaults to False.
use_image (bool, optional): if True, use image-swap pipeline. Defaults to False.
save_folder (str, optional): Output folder to store face-swaps and metadata file when `use_cam` is False.
Defaults to None.
"""
# init
self.use_video = use_video
self.save_folder = save_folder
self.use_image = use_image
# additional attributes
self.use_cam = (not use_video) and (not use_image)
# create output folder
if self.save_folder and not Path(self.save_folder).exists():
Path(self.save_folder).mkdir(parents=True, exist_ok=True)
def build_option(
self,
swap_type: str,
use_gpu: bool,
gpen_type: str,
gpen_path: str,
crop_size: int,
**kwargs,
) -> ModelOption:
"""Build DOT option based on swap type.
Args:
swap_type (str): Swap type engine.
use_gpu (bool): If True, use GPU.
gpen_type (str): GPEN type.
gpen_path (str): path to GPEN model checkpoint.
crop_size (int): crop size.
Returns:
ModelOption: DOT option.
"""
if swap_type not in AVAILABLE_SWAP_TYPES:
raise ValueError(f"Invalid swap type: {swap_type}")
option: ModelOption = None
if swap_type == "simswap":
option = self.simswap(
use_gpu=use_gpu,
gpen_type=gpen_type,
gpen_path=gpen_path,
crop_size=crop_size,
)
elif swap_type == "fomm":
option = self.fomm(
use_gpu=use_gpu, gpen_type=gpen_type, gpen_path=gpen_path, **kwargs
)
elif swap_type == "faceswap_cv2":
option = self.faceswap_cv2(
use_gpu=use_gpu, gpen_type=gpen_type, gpen_path=gpen_path
)
return option
def generate(
self,
option: ModelOption,
source: str,
target: Union[int, str],
show_fps: bool = False,
duration: int = None,
**kwargs,
) -> Optional[List]:
"""Differentiates among different swap options.
Available swap options:
- `camera`
- `image`
- `video`
Args:
option (ModelOption): Swap engine class.
source (str): File path of source image.
target (Union[int, str]): Either `int` which indicates camera descriptor or target image file.
show_fps (bool, optional): Displays FPS during camera pipeline. Defaults to False.
duration (int, optional): Used to trim source video in seconds. Defaults to None.
Returns:
Optional[List]: None when using camera, otherwise metadata of successful and rejected face-swaps.
"""
if self.use_cam:
option.generate_from_camera(
source, int(target), show_fps=show_fps, **kwargs
)
return None
if isinstance(target, str):
if self.use_video:
option.generate_from_video(
source, target, self.save_folder, duration, **kwargs
)
return None
elif self.use_image:
[swappedDict, rejectedDict] = option.generate_from_image(
source, target, self.save_folder, **kwargs
)
return [swappedDict, rejectedDict]
else:
return None
else:
return None
def simswap(
self,
use_gpu: bool,
gpen_type: str,
gpen_path: str,
crop_size: int = 224,
use_mask: bool = True,
) -> SimswapOption:
"""Build Simswap Option.
Args:
use_gpu (bool): If True, use GPU.
gpen_type (str): GPEN type.
gpen_path (str): path to GPEN model checkpoint.
crop_size (int, optional): crop size. Defaults to 224.
use_mask (bool, optional): If True, use mask. Defaults to True.
Returns:
SimswapOption: Simswap Option.
"""
return SimswapOption(
use_gpu=use_gpu,
gpen_type=gpen_type,
gpen_path=gpen_path,
crop_size=crop_size,
use_mask=use_mask,
)
def faceswap_cv2(
self, use_gpu: bool, gpen_type: str, gpen_path: str, crop_size: int = 256
) -> FaceswapCVOption:
"""Build FaceswapCV Option.
Args:
use_gpu (bool): If True, use GPU.
gpen_type (str): GPEN type.
gpen_path (str): path to GPEN model checkpoint.
crop_size (int, optional): crop size. Defaults to 256.
Returns:
FaceswapCVOption: FaceswapCV Option.
"""
return FaceswapCVOption(
use_gpu=use_gpu,
gpen_type=gpen_type,
gpen_path=gpen_path,
crop_size=crop_size,
)
def fomm(
self,
use_gpu: bool,
gpen_type: str,
gpen_path: str,
crop_size: int = 256,
**kwargs,
) -> FOMMOption:
"""Build FOMM Option.
Args:
use_gpu (bool): If True, use GPU.
gpen_type (str): GPEN type.
gpen_path (str): path to GPEN model checkpoint.
crop_size (int, optional): crop size. Defaults to 256.
Returns:
FOMMOption: FOMM Option.
"""
return FOMMOption(
use_gpu=use_gpu,
gpen_type=gpen_type,
gpen_path=gpen_path,
crop_size=crop_size,
offline=self.use_video,
)
================================================
FILE: src/dot/faceswap_cv2/__init__.py
================================================
#!/usr/bin/env python3
from .option import FaceswapCVOption
__all__ = ["FaceswapCVOption"]
================================================
FILE: src/dot/faceswap_cv2/generic.py
================================================
#!/usr/bin/env python3
import cv2
import numpy as np
import scipy.spatial as spatial
def bilinear_interpolate(img, coords):
"""
Interpolates over every image channel.
https://en.wikipedia.org/wiki/Bilinear_interpolation
:param img: max 3 channel image
:param coords: 2 x _m_ array. 1st row = xcoords, 2nd row = ycoords
:returns: array of interpolated pixels with same shape as coords
"""
int_coords = np.int32(coords)
x0, y0 = int_coords
dx, dy = coords - int_coords
# 4 Neighour pixels
q11 = img[y0, x0]
q21 = img[y0, x0 + 1]
q12 = img[y0 + 1, x0]
q22 = img[y0 + 1, x0 + 1]
btm = q21.T * dx + q11.T * (1 - dx)
top = q22.T * dx + q12.T * (1 - dx)
inter_pixel = top * dy + btm * (1 - dy)
return inter_pixel.T
def grid_coordinates(points):
"""
x,y grid coordinates within the ROI of supplied points.
:param points: points to generate grid coordinates
:returns: array of (x, y) coordinates
"""
xmin = np.min(points[:, 0])
xmax = np.max(points[:, 0]) + 1
ymin = np.min(points[:, 1])
ymax = np.max(points[:, 1]) + 1
return np.asarray(
[(x, y) for y in range(ymin, ymax) for x in range(xmin, xmax)], np.uint32
)
def process_warp(src_img, result_img, tri_affines, dst_points, delaunay):
"""
Warp each triangle from the src_image only within the
ROI of the destination image (points in dst_points).
"""
roi_coords = grid_coordinates(dst_points)
# indices to vertices. -1 if pixel is not in any triangle
roi_tri_indices = delaunay.find_simplex(roi_coords)
for simplex_index in range(len(delaunay.simplices)):
coords = roi_coords[roi_tri_indices == simplex_index]
num_coords = len(coords)
out_coords = np.dot(
tri_affines[simplex_index], np.vstack((coords.T, np.ones(num_coords)))
)
x, y = coords.T
result_img[y, x] = bilinear_interpolate(src_img, out_coords)
return None
def triangular_affine_matrices(vertices, src_points, dst_points):
"""
Calculate the affine transformation matrix for each
triangle (x,y) vertex from dst_points to src_points.
:param vertices: array of triplet indices to corners of triangle
:param src_points: array of [x, y] points to landmarks for source image
:param dst_points: array of [x, y] points to landmarks for destination image
:returns: 2 x 3 affine matrix transformation for a triangle
"""
ones = [1, 1, 1]
for tri_indices in vertices:
src_tri = np.vstack((src_points[tri_indices, :].T, ones))
dst_tri = np.vstack((dst_points[tri_indices, :].T, ones))
mat = np.dot(src_tri, np.linalg.inv(dst_tri))[:2, :]
yield mat
def warp_image_3d(src_img, src_points, dst_points, dst_shape, dtype=np.uint8):
rows, cols = dst_shape[:2]
result_img = np.zeros((rows, cols, 3), dtype=dtype)
delaunay = spatial.Delaunay(dst_points)
tri_affines = np.asarray(
list(triangular_affine_matrices(delaunay.simplices, src_points, dst_points))
)
process_warp(src_img, result_img, tri_affines, dst_points, delaunay)
return result_img
def transformation_from_points(points1, points2):
points1 = points1.astype(np.float64)
points2 = points2.astype(np.float64)
c1 = np.mean(points1, axis=0)
c2 = np.mean(points2, axis=0)
points1 -= c1
points2 -= c2
s1 = np.std(points1)
s2 = np.std(points2)
points1 /= s1
points2 /= s2
U, S, Vt = np.linalg.svd(np.dot(points1.T, points2))
R = (np.dot(U, Vt)).T
return np.vstack(
[
np.hstack([s2 / s1 * R, (c2.T - np.dot(s2 / s1 * R, c1.T))[:, np.newaxis]]),
np.array([[0.0, 0.0, 1.0]]),
]
)
def warp_image_2d(im, M, dshape):
output_im = np.zeros(dshape, dtype=im.dtype)
cv2.warpAffine(
im,
M[:2],
(dshape[1], dshape[0]),
dst=output_im,
borderMode=cv2.BORDER_TRANSPARENT,
flags=cv2.WARP_INVERSE_MAP,
)
return output_im
def mask_from_points(size, points, erode_flag=1):
radius = 10 # kernel size
kernel = np.ones((radius, radius), np.uint8)
mask = np.zeros(size, np.uint8)
cv2.fillConvexPoly(mask, cv2.convexHull(points), 255)
if erode_flag:
mask = cv2.erode(mask, kernel, iterations=1)
return mask
def correct_colours(im1, im2, landmarks1):
COLOUR_CORRECT_BLUR_FRAC = 0.75
LEFT_EYE_POINTS = list(range(42, 48))
RIGHT_EYE_POINTS = list(range(36, 42))
blur_amount = COLOUR_CORRECT_BLUR_FRAC * np.linalg.norm(
np.mean(landmarks1[LEFT_EYE_POINTS], axis=0)
- np.mean(landmarks1[RIGHT_EYE_POINTS], axis=0)
)
blur_amount = int(blur_amount)
if blur_amount % 2 == 0:
blur_amount += 1
im1_blur = cv2.GaussianBlur(im1, (blur_amount, blur_amount), 0)
im2_blur = cv2.GaussianBlur(im2, (blur_amount, blur_amount), 0)
# Avoid divide-by-zero errors.
im2_blur = im2_blur.astype(int)
im2_blur += 128 * (im2_blur <= 1)
result = (
im2.astype(np.float64)
* im1_blur.astype(np.float64)
/ im2_blur.astype(np.float64)
)
result = np.clip(result, 0, 255).astype(np.uint8)
return result
def apply_mask(img, mask):
"""
Apply mask to supplied image.
:param img: max 3 channel image
:param mask: [0-255] values in mask
:returns: new image with mask applied
"""
masked_img = cv2.bitwise_and(img, img, mask=mask)
return masked_img
================================================
FILE: src/dot/faceswap_cv2/option.py
================================================
#!/usr/bin/env python3
import cv2
import dlib
import numpy as np
from ..commons import ModelOption
from ..commons.utils import crop, resize
from ..faceswap_cv2.swap import Swap
class FaceswapCVOption(ModelOption):
def __init__(
self,
use_gpu=True,
use_mask=False,
crop_size=224,
gpen_type=None,
gpen_path=None,
):
super(FaceswapCVOption, self).__init__(
gpen_type=gpen_type,
use_gpu=use_gpu,
crop_size=crop_size,
gpen_path=gpen_path,
)
self.frame_proportion = 0.9
self.frame_offset_x = 0
self.frame_offset_y = 0
def create_model(self, model_path, **kwargs) -> None: # type: ignore
self.model = Swap(
predictor_path=model_path, end=68, warp_2d=False, correct_color=True
)
self.detector = dlib.get_frontal_face_detector()
def change_option(self, image, **kwargs):
self.source_image = image
self.src_landmarks, self.src_shape, self.src_face = self.model._process_face(
image
)
def process_image(
self, image, use_cam=True, ignore_error=True, **kwargs
) -> np.array:
frame = image[..., ::-1]
if use_cam:
frame, (self.frame_offset_x, self.frame_offset_y) = crop(
frame,
p=self.frame_proportion,
offset_x=self.frame_offset_x,
offset_y=self.frame_offset_y,
)
frame = resize(frame, (self.crop_size, self.crop_size))[..., :3]
frame = cv2.flip(frame, 1)
faces = self.detector(frame[..., ::-1])
if len(faces) > 0:
try:
swapped_img = self.model.apply_face_swap(
source_image=self.source_image,
target_image=frame,
save_path=None,
src_landmarks=self.src_landmarks,
src_shape=self.src_shape,
src_face=self.src_face,
)
swapped_img = np.array(swapped_img)[..., ::-1].copy()
except Exception as e:
if ignore_error:
print(e)
swapped_img = frame[..., ::-1].copy()
else:
raise e
else:
swapped_img = frame[..., ::-1].copy()
return swapped_img
================================================
FILE: src/dot/faceswap_cv2/swap.py
================================================
#!/usr/bin/env python3
from typing import Any, Dict
import cv2
import dlib
import numpy as np
from PIL import Image
from .generic import (
apply_mask,
correct_colours,
mask_from_points,
transformation_from_points,
warp_image_2d,
warp_image_3d,
)
# define globals
CACHED_PREDICTOR_PATH = "saved_models/faceswap_cv/shape_predictor_68_face_landmarks.dat"
class Swap:
def __init__(
self,
predictor_path: str = None,
warp_2d: bool = True,
correct_color: bool = True,
end: int = 48,
):
"""
Face Swap.
@description:
perform face swapping using Poisson blending
@arguments:
predictor_path: (str) path to 68-point facial landmark detector
warp_2d: (bool) if True, perform 2d warping for swapping
correct_color: (bool) if True, color correct swap output image
end: (int) last facial landmark point for face swap
"""
if not predictor_path:
predictor_path = CACHED_PREDICTOR_PATH
# init
self.predictor_path = predictor_path
self.warp_2d = warp_2d
self.correct_color = correct_color
self.end = end
# Load dlib models
self.detector = dlib.get_frontal_face_detector()
self.predictor = dlib.shape_predictor(self.predictor_path)
def apply_face_swap(self, source_image, target_image, save_path=None, **kwargs):
"""
apply face swapping from source to target image
@arguments:
source_image: (PIL or str) source PIL image or path to source image
target_image: (PIL or str) target PIL image or path to target image
save_path: (str) path to save face swap output image (optional)
**kwargs: Extra argument for specifying the source and target landmarks, shape and face
@returns:
faceswap_output_image: (PIL) face swap output image
"""
# load image if path given, else convert to cv2 format
if isinstance(source_image, str):
source_image_cv2 = cv2.imread(source_image)
else:
source_image_cv2 = cv2.cvtColor(np.array(source_image), cv2.COLOR_RGB2BGR)
if isinstance(target_image, str):
target_image_cv2 = cv2.imread(target_image)
else:
target_image_cv2 = cv2.cvtColor(np.array(target_image), cv2.COLOR_RGB2BGR)
# process source image
try:
src_landmarks = kwargs["src_landmarks"]
src_shape = kwargs["src_shape"]
src_face = kwargs["src_face"]
except Exception as e:
print(e)
src_landmarks, src_shape, src_face = self._process_face(source_image_cv2)
# process target image
trg_landmarks, trg_shape, trg_face = self._process_face(target_image_cv2)
# get target face dimensions
h, w = trg_face.shape[:2]
# 3d warp
warped_src_face = warp_image_3d(
src_face, src_landmarks[: self.end], trg_landmarks[: self.end], (h, w)
)
# Mask for blending
mask = mask_from_points((h, w), trg_landmarks)
mask_src = np.mean(warped_src_face, axis=2) > 0
mask = np.asarray(mask * mask_src, dtype=np.uint8)
# Correct color
if self.correct_color:
warped_src_face = apply_mask(warped_src_face, mask)
dst_face_masked = apply_mask(trg_face, mask)
warped_src_face = correct_colours(
dst_face_masked, warped_src_face, trg_landmarks
)
# 2d warp
if self.warp_2d:
unwarped_src_face = warp_image_3d(
warped_src_face,
trg_landmarks[: self.end],
src_landmarks[: self.end],
src_face.shape[:2],
)
warped_src_face = warp_image_2d(
unwarped_src_face,
transformation_from_points(trg_landmarks, src_landmarks),
(h, w, 3),
)
mask = mask_from_points((h, w), trg_landmarks)
mask_src = np.mean(warped_src_face, axis=2) > 0
mask = np.asarray(mask * mask_src, dtype=np.uint8)
# perform base blending operation
faceswap_output_cv2 = self._perform_base_blending(
mask, trg_face, warped_src_face
)
x, y, w, h = trg_shape
target_faceswap_img = target_image_cv2.copy()
target_faceswap_img[y : y + h, x : x + w] = faceswap_output_cv2
faceswap_output_image = Image.fromarray(
cv2.cvtColor(target_faceswap_img, cv2.COLOR_BGR2RGB)
)
if save_path:
faceswap_output_image.save(save_path, compress_level=0)
return faceswap_output_image
def _face_and_landmark_detection(self, image):
"""perform face detection and get facial landmarks"""
# get face bounding box
faces = self.detector(image)
idx = np.argmax(
[
(face.right() - face.left()) * (face.bottom() - face.top())
for face in faces
]
)
bbox = faces[idx]
# predict landmarks
landmarks_dlib = self.predictor(image=image, box=bbox)
face_landmarks = np.array([[p.x, p.y] for p in landmarks_dlib.parts()])
return face_landmarks
def _process_face(self, image, r=10):
"""process detected face and landmarks"""
# get landmarks
landmarks = self._face_and_landmark_detection(image)
# get image dimensions
im_w, im_h = image.shape[:2]
# get face edges
left, top = np.min(landmarks, 0)
right, bottom = np.max(landmarks, 0)
# scale landmarks and face edges
x, y = max(0, left - r), max(0, top - r)
w, h = min(right + r, im_h) - x, min(bottom + r, im_w) - y
return (
landmarks - np.asarray([[x, y]]),
(x, y, w, h),
image[y : y + h, x : x + w],
)
@staticmethod
def _perform_base_blending(mask, trg_face, warped_src_face):
"""perform Poisson blending using mask"""
# Shrink the mask
kernel = np.ones((10, 10), np.uint8)
mask = cv2.erode(mask, kernel, iterations=1)
# Poisson Blending
r = cv2.boundingRect(mask)
center = (r[0] + int(r[2] / 2), r[1] + int(r[3] / 2))
output_cv2 = cv2.seamlessClone(
warped_src_face, trg_face, mask, center, cv2.NORMAL_CLONE
)
return output_cv2
@classmethod
def from_config(cls, config: Dict[str, Any]) -> "Swap":
"""
Instantiates a Swap from a configuration.
Args:
config: A configuration for a Swap.
Returns:
A Swap instance.
"""
# get config
swap_config = config.get("swap")
# return instance
return cls(
predictor_path=swap_config.get("predictor_path", CACHED_PREDICTOR_PATH),
warp_2d=swap_config.get("warp_2d", True),
correct_color=swap_config.get("correct_color", True),
end=swap_config.get("end", 48),
)
================================================
FILE: src/dot/fomm/__init__.py
================================================
#!/usr/bin/env python3
from .option import FOMMOption
__all__ = ["FOMMOption"]
================================================
FILE: src/dot/fomm/config/vox-adv-256.yaml
================================================
---
dataset_params:
root_dir: data/vox-png
frame_shape: [256, 256, 3]
id_sampling: true
pairs_list: data/vox256.csv
augmentation_params:
flip_param:
horizontal_flip: true
time_flip: true
jitter_param:
brightness: 0.1
contrast: 0.1
saturation: 0.1
hue: 0.1
model_params:
common_params:
num_kp: 10
num_channels: 3
estimate_jacobian: true
kp_detector_params:
temperature: 0.1
block_expansion: 32
max_features: 1024
scale_factor: 0.25
num_blocks: 5
generator_params:
block_expansion: 64
max_features: 512
num_down_blocks: 2
num_bottleneck_blocks: 6
estimate_occlusion_map: true
dense_motion_params:
block_expansion: 64
max_features: 1024
num_blocks: 5
scale_factor: 0.25
discriminator_params:
scales: [1]
block_expansion: 32
max_features: 512
num_blocks: 4
use_kp: true
train_params:
num_epochs: 150
num_repeats: 75
epoch_milestones: []
lr_generator: 2.0e-4
lr_discriminator: 2.0e-4
lr_kp_detector: 2.0e-4
batch_size: 36
scales: [1, 0.5, 0.25, 0.125]
checkpoint_freq: 50
transform_params:
sigma_affine: 0.05
sigma_tps: 0.005
points_tps: 5
loss_weights:
generator_gan: 1
discriminator_gan: 1
feature_matching: [10, 10, 10, 10]
perceptual: [10, 10, 10, 10, 10]
equivariance_value: 10
equivariance_jacobian: 10
reconstruction_params:
num_videos: 1000
format: .mp4
animate_params:
num_pairs: 50
format: .mp4
normalization_params:
adapt_movement_scale: false
use_relative_movement: true
use_relative_jacobian: true
visualizer_params:
kp_size: 5
draw_border: true
colormap: gist_rainbow
================================================
FILE: src/dot/fomm/face_alignment.py
================================================
import warnings
from enum import IntEnum
import numpy as np
import torch
from face_alignment.folder_data import FolderData
from face_alignment.utils import crop, draw_gaussian, flip, get_image, get_preds_fromhm
from packaging import version
from tqdm import tqdm
class LandmarksType(IntEnum):
"""Enum class defining the type of landmarks to detect.
``TWO_D`` - the detected points ``(x,y)`` are detected in a 2D space and follow the visible contour of the face
``TWO_HALF_D`` - this points represent the projection of the 3D points into 3D
``THREE_D`` - detect the points ``(x,y,z)``` in a 3D space
"""
TWO_D = 1
TWO_HALF_D = 2
THREE_D = 3
class NetworkSize(IntEnum):
# TINY = 1
# SMALL = 2
# MEDIUM = 3
LARGE = 4
default_model_urls = {
"2DFAN-4": "saved_models/face_alignment/2DFAN4-cd938726ad.zip",
"3DFAN-4": "saved_models/face_alignment/3DFAN4-4a694010b9.zip",
"depth": "saved_models/face_alignment/depth-6c4283c0e0.zip",
}
models_urls = {
"1.6": {
"2DFAN-4": "saved_models/face_alignment/2DFAN4_1.6-c827573f02.zip",
"3DFAN-4": "saved_models/face_alignment/3DFAN4_1.6-ec5cf40a1d.zip",
"depth": "saved_models/face_alignment/depth_1.6-2aa3f18772.zip",
},
"1.5": {
"2DFAN-4": "saved_models/face_alignment/2DFAN4_1.5-a60332318a.zip",
"3DFAN-4": "saved_models/face_alignment/3DFAN4_1.5-176570af4d.zip",
"depth": "saved_models/face_alignment/depth_1.5-bc10f98e39.zip",
},
}
class FaceAlignment:
def __init__(
self,
landmarks_type,
network_size=NetworkSize.LARGE,
device="cuda",
dtype=torch.float32,
flip_input=False,
face_detector="sfd",
face_detector_kwargs=None,
verbose=False,
):
self.device = device
self.flip_input = flip_input
self.landmarks_type = landmarks_type
self.verbose = verbose
self.dtype = dtype
if version.parse(torch.__version__) < version.parse("1.5.0"):
raise ImportError(
"Unsupported pytorch version detected. Minimum supported version of pytorch: 1.5.0\
Either upgrade (recommended) your pytorch setup, or downgrade to face-alignment 1.2.0"
)
network_size = int(network_size)
pytorch_version = torch.__version__
if "dev" in pytorch_version:
pytorch_version = pytorch_version.rsplit(".", 2)[0]
else:
pytorch_version = pytorch_version.rsplit(".", 1)[0]
if "cuda" in device:
torch.backends.cudnn.benchmark = True
# Get the face detector
face_detector_module = __import__(
"face_alignment.detection." + face_detector,
globals(),
locals(),
[face_detector],
0,
)
face_detector_kwargs = face_detector_kwargs or {}
self.face_detector = face_detector_module.FaceDetector(
device=device, verbose=verbose, **face_detector_kwargs
)
# Initialise the face alignemnt networks
if landmarks_type == LandmarksType.TWO_D:
network_name = "2DFAN-" + str(network_size)
else:
network_name = "3DFAN-" + str(network_size)
self.face_alignment_net = torch.jit.load(
models_urls.get(pytorch_version, default_model_urls)[network_name]
)
self.face_alignment_net.to(device, dtype=dtype)
self.face_alignment_net.eval()
# Initialiase the depth prediciton network
if landmarks_type == LandmarksType.THREE_D:
self.depth_prediciton_net = torch.jit.load(
models_urls.get(pytorch_version, default_model_urls)["depth"]
)
self.depth_prediciton_net.to(device, dtype=dtype)
self.depth_prediciton_net.eval()
def get_landmarks(
self,
image_or_path,
detected_faces=None,
return_bboxes=False,
return_landmark_score=False,
):
"""Deprecated, please use get_landmarks_from_image
Arguments:
image_or_path {string or numpy.array or torch.tensor} -- The input image or path to it
Keyword Arguments:
detected_faces {list of numpy.array} -- list of bounding boxes, one for each face found
in the image (default: {None})
return_bboxes {boolean} -- If True, return the face bounding boxes in addition to the keypoints.
return_landmark_score {boolean} -- If True, return the keypoint scores along with the keypoints.
"""
return self.get_landmarks_from_image(
image_or_path, detected_faces, return_bboxes, return_landmark_score
)
@torch.no_grad()
def get_landmarks_from_image(
self,
image_or_path,
detected_faces=None,
return_bboxes=False,
return_landmark_score=False,
):
"""Predict the landmarks for each face present in the image.
This function predicts a set of 68 2D or 3D images, one for each image present.
If detect_faces is None the method will also run a face detector.
Arguments:
image_or_path {string or numpy.array or torch.tensor} -- The input image or path to it.
Keyword Arguments:
detected_faces {list of numpy.array} -- list of bounding boxes, one for each face found
in the image (default: {None})
return_bboxes {boolean} -- If True, return the face bounding boxes in addition to the keypoints.
return_landmark_score {boolean} -- If True, return the keypoint scores along with the keypoints.
Return:
result:
1. if both return_bboxes and return_landmark_score are False, result will be:
landmark
2. Otherwise, result will be one of the following, depending on the actual value of return_* arguments.
(landmark, landmark_score, detected_face)
(landmark, None, detected_face)
(landmark, landmark_score, None )
"""
image = get_image(image_or_path) # noqa
if detected_faces is None:
detected_faces = self.face_detector.detect_from_image(image.copy())
if len(detected_faces) == 0:
warnings.warn("No faces were detected.")
if return_bboxes or return_landmark_score:
return None, None, None
else:
return None
landmarks = []
landmarks_scores = []
for i, d in enumerate(detected_faces):
center = torch.tensor(
[d[2] - (d[2] - d[0]) / 2.0, d[3] - (d[3] - d[1]) / 2.0]
)
center[1] = center[1] - (d[3] - d[1]) * 0.12
scale = (d[2] - d[0] + d[3] - d[1]) / self.face_detector.reference_scale
inp = crop(image, center, scale) # noqa
inp = torch.from_numpy(inp.transpose((2, 0, 1))).float()
inp = inp.to(self.device, dtype=self.dtype)
inp.div_(255.0).unsqueeze_(0)
out = self.face_alignment_net(inp).detach()
if self.flip_input:
out += flip(
self.face_alignment_net(flip(inp)).detach(), is_label=True
) # noqa
out = out.to(device="cpu", dtype=torch.float32).numpy()
pts, pts_img, scores = get_preds_fromhm(out, center.numpy(), scale) # noqa
pts, pts_img = torch.from_numpy(pts), torch.from_numpy(pts_img)
pts, pts_img = pts.view(68, 2) * 4, pts_img.view(68, 2)
scores = scores.squeeze(0)
if self.landmarks_type == LandmarksType.THREE_D:
heatmaps = np.zeros((68, 256, 256), dtype=np.float32)
for i in range(68):
if pts[i, 0] > 0 and pts[i, 1] > 0:
heatmaps[i] = draw_gaussian(heatmaps[i], pts[i], 2) # noqa
heatmaps = torch.from_numpy(heatmaps).unsqueeze_(0)
heatmaps = heatmaps.to(self.device, dtype=self.dtype)
depth_pred = (
self.depth_prediciton_net(torch.cat((inp, heatmaps), 1))
.data.cpu()
.view(68, 1)
.to(dtype=torch.float32)
)
pts_img = torch.cat(
(pts_img, depth_pred * (1.0 / (256.0 / (200.0 * scale)))), 1
)
landmarks.append(pts_img.numpy())
landmarks_scores.append(scores)
if not return_bboxes:
detected_faces = None
if not return_landmark_score:
landmarks_scores = None
if return_bboxes or return_landmark_score:
return landmarks, landmarks_scores, detected_faces
else:
return landmarks
@torch.no_grad()
def get_landmarks_from_batch(
self,
image_batch,
detected_faces=None,
return_bboxes=False,
return_landmark_score=False,
):
"""Predict the landmarks for each face present in the image.
This function predicts a set of 68 2D or 3D images, one for each image in a batch in parallel.
If detect_faces is None the method will also run a face detector.
Arguments:
image_batch {torch.tensor} -- The input images batch
Keyword Arguments:
detected_faces {list of numpy.array} -- list of bounding boxes, one for each face found
in the image (default: {None})
return_bboxes {boolean} -- If True, return the face bounding boxes in addition to the keypoints.
return_landmark_score {boolean} -- If True, return the keypoint scores along with the keypoints.
Return:
result:
1. if both return_bboxes and return_landmark_score are False, result will be:
landmarks
2. Otherwise, result will be one of the following, depending on the actual value of return_* arguments.
(landmark, landmark_score, detected_face)
(landmark, None, detected_face)
(landmark, landmark_score, None )
"""
if detected_faces is None:
detected_faces = self.face_detector.detect_from_batch(image_batch)
if len(detected_faces) == 0:
warnings.warn("No faces were detected.")
if return_bboxes or return_landmark_score:
return None, None, None
else:
return None
landmarks = []
landmarks_scores_list = []
# A batch for each frame
for i, faces in enumerate(detected_faces):
res = self.get_landmarks_from_image(
image_batch[i].cpu().numpy().transpose(1, 2, 0),
detected_faces=faces,
return_landmark_score=return_landmark_score,
)
if return_landmark_score:
landmark_set, landmarks_scores, _ = res
landmarks_scores_list.append(landmarks_scores)
else:
landmark_set = res
# Bacward compatibility
if landmark_set is not None:
landmark_set = np.concatenate(landmark_set, axis=0)
else:
landmark_set = []
landmarks.append(landmark_set)
if not return_bboxes:
detected_faces = None
if not return_landmark_score:
landmarks_scores_list = None
if return_bboxes or return_landmark_score:
return landmarks, landmarks_scores_list, detected_faces
else:
return landmarks
def get_landmarks_from_directory(
self,
path,
extensions=[".jpg", ".png"],
recursive=True,
show_progress_bar=True,
return_bboxes=False,
return_landmark_score=False,
):
"""Scan a directory for images with a given extension type(s) and predict the landmarks for each
face present in the images found.
Arguments:
path {str} -- path to the target directory containing the images
Keyword Arguments:
extensions {list of str} -- list containing the image extensions considered (default: ['.jpg', '.png'])
recursive {boolean} -- If True, scans for images recursively (default: True)
show_progress_bar {boolean} -- If True displays a progress bar (default: True)
return_bboxes {boolean} -- If True, return the face bounding boxes in addition to the keypoints.
return_landmark_score {boolean} -- If True, return the keypoint scores along with the keypoints.
"""
dataset = FolderData(
path,
self.face_detector.tensor_or_path_to_ndarray,
extensions,
recursive,
self.verbose,
)
dataloader = torch.utils.data.DataLoader(
dataset, batch_size=1, shuffle=False, num_workers=2, prefetch_factor=4
)
predictions = {}
for (image_path, image) in tqdm(dataloader, disable=not show_progress_bar):
image_path, image = image_path[0], image[0]
bounding_boxes = self.face_detector.detect_from_image(image)
if return_bboxes or return_landmark_score:
preds, bbox, score = self.get_landmarks_from_image(
image,
bounding_boxes,
return_bboxes=return_bboxes,
return_landmark_score=return_landmark_score,
)
predictions[image_path] = (preds, bbox, score)
else:
preds = self.get_landmarks_from_image(image, bounding_boxes)
predictions[image_path] = preds
return predictions
================================================
FILE: src/dot/fomm/modules/__init__.py
================================================
#!/usr/bin/env python3
================================================
FILE: src/dot/fomm/modules/dense_motion.py
================================================
#!/usr/bin/env python3
import torch
import torch.nn.functional as F
from torch import nn
from .util import AntiAliasInterpolation2d, Hourglass, kp2gaussian, make_coordinate_grid
class DenseMotionNetwork(nn.Module):
"""
Module that predicting a dense motion
from sparse motion representation given
by kp_source and kp_driving
"""
def __init__(
self,
block_expansion,
num_blocks,
max_features,
num_kp,
num_channels,
estimate_occlusion_map=False,
scale_factor=1,
kp_variance=0.01,
):
super(DenseMotionNetwork, self).__init__()
self.hourglass = Hourglass(
block_expansion=block_expansion,
in_features=(num_kp + 1) * (num_channels + 1),
max_features=max_features,
num_blocks=num_blocks,
)
self.mask = nn.Conv2d(
self.hourglass.out_filters, num_kp + 1, kernel_size=(7, 7), padding=(3, 3)
)
if estimate_occlusion_map:
self.occlusion = nn.Conv2d(
self.hourglass.out_filters, 1, kernel_size=(7, 7), padding=(3, 3)
)
else:
self.occlusion = None
self.num_kp = num_kp
self.scale_factor = scale_factor
self.kp_variance = kp_variance
if self.scale_factor != 1:
self.down = AntiAliasInterpolation2d(num_channels, self.scale_factor)
def create_heatmap_representations(self, source_image, kp_driving, kp_source):
"""
Eq 6. in the paper H_k(z)
"""
spatial_size = source_image.shape[2:]
gaussian_driving = kp2gaussian(
kp_driving, spatial_size=spatial_size, kp_variance=self.kp_variance
)
gaussian_source = kp2gaussian(
kp_source, spatial_size=spatial_size, kp_variance=self.kp_variance
)
heatmap = gaussian_driving - gaussian_source
# adding background feature
zeros = torch.zeros(heatmap.shape[0], 1, spatial_size[0], spatial_size[1]).type(
heatmap.type()
)
heatmap = torch.cat([zeros, heatmap], dim=1)
heatmap = heatmap.unsqueeze(2)
return heatmap
def create_sparse_motions(self, source_image, kp_driving, kp_source):
"""
Eq 4. in the paper T_{s<-d}(z)
"""
bs, _, h, w = source_image.shape
identity_grid = make_coordinate_grid((h, w), type=kp_source["value"].type())
identity_grid = identity_grid.view(1, 1, h, w, 2)
coordinate_grid = identity_grid - kp_driving["value"].view(
bs, self.num_kp, 1, 1, 2
)
if "jacobian" in kp_driving:
jacobian = torch.matmul(
kp_source["jacobian"], torch.inverse(kp_driving["jacobian"])
)
jacobian = jacobian.unsqueeze(-3).unsqueeze(-3)
jacobian = jacobian.repeat(1, 1, h, w, 1, 1)
coordinate_grid = torch.matmul(jacobian, coordinate_grid.unsqueeze(-1))
coordinate_grid = coordinate_grid.squeeze(-1)
driving_to_source = coordinate_grid + kp_source["value"].view(
bs, self.num_kp, 1, 1, 2
)
# adding background feature
identity_grid = identity_grid.repeat(bs, 1, 1, 1, 1)
sparse_motions = torch.cat([identity_grid, driving_to_source], dim=1)
return sparse_motions
def create_deformed_source_image(self, source_image, sparse_motions):
"""
Eq 7. in the paper hat{T}_{s<-d}(z)
"""
bs, _, h, w = source_image.shape
source_repeat = (
source_image.unsqueeze(1)
.unsqueeze(1)
.repeat(1, self.num_kp + 1, 1, 1, 1, 1)
)
source_repeat = source_repeat.view(bs * (self.num_kp + 1), -1, h, w)
sparse_motions = sparse_motions.view((bs * (self.num_kp + 1), h, w, -1))
sparse_deformed = F.grid_sample(source_repeat, sparse_motions)
sparse_deformed = sparse_deformed.view((bs, self.num_kp + 1, -1, h, w))
return sparse_deformed
def forward(self, source_image, kp_driving, kp_source):
if self.scale_factor != 1:
source_image = self.down(source_image)
bs, _, h, w = source_image.shape
out_dict = dict()
heatmap_representation = self.create_heatmap_representations(
source_image, kp_driving, kp_source
)
sparse_motion = self.create_sparse_motions(source_image, kp_driving, kp_source)
deformed_source = self.create_deformed_source_image(source_image, sparse_motion)
out_dict["sparse_deformed"] = deformed_source
input = torch.cat([heatmap_representation, deformed_source], dim=2)
input = input.view(bs, -1, h, w)
prediction = self.hourglass(input)
mask = self.mask(prediction)
mask = F.softmax(mask, dim=1)
out_dict["mask"] = mask
mask = mask.unsqueeze(2)
sparse_motion = sparse_motion.permute(0, 1, 4, 2, 3)
deformation = (sparse_motion * mask).sum(dim=1)
deformation = deformation.permute(0, 2, 3, 1)
out_dict["deformation"] = deformation
# Sec. 3.2 in the paper
if self.occlusion:
occlusion_map = torch.sigmoid(self.occlusion(prediction))
out_dict["occlusion_map"] = occlusion_map
return out_dict
================================================
FILE: src/dot/fomm/modules/generator_optim.py
================================================
#!/usr/bin/env python3
import torch
import torch.nn.functional as F
from torch import nn
from .dense_motion import DenseMotionNetwork
from .util import DownBlock2d, ResBlock2d, SameBlock2d, UpBlock2d
class OcclusionAwareGenerator(nn.Module):
"""
Generator that given source image and keypoints
try to transform image according to movement trajectories
induced by keypoints. Generator follows Johnson architecture.
"""
def __init__(
self,
num_channels,
num_kp,
block_expansion,
max_features,
num_down_blocks,
num_bottleneck_blocks,
estimate_occlusion_map=False,
dense_motion_params=None,
estimate_jacobian=False,
):
super(OcclusionAwareGenerator, self).__init__()
if dense_motion_params is not None:
self.dense_motion_network = DenseMotionNetwork(
num_kp=num_kp,
num_channels=num_channels,
estimate_occlusion_map=estimate_occlusion_map,
**dense_motion_params
)
else:
self.dense_motion_network = None
self.first = SameBlock2d(
num_channels, block_expansion, kernel_size=(7, 7), padding=(3, 3)
)
down_blocks = []
for i in range(num_down_blocks):
in_features = min(max_features, block_expansion * (2**i))
out_features = min(max_features, block_expansion * (2 ** (i + 1)))
down_blocks.append(
DownBlock2d(
in_features, out_features, kernel_size=(3, 3), padding=(1, 1)
)
)
self.down_blocks = nn.ModuleList(down_blocks)
up_blocks = []
for i in range(num_down_blocks):
in_features = min(
max_features, block_expansion * (2 ** (num_down_blocks - i))
)
out_features = min(
max_features, block_expansion * (2 ** (num_down_blocks - i - 1))
)
up_blocks.append(
UpBlock2d(in_features, out_features, kernel_size=(3, 3), padding=(1, 1))
)
self.up_blocks = nn.ModuleList(up_blocks)
self.bottleneck = torch.nn.Sequential()
in_features = min(max_features, block_expansion * (2**num_down_blocks))
for i in range(num_bottleneck_blocks):
self.bottleneck.add_module(
"r" + str(i),
ResBlock2d(in_features, kernel_size=(3, 3), padding=(1, 1)),
)
self.final = nn.Conv2d(
block_expansion, num_channels, kernel_size=(7, 7), padding=(3, 3)
)
self.estimate_occlusion_map = estimate_occlusion_map
self.num_channels = num_channels
self.enc_features = None
def deform_input(self, inp, deformation):
_, h_old, w_old, _ = deformation.shape
_, _, h, w = inp.shape
if h_old != h or w_old != w:
deformation = deformation.permute(0, 3, 1, 2)
deformation = F.interpolate(deformation, size=(h, w), mode="bilinear")
deformation = deformation.permute(0, 2, 3, 1)
return F.grid_sample(inp, deformation)
def encode_source(self, source_image):
# Encoding (downsampling) part
out = self.first(source_image)
for i in range(len(self.down_blocks)):
out = self.down_blocks[i](out)
self.enc_features = out
def forward(self, source_image, kp_driving, kp_source, optim_ret=True):
assert self.enc_features is not None, "Call encode_source()"
out = self.enc_features
# Transforming feature representation
# according to deformation and occlusion
output_dict = {}
if self.dense_motion_network is not None:
dense_motion = self.dense_motion_network(
source_image=source_image, kp_driving=kp_driving, kp_source=kp_source
)
output_dict["mask"] = dense_motion["mask"]
output_dict["sparse_deformed"] = dense_motion["sparse_deformed"]
if "occlusion_map" in dense_motion:
occlusion_map = dense_motion["occlusion_map"]
output_dict["occlusion_map"] = occlusion_map
else:
occlusion_map = None
deformation = dense_motion["deformation"]
out = self.deform_input(out, deformation)
if occlusion_map is not None:
if (out.shape[2] != occlusion_map.shape[2]) or (
out.shape[3] != occlusion_map.shape[3]
):
occlusion_map = F.interpolate(
occlusion_map, size=out.shape[2:], mode="bilinear"
)
out = out * occlusion_map
if not optim_ret:
output_dict["deformed"] = self.deform_input(source_image, deformation)
# Decoding part
out = self.bottleneck(out)
for i in range(len(self.up_blocks)):
out = self.up_blocks[i](out)
out = self.final(out)
out = F.sigmoid(out)
output_dict["prediction"] = out
return output_dict
================================================
FILE: src/dot/fomm/modules/keypoint_detector.py
================================================
#!/usr/bin/env python3
import torch
import torch.nn.functional as F
from torch import nn
from .util import AntiAliasInterpolation2d, Hourglass, make_coordinate_grid
class KPDetector(nn.Module):
"""
Detecting a keypoints. Return keypoint position
and jacobian near each keypoint.
"""
def __init__(
self,
block_expansion,
num_kp,
num_channels,
max_features,
num_blocks,
temperature,
estimate_jacobian=False,
scale_factor=1,
single_jacobian_map=False,
pad=0,
):
super(KPDetector, self).__init__()
self.predictor = Hourglass(
block_expansion,
in_features=num_channels,
max_features=max_features,
num_blocks=num_blocks,
)
self.kp = nn.Conv2d(
in_channels=self.predictor.out_filters,
out_channels=num_kp,
kernel_size=(7, 7),
padding=pad,
)
if estimate_jacobian:
self.num_jacobian_maps = 1 if single_jacobian_map else num_kp
self.jacobian = nn.Conv2d(
in_channels=self.predictor.out_filters,
out_channels=4 * self.num_jacobian_maps,
kernel_size=(7, 7),
padding=pad,
)
self.jacobian.weight.data.zero_()
self.jacobian.bias.data.copy_(
torch.tensor([1, 0, 0, 1] * self.num_jacobian_maps, dtype=torch.float)
)
else:
self.jacobian = None
self.temperature = temperature
self.scale_factor = scale_factor
if self.scale_factor != 1:
self.down = AntiAliasInterpolation2d(num_channels, self.scale_factor)
def gaussian2kp(self, heatmap):
"""
Extract the mean and from a heatmap
"""
shape = heatmap.shape
heatmap = heatmap.unsqueeze(-1)
grid = (
make_coordinate_grid(shape[2:], heatmap.type()).unsqueeze_(0).unsqueeze_(0)
)
value = (heatmap * grid).sum(dim=(2, 3))
kp = {"value": value}
return kp
def forward(self, x):
if self.scale_factor != 1:
x = self.down(x)
feature_map = self.predictor(x)
prediction = self.kp(feature_map)
final_shape = prediction.shape
heatmap = prediction.view(final_shape[0], final_shape[1], -1)
heatmap = F.softmax(heatmap / self.temperature, dim=2)
heatmap = heatmap.view(*final_shape)
out = self.gaussian2kp(heatmap)
if self.jacobian is not None:
jacobian_map = self.jacobian(feature_map)
jacobian_map = jacobian_map.reshape(
final_shape[0],
self.num_jacobian_maps,
4,
final_shape[2],
final_shape[3],
)
heatmap = heatmap.unsqueeze(2)
jacobian = heatmap * jacobian_map
jacobian = jacobian.view(final_shape[0], final_shape[1], 4, -1)
jacobian = jacobian.sum(dim=-1)
jacobian = jacobian.view(jacobian.shape[0], jacobian.shape[1], 2, 2)
out["jacobian"] = jacobian
return out
================================================
FILE: src/dot/fomm/modules/util.py
================================================
#!/usr/bin/env python3
import torch
import torch.nn.functional as F
from torch import nn
from ..sync_batchnorm.batchnorm import SynchronizedBatchNorm2d as BatchNorm2d
def kp2gaussian(kp, spatial_size, kp_variance):
"""
Transform a keypoint into gaussian like representation
"""
mean = kp["value"]
coordinate_grid = make_coordinate_grid(spatial_size, mean.type())
number_of_leading_dimensions = len(mean.shape) - 1
shape = (1,) * number_of_leading_dimensions + coordinate_grid.shape
coordinate_grid = coordinate_grid.view(*shape)
repeats = mean.shape[:number_of_leading_dimensions] + (1, 1, 1)
coordinate_grid = coordinate_grid.repeat(*repeats)
# Preprocess kp shape
shape = mean.shape[:number_of_leading_dimensions] + (1, 1, 2)
mean = mean.view(*shape)
mean_sub = coordinate_grid - mean
out = torch.exp(-0.5 * (mean_sub**2).sum(-1) / kp_variance)
return out
def make_coordinate_grid(spatial_size, type):
"""
Create a meshgrid [-1,1] x [-1,1] of given spatial_size.
"""
h, w = spatial_size
x = torch.arange(w).type(type)
y = torch.arange(h).type(type)
x = 2 * (x / (w - 1)) - 1
y = 2 * (y / (h - 1)) - 1
yy = y.view(-1, 1).repeat(1, w)
xx = x.view(1, -1).repeat(h, 1)
meshed = torch.cat([xx.unsqueeze_(2), yy.unsqueeze_(2)], 2)
return meshed
class ResBlock2d(nn.Module):
"""
Res block, preserve spatial resolution.
"""
def __init__(self, in_features, kernel_size, padding):
super(ResBlock2d, self).__init__()
self.conv1 = nn.Conv2d(
in_channels=in_features,
out_channels=in_features,
kernel_size=kernel_size,
padding=padding,
)
self.conv2 = nn.Conv2d(
in_channels=in_features,
out_channels=in_features,
kernel_size=kernel_size,
padding=padding,
)
self.norm1 = BatchNorm2d(in_features, affine=True)
self.norm2 = BatchNorm2d(in_features, affine=True)
def forward(self, x):
out = self.norm1(x)
out = F.relu(out)
out = self.conv1(out)
out = self.norm2(out)
out = F.relu(out)
out = self.conv2(out)
out += x
return out
class UpBlock2d(nn.Module):
"""
Upsampling block for use in decoder.
"""
def __init__(self, in_features, out_features, kernel_size=3, padding=1, groups=1):
super(UpBlock2d, self).__init__()
self.conv = nn.Conv2d(
in_channels=in_features,
out_channels=out_features,
kernel_size=kernel_size,
padding=padding,
groups=groups,
)
self.norm = BatchNorm2d(out_features, affine=True)
def forward(self, x):
out = F.interpolate(x, scale_factor=2)
out = self.conv(out)
out = self.norm(out)
out = F.relu(out)
return out
class DownBlock2d(nn.Module):
"""
Downsampling block for use in encoder.
"""
def __init__(self, in_features, out_features, kernel_size=3, padding=1, groups=1):
super(DownBlock2d, self).__init__()
self.conv = nn.Conv2d(
in_channels=in_features,
out_channels=out_features,
kernel_size=kernel_size,
padding=padding,
groups=groups,
)
self.norm = BatchNorm2d(out_features, affine=True)
self.pool = nn.AvgPool2d(kernel_size=(2, 2))
def forward(self, x):
out = self.conv(x)
out = self.norm(out)
out = F.relu(out)
out = self.pool(out)
return out
class SameBlock2d(nn.Module):
"""
Simple block, preserve spatial resolution.
"""
def __init__(self, in_features, out_features, groups=1, kernel_size=3, padding=1):
super(SameBlock2d, self).__init__()
self.conv = nn.Conv2d(
in_channels=in_features,
out_channels=out_features,
kernel_size=kernel_size,
padding=padding,
groups=groups,
)
self.norm = BatchNorm2d(out_features, affine=True)
def forward(self, x):
out = self.conv(x)
out = self.norm(out)
out = F.relu(out)
return out
class Encoder(nn.Module):
"""
Hourglass Encoder
"""
def __init__(self, block_expansion, in_features, num_blocks=3, max_features=256):
super(Encoder, self).__init__()
down_blocks = []
for i in range(num_blocks):
down_blocks.append(
DownBlock2d(
in_features
if i == 0
else min(max_features, block_expansion * (2**i)),
min(max_features, block_expansion * (2 ** (i + 1))),
kernel_size=3,
padding=1,
)
)
self.down_blocks = nn.ModuleList(down_blocks)
def forward(self, x):
outs = [x]
for down_block in self.down_blocks:
outs.append(down_block(outs[-1]))
return outs
class Decoder(nn.Module):
"""
Hourglass Decoder
"""
def __init__(self, block_expansion, in_features, num_blocks=3, max_features=256):
super(Decoder, self).__init__()
up_blocks = []
for i in range(num_blocks)[::-1]:
in_filters = (1 if i == num_blocks - 1 else 2) * min(
max_features, block_expansion * (2 ** (i + 1))
)
out_filters = min(max_features, block_expansion * (2**i))
up_blocks.append(
UpBlock2d(in_filters, out_filters, kernel_size=3, padding=1)
)
self.up_blocks = nn.ModuleList(up_blocks)
self.out_filters = block_expansion + in_features
def forward(self, x):
out = x.pop()
for up_block in self.up_blocks:
out = up_block(out)
skip = x.pop()
out = torch.cat([out, skip], dim=1)
return out
class Hourglass(nn.Module):
"""
Hourglass architecture.
"""
def __init__(self, block_expansion, in_features, num_blocks=3, max_features=256):
super(Hourglass, self).__init__()
self.encoder = Encoder(block_expansion, in_features, num_blocks, max_features)
self.decoder = Decoder(block_expansion, in_features, num_blocks, max_features)
self.out_filters = self.decoder.out_filters
def forward(self, x):
return self.decoder(self.encoder(x))
class AntiAliasInterpolation2d(nn.Module):
"""
Band-limited downsampling,
for better preservation of the input signal.
"""
def __init__(self, channels, scale):
super(AntiAliasInterpolation2d, self).__init__()
sigma = (1 / scale - 1) / 2
kernel_size = 2 * round(sigma * 4) + 1
self.ka = kernel_size // 2
self.kb = self.ka - 1 if kernel_size % 2 == 0 else self.ka
kernel_size = [kernel_size, kernel_size]
sigma = [sigma, sigma]
# The gaussian kernel is the product of the
# gaussian function of each dimension.
kernel = 1
meshgrids = torch.meshgrid(
[torch.arange(size, dtype=torch.float32) for size in kernel_size],
indexing="xy",
)
for size, std, mgrid in zip(kernel_size, sigma, meshgrids):
mean = (size - 1) / 2
kernel *= torch.exp(-((mgrid - mean) ** 2) / (2 * std**2))
# Make sure sum of values in gaussian kernel equals 1.
kernel = kernel / torch.sum(kernel)
# Reshape to depthwise convolutional weight
kernel = kernel.view(1, 1, *kernel.size())
kernel = kernel.repeat(channels, *[1] * (kernel.dim() - 1))
self.register_buffer("weight", kernel)
self.groups = channels
self.scale = scale
def forward(self, input):
if self.scale == 1.0:
return input
out = F.pad(input, (self.ka, self.kb, self.ka, self.kb))
out = F.conv2d(out, weight=self.weight, groups=self.groups)
out = F.interpolate(out, scale_factor=(self.scale, self.scale))
return out
================================================
FILE: src/dot/fomm/option.py
================================================
#!/usr/bin/env python3
import os
import sys
import cv2
import numpy as np
from ..commons import ModelOption
from ..commons.cam.cam import (
draw_calib_text,
draw_face_landmarks,
draw_landmark_text,
draw_rect,
is_new_frame_better,
)
from ..commons.utils import crop, log, pad_img, resize
from .predictor_local import PredictorLocal
def determine_path():
"""
Find the script path
"""
try:
root = __file__
if os.path.islink(root):
root = os.path.realpath(root)
return os.path.dirname(os.path.abspath(root))
except Exception as e:
print(e)
print("I'm sorry, but something is wrong.")
print("There is no __file__ variable. Please contact the author.")
sys.exit()
class FOMMOption(ModelOption):
def __init__(
self,
use_gpu: bool = True,
use_mask: bool = False,
crop_size: int = 256,
gpen_type: str = None,
gpen_path: str = None,
offline: bool = False,
):
super(FOMMOption, self).__init__(
gpen_type=gpen_type,
use_gpu=use_gpu,
crop_size=crop_size,
gpen_path=gpen_path,
)
# use FOMM offline, video or image file
self.offline = offline
self.frame_proportion = 0.9
self.frame_offset_x = 0
self.frame_offset_y = 0
self.overlay_alpha = 0.0
self.preview_flip = False
self.output_flip = False
self.find_keyframe = False
self.is_calibrated = True if self.offline else False
self.show_landmarks = False
self.passthrough = False
self.green_overlay = False
self.opt_relative = True
self.opt_adapt_scale = True
self.opt_enc_downscale = 1
self.opt_no_pad = True
self.opt_in_port = 5557
self.opt_out_port = 5558
self.opt_hide_rect = False
self.opt_in_addr = None
self.opt_out_addr = None
self.LANDMARK_SLICE_ARRAY = np.array([17, 22, 27, 31, 36, 42, 48, 60])
self.display_string = ""
def create_model(self, model_path, **kwargs) -> None: # type: ignore
opt_config = determine_path() + "/config/vox-adv-256.yaml"
opt_checkpoint = model_path
predictor_args = {
"config_path": opt_config,
"checkpoint_path": opt_checkpoint,
"relative": self.opt_relative,
"adapt_movement_scale": self.opt_adapt_scale,
"enc_downscale": self.opt_enc_downscale,
}
self.predictor = PredictorLocal(**predictor_args)
def change_option(self, image, **kwargs):
if image.ndim == 2:
image = np.tile(image[..., None], [1, 1, 3])
image = image[..., :3][..., ::-1]
image = resize(image, (self.crop_size, self.crop_size))
print("Image shape ", image.shape)
self.source_kp = self.predictor.get_frame_kp(image)
self.kp_source = None
self.predictor.set_source_image(image)
self.source_image = image
def handle_keyboard_input(self):
key = cv2.waitKey(1)
if key == ord("w"):
self.frame_proportion -= 0.05
self.frame_proportion = max(self.frame_proportion, 0.1)
elif key == ord("s"):
self.frame_proportion += 0.05
self.frame_proportion = min(self.frame_proportion, 1.0)
elif key == ord("H"):
self.frame_offset_x -= 1
elif key == ord("h"):
self.frame_offset_x -= 5
elif key == ord("K"):
self.frame_offset_x += 1
elif key == ord("k"):
self.frame_offset_x += 5
elif key == ord("J"):
self.frame_offset_y -= 1
elif key == ord("j"):
self.frame_offset_y -= 5
elif key == ord("U"):
self.frame_offset_y += 1
elif key == ord("u"):
self.frame_offset_y += 5
elif key == ord("Z"):
self.frame_offset_x = 0
self.frame_offset_y = 0
self.frame_proportion = 0.9
elif key == ord("x"):
self.predictor.reset_frames()
if not self.is_calibrated:
cv2.namedWindow("FOMM", cv2.WINDOW_GUI_NORMAL)
cv2.moveWindow("FOMM", 600, 250)
self.is_calibrated = True
self.show_landmarks = False
elif key == ord("z"):
self.overlay_alpha = max(self.overlay_alpha - 0.1, 0.0)
elif key == ord("c"):
self.overlay_alpha = min(self.overlay_alpha + 0.1, 1.0)
elif key == ord("r"):
self.preview_flip = not self.preview_flip
elif key == ord("t"):
self.output_flip = not self.output_flip
elif key == ord("f"):
self.find_keyframe = not self.find_keyframe
elif key == ord("o"):
self.show_landmarks = not self.show_landmarks
elif key == 48:
self.passthrough = not self.passthrough
elif key != -1:
log(key)
def process_image(self, image, use_gpu=True, **kwargs) -> np.array:
if not self.offline:
self.handle_keyboard_input()
stream_img_size = image.shape[1], image.shape[0]
frame = image[..., ::-1]
frame, (frame_offset_x, frame_offset_y) = crop(
frame,
p=self.frame_proportion,
offset_x=self.frame_offset_x,
offset_y=self.frame_offset_y,
)
frame = resize(frame, (self.crop_size, self.crop_size))[..., :3]
if self.find_keyframe:
if is_new_frame_better(log, self.source_image, frame, self.predictor):
log("Taking new frame!")
self.green_overlay = True
self.predictor.reset_frames()
if self.passthrough:
out = frame
elif self.is_calibrated:
out = self.predictor.predict(frame)
if out is None:
log("predict returned None")
else:
out = None
if self.overlay_alpha > 0:
preview_frame = cv2.addWeighted(
self.source_image,
self.overlay_alpha,
frame,
1.0 - self.overlay_alpha,
0.0,
)
else:
preview_frame = frame.copy()
if self.show_landmarks:
# Dim the background to make it easier to see the landmarks
preview_frame = cv2.convertScaleAbs(preview_frame, alpha=0.5, beta=0.0)
draw_face_landmarks(
self.LANDMARK_SLICE_ARRAY, preview_frame, self.source_kp, (200, 20, 10)
)
frame_kp = self.predictor.get_frame_kp(frame)
draw_face_landmarks(self.LANDMARK_SLICE_ARRAY, preview_frame, frame_kp)
preview_frame = cv2.flip(preview_frame, 1)
if self.green_overlay:
green_alpha = 0.8
overlay = preview_frame.copy()
overlay[:] = (0, 255, 0)
preview_frame = cv2.addWeighted(
preview_frame, green_alpha, overlay, 1.0 - green_alpha, 0.0
)
if self.find_keyframe:
preview_frame = cv2.putText(
preview_frame,
self.display_string,
(10, 220),
0,
0.5 * self.crop_size / 256,
(255, 255, 255),
1,
)
if not self.is_calibrated:
preview_frame = draw_calib_text(preview_frame)
elif self.show_landmarks:
preview_frame = draw_landmark_text(preview_frame)
if not self.opt_hide_rect:
draw_rect(preview_frame)
if not self.offline:
cv2.imshow("FOMM", preview_frame[..., ::-1])
if out is not None:
if not self.opt_no_pad:
out = pad_img(out, stream_img_size)
if self.output_flip:
out = cv2.flip(out, 1)
return out[..., ::-1]
else:
return preview_frame[..., ::-1]
================================================
FILE: src/dot/fomm/predictor_local.py
================================================
#!/usr/bin/env python3
import numpy as np
import torch
import yaml
from scipy.spatial import ConvexHull
from . import face_alignment
from .modules.generator_optim import OcclusionAwareGenerator
from .modules.keypoint_detector import KPDetector
def normalize_kp(
kp_source,
kp_driving,
kp_driving_initial,
adapt_movement_scale=False,
use_relative_movement=False,
gitextract_wvjb3qsw/
├── .flake8
├── .github/
│ ├── ISSUE_TEMPLATE/
│ │ ├── ask_a_question.md
│ │ ├── bug_report.md
│ │ ├── documentation.md
│ │ └── feature_request.md
│ ├── PULL_REQUEST_TEMPLATE.md
│ └── workflows/
│ ├── build_dot.yaml
│ └── code_check.yaml
├── .gitignore
├── .pre-commit-config.yaml
├── .yamllint
├── CHANGELOG.md
├── CONTRIBUTING.md
├── Dockerfile
├── LICENSE
├── README.md
├── configs/
│ ├── faceswap_cv2.yaml
│ ├── fomm.yaml
│ ├── simswap.yaml
│ └── simswaphq.yaml
├── docker-compose.yml
├── docs/
│ ├── create_executable.md
│ ├── profiling.md
│ └── run_without_camera.md
├── envs/
│ ├── environment-apple-m2.yaml
│ ├── environment-cpu.yaml
│ └── environment-gpu.yaml
├── notebooks/
│ └── colab_demo.ipynb
├── pyproject.toml
├── requirements-apple-m2.txt
├── requirements-dev.txt
├── requirements.txt
├── scripts/
│ ├── image_swap.py
│ ├── metadata_swap.py
│ ├── profile_simswap.py
│ └── video_swap.py
├── setup.cfg
├── src/
│ └── dot/
│ ├── __init__.py
│ ├── __main__.py
│ ├── commons/
│ │ ├── __init__.py
│ │ ├── cam/
│ │ │ ├── __init__.py
│ │ │ ├── cam.py
│ │ │ └── camera_selector.py
│ │ ├── camera_utils.py
│ │ ├── model_option.py
│ │ ├── pose/
│ │ │ └── head_pose.py
│ │ ├── utils.py
│ │ └── video/
│ │ ├── __init__.py
│ │ ├── video_utils.py
│ │ └── videocaptureasync.py
│ ├── dot.py
│ ├── faceswap_cv2/
│ │ ├── __init__.py
│ │ ├── generic.py
│ │ ├── option.py
│ │ └── swap.py
│ ├── fomm/
│ │ ├── __init__.py
│ │ ├── config/
│ │ │ └── vox-adv-256.yaml
│ │ ├── face_alignment.py
│ │ ├── modules/
│ │ │ ├── __init__.py
│ │ │ ├── dense_motion.py
│ │ │ ├── generator_optim.py
│ │ │ ├── keypoint_detector.py
│ │ │ └── util.py
│ │ ├── option.py
│ │ ├── predictor_local.py
│ │ └── sync_batchnorm/
│ │ ├── __init__.py
│ │ ├── batchnorm.py
│ │ └── comm.py
│ ├── gpen/
│ │ ├── __init__.py
│ │ ├── __init_paths.py
│ │ ├── align_faces.py
│ │ ├── face_enhancement.py
│ │ ├── face_model/
│ │ │ ├── __init__.py
│ │ │ ├── face_gan.py
│ │ │ ├── model.py
│ │ │ └── op/
│ │ │ ├── __init__.py
│ │ │ ├── fused_act.py
│ │ │ ├── fused_act_v2.py
│ │ │ ├── fused_bias_act.cpp
│ │ │ ├── fused_bias_act_kernel.cu
│ │ │ ├── upfirdn2d.cpp
│ │ │ ├── upfirdn2d.py
│ │ │ ├── upfirdn2d_kernel.cu
│ │ │ └── upfirdn2d_v2.py
│ │ └── retinaface/
│ │ ├── __init__.py
│ │ ├── data/
│ │ │ ├── FDDB/
│ │ │ │ └── img_list.txt
│ │ │ ├── __init__.py
│ │ │ ├── config.py
│ │ │ ├── data_augment.py
│ │ │ └── wider_face.py
│ │ ├── facemodels/
│ │ │ ├── __init__.py
│ │ │ ├── net.py
│ │ │ └── retinaface.py
│ │ ├── layers/
│ │ │ ├── __init__.py
│ │ │ ├── functions/
│ │ │ │ └── prior_box.py
│ │ │ └── modules/
│ │ │ ├── __init__.py
│ │ │ └── multibox_loss.py
│ │ ├── retinaface_detection.py
│ │ └── utils/
│ │ ├── __init__.py
│ │ ├── box_utils.py
│ │ ├── nms/
│ │ │ ├── __init__.py
│ │ │ └── py_cpu_nms.py
│ │ └── timer.py
│ ├── simswap/
│ │ ├── __init__.py
│ │ ├── configs/
│ │ │ ├── config.yaml
│ │ │ └── config_512.yaml
│ │ ├── fs_model.py
│ │ ├── mediapipe/
│ │ │ ├── __init__.py
│ │ │ ├── face_mesh.py
│ │ │ └── utils/
│ │ │ ├── face_align_ffhqandnewarc.py
│ │ │ └── mediapipe_landmarks.py
│ │ ├── models/
│ │ │ ├── __init__.py
│ │ │ ├── arcface_models.py
│ │ │ ├── base_model.py
│ │ │ ├── fs_networks.py
│ │ │ ├── fs_networks_512.py
│ │ │ └── models.py
│ │ ├── option.py
│ │ ├── parsing_model/
│ │ │ ├── __init__.py
│ │ │ ├── model.py
│ │ │ └── resnet.py
│ │ └── util/
│ │ ├── __init__.py
│ │ ├── norm.py
│ │ ├── reverse2original.py
│ │ └── util.py
│ └── ui/
│ └── ui.py
└── tests/
└── pipeline_test.py
SYMBOL INDEX (552 symbols across 64 files)
FILE: scripts/image_swap.py
function main (line 33) | def main(
function find_images_from_path (line 76) | def find_images_from_path(path):
FILE: scripts/metadata_swap.py
function main (line 53) | def main(
function format_swaps (line 194) | def format_swaps(succeeds):
FILE: scripts/profile_simswap.py
function main (line 31) | def main(
function find_images_from_path (line 65) | def find_images_from_path(path):
FILE: scripts/video_swap.py
function main (line 31) | def main(
FILE: src/dot/__main__.py
function run (line 16) | def run(
function main (line 183) | def main(
FILE: src/dot/commons/cam/cam.py
function is_new_frame_better (line 15) | def is_new_frame_better(log, source, driving, predictor):
function load_stylegan_avatar (line 44) | def load_stylegan_avatar(IMG_SIZE=256):
function load_images (line 58) | def load_images(log, opt_avatars, IMG_SIZE=256):
function draw_rect (line 78) | def draw_rect(img, rw=0.6, rh=0.8, color=(255, 0, 0), thickness=2):
function kp_to_pixels (line 87) | def kp_to_pixels(arr):
function draw_face_landmarks (line 92) | def draw_face_landmarks(LANDMARK_SLICE_ARRAY, img, face_kp, color=(20, 8...
function print_help (line 100) | def print_help(avatar_names):
function draw_fps (line 119) | def draw_fps(
function draw_landmark_text (line 210) | def draw_landmark_text(frame, thk=2, fontsz=0.5, color=(0, 0, 255), IMG_...
function draw_calib_text (line 220) | def draw_calib_text(frame, thk=2, fontsz=0.5, color=(0, 0, 255), IMG_SIZ...
function select_camera (line 233) | def select_camera(log, config):
FILE: src/dot/commons/cam/camera_selector.py
function query_cameras (line 12) | def query_cameras(n_cams):
function make_grid (line 40) | def make_grid(images, cell_size=(320, 240), cols=2):
function mouse_callback (line 57) | def mouse_callback(event, x, y, flags, userdata):
function select_camera (line 68) | def select_camera(cam_frames, window="Camera selector"):
FILE: src/dot/commons/camera_utils.py
function fetch_camera (line 17) | def fetch_camera(target: int) -> VideoCaptureAsync:
function camera_pipeline (line 35) | def camera_pipeline(
FILE: src/dot/commons/model_option.py
class ModelOption (line 20) | class ModelOption(ABC):
method __init__ (line 21) | def __init__(
method generate_from_image (line 58) | def generate_from_image(
method generate_from_camera (line 166) | def generate_from_camera(
method generate_from_video (line 196) | def generate_from_video(
method post_process_image (line 229) | def post_process_image(self, image, **kwargs):
method change_option (line 238) | def change_option(self, image, **kwargs):
method process_image (line 242) | def process_image(self, image, **kwargs):
method create_model (line 246) | def create_model(self, source, target, limit=None, swap_case_idx=0, **...
FILE: src/dot/commons/pose/head_pose.py
function pose_estimation (line 23) | def pose_estimation(
FILE: src/dot/commons/utils.py
function log (line 18) | def log(*args, **kwargs):
function info (line 23) | def info(*args, file=sys.stdout, **kwargs):
function find_images_from_path (line 27) | def find_images_from_path(path):
function find_files_from_path (line 47) | def find_files_from_path(path: str, ext: List, filter: str = None):
function expand_bbox (line 70) | def expand_bbox(
function rand_idx_tuple (line 93) | def rand_idx_tuple(source_len, target_len):
function generate_random_file_idx (line 100) | def generate_random_file_idx(length):
class Tee (line 104) | class Tee(object):
method __init__ (line 105) | def __init__(self, filename, mode="w", terminal=sys.stderr):
method __del__ (line 109) | def __del__(self):
method write (line 112) | def write(self, *args, **kwargs):
method __call__ (line 116) | def __call__(self, *args, **kwargs):
method flush (line 119) | def flush(self):
class Logger (line 123) | class Logger:
method __init__ (line 124) | def __init__(self, filename, verbose=True):
method __call__ (line 128) | def __call__(self, *args, important=False, **kwargs):
class Once (line 135) | class Once:
method __init__ (line 138) | def __init__(self, what, who=log, per=1e12):
class TicToc (line 151) | class TicToc:
method __init__ (line 152) | def __init__(self):
method tic (line 156) | def tic(self):
method toc (line 159) | def toc(self, total=False):
method tocp (line 166) | def tocp(self, str):
class AccumDict (line 172) | class AccumDict:
method __init__ (line 173) | def __init__(self, num_f=3):
method add (line 177) | def add(self, k, v):
method __dict__ (line 180) | def __dict__(self):
method __getitem__ (line 183) | def __getitem__(self, key):
method __str__ (line 186) | def __str__(self):
method __repr__ (line 199) | def __repr__(self):
function clamp (line 203) | def clamp(value, min_value, max_value):
function crop (line 207) | def crop(img, p=0.7, offset_x=0, offset_y=0):
function pad_img (line 226) | def pad_img(img, target_size, default_pad=0):
function resize (line 238) | def resize(img, size, version="cv"):
function determine_path (line 242) | def determine_path():
FILE: src/dot/commons/video/video_utils.py
function _crop_and_pose (line 25) | def _crop_and_pose(
function video_pipeline (line 75) | def video_pipeline(
FILE: src/dot/commons/video/videocaptureasync.py
class VideoCaptureAsync (line 12) | class VideoCaptureAsync:
method __init__ (line 13) | def __init__(self, src=0, width=640, height=480):
method set (line 26) | def set(self, var1, var2):
method isOpened (line 29) | def isOpened(self):
method start (line 32) | def start(self):
method update (line 55) | def update(self):
method read (line 64) | def read(self):
method stop (line 72) | def stop(self):
method __exit__ (line 76) | def __exit__(self, exec_type, exc_value, traceback):
FILE: src/dot/dot.py
class DOT (line 18) | class DOT:
method __init__ (line 33) | def __init__(
method build_option (line 61) | def build_option(
method generate (line 104) | def generate(
method simswap (line 151) | def simswap(
method faceswap_cv2 (line 179) | def faceswap_cv2(
method fomm (line 200) | def fomm(
FILE: src/dot/faceswap_cv2/generic.py
function bilinear_interpolate (line 8) | def bilinear_interpolate(img, coords):
function grid_coordinates (line 33) | def grid_coordinates(points):
function process_warp (line 49) | def process_warp(src_img, result_img, tri_affines, dst_points, delaunay):
function triangular_affine_matrices (line 70) | def triangular_affine_matrices(vertices, src_points, dst_points):
function warp_image_3d (line 87) | def warp_image_3d(src_img, src_points, dst_points, dst_shape, dtype=np.u...
function transformation_from_points (line 101) | def transformation_from_points(points1, points2):
function warp_image_2d (line 126) | def warp_image_2d(im, M, dshape):
function mask_from_points (line 140) | def mask_from_points(size, points, erode_flag=1):
function correct_colours (line 152) | def correct_colours(im1, im2, landmarks1):
function apply_mask (line 181) | def apply_mask(img, mask):
FILE: src/dot/faceswap_cv2/option.py
class FaceswapCVOption (line 12) | class FaceswapCVOption(ModelOption):
method __init__ (line 13) | def __init__(
method create_model (line 31) | def create_model(self, model_path, **kwargs) -> None: # type: ignore
method change_option (line 39) | def change_option(self, image, **kwargs):
method process_image (line 45) | def process_image(
FILE: src/dot/faceswap_cv2/swap.py
class Swap (line 23) | class Swap:
method __init__ (line 24) | def __init__(
method apply_face_swap (line 54) | def apply_face_swap(self, source_image, target_image, save_path=None, ...
method _face_and_landmark_detection (line 144) | def _face_and_landmark_detection(self, image):
method _process_face (line 162) | def _process_face(self, image, r=10):
method _perform_base_blending (line 185) | def _perform_base_blending(mask, trg_face, warped_src_face):
method from_config (line 202) | def from_config(cls, config: Dict[str, Any]) -> "Swap":
FILE: src/dot/fomm/face_alignment.py
class LandmarksType (line 12) | class LandmarksType(IntEnum):
class NetworkSize (line 26) | class NetworkSize(IntEnum):
class FaceAlignment (line 53) | class FaceAlignment:
method __init__ (line 54) | def __init__(
method get_landmarks (line 121) | def get_landmarks(
method get_landmarks_from_image (line 144) | def get_landmarks_from_image(
method get_landmarks_from_batch (line 244) | def get_landmarks_from_batch(
method get_landmarks_from_directory (line 315) | def get_landmarks_from_directory(
FILE: src/dot/fomm/modules/dense_motion.py
class DenseMotionNetwork (line 10) | class DenseMotionNetwork(nn.Module):
method __init__ (line 17) | def __init__(
method create_heatmap_representations (line 55) | def create_heatmap_representations(self, source_image, kp_driving, kp_...
method create_sparse_motions (line 77) | def create_sparse_motions(self, source_image, kp_driving, kp_source):
method create_deformed_source_image (line 105) | def create_deformed_source_image(self, source_image, sparse_motions):
method forward (line 121) | def forward(self, source_image, kp_driving, kp_source):
FILE: src/dot/fomm/modules/generator_optim.py
class OcclusionAwareGenerator (line 11) | class OcclusionAwareGenerator(nn.Module):
method __init__ (line 18) | def __init__(
method deform_input (line 86) | def deform_input(self, inp, deformation):
method encode_source (line 95) | def encode_source(self, source_image):
method forward (line 103) | def forward(self, source_image, kp_driving, kp_source, optim_ret=True):
FILE: src/dot/fomm/modules/keypoint_detector.py
class KPDetector (line 10) | class KPDetector(nn.Module):
method __init__ (line 16) | def __init__(
method gaussian2kp (line 66) | def gaussian2kp(self, heatmap):
method forward (line 80) | def forward(self, x):
FILE: src/dot/fomm/modules/util.py
function kp2gaussian (line 10) | def kp2gaussian(kp, spatial_size, kp_variance):
function make_coordinate_grid (line 34) | def make_coordinate_grid(spatial_size, type):
class ResBlock2d (line 53) | class ResBlock2d(nn.Module):
method __init__ (line 58) | def __init__(self, in_features, kernel_size, padding):
method forward (line 77) | def forward(self, x):
class UpBlock2d (line 88) | class UpBlock2d(nn.Module):
method __init__ (line 93) | def __init__(self, in_features, out_features, kernel_size=3, padding=1...
method forward (line 107) | def forward(self, x):
class DownBlock2d (line 115) | class DownBlock2d(nn.Module):
method __init__ (line 120) | def __init__(self, in_features, out_features, kernel_size=3, padding=1...
method forward (line 134) | def forward(self, x):
class SameBlock2d (line 142) | class SameBlock2d(nn.Module):
method __init__ (line 147) | def __init__(self, in_features, out_features, groups=1, kernel_size=3,...
method forward (line 160) | def forward(self, x):
class Encoder (line 167) | class Encoder(nn.Module):
method __init__ (line 172) | def __init__(self, block_expansion, in_features, num_blocks=3, max_fea...
method forward (line 191) | def forward(self, x):
class Decoder (line 198) | class Decoder(nn.Module):
method __init__ (line 203) | def __init__(self, block_expansion, in_features, num_blocks=3, max_fea...
method forward (line 221) | def forward(self, x):
class Hourglass (line 230) | class Hourglass(nn.Module):
method __init__ (line 235) | def __init__(self, block_expansion, in_features, num_blocks=3, max_fea...
method forward (line 244) | def forward(self, x):
class AntiAliasInterpolation2d (line 248) | class AntiAliasInterpolation2d(nn.Module):
method __init__ (line 254) | def __init__(self, channels, scale):
method forward (line 284) | def forward(self, input):
FILE: src/dot/fomm/option.py
function determine_path (line 21) | def determine_path():
class FOMMOption (line 38) | class FOMMOption(ModelOption):
method __init__ (line 39) | def __init__(
method create_model (line 81) | def create_model(self, model_path, **kwargs) -> None: # type: ignore
method change_option (line 95) | def change_option(self, image, **kwargs):
method handle_keyboard_input (line 106) | def handle_keyboard_input(self):
method process_image (line 161) | def process_image(self, image, use_gpu=True, **kwargs) -> np.array:
FILE: src/dot/fomm/predictor_local.py
function normalize_kp (line 13) | def normalize_kp(
function to_tensor (line 47) | def to_tensor(a):
class PredictorLocal (line 51) | class PredictorLocal:
method __init__ (line 52) | def __init__(
method load_checkpoints (line 83) | def load_checkpoints(self):
method reset_frames (line 108) | def reset_frames(self):
method set_source_image (line 111) | def set_source_image(self, source_image):
method predict (line 126) | def predict(self, driving_frame):
method get_frame_kp (line 156) | def get_frame_kp(self, image):
method normalize_alignment_kp (line 166) | def normalize_alignment_kp(kp):
method get_start_frame (line 173) | def get_start_frame(self):
method get_start_frame_kp (line 176) | def get_start_frame_kp(self):
FILE: src/dot/fomm/sync_batchnorm/batchnorm.py
function _sum_ft (line 23) | def _sum_ft(tensor):
function _unsqueeze_ft (line 28) | def _unsqueeze_ft(tensor):
class _SynchronizedBatchNorm (line 38) | class _SynchronizedBatchNorm(_BatchNorm):
method __init__ (line 39) | def __init__(self, num_features, eps=1e-5, momentum=0.1, affine=True):
method forward (line 51) | def forward(self, input):
method __data_parallel_replicate__ (line 97) | def __data_parallel_replicate__(self, ctx, copy_id):
method _data_parallel_master (line 107) | def _data_parallel_master(self, intermediates):
method _compute_mean_std (line 131) | def _compute_mean_std(self, sum_, ssum, size):
class SynchronizedBatchNorm2d (line 154) | class SynchronizedBatchNorm2d(_SynchronizedBatchNorm):
method _check_input_dim (line 213) | def _check_input_dim(self, input):
FILE: src/dot/fomm/sync_batchnorm/comm.py
class FutureResult (line 19) | class FutureResult(object):
method __init__ (line 25) | def __init__(self):
method put (line 30) | def put(self, result):
method get (line 36) | def get(self):
class SlavePipe (line 52) | class SlavePipe(_SlavePipeBase):
method run_slave (line 57) | def run_slave(self, msg):
class SyncMaster (line 64) | class SyncMaster(object):
method __init__ (line 80) | def __init__(self, master_callback):
method __getstate__ (line 91) | def __getstate__(self):
method __setstate__ (line 94) | def __setstate__(self, state):
method register_slave (line 97) | def register_slave(self, identifier):
method run_master (line 117) | def run_master(self, master_msg):
method nr_slaves (line 154) | def nr_slaves(self):
FILE: src/dot/gpen/__init_paths.py
function add_path (line 11) | def add_path(path):
FILE: src/dot/gpen/align_faces.py
function _umeyama (line 25) | def _umeyama(src, dst, estimate_scale=True, scale=1.0):
class FaceWarpException (line 96) | class FaceWarpException(Exception):
method __str__ (line 97) | def __str__(self):
function get_reference_facial_points (line 101) | def get_reference_facial_points(
function get_affine_transform_matrix (line 185) | def get_affine_transform_matrix(src_pts, dst_pts):
function warp_and_crop_face (line 202) | def warp_and_crop_face(
FILE: src/dot/gpen/face_enhancement.py
class FaceEnhancement (line 18) | class FaceEnhancement(object):
method __init__ (line 19) | def __init__(
method process (line 54) | def process(self, img, use_gpu=True):
FILE: src/dot/gpen/face_model/face_gan.py
class FaceGAN (line 16) | class FaceGAN(object):
method __init__ (line 17) | def __init__(
method load_model (line 40) | def load_model(self, channel_multiplier=2, narrow=1, use_gpu=True):
method process (line 55) | def process(self, img, use_gpu=True):
method img2tensor (line 66) | def img2tensor(self, img, use_gpu=True):
method tensor2img (line 76) | def tensor2img(self, img_t, pmax=255.0, imtype=np.uint8):
FILE: src/dot/gpen/face_model/model.py
class PixelNorm (line 20) | class PixelNorm(nn.Module):
method __init__ (line 21) | def __init__(self):
method forward (line 24) | def forward(self, input):
function make_kernel (line 28) | def make_kernel(k):
class Upsample (line 39) | class Upsample(nn.Module):
method __init__ (line 40) | def __init__(self, kernel, factor=2):
method forward (line 54) | def forward(self, input):
class Downsample (line 60) | class Downsample(nn.Module):
method __init__ (line 61) | def __init__(self, kernel, factor=2):
method forward (line 75) | def forward(self, input):
class Blur (line 81) | class Blur(nn.Module):
method __init__ (line 82) | def __init__(self, kernel, pad, upsample_factor=1):
method forward (line 94) | def forward(self, input):
class EqualConv2d (line 100) | class EqualConv2d(nn.Module):
method __init__ (line 101) | def __init__(
method forward (line 120) | def forward(self, input):
method __repr__ (line 131) | def __repr__(self):
class EqualLinear (line 138) | class EqualLinear(nn.Module):
method __init__ (line 139) | def __init__(
method forward (line 157) | def forward(self, input):
method __repr__ (line 169) | def __repr__(self):
class ScaledLeakyReLU (line 175) | class ScaledLeakyReLU(nn.Module):
method __init__ (line 176) | def __init__(self, negative_slope=0.2):
method forward (line 181) | def forward(self, input):
class ModulatedConv2d (line 187) | class ModulatedConv2d(nn.Module):
method __init__ (line 188) | def __init__(
method __repr__ (line 236) | def __repr__(self):
method forward (line 242) | def forward(self, input, style):
class NoiseInjection (line 286) | class NoiseInjection(nn.Module):
method __init__ (line 287) | def __init__(self, isconcat=True):
method forward (line 293) | def forward(self, image, noise=None):
class ConstantInput (line 304) | class ConstantInput(nn.Module):
method __init__ (line 305) | def __init__(self, channel, size=4):
method forward (line 310) | def forward(self, input):
class StyledConv (line 317) | class StyledConv(nn.Module):
method __init__ (line 318) | def __init__(
method forward (line 345) | def forward(self, input, style, noise=None):
class ToRGB (line 353) | class ToRGB(nn.Module):
method __init__ (line 354) | def __init__(self, in_channel, style_dim, upsample=True, blur_kernel=[...
method forward (line 363) | def forward(self, input, style, skip=None):
class Generator (line 375) | class Generator(nn.Module):
method __init__ (line 376) | def __init__(
method make_noise (line 470) | def make_noise(self):
method mean_latent (line 481) | def mean_latent(self, n_latent):
method get_latent (line 489) | def get_latent(self, input):
method forward (line 492) | def forward(
class ConvLayer (line 567) | class ConvLayer(nn.Sequential):
method __init__ (line 568) | def __init__(
class ResBlock (line 616) | class ResBlock(nn.Module):
method __init__ (line 617) | def __init__(self, in_channel, out_channel, blur_kernel=[1, 3, 3, 1]):
method forward (line 627) | def forward(self, input):
class FullGenerator (line 637) | class FullGenerator(nn.Module):
method __init__ (line 638) | def __init__(
method forward (line 688) | def forward(
class Discriminator (line 720) | class Discriminator(nn.Module):
method __init__ (line 721) | def __init__(self, size, channel_multiplier=2, blur_kernel=[1, 3, 3, 1...
method forward (line 760) | def forward(self, input):
FILE: src/dot/gpen/face_model/op/fused_act.py
class FusedLeakyReLUFunctionBackward (line 43) | class FusedLeakyReLUFunctionBackward(Function):
method forward (line 45) | def forward(ctx, grad_output, out, negative_slope, scale):
method backward (line 66) | def backward(ctx, gradgrad_input, gradgrad_bias):
class FusedLeakyReLUFunction (line 75) | class FusedLeakyReLUFunction(Function):
method forward (line 77) | def forward(ctx, input, bias, negative_slope, scale):
method backward (line 87) | def backward(ctx, grad_output):
class FusedLeakyReLU (line 97) | class FusedLeakyReLU(nn.Module):
method __init__ (line 98) | def __init__(self, channel, negative_slope=0.2, scale=2**0.5):
method forward (line 105) | def forward(self, input):
function fused_leaky_relu (line 109) | def fused_leaky_relu(input, bias, negative_slope=0.2, scale=2**0.5):
FILE: src/dot/gpen/face_model/op/fused_act_v2.py
class FusedLeakyReLU_v2 (line 12) | class FusedLeakyReLU_v2(nn.Module):
method __init__ (line 13) | def __init__(self, channel, negative_slope=0.2, scale=2**0.5):
method forward (line 20) | def forward(self, input):
function fused_leaky_relu_v2 (line 24) | def fused_leaky_relu_v2(input, bias, negative_slope=0.2, scale=2**0.5):
FILE: src/dot/gpen/face_model/op/fused_bias_act.cpp
function fused_bias_act (line 13) | torch::Tensor fused_bias_act(const torch::Tensor& input, const torch::Te...
function PYBIND11_MODULE (line 21) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
FILE: src/dot/gpen/face_model/op/upfirdn2d.cpp
function upfirdn2d (line 14) | torch::Tensor upfirdn2d(const torch::Tensor& input, const torch::Tensor&...
function PYBIND11_MODULE (line 23) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
FILE: src/dot/gpen/face_model/op/upfirdn2d.py
class UpFirDn2dBackward (line 22) | class UpFirDn2dBackward(Function):
method forward (line 24) | def forward(
method backward (line 66) | def backward(ctx, gradgrad_input):
class UpFirDn2d (line 90) | class UpFirDn2d(Function):
method forward (line 92) | def forward(ctx, input, kernel, up, down, pad):
method backward (line 128) | def backward(ctx, grad_output):
function upfirdn2d (line 146) | def upfirdn2d(input, kernel, up=1, down=1, pad=(0, 0)):
function upfirdn2d_native (line 154) | def upfirdn2d_native(
FILE: src/dot/gpen/face_model/op/upfirdn2d_v2.py
function upfirdn2d_v2 (line 11) | def upfirdn2d_v2(input, kernel, up=1, down=1, pad=(0, 0)):
function upfirdn2d_native (line 19) | def upfirdn2d_native(
FILE: src/dot/gpen/retinaface/data/data_augment.py
function _crop (line 11) | def _crop(image, boxes, labels, landm, img_dim):
function _distort (line 83) | def _distort(image):
function _expand (line 143) | def _expand(image, boxes, fill, p):
function _mirror (line 167) | def _mirror(image, boxes, landms):
function _pad_to_square (line 189) | def _pad_to_square(image, rgb_mean, pad_image_flag):
function _resize_subtract_mean (line 200) | def _resize_subtract_mean(image, insize, rgb_mean):
class preproc (line 215) | class preproc(object):
method __init__ (line 216) | def __init__(self, img_dim, rgb_means):
method __call__ (line 220) | def __call__(self, image, targets):
FILE: src/dot/gpen/retinaface/data/wider_face.py
class WiderFaceDetection (line 9) | class WiderFaceDetection(data.Dataset):
method __init__ (line 10) | def __init__(self, txt_path, preproc=None):
method __len__ (line 37) | def __len__(self):
method __getitem__ (line 40) | def __getitem__(self, index):
function detection_collate (line 80) | def detection_collate(batch):
FILE: src/dot/gpen/retinaface/facemodels/net.py
function conv_bn (line 8) | def conv_bn(inp, oup, stride=1, leaky=0):
function conv_bn_no_relu (line 16) | def conv_bn_no_relu(inp, oup, stride):
function conv_bn1X1 (line 23) | def conv_bn1X1(inp, oup, stride, leaky=0):
function conv_dw (line 31) | def conv_dw(inp, oup, stride, leaky=0.1):
class SSH (line 42) | class SSH(nn.Module):
method __init__ (line 43) | def __init__(self, in_channel, out_channel):
method forward (line 59) | def forward(self, input):
class FPN (line 73) | class FPN(nn.Module):
method __init__ (line 74) | def __init__(self, in_channels_list, out_channels):
method forward (line 92) | def forward(self, input):
class MobileNetV1 (line 115) | class MobileNetV1(nn.Module):
method __init__ (line 116) | def __init__(self):
method forward (line 141) | def forward(self, x):
FILE: src/dot/gpen/retinaface/facemodels/retinaface.py
class ClassHead (line 13) | class ClassHead(nn.Module):
method __init__ (line 14) | def __init__(self, inchannels=512, num_anchors=3):
method forward (line 21) | def forward(self, x):
class BboxHead (line 28) | class BboxHead(nn.Module):
method __init__ (line 29) | def __init__(self, inchannels=512, num_anchors=3):
method forward (line 35) | def forward(self, x):
class LandmarkHead (line 42) | class LandmarkHead(nn.Module):
method __init__ (line 43) | def __init__(self, inchannels=512, num_anchors=3):
method forward (line 49) | def forward(self, x):
class RetinaFace (line 56) | class RetinaFace(nn.Module):
method __init__ (line 57) | def __init__(self, cfg=None, phase="train"):
method _make_class_head (line 104) | def _make_class_head(self, fpn_num=3, inchannels=64, anchor_num=2):
method _make_bbox_head (line 110) | def _make_bbox_head(self, fpn_num=3, inchannels=64, anchor_num=2):
method _make_landmark_head (line 116) | def _make_landmark_head(self, fpn_num=3, inchannels=64, anchor_num=2):
method forward (line 122) | def forward(self, inputs):
FILE: src/dot/gpen/retinaface/layers/functions/prior_box.py
class PriorBox (line 9) | class PriorBox(object):
method __init__ (line 10) | def __init__(self, cfg, image_size=None, phase="train"):
method forward (line 22) | def forward(self):
FILE: src/dot/gpen/retinaface/layers/modules/multibox_loss.py
class MultiBoxLoss (line 13) | class MultiBoxLoss(nn.Module):
method __init__ (line 36) | def __init__(
method forward (line 58) | def forward(self, predictions, priors, targets):
FILE: src/dot/gpen/retinaface/retinaface_detection.py
class RetinaFaceDetection (line 20) | class RetinaFaceDetection(object):
method __init__ (line 21) | def __init__(self, base_dir, network="RetinaFace-R50", use_gpu=True):
method check_keys (line 38) | def check_keys(self, pretrained_state_dict):
method remove_prefix (line 45) | def remove_prefix(self, state_dict, prefix):
method load_model (line 54) | def load_model(self, load_to_cpu=False):
method detect (line 74) | def detect(
FILE: src/dot/gpen/retinaface/utils/box_utils.py
function point_form (line 7) | def point_form(boxes):
function center_size (line 24) | def center_size(boxes):
function intersect (line 37) | def intersect(box_a, box_b):
function jaccard (line 62) | def jaccard(box_a, box_b):
function matrix_iou (line 89) | def matrix_iou(a, b):
function matrix_iof (line 102) | def matrix_iof(a, b):
function match (line 114) | def match(
function encode (line 173) | def encode(matched, priors, variances):
function encode_landm (line 197) | def encode_landm(matched, priors, variances):
function decode (line 226) | def decode(loc, priors, variances):
function decode_landm (line 251) | def decode_landm(pre, priors, variances):
function log_sum_exp (line 276) | def log_sum_exp(x):
function nms (line 290) | def nms(boxes, scores, overlap=0.5, top_k=200):
FILE: src/dot/gpen/retinaface/utils/nms/py_cpu_nms.py
function py_cpu_nms (line 13) | def py_cpu_nms(dets, thresh):
FILE: src/dot/gpen/retinaface/utils/timer.py
class Timer (line 13) | class Timer(object):
method __init__ (line 16) | def __init__(self):
method tic (line 23) | def tic(self):
method toc (line 28) | def toc(self, average=True):
method clear (line 38) | def clear(self):
FILE: src/dot/simswap/fs_model.py
function determine_path (line 11) | def determine_path():
class fsModel (line 33) | class fsModel(BaseModel):
method name (line 34) | def name(self):
method initialize (line 37) | def initialize(
method forward (line 91) | def forward(self, img_id, img_att, latent_id, latent_att, for_G=False):
function create_model (line 97) | def create_model(
FILE: src/dot/simswap/mediapipe/face_mesh.py
class FaceMesh (line 16) | class FaceMesh:
method __init__ (line 33) | def __init__(
method _get_centroid (line 48) | def _get_centroid(self, landmarks: List[NormalizedLandmark]) -> Tuple[...
method get_face_landmarks (line 62) | def get_face_landmarks(self, image: np.ndarray) -> Optional[np.array]:
method get (line 155) | def get(
FILE: src/dot/simswap/mediapipe/utils/face_align_ffhqandnewarc.py
function estimate_norm (line 81) | def estimate_norm(lmk, image_size=112, mode="ffhq"):
function norm_crop (line 106) | def norm_crop(img, landmark, image_size=112, mode="ffhq"):
function square_crop (line 123) | def square_crop(im, S):
function transform (line 138) | def transform(data, center, output_size, scale, rotation):
function trans_points2d (line 153) | def trans_points2d(pts, M):
function trans_points3d (line 164) | def trans_points3d(pts, M):
function trans_points (line 177) | def trans_points(pts, M):
FILE: src/dot/simswap/mediapipe/utils/mediapipe_landmarks.py
class MediaPipeLandmarks (line 4) | class MediaPipeLandmarks:
FILE: src/dot/simswap/models/arcface_models.py
class SEBlock (line 11) | class SEBlock(nn.Module):
method __init__ (line 12) | def __init__(self, channel, reduction=16):
method forward (line 22) | def forward(self, x):
class IRBlock (line 29) | class IRBlock(nn.Module):
method __init__ (line 32) | def __init__(self, inplanes, planes, stride=1, downsample=None, use_se...
method forward (line 46) | def forward(self, x):
class ResNet (line 67) | class ResNet(nn.Module):
method __init__ (line 68) | def __init__(self, block, layers, use_se=True):
method _make_layer (line 95) | def _make_layer(self, block, planes, blocks, stride=1):
method forward (line 119) | def forward(self, x):
class ArcMarginModel (line 140) | class ArcMarginModel(nn.Module):
method __init__ (line 141) | def __init__(self, args):
method forward (line 157) | def forward(self, input, label):
FILE: src/dot/simswap/models/base_model.py
class BaseModel (line 9) | class BaseModel(torch.nn.Module):
method name (line 10) | def name(self):
method initialize (line 13) | def initialize(self, opt_gpu_ids, opt_checkpoints_dir, opt_name, opt_v...
method set_input (line 19) | def set_input(self, input):
method forward (line 22) | def forward(self):
method test (line 26) | def test(self):
method get_image_paths (line 29) | def get_image_paths(self):
method optimize_parameters (line 32) | def optimize_parameters(self):
method get_current_visuals (line 35) | def get_current_visuals(self):
method get_current_errors (line 38) | def get_current_errors(self):
method save (line 41) | def save(self, label):
method save_network (line 45) | def save_network(self, network, network_label, epoch_label, gpu_ids):
method load_network (line 53) | def load_network(self, network, network_label, epoch_label, save_dir=""):
method update_learning_rate (line 106) | def update_learning_rate(self):
FILE: src/dot/simswap/models/fs_networks.py
class InstanceNorm (line 13) | class InstanceNorm(nn.Module):
method __init__ (line 14) | def __init__(self, epsilon=1e-8):
method forward (line 22) | def forward(self, x):
class ApplyStyle (line 29) | class ApplyStyle(nn.Module):
method __init__ (line 34) | def __init__(self, latent_size, channels):
method forward (line 38) | def forward(self, x, latent):
class ResnetBlock_Adain (line 46) | class ResnetBlock_Adain(nn.Module):
method __init__ (line 47) | def __init__(self, dim, latent_size, padding_type, activation=nn.ReLU(...
method forward (line 80) | def forward(self, x, dlatents_in_slice):
class Generator_Adain_Upsample (line 90) | class Generator_Adain_Upsample(nn.Module):
method __init__ (line 91) | def __init__(
method forward (line 181) | def forward(self, input, dlatents):
FILE: src/dot/simswap/models/fs_networks_512.py
class InstanceNorm (line 19) | class InstanceNorm(nn.Module):
method __init__ (line 20) | def __init__(self, epsilon=1e-8):
method forward (line 28) | def forward(self, x):
class ApplyStyle (line 35) | class ApplyStyle(nn.Module):
method __init__ (line 40) | def __init__(self, latent_size, channels):
method forward (line 44) | def forward(self, x, latent):
class ResnetBlock_Adain (line 52) | class ResnetBlock_Adain(nn.Module):
method __init__ (line 53) | def __init__(self, dim, latent_size, padding_type, activation=nn.ReLU(...
method forward (line 85) | def forward(self, x, dlatents_in_slice):
class Generator_Adain_Upsample (line 95) | class Generator_Adain_Upsample(nn.Module):
method __init__ (line 96) | def __init__(
method forward (line 195) | def forward(self, input, dlatents):
FILE: src/dot/simswap/models/models.py
class SEBlock (line 13) | class SEBlock(nn.Module):
method __init__ (line 14) | def __init__(self, channel, reduction=16):
method forward (line 24) | def forward(self, x):
class IRBlock (line 32) | class IRBlock(nn.Module):
method __init__ (line 35) | def __init__(self, inplanes, planes, stride=1, downsample=None, use_se...
method forward (line 49) | def forward(self, x):
class ResNet (line 70) | class ResNet(nn.Module):
method __init__ (line 71) | def __init__(self, block, layers, use_se=True):
method _make_layer (line 98) | def _make_layer(self, block, planes, blocks, stride=1):
method forward (line 122) | def forward(self, x):
class ArcMarginModel (line 142) | class ArcMarginModel(nn.Module):
method __init__ (line 143) | def __init__(self, args):
method forward (line 159) | def forward(self, input, label):
FILE: src/dot/simswap/option.py
class SimswapOption (line 19) | class SimswapOption(ModelOption):
method __init__ (line 22) | def __init__(
method create_model (line 38) | def create_model( # type: ignore
method change_option (line 119) | def change_option(self, image: np.array, **kwargs) -> None:
method process_image (line 157) | def process_image(self, image: np.array, **kwargs) -> np.array:
FILE: src/dot/simswap/parsing_model/model.py
class ConvBNReLU (line 12) | class ConvBNReLU(nn.Module):
method __init__ (line 13) | def __init__(self, in_chan, out_chan, ks=3, stride=1, padding=1, *args...
method forward (line 26) | def forward(self, x):
method init_weight (line 31) | def init_weight(self):
class BiSeNetOutput (line 39) | class BiSeNetOutput(nn.Module):
method __init__ (line 40) | def __init__(self, in_chan, mid_chan, n_classes, *args, **kwargs):
method forward (line 46) | def forward(self, x):
method init_weight (line 51) | def init_weight(self):
method get_params (line 58) | def get_params(self):
class AttentionRefinementModule (line 70) | class AttentionRefinementModule(nn.Module):
method __init__ (line 71) | def __init__(self, in_chan, out_chan, *args, **kwargs):
method forward (line 79) | def forward(self, x):
method init_weight (line 88) | def init_weight(self):
class ContextPath (line 96) | class ContextPath(nn.Module):
method __init__ (line 97) | def __init__(self, *args, **kwargs):
method forward (line 108) | def forward(self, x):
method init_weight (line 131) | def init_weight(self):
method get_params (line 138) | def get_params(self):
class FeatureFusionModule (line 150) | class FeatureFusionModule(nn.Module):
method __init__ (line 151) | def __init__(self, in_chan, out_chan, *args, **kwargs):
method forward (line 164) | def forward(self, fsp, fcp):
method init_weight (line 176) | def init_weight(self):
method get_params (line 183) | def get_params(self):
class BiSeNet (line 195) | class BiSeNet(nn.Module):
method __init__ (line 196) | def __init__(self, n_classes, *args, **kwargs):
method forward (line 206) | def forward(self, x):
method init_weight (line 227) | def init_weight(self):
method get_params (line 234) | def get_params(self):
FILE: src/dot/simswap/parsing_model/resnet.py
function conv3x3 (line 11) | def conv3x3(in_planes, out_planes, stride=1):
class BasicBlock (line 18) | class BasicBlock(nn.Module):
method __init__ (line 19) | def __init__(self, in_chan, out_chan, stride=1):
method forward (line 33) | def forward(self, x):
function create_layer_basic (line 48) | def create_layer_basic(in_chan, out_chan, bnum, stride=1):
class Resnet18 (line 55) | class Resnet18(nn.Module):
method __init__ (line 56) | def __init__(self):
method forward (line 67) | def forward(self, x):
method init_weight (line 78) | def init_weight(self):
method get_params (line 87) | def get_params(self):
FILE: src/dot/simswap/util/norm.py
class SpecificNorm (line 8) | class SpecificNorm(nn.Module):
method __init__ (line 9) | def __init__(self, epsilon=1e-8, use_gpu=True):
method forward (line 39) | def forward(self, x, use_gpu=True):
FILE: src/dot/simswap/util/reverse2original.py
function isin (line 12) | def isin(ar1, ar2):
function encode_segmentation_rgb (line 16) | def encode_segmentation_rgb(segmentation, device, no_neck=True):
class SoftErosion (line 49) | class SoftErosion(nn.Module):
method __init__ (line 50) | def __init__(self, kernel_size=15, threshold=0.6, iterations=1):
method forward (line 69) | def forward(self, x):
function postprocess (line 87) | def postprocess(swapped_face, target, target_mask, smooth_mask, device):
function reverse2wholeimage (line 103) | def reverse2wholeimage(
FILE: src/dot/simswap/util/util.py
function tensor2im (line 18) | def tensor2im(image_tensor, imtype=np.uint8, normalize=True):
function tensor2label (line 36) | def tensor2label(label_tensor, n_label, imtype=np.uint8):
function save_image (line 47) | def save_image(image_numpy, image_path):
function mkdirs (line 52) | def mkdirs(paths):
function mkdir (line 60) | def mkdir(path):
function uint82bin (line 70) | def uint82bin(n, count=8):
function labelcolormap (line 75) | def labelcolormap(N):
function _totensor (line 134) | def _totensor(array):
function load_parsing_model (line 140) | def load_parsing_model(path, use_mask, use_gpu):
function crop_align (line 157) | def crop_align(
class Colorize (line 192) | class Colorize(object):
method __init__ (line 193) | def __init__(self, n=35):
method __call__ (line 197) | def __call__(self, gray_image):
FILE: src/dot/ui/ui.py
class ToolTip (line 23) | class ToolTip(object):
method __init__ (line 24) | def __init__(self, widget):
method showtip (line 30) | def showtip(self, text):
method hidetip (line 52) | def hidetip(self):
class ToplevelUsageWindow (line 59) | class ToplevelUsageWindow(customtkinter.CTkToplevel):
method __init__ (line 64) | def __init__(self, *args, **kwargs):
class ToplevelAboutWindow (line 98) | class ToplevelAboutWindow(customtkinter.CTkToplevel):
method __init__ (line 103) | def __init__(self, *args, **kwargs):
class TabView (line 131) | class TabView:
method __init__ (line 136) | def __init__(self, tab_view, target_tip_text, use_image=False, use_vid...
method setup_ui (line 155) | def setup_ui(self):
method modify_entry (line 640) | def modify_entry(self, entry_element: customtkinter.CTkEntry, text: str):
method upload_action_config_file (line 652) | def upload_action_config_file(
method CreateToolTip (line 706) | def CreateToolTip(self, widget, text):
method start_button_event (line 718) | def start_button_event(self, error_label):
method upload_folder_action (line 776) | def upload_folder_action(self, entry_element: customtkinter.CTkOptionM...
method upload_file_action (line 787) | def upload_file_action(self, entry_element: customtkinter.CTkOptionMenu):
method optionmenu_callback (line 798) | def optionmenu_callback(self, choice: str):
class App (line 844) | class App(customtkinter.CTk):
method __init__ (line 849) | def __init__(self):
method usage_window (line 898) | def usage_window(self):
method about_window (line 912) | def about_window(self):
function main (line 928) | def main():
FILE: tests/pipeline_test.py
function fake_generate (line 13) | def fake_generate(self, option, source, target, show_fps=False, **kwargs):
class TestDotOptions (line 18) | class TestDotOptions(unittest.TestCase):
method setUp (line 19) | def setUp(self):
method test_option_creation (line 28) | def test_option_creation(self):
Condensed preview — 127 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (533K chars).
[
{
"path": ".flake8",
"chars": 88,
"preview": "[flake8]\nmax-line-length = 120\nextend-ignore = E203\nper-file-ignores = __init__.py:F401\n"
},
{
"path": ".github/ISSUE_TEMPLATE/ask_a_question.md",
"chars": 219,
"preview": "---\nname: Ask a Question\nabout: Ask a question about using dot\nlabels: question\n\n---\n\n## :question: Ask a Question:\n\n###"
},
{
"path": ".github/ISSUE_TEMPLATE/bug_report.md",
"chars": 417,
"preview": "---\nname: Bug Report\nabout: Report bugs to improve dot\nlabels: bug\n\n---\n\n## :bug: Bug Report\n\n<!-- Note: Remove sections"
},
{
"path": ".github/ISSUE_TEMPLATE/documentation.md",
"chars": 348,
"preview": "---\nname: Documentation\nabout: Report an issue related to dot documentation\nlabels: documentation\n\n---\n\n## :memo: Docume"
},
{
"path": ".github/ISSUE_TEMPLATE/feature_request.md",
"chars": 325,
"preview": "---\nname: Feature Request\nabout: Submit a feature request for dot\nlabels: feature\n\n---\n\n## :sparkles: Feature Request\n\n<"
},
{
"path": ".github/PULL_REQUEST_TEMPLATE.md",
"chars": 612,
"preview": "<!-- Is this pull request ready for review? (if not, please submit in draft mode) -->\n\n## Description\n\n<!--\nPlease inclu"
},
{
"path": ".github/workflows/build_dot.yaml",
"chars": 971,
"preview": "name: build-dot\n\non:\n push:\n branches:\n - main\n paths-ignore:\n - \"**.md\"\n pull_request:\n types: [op"
},
{
"path": ".github/workflows/code_check.yaml",
"chars": 934,
"preview": "name: code-check\n\non:\n push:\n branches:\n - main\n pull_request:\n types: [opened, synchroni"
},
{
"path": ".gitignore",
"chars": 4766,
"preview": "# repo ignores\r\ndata/results/*\r\nsaved_models/*\r\n*.patch\r\n\r\n# Created by https://www.toptal.com/developers/gitignore/api/"
},
{
"path": ".pre-commit-config.yaml",
"chars": 1144,
"preview": "default_language_version:\n python: python3.8\n\nrepos:\n - repo: https://github.com/pre-commit/pre-commit-hooks\n "
},
{
"path": ".yamllint",
"chars": 634,
"preview": "---\nyaml-files:\n - '*.yaml'\n - '*.yml'\n - .yamllint\n\nrules:\n braces: enable\n brackets: enable\n colons:"
},
{
"path": "CHANGELOG.md",
"chars": 4578,
"preview": "# Changelog\n\nAll notable changes to this project will be documented in this file.\n\nThe format is based on [Keep a Change"
},
{
"path": "CONTRIBUTING.md",
"chars": 1635,
"preview": "# Contributing\n\nWhen contributing to this repository, please refer to the following.\n\n## Suggested Guidelines\n\n1. When o"
},
{
"path": "Dockerfile",
"chars": 1415,
"preview": "FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04\n\n# copy repo codebase\nCOPY . ./dot\n\n# set working directory\nWORKDIR ./d"
},
{
"path": "LICENSE",
"chars": 1490,
"preview": "Copyright (c) 2022, Sensity B.V.\nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or withou"
},
{
"path": "README.md",
"chars": 15415,
"preview": "<div align=\"center\">\n\n<h1> the Deepfake Offensive Toolkit </h1>\n\n[ 2022, Sensity B.V. All rights reserved.\nlicensed under the BSD 3-Clause \"New\" o"
},
{
"path": "scripts/metadata_swap.py",
"chars": 7634,
"preview": "#!/usr/bin/env python3\n\"\"\"\nCopyright (c) 2022, Sensity B.V. All rights reserved.\nlicensed under the BSD 3-Clause \"New\" o"
},
{
"path": "scripts/profile_simswap.py",
"chars": 2045,
"preview": "#!/usr/bin/env python3\n\"\"\"\nCopyright (c) 2022, Sensity B.V. All rights reserved.\nlicensed under the BSD 3-Clause \"New\" o"
},
{
"path": "scripts/video_swap.py",
"chars": 2349,
"preview": "#!/usr/bin/env python3\n\"\"\"\nCopyright (c) 2022, Sensity B.V. All rights reserved.\nlicensed under the BSD 3-Clause \"New\" o"
},
{
"path": "setup.cfg",
"chars": 1206,
"preview": "[bumpversion]\ncurrent_version = 1.4.0\ncommit = True\ntag = False\nparse = (?P<major>\\d+)\\.(?P<minor>\\d+)\\.(?P<patch>\\d+)?\n"
},
{
"path": "src/dot/__init__.py",
"chars": 344,
"preview": "#!/usr/bin/env python3\r\n\"\"\"\r\nCopyright (c) 2022, Sensity B.V. All rights reserved.\r\nlicensed under the BSD 3-Clause \"New"
},
{
"path": "src/dot/__main__.py",
"chars": 6833,
"preview": "#!/usr/bin/env python3\r\n\"\"\"\r\nCopyright (c) 2022, Sensity B.V. All rights reserved.\r\nlicensed under the BSD 3-Clause \"New"
},
{
"path": "src/dot/commons/__init__.py",
"chars": 211,
"preview": "#!/usr/bin/env python3\n\"\"\"\nCopyright (c) 2022, Sensity B.V. All rights reserved.\nlicensed under the BSD 3-Clause \"New\" o"
},
{
"path": "src/dot/commons/cam/__init__.py",
"chars": 23,
"preview": "#!/usr/bin/env python3\n"
},
{
"path": "src/dot/commons/cam/cam.py",
"chars": 6658,
"preview": "#!/usr/bin/env python3\n\nimport glob\nimport os\n\nimport cv2\nimport numpy as np\nimport requests\nimport yaml\n\nfrom ..utils i"
},
{
"path": "src/dot/commons/cam/camera_selector.py",
"chars": 2980,
"preview": "#!/usr/bin/env python3\n\nimport cv2\nimport numpy as np\nimport yaml\n\nfrom ..utils import log\n\ng_selected_cam = None\n\n\ndef "
},
{
"path": "src/dot/commons/camera_utils.py",
"chars": 3965,
"preview": "#!/usr/bin/env python3\n\"\"\"\nCopyright (c) 2022, Sensity B.V. All rights reserved.\nlicensed under the BSD 3-Clause \"New\" o"
},
{
"path": "src/dot/commons/model_option.py",
"chars": 8325,
"preview": "#!/usr/bin/env python3\n\"\"\"\nCopyright (c) 2022, Sensity B.V. All rights reserved.\nlicensed under the BSD 3-Clause \"New\" o"
},
{
"path": "src/dot/commons/pose/head_pose.py",
"chars": 2756,
"preview": "#!/usr/bin/env python3\nimport cv2\nimport mediapipe as mp\nimport numpy as np\n\nmp_face_mesh = mp.solutions.face_mesh\nface_"
},
{
"path": "src/dot/commons/utils.py",
"chars": 6126,
"preview": "#!/usr/bin/env python3\n\nimport glob\nimport os\nimport random\nimport sys\nimport time\nfrom collections import defaultdict\nf"
},
{
"path": "src/dot/commons/video/__init__.py",
"chars": 23,
"preview": "#!/usr/bin/env python3\n"
},
{
"path": "src/dot/commons/video/video_utils.py",
"chars": 6398,
"preview": "#!/usr/bin/env python3\n\"\"\"\nCopyright (c) 2022, Sensity B.V. All rights reserved.\nlicensed under the BSD 3-Clause \"New\" o"
},
{
"path": "src/dot/commons/video/videocaptureasync.py",
"chars": 2214,
"preview": "#!/usr/bin/env python3\n# https://github.com/gilbertfrancois/video-capture-async\n\nimport threading\nimport time\n\nimport cv"
},
{
"path": "src/dot/dot.py",
"chars": 6873,
"preview": "#!/usr/bin/env python3\n\"\"\"\nCopyright (c) 2022, Sensity B.V. All rights reserved.\nlicensed under the BSD 3-Clause \"New\" o"
},
{
"path": "src/dot/faceswap_cv2/__init__.py",
"chars": 93,
"preview": "#!/usr/bin/env python3\n\nfrom .option import FaceswapCVOption\n\n__all__ = [\"FaceswapCVOption\"]\n"
},
{
"path": "src/dot/faceswap_cv2/generic.py",
"chars": 5725,
"preview": "#!/usr/bin/env python3\r\n\r\nimport cv2\r\nimport numpy as np\r\nimport scipy.spatial as spatial\r\n\r\n\r\ndef bilinear_interpolate("
},
{
"path": "src/dot/faceswap_cv2/option.py",
"chars": 2432,
"preview": "#!/usr/bin/env python3\n\nimport cv2\nimport dlib\nimport numpy as np\n\nfrom ..commons import ModelOption\nfrom ..commons.util"
},
{
"path": "src/dot/faceswap_cv2/swap.py",
"chars": 7429,
"preview": "#!/usr/bin/env python3\r\n\r\nfrom typing import Any, Dict\r\n\r\nimport cv2\r\nimport dlib\r\nimport numpy as np\r\nfrom PIL import I"
},
{
"path": "src/dot/fomm/__init__.py",
"chars": 81,
"preview": "#!/usr/bin/env python3\n\nfrom .option import FOMMOption\n\n__all__ = [\"FOMMOption\"]\n"
},
{
"path": "src/dot/fomm/config/vox-adv-256.yaml",
"chars": 1980,
"preview": "---\ndataset_params:\n root_dir: data/vox-png\n frame_shape: [256, 256, 3]\n id_sampling: true\n pairs_list: data"
},
{
"path": "src/dot/fomm/face_alignment.py",
"chars": 13967,
"preview": "import warnings\nfrom enum import IntEnum\n\nimport numpy as np\nimport torch\nfrom face_alignment.folder_data import FolderD"
},
{
"path": "src/dot/fomm/modules/__init__.py",
"chars": 23,
"preview": "#!/usr/bin/env python3\n"
},
{
"path": "src/dot/fomm/modules/dense_motion.py",
"chars": 5384,
"preview": "#!/usr/bin/env python3\n\nimport torch\nimport torch.nn.functional as F\nfrom torch import nn\n\nfrom .util import AntiAliasIn"
},
{
"path": "src/dot/fomm/modules/generator_optim.py",
"chars": 5187,
"preview": "#!/usr/bin/env python3\n\nimport torch\nimport torch.nn.functional as F\nfrom torch import nn\n\nfrom .dense_motion import Den"
},
{
"path": "src/dot/fomm/modules/keypoint_detector.py",
"chars": 3243,
"preview": "#!/usr/bin/env python3\n\nimport torch\nimport torch.nn.functional as F\nfrom torch import nn\n\nfrom .util import AntiAliasIn"
},
{
"path": "src/dot/fomm/modules/util.py",
"chars": 8176,
"preview": "#!/usr/bin/env python3\n\nimport torch\nimport torch.nn.functional as F\nfrom torch import nn\n\nfrom ..sync_batchnorm.batchno"
},
{
"path": "src/dot/fomm/option.py",
"chars": 8094,
"preview": "#!/usr/bin/env python3\n\nimport os\nimport sys\n\nimport cv2\nimport numpy as np\n\nfrom ..commons import ModelOption\nfrom ..co"
},
{
"path": "src/dot/fomm/predictor_local.py",
"chars": 5714,
"preview": "#!/usr/bin/env python3\n\nimport numpy as np\nimport torch\nimport yaml\nfrom scipy.spatial import ConvexHull\n\nfrom . import "
},
{
"path": "src/dot/fomm/sync_batchnorm/__init__.py",
"chars": 297,
"preview": "#!/usr/bin/env python3\n# -*- coding: utf-8 -*-\n# File : __init__.py\n# Author : Jiayuan Mao\n# Email : maojiayuan@gmail"
},
{
"path": "src/dot/fomm/sync_batchnorm/batchnorm.py",
"chars": 7765,
"preview": "#!/usr/bin/env python3\n# -*- coding: utf-8 -*-\n# File : batchnorm.py\n# Author : Jiayuan Mao\n# Email : maojiayuan@gmai"
},
{
"path": "src/dot/fomm/sync_batchnorm/comm.py",
"chars": 4630,
"preview": "#!/usr/bin/env python3\n# -*- coding: utf-8 -*-\n# File : comm.py\n# Author : Jiayuan Mao\n# Email : maojiayuan@gmail.com"
},
{
"path": "src/dot/gpen/__init__.py",
"chars": 23,
"preview": "#!/usr/bin/env python3\n"
},
{
"path": "src/dot/gpen/__init_paths.py",
"chars": 421,
"preview": "#!/usr/bin/env python3\n\n\"\"\"\n@paper: GAN Prior Embedded Network for Blind Face Restoration in the Wild (CVPR2021)\n@author"
},
{
"path": "src/dot/gpen/align_faces.py",
"chars": 7726,
"preview": "#!/usr/bin/env python3\n# -*- coding: utf-8 -*-\n\n\"\"\"\nCreated on Mon Apr 24 15:43:29 2017\n@author: zhaoy\n\n@Modified by yan"
},
{
"path": "src/dot/gpen/face_enhancement.py",
"chars": 4971,
"preview": "#!/usr/bin/env python3\n\n\"\"\"\n@paper: GAN Prior Embedded Network for Blind Face Restoration in the Wild (CVPR2021)\n@author"
},
{
"path": "src/dot/gpen/face_model/__init__.py",
"chars": 23,
"preview": "#!/usr/bin/env python3\n"
},
{
"path": "src/dot/gpen/face_model/face_gan.py",
"chars": 2477,
"preview": "#!/usr/bin/env python3\n\n\"\"\"\n@paper: GAN Prior Embedded Network for Blind Face Restoration in the Wild (CVPR2021)\n@author"
},
{
"path": "src/dot/gpen/face_model/model.py",
"chars": 21666,
"preview": "#!/usr/bin/env python3\n\n\"\"\"\n@paper: GAN Prior Embedded Network for Blind Face Restoration in the Wild (CVPR2021)\n@author"
},
{
"path": "src/dot/gpen/face_model/op/__init__.py",
"chars": 117,
"preview": "#!/usr/bin/env python3\n\n# from .fused_act import FusedLeakyReLU, fused_leaky_relu\n# from .upfirdn2d import upfirdn2d\n"
},
{
"path": "src/dot/gpen/face_model/op/fused_act.py",
"chars": 3008,
"preview": "#!/usr/bin/env python3\n\n# This file is no longer used\n\nimport os\n\nimport torch\nfrom torch import nn\nfrom torch.autograd "
},
{
"path": "src/dot/gpen/face_model/op/fused_act_v2.py",
"chars": 1078,
"preview": "#!/usr/bin/env python3\n\nimport os\n\nimport torch\nfrom torch import nn\nfrom torch.nn import functional as F\n\nmodule_path ="
},
{
"path": "src/dot/gpen/face_model/op/fused_bias_act.cpp",
"chars": 859,
"preview": "// This file is no longer used\n\n#include <torch/extension.h>\n\n\ntorch::Tensor fused_bias_act_op(const torch::Tensor& inpu"
},
{
"path": "src/dot/gpen/face_model/op/fused_bias_act_kernel.cu",
"chars": 2810,
"preview": "// This file is no longer used\n\n// Copyright (c) 2019, NVIDIA Corporation. All rights reserved.\n//\n// This work is made "
},
{
"path": "src/dot/gpen/face_model/op/upfirdn2d.cpp",
"chars": 999,
"preview": "// This file is no longer used\n\n#include <torch/extension.h>\n\n\ntorch::Tensor upfirdn2d_op(const torch::Tensor& input, co"
},
{
"path": "src/dot/gpen/face_model/op/upfirdn2d.py",
"chars": 5111,
"preview": "#!/usr/bin/env python3\n\n# This file is no longer used\n\nimport os\n\nimport torch\nimport torch.nn.functional as F\nfrom torc"
},
{
"path": "src/dot/gpen/face_model/op/upfirdn2d_kernel.cu",
"chars": 8986,
"preview": "// This file is no longer used\n\n// Copyright (c) 2019, NVIDIA Corporation. All rights reserved.\n//\n// This work is made "
},
{
"path": "src/dot/gpen/face_model/op/upfirdn2d_v2.py",
"chars": 1675,
"preview": "#!/usr/bin/env python3\n\nimport os\n\nimport torch\nfrom torch.nn import functional as F\n\nmodule_path = os.path.dirname(__fi"
},
{
"path": "src/dot/gpen/retinaface/__init__.py",
"chars": 23,
"preview": "#!/usr/bin/env python3\n"
},
{
"path": "src/dot/gpen/retinaface/data/FDDB/img_list.txt",
"chars": 66253,
"preview": "2002/08/11/big/img_591\n2002/08/26/big/img_265\n2002/07/19/big/img_423\n2002/08/24/big/img_490\n2002/08/31/big/img_17676\n200"
},
{
"path": "src/dot/gpen/retinaface/data/__init__.py",
"chars": 338,
"preview": "#!/usr/bin/env python3\n\nfrom dot.gpen.retinaface.data.config import cfg_mnet, cfg_re50\nfrom dot.gpen.retinaface.data.dat"
},
{
"path": "src/dot/gpen/retinaface/data/config.py",
"chars": 942,
"preview": "#!/usr/bin/env python3\n\ncfg_mnet = {\n \"name\": \"mobilenet0.25\",\n \"min_sizes\": [[16, 32], [64, 128], [256, 512]],\n "
},
{
"path": "src/dot/gpen/retinaface/data/data_augment.py",
"chars": 7182,
"preview": "#!/usr/bin/env python3\n\nimport random\n\nimport cv2\nimport numpy as np\n\nfrom ..utils.box_utils import matrix_iof\n\n\ndef _cr"
},
{
"path": "src/dot/gpen/retinaface/data/wider_face.py",
"chars": 3377,
"preview": "#!/usr/bin/env python3\n\nimport cv2\nimport numpy as np\nimport torch\nimport torch.utils.data as data\n\n\nclass WiderFaceDete"
},
{
"path": "src/dot/gpen/retinaface/facemodels/__init__.py",
"chars": 23,
"preview": "#!/usr/bin/env python3\n"
},
{
"path": "src/dot/gpen/retinaface/facemodels/net.py",
"chars": 4561,
"preview": "#!/usr/bin/env python3\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\ndef conv_bn(inp, oup, strid"
},
{
"path": "src/dot/gpen/retinaface/facemodels/retinaface.py",
"chars": 5089,
"preview": "#!/usr/bin/env python3\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport torchvision.models._ut"
},
{
"path": "src/dot/gpen/retinaface/layers/__init__.py",
"chars": 23,
"preview": "#!/usr/bin/env python3\n"
},
{
"path": "src/dot/gpen/retinaface/layers/functions/prior_box.py",
"chars": 1463,
"preview": "#!/usr/bin/env python3\n\nfrom itertools import product as product\nfrom math import ceil\n\nimport torch\n\n\nclass PriorBox(ob"
},
{
"path": "src/dot/gpen/retinaface/layers/modules/__init__.py",
"chars": 92,
"preview": "#!/usr/bin/env python3\n\nfrom .multibox_loss import MultiBoxLoss\n\n__all__ = [\"MultiBoxLoss\"]\n"
},
{
"path": "src/dot/gpen/retinaface/layers/modules/multibox_loss.py",
"chars": 5646,
"preview": "#!/usr/bin/env python3\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom ...data import cfg_mnet"
},
{
"path": "src/dot/gpen/retinaface/retinaface_detection.py",
"chars": 5673,
"preview": "#!/usr/bin/env python3\n\n\"\"\"\n@paper: GAN Prior Embedded Network for Blind Face Restoration in the Wild (CVPR2021)\n@author"
},
{
"path": "src/dot/gpen/retinaface/utils/__init__.py",
"chars": 23,
"preview": "#!/usr/bin/env python3\n"
},
{
"path": "src/dot/gpen/retinaface/utils/box_utils.py",
"chars": 12993,
"preview": "#!/usr/bin/env python3\n\nimport numpy as np\nimport torch\n\n\ndef point_form(boxes):\n \"\"\"Convert prior_boxes to (xmin, ym"
},
{
"path": "src/dot/gpen/retinaface/utils/nms/__init__.py",
"chars": 23,
"preview": "#!/usr/bin/env python3\n"
},
{
"path": "src/dot/gpen/retinaface/utils/nms/py_cpu_nms.py",
"chars": 1076,
"preview": "#!/usr/bin/env python3\n\n# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Mic"
},
{
"path": "src/dot/gpen/retinaface/utils/timer.py",
"chars": 1139,
"preview": "#!/usr/bin/env python3\n\n# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Mic"
},
{
"path": "src/dot/simswap/__init__.py",
"chars": 87,
"preview": "#!/usr/bin/env python3\n\nfrom .option import SimswapOption\n\n__all__ = [\"SimswapOption\"]\n"
},
{
"path": "src/dot/simswap/configs/config.yaml",
"chars": 721,
"preview": "---\nanalysis:\n simswap:\n parsing_model_path: saved_models/simswap/parsing_model/checkpoint/79999_iter.pth\n "
},
{
"path": "src/dot/simswap/configs/config_512.yaml",
"chars": 723,
"preview": "---\nanalysis:\n simswap:\n parsing_model_path: saved_models/simswap/parsing_model/checkpoint/79999_iter.pth\n "
},
{
"path": "src/dot/simswap/fs_model.py",
"chars": 3064,
"preview": "#!/usr/bin/env python3\n\nimport os\nimport sys\n\nimport torch\n\nfrom .models.base_model import BaseModel\n\n\ndef determine_pat"
},
{
"path": "src/dot/simswap/mediapipe/__init__.py",
"chars": 23,
"preview": "#!/usr/bin/env python3\n"
},
{
"path": "src/dot/simswap/mediapipe/face_mesh.py",
"chars": 7312,
"preview": "#!/usr/bin/env python3\r\n\r\nfrom typing import List, Optional, Tuple\r\n\r\nimport cv2\r\nimport mediapipe as mp\r\nimport numpy a"
},
{
"path": "src/dot/simswap/mediapipe/utils/face_align_ffhqandnewarc.py",
"chars": 4766,
"preview": "#!/usr/bin/env python3\n\nimport cv2\nimport numpy as np\nfrom skimage import transform as trans\n\nsrc1 = np.array(\n [\n "
},
{
"path": "src/dot/simswap/mediapipe/utils/mediapipe_landmarks.py",
"chars": 1175,
"preview": "#!/usr/bin/env python3\n\n\nclass MediaPipeLandmarks:\n \"\"\"Defines facial landmark indexes for Google's MediaPipe\"\"\"\n\n "
},
{
"path": "src/dot/simswap/models/__init__.py",
"chars": 23,
"preview": "#!/usr/bin/env python3\n"
},
{
"path": "src/dot/simswap/models/arcface_models.py",
"chars": 5476,
"preview": "import math\n\nimport torch\nimport torch.nn.functional as F\nfrom torch import nn\nfrom torch.nn import Parameter\n\nfrom dot."
},
{
"path": "src/dot/simswap/models/base_model.py",
"chars": 3640,
"preview": "#!/usr/bin/env python3\n\nimport os\nimport sys\n\nimport torch\n\n\nclass BaseModel(torch.nn.Module):\n def name(self):\n "
},
{
"path": "src/dot/simswap/models/fs_networks.py",
"chars": 6585,
"preview": "#!/usr/bin/env python3\r\n\r\n\"\"\"\r\nCopyright (C) 2019 NVIDIA Corporation. All rights reserved.\r\nLicensed under the CC BY-NC"
},
{
"path": "src/dot/simswap/models/fs_networks_512.py",
"chars": 6979,
"preview": "#!/usr/bin/env python3\n\"\"\"\nAuthor: Naiyuan liu\nGithub: https://github.com/NNNNAI\nDate: 2021-11-23 16:55:48\nLastEditors: "
},
{
"path": "src/dot/simswap/models/models.py",
"chars": 5506,
"preview": "#!/usr/bin/env python3\n\nimport math\n\nimport torch\nimport torch.nn.functional as F\nfrom torch import nn\nfrom torch.nn imp"
},
{
"path": "src/dot/simswap/option.py",
"chars": 6883,
"preview": "#!/usr/bin/env python3\n\nimport cv2\nimport numpy as np\nimport torch\nimport torch.nn.functional as F\nfrom PIL import Image"
},
{
"path": "src/dot/simswap/parsing_model/__init__.py",
"chars": 23,
"preview": "#!/usr/bin/env python3\n"
},
{
"path": "src/dot/simswap/parsing_model/model.py",
"chars": 9458,
"preview": "#!/usr/bin/python\r\n# -*- encoding: utf-8 -*-\r\n\r\n\r\nimport torch\r\nimport torch.nn as nn\r\nimport torch.nn.functional as F\r\n"
},
{
"path": "src/dot/simswap/parsing_model/resnet.py",
"chars": 3626,
"preview": "#!/usr/bin/python\r\n# -*- encoding: utf-8 -*-\r\n\r\nimport torch\r\nimport torch.nn as nn\r\nimport torch.nn.functional as F\r\n\r\n"
},
{
"path": "src/dot/simswap/util/__init__.py",
"chars": 23,
"preview": "#!/usr/bin/env python3\n"
},
{
"path": "src/dot/simswap/util/norm.py",
"chars": 1575,
"preview": "#!/usr/bin/env python3\r\n\r\nimport numpy as np\r\nimport torch\r\nimport torch.nn as nn\r\n\r\n\r\nclass SpecificNorm(nn.Module):\r\n "
},
{
"path": "src/dot/simswap/util/reverse2original.py",
"chars": 6124,
"preview": "#!/usr/bin/env python3\n\nimport cv2\nimport kornia as K\nimport numpy as np\nimport torch\nimport torch.nn as nn\nfrom kornia."
},
{
"path": "src/dot/simswap/util/util.py",
"chars": 6485,
"preview": "#!/usr/bin/env python3\r\n\r\nfrom __future__ import print_function\r\n\r\nimport os\r\n\r\nimport cv2\r\nimport numpy as np\r\nimport t"
},
{
"path": "src/dot/ui/ui.py",
"chars": 33376,
"preview": "#!/usr/bin/env python3\n\"\"\"\nCopyright (c) 2022, Sensity B.V. All rights reserved.\nlicensed under the BSD 3-Clause \"New\" o"
},
{
"path": "tests/pipeline_test.py",
"chars": 1670,
"preview": "#!/usr/bin/env python3\n\"\"\"\nCopyright (c) 2022, Sensity B.V. All rights reserved.\nlicensed under the BSD 3-Clause \"New\" o"
}
]
About this extraction
This page contains the full source code of the sensity-ai/dot GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 127 files (490.8 KB), approximately 152.7k tokens, and a symbol index with 552 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.