Repository: nolze/msoffcrypto-tool
Branch: master
Commit: 6d9e72c58de2
Files: 62
Total size: 184.3 KB
Directory structure:
gitextract_ab9olciu/
├── .github/
│ ├── SECURITY.md
│ └── workflows/
│ └── ci.yaml
├── .gitignore
├── .readthedocs.yml
├── CHANGELOG.md
├── LICENSE.txt
├── NOTICE.txt
├── README.md
├── docs/
│ ├── Makefile
│ ├── cli.rst
│ ├── conf.py
│ ├── index.rst
│ ├── make.bat
│ ├── modules.rst
│ ├── msoffcrypto.exceptions.rst
│ ├── msoffcrypto.format.rst
│ ├── msoffcrypto.method.container.rst
│ ├── msoffcrypto.method.rst
│ ├── msoffcrypto.rst
│ └── requirements.txt
├── msoffcrypto/
│ ├── __init__.py
│ ├── __main__.py
│ ├── exceptions/
│ │ └── __init__.py
│ ├── format/
│ │ ├── __init__.py
│ │ ├── base.py
│ │ ├── common.py
│ │ ├── doc97.py
│ │ ├── ooxml.py
│ │ ├── ppt97.py
│ │ └── xls97.py
│ └── method/
│ ├── __init__.py
│ ├── container/
│ │ ├── __init__.py
│ │ └── ecma376_encrypted.py
│ ├── ecma376_agile.py
│ ├── ecma376_extensible.py
│ ├── ecma376_standard.py
│ ├── rc4.py
│ ├── rc4_cryptoapi.py
│ └── xor_obfuscation.py
├── pyproject.toml
└── tests/
├── __init__.py
├── inputs/
│ ├── ecma376standard_password.docx
│ ├── example_password.docx
│ ├── example_password.xlsx
│ ├── plain.doc
│ ├── plain.ppt
│ ├── plain.xls
│ ├── rc4cryptoapi_password.doc
│ ├── rc4cryptoapi_password.ppt
│ ├── rc4cryptoapi_password.xls
│ └── xor_password_123456789012345.xls
├── outputs/
│ ├── ecma376standard_password_plain.docx
│ ├── example.docx
│ ├── example.xlsx
│ ├── rc4cryptoapi_password_plain.doc
│ ├── rc4cryptoapi_password_plain.ppt
│ ├── rc4cryptoapi_password_plain.xls
│ └── xor_password_123456789012345_plain.xls
├── test_cli.py
├── test_cli.sh
├── test_compare_known_output.py
└── test_file_handle.py
================================================
FILE CONTENTS
================================================
================================================
FILE: .github/SECURITY.md
================================================
# Security Policy
## Reporting a Vulnerability
To report a security vulnerability, please use the
[Tidelift security contact](https://tidelift.com/security).
Tidelift will coordinate the fix and disclosure.
================================================
FILE: .github/workflows/ci.yaml
================================================
name: build
on:
push:
# branches: [$default-branch]
branches: ["master"]
tags: ["*"]
pull_request:
# branches: [$default-branch]
branches: ["master"]
jobs:
# https://srz-zumix.blogspot.com/2019/10/github-actions-ci-skip.html
prepare:
runs-on: ubuntu-latest
if: "! contains(github.event.head_commit.message, '[skip ci]')"
steps:
- run: echo "[skip ci] ${{ contains(github.event.head_commit.message, '[skip ci]') }}"
- run: echo "[github.ref] ${{ github.ref }}"
build:
needs: ["prepare"]
runs-on: ${{ matrix.os }}
strategy:
fail-fast: true
matrix:
os: ["ubuntu-latest", "macos-latest", "windows-latest"]
python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Install poetry and codecov
run: |
python -m pip install --upgrade pip
python -m pip install poetry codecov
- name: Install dependencies
run: |
poetry install --no-interaction
- name: Test with pytest
run: |
poetry run coverage run -m pytest -v
codecov
publish:
needs: ["build"]
if: "success() && startsWith(github.ref, 'refs/tags')"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: "3.x"
- name: Install poetry
run: |
python -m pip install --upgrade pip
python -m pip install poetry
- name: Build and publish package
run: |
poetry config pypi-token.pypi "${{ secrets.PYPI_API_TOKEN }}"
poetry publish --no-interaction --build
================================================
FILE: .gitignore
================================================
docs/_static/
docs/_templates/
docs/_build/
### https://raw.github.com/github/gitignore/4bff4a2986af526650f1d329d97047dc1fa87599/Python.gitignore
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/
# Translations
*.mo
*.pot
# Django stuff:
*.log
.static_storage/
.media/
local_settings.py
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
# Jupyter Notebook
.ipynb_checkpoints
# pyenv
.python-version
# celery beat schedule file
celerybeat-schedule
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
### https://raw.github.com/github/gitignore/4bff4a2986af526650f1d329d97047dc1fa87599/Global/macOS.gitignore
# General
.DS_Store
.AppleDouble
.LSOverride
# Icon must end with two \r
Icon
# Thumbnails
._*
# Files that might appear in the root of a volume
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent
# Directories potentially created on remote AFP share
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk
### https://raw.github.com/github/gitignore/4bff4a2986af526650f1d329d97047dc1fa87599/Global/Windows.gitignore
# Windows thumbnail cache files
Thumbs.db
ehthumbs.db
ehthumbs_vista.db
# Dump file
*.stackdump
# Folder config file
[Dd]esktop.ini
# Recycle Bin used on file shares
$RECYCLE.BIN/
# Windows Installer files
*.cab
*.msi
*.msm
*.msp
# Windows shortcuts
*.lnk
================================================
FILE: .readthedocs.yml
================================================
version: 2
sphinx:
configuration: docs/conf.py
build:
os: ubuntu-22.04
tools:
python: "3.11"
python:
install:
- requirements: docs/requirements.txt
- method: pip
path: .
================================================
FILE: CHANGELOG.md
================================================
v6.0.0 / 2026-01-12
===================
* (BREAKING) Drop support for Python 3.8 and 3.9, add Python 3.14 to CI
* Update dependencies
* Clarify error messages
v5.4.2 / 2024-08-09
===================
* Fix DeprecationWarning from cryptography library (reported by @dennn11, [#92](https://github.com/nolze/msoffcrypto-tool/issues/92))
v5.4.1 / 2024-05-25
===================
* Fix for incorrect key size with 0 length keySize var (@UserJHansen, [#89](https://github.com/nolze/msoffcrypto-tool/pull/89))
v5.4.0 / 2024-05-02
===================
* Never return None in ooxml's \_parseinfo (@gdesmar, [#88](https://github.com/nolze/msoffcrypto-tool/pull/88))
v5.3.1 / 2024-01-19
===================
* Bug fixes
v5.3.0 / 2024-01-19
===================
* Add support for OOXML encryption, a port from the C++ library https://github.com/herumi/msoffice (@stephane-rouleau, [#86](https://github.com/nolze/msoffcrypto-tool/pull/86))
v5.2.0 / 2024-01-06
===================
* Support XOR Obfuscation decryption for .xls documents (@DissectMalware, [#80](https://github.com/nolze/msoffcrypto-tool/pull/80))
* Bug fixes
v5.1.1 / 2023-07-20
===================
* Drop Python 3.7 support as it reaches EOL, Add Python 3.11 to CI environments
* Get the version in `__main__.py` instead of `__init__.py` to avoid a relevant error in PyInstaller/cx\_Freeze in which `pkg_resources` does not work by default
v5.1.0 / 2023-07-17
===================
* Load plain OOXML as OfficeFile with type == plain. Fixes [#74](https://github.com/nolze/msoffcrypto-tool/issues/74)
* Use importlib.metadata.version in Python >=3.8 ([#77](https://github.com/nolze/msoffcrypto-tool/issues/77))
5.0.1 / 2023-02-28
===================
* (dev) Switch to GitHub Actions from Travis CI
* Update dependencies, Drop Python 3.6 support
5.0.0 / 2022-01-20
==================
* (dev) Add tests on Python 3.7 to 3.9 ([#71](https://github.com/nolze/msoffcrypto-tool/pull/71))
* (dev) Track poetry.lock ([#71](https://github.com/nolze/msoffcrypto-tool/pull/71))
* (BREAKING) Drop Python 2 support ([#71](https://github.com/nolze/msoffcrypto-tool/pull/71))
* Raise exception if no encryption type is specified ([#70](https://github.com/nolze/msoffcrypto-tool/issues/70))
* Support SHA256, SHA384 hash algorithm (@jackydo, [#67](https://github.com/nolze/msoffcrypto-tool/pull/67))
* Fix errors for unencrypted documents
* Use absolute imports ([#63](https://github.com/nolze/msoffcrypto-tool/pull/63))
4.12.0 / 2021-06-04
===================
* Use custom exceptions ([#59](https://github.com/nolze/msoffcrypto-tool/pull/59))
* (dev) Remove nose (thank you) ([#57](https://github.com/nolze/msoffcrypto-tool/pull/57))
* (dev) Use poetry ([#55](https://github.com/nolze/msoffcrypto-tool/pull/55))
4.11.0 / 2020-09-03
===================
* Improve hash calculation (suggested by @StanislavNikolov)
* Add "verify\_passwd" and "verify\_integrity" option (@jeffli678)
* Make _packUserEditAtom spec-compliant
4.10.2 / 2020-04-08
===================
* Update \_makekey in rc4\_cryptoapi (@doracpphp)
* Fix handling of optional field value in ppt97
* Add tests for is_encrypted() (--test)
* Make Doc97File.is_encrypted() return boolean
================================================
FILE: LICENSE.txt
================================================
MIT License
Copyright (c) 2015 nolze
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
================================================
FILE: NOTICE.txt
================================================
This software contains derivative works from https://github.com/herumi/msoffice
which is licensed under the BSD 3-Clause License.
https://github.com/herumi/msoffice/blob/c3cdb1ea0a5285a2a1718fee2dc893fd884bdad0/COPYRIGHT
Copyright (c) 2007-2015 Cybozu Labs, Inc.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
Neither the name of the Cybozu Labs, Inc. nor the names of its contributors may
be used to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
THE POSSIBILITY OF SUCH DAMAGE.
================================================
FILE: README.md
================================================
# msoffcrypto-tool
[](https://pypi.org/project/msoffcrypto-tool/)
[](https://pypistats.org/packages/msoffcrypto-tool)
[](https://github.com/nolze/msoffcrypto-tool/actions/workflows/ci.yaml)
[](https://codecov.io/gh/nolze/msoffcrypto-tool)
[](http://msoffcrypto-tool.readthedocs.io/en/latest/?badge=latest)
msoffcrypto-tool is a Python tool and library for decrypting and encrypting MS Office files using a password or other keys.
## Contents
* [Installation](#installation)
* [Examples](#examples)
* [Supported encryption methods](#supported-encryption-methods)
* [Tests](#tests)
* [Todo](#todo)
* [Resources](#resources)
* [Use cases and mentions](#use-cases-and-mentions)
* [Contributors](#contributors)
* [Credits](#credits)
## Installation
```
pip install msoffcrypto-tool
```
## Examples
### As CLI tool (with password)
#### Decryption
Specify the password with `-p` flag:
```
msoffcrypto-tool encrypted.docx decrypted.docx -p Passw0rd
```
Password is prompted if you omit the password argument value:
```bash
$ msoffcrypto-tool encrypted.docx decrypted.docx -p
Password:
```
To check if the file is encrypted or not, use `-t` flag:
```
msoffcrypto-tool document.doc --test -v
```
It returns `1` if the file is encrypted, `0` if not.
#### Encryption (OOXML only, experimental)
> [!IMPORTANT]
> Encryption feature is experimental. Please use it at your own risk.
To password-protect a document, use `-e` flag along with `-p` flag:
```
msoffcrypto-tool -e -p Passw0rd plain.docx encrypted.docx
```
### As library
Password and more key types are supported with library functions.
#### Decryption
Basic usage:
```python
import msoffcrypto
encrypted = open("encrypted.docx", "rb")
file = msoffcrypto.OfficeFile(encrypted)
file.load_key(password="Passw0rd") # Use password
with open("decrypted.docx", "wb") as f:
file.decrypt(f)
encrypted.close()
```
In-memory:
```python
import msoffcrypto
import io
import pandas as pd
decrypted = io.BytesIO()
with open("encrypted.xlsx", "rb") as f:
file = msoffcrypto.OfficeFile(f)
file.load_key(password="Passw0rd") # Use password
file.decrypt(decrypted)
df = pd.read_excel(decrypted)
print(df)
```
Advanced usage:
```python
# Verify password before decryption (default: False)
# The ECMA-376 Agile/Standard crypto system allows one to know whether the supplied password is correct before actually decrypting the file
# Currently, the verify_password option is only meaningful for ECMA-376 Agile/Standard Encryption
file.load_key(password="Passw0rd", verify_password=True)
# Use private key
file.load_key(private_key=open("priv.pem", "rb"))
# Use intermediate key (secretKey)
file.load_key(secret_key=binascii.unhexlify("AE8C36E68B4BB9EA46E5544A5FDB6693875B2FDE1507CBC65C8BCF99E25C2562"))
# Check the HMAC of the data payload before decryption (default: False)
# Currently, the verify_integrity option is only meaningful for ECMA-376 Agile Encryption
file.decrypt(open("decrypted.docx", "wb"), verify_integrity=True)
```
Supported key types are
- Passwords
- Intermediate keys (optional)
- Private keys used for generating escrow keys (escrow certificates) (optional)
See also ["Backdooring MS Office documents with secret master keys"](https://web.archive.org/web/20171008075059/http://secuinside.com/archive/2015/2015-1-9.pdf) for more information on the key types.
#### Encryption (OOXML only, experimental)
> [!IMPORTANT]
> Encryption feature is experimental. Please use it at your own risk.
Basic usage:
```python
from msoffcrypto.format.ooxml import OOXMLFile
plain = open("plain.docx", "rb")
file = OOXMLFile(plain)
with open("encrypted.docx", "wb") as f:
file.encrypt("Passw0rd", f)
plain.close()
```
In-memory:
```python
from msoffcrypto.format.ooxml import OOXMLFile
import io
encrypted = io.BytesIO()
with open("plain.xlsx", "rb") as f:
file = OOXMLFile(f)
file.encrypt("Passw0rd", encrypted)
# Do stuff with encrypted buffer; it contains an OLE container with an encrypted stream
...
```
## Supported encryption methods
### MS-OFFCRYPTO specs
* [x] ECMA-376 (Agile Encryption/Standard Encryption)
* [x] MS-DOCX (OOXML) (Word 2007-)
* [x] MS-XLSX (OOXML) (Excel 2007-)
* [x] MS-PPTX (OOXML) (PowerPoint 2007-)
* [x] Office Binary Document RC4 CryptoAPI
* [x] MS-DOC (Word 2002, 2003, 2004)
* [x] MS-XLS ([Excel 2002, 2003, 2007, 2010](https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-xls/a3ad4e36-ab66-426c-ba91-b84433312068#Appendix_A_22)) (experimental)
* [x] MS-PPT (PowerPoint 2002, 2003, 2004) (partial, experimental)
* [x] Office Binary Document RC4
* [x] MS-DOC (Word 97, 98, 2000)
* [x] MS-XLS (Excel 97, 98, 2000) (experimental)
* [ ] ECMA-376 (Extensible Encryption)
* [x] XOR Obfuscation
* [x] MS-XLS ([Excel 2002, 2003](https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-xls/a3ad4e36-ab66-426c-ba91-b84433312068#Appendix_A_21)) (experimental)
* [ ] MS-DOC (Word 2002, 2003, 2004?)
### Other
* [ ] Word 95 Encryption (Word 95 and prior)
* [ ] Excel 95 Encryption (Excel 95 and prior)
* [ ] PowerPoint 95 Encryption (PowerPoint 95 and prior)
PRs are welcome!
## Tests
With [coverage](https://github.com/nedbat/coveragepy) and [pytest](https://pytest.org/):
```
poetry install
poetry run coverage run -m pytest -v
```
## Todo
* [x] Add tests
* [x] Support decryption with passwords
* [x] Support older encryption schemes
* [x] Add function-level tests
* [x] Add API documents
* [x] Publish to PyPI
* [x] Add decryption tests for various file formats
* [x] Integrate with more comprehensive projects handling MS Office files (such as [oletools](https://github.com/decalage2/oletools/)?) if possible
* [x] Add the password prompt mode for CLI
* [x] Improve error types (v4.12.0)
* [ ] Add type hints
* [ ] Introduce something like `ctypes.Structure`
* [x] Support OOXML encryption
* [ ] Support other encryption
* [ ] Isolate parser
* [ ] Redesign APIs (v6.0.0)
## Resources
* "Backdooring MS Office documents with secret master keys" [http://secuinside.com/archive/2015/2015-1-9.pdf](https://web.archive.org/web/20171008075059/http://secuinside.com/archive/2015/2015-1-9.pdf)
* Technical Documents <https://msdn.microsoft.com/en-us/library/cc313105.aspx>
* [MS-OFFCRYPTO] Agile Encryption <https://msdn.microsoft.com/en-us/library/dd949735(v=office.12).aspx>
* [MS-OFFDI] Microsoft Office File Format Documentation Introduction <https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-offdi/24ed256c-eb5b-494e-b4f6-fb696ad2b4dc>
* LibreOffice/core <https://github.com/LibreOffice/core>
* LibreOffice/mso-dumper <https://github.com/LibreOffice/mso-dumper>
* wvDecrypt <http://www.skynet.ie/~caolan/Packages/wvDecrypt.html>
* Microsoft Office password protection - Wikipedia <https://en.wikipedia.org/wiki/Microsoft_Office_password_protection#History_of_Microsoft_Encryption_password>
* office2john.py <https://github.com/magnumripper/JohnTheRipper/blob/bleeding-jumbo/run/office2john.py>
## Alternatives
* herumi/msoffice <https://github.com/herumi/msoffice>
* DocRecrypt <https://blogs.technet.microsoft.com/office_resource_kit/2013/01/23/now-you-can-reset-or-remove-a-password-from-a-word-excel-or-powerpoint-filewith-office-2013/>
* Apache POI - the Java API for Microsoft Documents <https://poi.apache.org/>
## Use cases and mentions
### General
* <https://repology.org/project/python:msoffcrypto-tool/versions> (kudos to maintainers!)
<!-- * <https://checkroth.com/unlocking-password-protected-files.html> (outdated) -->
### Corporate
* Workato <https://docs.workato.com/connectors/python.html#supported-features> <!-- https://web.archive.org/web/20240525062245/https://docs.workato.com/connectors/python.html#supported-features -->
* Check Point <https://www.checkpoint.com/about-us/copyright-and-trademarks/> <!-- https://web.archive.org/web/20230326071230/https://www.checkpoint.com/about-us/copyright-and-trademarks/ -->
### Malware/maldoc analysis
* <https://github.com/jbremer/sflock/commit/3f6a96abe1dbb4405e4fb7fd0d16863f634b09fb>
* <https://isc.sans.edu/forums/diary/Video+Analyzing+Encrypted+Malicious+Office+Documents/24572/>
### CTF
* <https://github.com/shombo/cyberstakes-writeps-2018/tree/master/word_up>
* <https://github.com/willi123yao/Cyberthon2020_Writeups/blob/master/csit/Lost_Magic>
### In other languages
* <https://github.com/dtjohnson/xlsx-populate>
* <https://github.com/opendocument-app/OpenDocument.core/blob/233663b039/src/internal/ooxml/ooxml_crypto.h>
* <https://github.com/jaydadhania08/PHPDecryptXLSXWithPassword>
* <https://github.com/epicentre-msf/rpxl>
### In publications
* [Excel、データ整理&分析、画像処理の自動化ワザを完全網羅! 超速Python仕事術大全](https://books.google.co.jp/books?id=TBdVEAAAQBAJ&q=msoffcrypto) (伊沢剛, 2022)
* ["Analyse de documents malveillants en 2021"](https://twitter.com/decalage2/status/1435255507846053889), MISC Hors-série N° 24, "Reverse engineering : apprenez à analyser des binaires" (Lagadec Philippe, 2021)
* [シゴトがはかどる Python自動処理の教科書](https://books.google.co.jp/books?id=XEYUEAAAQBAJ&q=msoffcrypto) (クジラ飛行机, 2020)
## Contributors
* <https://github.com/nolze/msoffcrypto-tool/graphs/contributors>
## Credits
* The sample file for XOR Obfuscation is from: <https://github.com/openwall/john-samples/tree/main/Office/Office_Secrets>
================================================
FILE: docs/Makefile
================================================
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
================================================
FILE: docs/cli.rst
================================================
Command-line interface
======================
.. toctree::
.. autoprogram:: msoffcrypto.__main__:parser
:prog: msoffcrypto-tool
================================================
FILE: docs/conf.py
================================================
# Configuration file for the Sphinx documentation builder.
#
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
project = "msoffcrypto-tool"
copyright = "nolze"
author = "nolze"
version = ""
release = version
# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
import os
import sys
sys.path.insert(0, os.path.abspath("../"))
extensions = [
"sphinx.ext.autodoc",
"sphinxcontrib.autoprogram",
"sphinx.ext.napoleon",
"sphinx.ext.viewcode",
"myst_parser",
]
templates_path = ["_templates"]
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
html_theme = "furo"
html_static_path = ["_static"]
# html_title = "<project> <version>"
html_title = "msoffcrypto-tool"
html_theme_options = {
"footer_icons": [
{
"name": "GitHub",
"url": "https://github.com/nolze/msoffcrypto-tool",
"html": """
<svg stroke="currentColor" fill="currentColor" stroke-width="0" viewBox="0 0 16 16">
<path fill-rule="evenodd" d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0 0 16 8c0-4.42-3.58-8-8-8z"></path>
</svg>
""",
"class": "",
},
],
}
myst_enable_extensions = ["tasklist"]
================================================
FILE: docs/index.rst
================================================
.. msoffcrypto-tool documentation master file, created by
sphinx-quickstart on Tue Oct 17 02:16:54 2023.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
msoffcrypto-tool
================
.. include:: ../README.md
:parser: myst_parser.sphinx_
:start-after: msoffcrypto-tool
.. toctree::
:hidden:
:maxdepth: 2
:caption: Contents:
cli
msoffcrypto
.. * :ref:`genindex`
.. * :ref:`modindex`
.. * :ref:`search`
================================================
FILE: docs/make.bat
================================================
@ECHO OFF
pushd %~dp0
REM Command file for Sphinx documentation
if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=.
set BUILDDIR=_build
%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)
if "%1" == "" goto help
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end
:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
:end
popd
================================================
FILE: docs/modules.rst
================================================
msoffcrypto
===========
.. toctree::
:maxdepth: 1
msoffcrypto
================================================
FILE: docs/msoffcrypto.exceptions.rst
================================================
msoffcrypto.exceptions package
==============================
Module contents
---------------
.. automodule:: msoffcrypto.exceptions
:members:
:undoc-members:
:show-inheritance:
================================================
FILE: docs/msoffcrypto.format.rst
================================================
msoffcrypto.format package
==========================
Submodules
----------
msoffcrypto.format.base module
------------------------------
.. automodule:: msoffcrypto.format.base
:members:
:undoc-members:
:show-inheritance:
msoffcrypto.format.common module
--------------------------------
.. automodule:: msoffcrypto.format.common
:members:
:undoc-members:
:show-inheritance:
msoffcrypto.format.doc97 module
-------------------------------
.. automodule:: msoffcrypto.format.doc97
:members:
:undoc-members:
:show-inheritance:
msoffcrypto.format.ooxml module
-------------------------------
.. automodule:: msoffcrypto.format.ooxml
:members:
:undoc-members:
:show-inheritance:
msoffcrypto.format.ppt97 module
-------------------------------
.. automodule:: msoffcrypto.format.ppt97
:members:
:undoc-members:
:show-inheritance:
msoffcrypto.format.xls97 module
-------------------------------
.. automodule:: msoffcrypto.format.xls97
:members:
:undoc-members:
:show-inheritance:
Module contents
---------------
.. automodule:: msoffcrypto.format
:members:
:undoc-members:
:show-inheritance:
================================================
FILE: docs/msoffcrypto.method.container.rst
================================================
msoffcrypto.method.container package
====================================
Submodules
----------
msoffcrypto.method.container.ecma376\_encrypted module
------------------------------------------------------
.. automodule:: msoffcrypto.method.container.ecma376_encrypted
:members:
:undoc-members:
:show-inheritance:
Module contents
---------------
.. automodule:: msoffcrypto.method.container
:members:
:undoc-members:
:show-inheritance:
================================================
FILE: docs/msoffcrypto.method.rst
================================================
msoffcrypto.method package
==========================
Subpackages
-----------
.. toctree::
:maxdepth: 1
msoffcrypto.method.container
Submodules
----------
msoffcrypto.method.ecma376\_agile module
----------------------------------------
.. automodule:: msoffcrypto.method.ecma376_agile
:members:
:undoc-members:
:show-inheritance:
msoffcrypto.method.ecma376\_extensible module
---------------------------------------------
.. automodule:: msoffcrypto.method.ecma376_extensible
:members:
:undoc-members:
:show-inheritance:
msoffcrypto.method.ecma376\_standard module
-------------------------------------------
.. automodule:: msoffcrypto.method.ecma376_standard
:members:
:undoc-members:
:show-inheritance:
msoffcrypto.method.rc4 module
-----------------------------
.. automodule:: msoffcrypto.method.rc4
:members:
:undoc-members:
:show-inheritance:
msoffcrypto.method.rc4\_cryptoapi module
----------------------------------------
.. automodule:: msoffcrypto.method.rc4_cryptoapi
:members:
:undoc-members:
:show-inheritance:
msoffcrypto.method.xor\_obfuscation module
------------------------------------------
.. automodule:: msoffcrypto.method.xor_obfuscation
:members:
:undoc-members:
:show-inheritance:
Module contents
---------------
.. automodule:: msoffcrypto.method
:members:
:undoc-members:
:show-inheritance:
================================================
FILE: docs/msoffcrypto.rst
================================================
msoffcrypto package
===================
Subpackages
-----------
.. toctree::
:maxdepth: 1
msoffcrypto.exceptions
msoffcrypto.format
msoffcrypto.method
Module contents
---------------
.. automodule:: msoffcrypto
:members:
:undoc-members:
:show-inheritance:
================================================
FILE: docs/requirements.txt
================================================
accessible-pygments==0.0.5 ; python_version >= "3.10" and python_version < "4.0"
alabaster==1.0.0 ; python_version >= "3.10" and python_version < "4.0"
anyio==4.12.1 ; python_version >= "3.10" and python_version < "4.0"
babel==2.17.0 ; python_version >= "3.10" and python_version < "4.0"
beautifulsoup4==4.14.3 ; python_version >= "3.10" and python_version < "4.0"
certifi==2026.1.4 ; python_version >= "3.10" and python_version < "4.0"
charset-normalizer==3.4.4 ; python_version >= "3.10" and python_version < "4.0"
click==8.3.1 ; python_version >= "3.10" and python_version < "4.0"
colorama==0.4.6 ; python_version >= "3.10" and python_version < "4.0"
docutils==0.21.2 ; python_version >= "3.10" and python_version < "4.0"
exceptiongroup==1.3.1 ; python_version == "3.10"
furo==2025.12.19 ; python_version >= "3.10" and python_version < "4.0"
h11==0.16.0 ; python_version >= "3.10" and python_version < "4.0"
idna==3.11 ; python_version >= "3.10" and python_version < "4.0"
imagesize==1.4.1 ; python_version >= "3.10" and python_version < "4.0"
jinja2==3.1.6 ; python_version >= "3.10" and python_version < "4.0"
markdown-it-py==3.0.0 ; python_version >= "3.10" and python_version < "4.0"
markupsafe==3.0.3 ; python_version >= "3.10" and python_version < "4.0"
mdit-py-plugins==0.5.0 ; python_version >= "3.10" and python_version < "4.0"
mdurl==0.1.2 ; python_version >= "3.10" and python_version < "4.0"
myst-parser==4.0.1 ; python_version >= "3.10" and python_version < "4.0"
packaging==25.0 ; python_version >= "3.10" and python_version < "4.0"
pygments==2.19.2 ; python_version >= "3.10" and python_version < "4.0"
pyyaml==6.0.3 ; python_version >= "3.10" and python_version < "4.0"
requests==2.32.5 ; python_version >= "3.10" and python_version < "4.0"
snowballstemmer==3.0.1 ; python_version >= "3.10" and python_version < "4.0"
soupsieve==2.8.1 ; python_version >= "3.10" and python_version < "4.0"
sphinx-autobuild==2024.10.2 ; python_version >= "3.10" and python_version < "4.0"
sphinx-basic-ng==1.0.0b2 ; python_version >= "3.10" and python_version < "4.0"
sphinx==8.1.3 ; python_version >= "3.10" and python_version < "4.0"
sphinxcontrib-applehelp==2.0.0 ; python_version >= "3.10" and python_version < "4.0"
sphinxcontrib-autoprogram==0.1.9 ; python_version >= "3.10" and python_version < "4.0"
sphinxcontrib-devhelp==2.0.0 ; python_version >= "3.10" and python_version < "4.0"
sphinxcontrib-htmlhelp==2.1.0 ; python_version >= "3.10" and python_version < "4.0"
sphinxcontrib-jsmath==1.0.1 ; python_version >= "3.10" and python_version < "4.0"
sphinxcontrib-qthelp==2.0.0 ; python_version >= "3.10" and python_version < "4.0"
sphinxcontrib-serializinghtml==2.0.0 ; python_version >= "3.10" and python_version < "4.0"
starlette==0.51.0 ; python_version >= "3.10" and python_version < "4.0"
tomli==2.4.0 ; python_version == "3.10"
typing-extensions==4.15.0 ; python_version >= "3.10" and python_version < "4.0"
urllib3==2.6.3 ; python_version >= "3.10" and python_version < "4.0"
uvicorn==0.40.0 ; python_version >= "3.10" and python_version < "4.0"
watchfiles==1.1.1 ; python_version >= "3.10" and python_version < "4.0"
websockets==16.0 ; python_version >= "3.10" and python_version < "4.0"
================================================
FILE: msoffcrypto/__init__.py
================================================
import zipfile
import olefile
from msoffcrypto import exceptions
def OfficeFile(file):
"""Return an office file object based on the format of given file.
Args:
file (:obj:`_io.BufferedReader`): Input file.
Returns:
BaseOfficeFile object.
Examples:
>>> with open("tests/inputs/example_password.docx", "rb") as f:
... officefile = OfficeFile(f)
... officefile.keyTypes
('password', 'private_key', 'secret_key')
>>> with open("tests/inputs/example_password.docx", "rb") as f:
... officefile = OfficeFile(f)
... officefile.load_key(password="Password1234_", verify_password=True)
>>> with open("README.md", "rb") as f:
... officefile = OfficeFile(f)
Traceback (most recent call last):
...
msoffcrypto.exceptions.FileFormatError: ...
>>> with open("tests/inputs/example_password.docx", "rb") as f:
... officefile = OfficeFile(f)
... officefile.load_key(password="0000", verify_password=True)
Traceback (most recent call last):
...
msoffcrypto.exceptions.InvalidKeyError: ...
Given file handle will not be closed, the file position will most certainly
change.
"""
file.seek(0) # required by isOleFile
if olefile.isOleFile(file):
ole = olefile.OleFileIO(file)
elif zipfile.is_zipfile(file): # Heuristic
from msoffcrypto.format.ooxml import OOXMLFile
return OOXMLFile(file)
else:
raise exceptions.FileFormatError("Unsupported file format")
# TODO: Make format specifiable by option in case of obstruction
# Try this first; see https://github.com/nolze/msoffcrypto-tool/issues/17
if ole.exists("EncryptionInfo"):
from msoffcrypto.format.ooxml import OOXMLFile
return OOXMLFile(file)
# MS-DOC: The WordDocument stream MUST be present in the file.
# https://msdn.microsoft.com/en-us/library/dd926131(v=office.12).aspx
elif ole.exists("wordDocument"):
from msoffcrypto.format.doc97 import Doc97File
return Doc97File(file)
# MS-XLS: A file MUST contain exactly one Workbook Stream, ...
# https://msdn.microsoft.com/en-us/library/dd911009(v=office.12).aspx
elif ole.exists("Workbook"):
from msoffcrypto.format.xls97 import Xls97File
return Xls97File(file)
# MS-PPT: A required stream whose name MUST be "PowerPoint Document".
# https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/1fc22d56-28f9-4818-bd45-67c2bf721ccf
elif ole.exists("PowerPoint Document"):
from msoffcrypto.format.ppt97 import Ppt97File
return Ppt97File(file)
else:
raise exceptions.FileFormatError("Unrecognized file format")
================================================
FILE: msoffcrypto/__main__.py
================================================
import argparse
import getpass
import logging
import sys
import olefile
from msoffcrypto import OfficeFile, exceptions
from msoffcrypto.format.ooxml import OOXMLFile, _is_ooxml
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())
def _get_version():
if sys.version_info >= (3, 8):
from importlib import metadata
return metadata.version("msoffcrypto-tool")
else:
import pkg_resources
return pkg_resources.get_distribution("msoffcrypto-tool").version
def ifWIN32SetBinary(io):
if sys.platform == "win32":
import msvcrt
import os
msvcrt.setmode(io.fileno(), os.O_BINARY)
def is_encrypted(file):
r"""
Test if the file is encrypted.
>>> f = open("tests/inputs/plain.doc", "rb")
>>> is_encrypted(f)
False
"""
# TODO: Validate file
if not olefile.isOleFile(file):
return False
file = OfficeFile(file)
return file.is_encrypted()
parser = argparse.ArgumentParser()
group = parser.add_mutually_exclusive_group(required=True)
group.add_argument("-p", "--password", nargs="?", const="", dest="password", help="password text")
group.add_argument("-t", "--test", dest="test_encrypted", action="store_true", help="test if the file is encrypted")
parser.add_argument("-e", dest="encrypt", action="store_true", help="encryption mode (default is false)")
parser.add_argument("-v", dest="verbose", action="store_true", help="print verbose information")
parser.add_argument("infile", nargs="?", type=argparse.FileType("rb"), help="input file")
parser.add_argument("outfile", nargs="?", type=argparse.FileType("wb"), help="output file (if blank, stdout is used)")
def main():
args = parser.parse_args()
if args.verbose:
logger.removeHandler(logging.NullHandler())
logging.basicConfig(level=logging.DEBUG, format="%(message)s")
version = _get_version()
logger.debug("Version: {}".format(version))
if args.test_encrypted:
if not is_encrypted(args.infile):
print("{}: not encrypted".format(args.infile.name), file=sys.stderr)
sys.exit(1)
else:
logger.debug("{}: encrypted".format(args.infile.name))
return
if args.password:
password = args.password
else:
password = getpass.getpass()
if args.outfile is None:
ifWIN32SetBinary(sys.stdout)
if hasattr(sys.stdout, "buffer"): # For Python 2
args.outfile = sys.stdout.buffer
else:
args.outfile = sys.stdout
if args.encrypt:
if not _is_ooxml(args.infile):
raise exceptions.FileFormatError("Not an OOXML file")
# OOXML is the only format we support for encryption
file = OOXMLFile(args.infile)
file.encrypt(password, args.outfile)
else:
if not olefile.isOleFile(args.infile):
raise exceptions.FileFormatError("Not an OLE file")
file = OfficeFile(args.infile)
file.load_key(password=password)
file.decrypt(args.outfile)
if __name__ == "__main__":
main()
================================================
FILE: msoffcrypto/exceptions/__init__.py
================================================
class FileFormatError(Exception):
"""Raised when the format of given file is unsupported or unrecognized."""
pass
class ParseError(Exception):
"""Raised when the file cannot be parsed correctly."""
pass
class DecryptionError(Exception):
"""Raised when the file cannot be decrypted."""
pass
class EncryptionError(Exception):
"""Raised when the file cannot be encrypted."""
pass
class InvalidKeyError(DecryptionError):
"""Raised when the given password or key is incorrect or cannot be verified."""
pass
================================================
FILE: msoffcrypto/format/__init__.py
================================================
================================================
FILE: msoffcrypto/format/base.py
================================================
import abc
# For 2 and 3 compatibility
# https://stackoverflow.com/questions/35673474/
ABC = abc.ABCMeta("ABC", (object,), {"__slots__": ()})
class BaseOfficeFile(ABC):
def __init__(self):
pass
@abc.abstractmethod
def load_key(self):
pass
@abc.abstractmethod
def decrypt(self, outfile):
pass
@abc.abstractmethod
def is_encrypted(self) -> bool:
pass
================================================
FILE: msoffcrypto/format/common.py
================================================
import io
import logging
from struct import unpack
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())
# https://msdn.microsoft.com/en-us/library/dd926359(v=office.12).aspx
def _parse_encryptionheader(blob):
(flags,) = unpack("<I", blob.read(4))
# if mode == 'strict': compare values with spec.
(sizeExtra,) = unpack("<I", blob.read(4))
(algId,) = unpack("<I", blob.read(4))
(algIdHash,) = unpack("<I", blob.read(4))
(keySize,) = unpack("<I", blob.read(4))
(providerType,) = unpack("<I", blob.read(4))
(reserved1,) = unpack("<I", blob.read(4))
(reserved2,) = unpack("<I", blob.read(4))
cspName = blob.read().decode("utf-16le")
header = {
"flags": flags,
"sizeExtra": sizeExtra,
"algId": algId,
"algIdHash": algIdHash,
"keySize": keySize,
"providerType": providerType,
"reserved1": reserved1,
"reserved2": reserved2,
"cspName": cspName,
}
return header
# https://msdn.microsoft.com/en-us/library/dd910568(v=office.12).aspx
def _parse_encryptionverifier(blob, algorithm: str):
(saltSize,) = unpack("<I", blob.read(4))
salt = blob.read(16)
encryptedVerifier = blob.read(16)
(verifierHashSize,) = unpack("<I", blob.read(4))
if algorithm == "RC4":
encryptedVerifierHash = blob.read(20)
elif algorithm == "AES":
encryptedVerifierHash = blob.read(32)
else:
raise ValueError("Invalid algorithm: {}".format(algorithm))
verifier = {
"saltSize": saltSize,
"salt": salt,
"encryptedVerifier": encryptedVerifier,
"verifierHashSize": verifierHashSize,
"encryptedVerifierHash": encryptedVerifierHash,
}
return verifier
def _parse_header_RC4CryptoAPI(encryptionHeader):
_flags = encryptionHeader.read(4) # TODO: Support flags
(headerSize,) = unpack("<I", encryptionHeader.read(4))
logger.debug(headerSize)
blob = io.BytesIO(encryptionHeader.read(headerSize))
header = _parse_encryptionheader(blob)
logger.debug(header)
# NOTE: https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-offcrypto/36cfb17f-9b15-4a9b-911a-f401f60b3991
keySize = 0x00000028 if header["keySize"] == 0 else header["keySize"]
blob = io.BytesIO(encryptionHeader.read())
verifier = _parse_encryptionverifier(blob, "RC4") # TODO: Fix (cf. ooxml.py)
logger.debug(verifier)
info = {
"salt": verifier["salt"],
"keySize": keySize,
"encryptedVerifier": verifier["encryptedVerifier"],
"encryptedVerifierHash": verifier["encryptedVerifierHash"],
}
return info
================================================
FILE: msoffcrypto/format/doc97.py
================================================
import io
import logging
import shutil
import tempfile
from collections import namedtuple
from struct import pack, unpack, unpack_from
import olefile
from msoffcrypto import exceptions
from msoffcrypto.format import base
from msoffcrypto.format.common import _parse_header_RC4CryptoAPI
from msoffcrypto.method.rc4 import DocumentRC4
from msoffcrypto.method.rc4_cryptoapi import DocumentRC4CryptoAPI
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())
FibBase = namedtuple(
"FibBase",
[
"wIdent",
"nFib",
"unused",
"lid",
"pnNext",
"fDot",
"fGlsy",
"fComplex",
"fHasPic",
"cQuickSaves",
"fEncrypted",
"fWhichTblStm",
"fReadOnlyRecommended",
"fWriteReservation",
"fExtChar",
"fLoadOverride",
"fFarEast",
"nFibBack",
"fObfuscation",
"IKey",
"envr",
"fMac",
"fEmptySpecial",
"fLoadOverridePage",
"reserved1",
"reserved2",
"fSpare0",
"reserved3",
"reserved4",
"reserved5",
"reserved6",
],
)
def _parseFibBase(blob):
r"""
Pasrse FibBase binary blob.
>>> blob = io.BytesIO(b'\xec\xa5\xc1\x00G\x00\t\x04\x00\x00\x00\x13\xbf\x004\x00\
... \x00\x00\x00\x10\x00\x00\x00\x00\x00\x04\x00\x00\x16\x04\x00\x00')
>>> fibbase = _parseFibBase(blob)
>>> hex(fibbase.wIdent)
'0xa5ec'
>>> hex(fibbase.nFib)
'0xc1'
>>> hex(fibbase.fExtChar)
'0x1'
"""
getBit = lambda bits, i: (bits & (1 << i)) >> i
getBitSlice = lambda bits, i, w: (bits & (2**w - 1 << i)) >> i
# https://msdn.microsoft.com/en-us/library/dd944620(v=office.12).aspx
(buf,) = unpack_from("<H", blob.read(2))
wIdent = buf
(buf,) = unpack_from("<H", blob.read(2))
nFib = buf
(buf,) = unpack_from("<H", blob.read(2))
unused = buf
(buf,) = unpack_from("<H", blob.read(2))
lid = buf
(buf,) = unpack_from("<H", blob.read(2))
pnNext = buf
(buf,) = unpack_from("<H", blob.read(2))
fDot = getBit(buf, 0)
fGlsy = getBit(buf, 1)
fComplex = getBit(buf, 2)
fHasPic = getBit(buf, 3)
cQuickSaves = getBitSlice(buf, 4, 4)
fEncrypted = getBit(buf, 8)
fWhichTblStm = getBit(buf, 9)
fReadOnlyRecommended = getBit(buf, 10)
fWriteReservation = getBit(buf, 11)
fExtChar = getBit(buf, 12)
fLoadOverride = getBit(buf, 13)
fFarEast = getBit(buf, 14)
fObfuscation = getBit(buf, 15)
(buf,) = unpack_from("<H", blob.read(2))
nFibBack = buf
(buf,) = unpack_from("<I", blob.read(4))
IKey = buf
(buf,) = unpack_from("<B", blob.read(1))
envr = buf
(buf,) = unpack_from("<B", blob.read(1))
fMac = getBit(buf, 0)
fEmptySpecial = getBit(buf, 1)
fLoadOverridePage = getBit(buf, 2)
reserved1 = getBit(buf, 3)
reserved2 = getBit(buf, 4)
fSpare0 = getBitSlice(buf, 5, 3)
(buf,) = unpack_from("<H", blob.read(2))
reserved3 = buf
(buf,) = unpack_from("<H", blob.read(2))
reserved4 = buf
(buf,) = unpack_from("<I", blob.read(4))
reserved5 = buf
(buf,) = unpack_from("<I", blob.read(4))
reserved6 = buf
fibbase = FibBase(
wIdent=wIdent,
nFib=nFib,
unused=unused,
lid=lid,
pnNext=pnNext,
fDot=fDot,
fGlsy=fGlsy,
fComplex=fComplex,
fHasPic=fHasPic,
cQuickSaves=cQuickSaves,
fEncrypted=fEncrypted,
fWhichTblStm=fWhichTblStm,
fReadOnlyRecommended=fReadOnlyRecommended,
fWriteReservation=fWriteReservation,
fExtChar=fExtChar,
fLoadOverride=fLoadOverride,
fFarEast=fFarEast,
nFibBack=nFibBack,
fObfuscation=fObfuscation,
IKey=IKey,
envr=envr,
fMac=fMac,
fEmptySpecial=fEmptySpecial,
fLoadOverridePage=fLoadOverridePage,
reserved1=reserved1,
reserved2=reserved2,
fSpare0=fSpare0,
reserved3=reserved3,
reserved4=reserved4,
reserved5=reserved5,
reserved6=reserved6,
)
return fibbase
def _packFibBase(fibbase):
setBit = lambda bits, i, v: (bits & ~(1 << i)) | (v << i)
setBitSlice = lambda bits, i, w, v: (bits & ~((2**w - 1) << i)) | (
(v & (2**w - 1)) << i
)
blob = io.BytesIO()
buf = pack("<H", fibbase.wIdent)
blob.write(buf)
buf = pack("<H", fibbase.nFib)
blob.write(buf)
buf = pack("<H", fibbase.unused)
blob.write(buf)
buf = pack("<H", fibbase.lid)
blob.write(buf)
buf = pack("<H", fibbase.pnNext)
blob.write(buf)
_buf = 0xFFFF
_buf = setBit(_buf, 0, fibbase.fDot)
_buf = setBit(_buf, 1, fibbase.fGlsy)
_buf = setBit(_buf, 2, fibbase.fComplex)
_buf = setBit(_buf, 3, fibbase.fHasPic)
_buf = setBitSlice(_buf, 4, 4, fibbase.cQuickSaves)
_buf = setBit(_buf, 8, fibbase.fEncrypted)
_buf = setBit(_buf, 9, fibbase.fWhichTblStm)
_buf = setBit(_buf, 10, fibbase.fReadOnlyRecommended)
_buf = setBit(_buf, 11, fibbase.fWriteReservation)
_buf = setBit(_buf, 12, fibbase.fExtChar)
_buf = setBit(_buf, 13, fibbase.fLoadOverride)
_buf = setBit(_buf, 14, fibbase.fFarEast)
_buf = setBit(_buf, 15, fibbase.fObfuscation)
buf = pack("<H", _buf)
blob.write(buf)
buf = pack("<H", fibbase.nFibBack)
blob.write(buf)
buf = pack("<I", fibbase.IKey)
blob.write(buf)
buf = pack("<B", fibbase.envr)
blob.write(buf)
_buf = 0xFF
_buf = setBit(_buf, 0, fibbase.fMac)
_buf = setBit(_buf, 1, fibbase.fEmptySpecial)
_buf = setBit(_buf, 2, fibbase.fLoadOverridePage)
_buf = setBit(_buf, 3, fibbase.reserved1)
_buf = setBit(_buf, 4, fibbase.reserved2)
_buf = setBitSlice(_buf, 5, 3, fibbase.fSpare0)
buf = pack("<B", _buf)
blob.write(buf)
buf = pack("<H", fibbase.reserved3)
blob.write(buf)
buf = pack("<H", fibbase.reserved4)
blob.write(buf)
buf = pack("<I", fibbase.reserved5)
blob.write(buf)
buf = pack("<I", fibbase.reserved6)
blob.write(buf)
blob.seek(0)
return blob
def _parseFib(blob):
Fib = namedtuple("Fib", ["base"])
fib = Fib(base=_parseFibBase(blob))
return fib
def _parse_header_RC4(encryptionHeader):
# RC4: https://msdn.microsoft.com/en-us/library/dd908560(v=office.12).aspx
salt = encryptionHeader.read(16)
encryptedVerifier = encryptionHeader.read(16)
encryptedVerifierHash = encryptionHeader.read(16)
info = {
"salt": salt,
"encryptedVerifier": encryptedVerifier,
"encryptedVerifierHash": encryptedVerifierHash,
}
return info
class Doc97File(base.BaseOfficeFile):
"""Return a MS-DOC file object.
Examples:
>>> with open("tests/inputs/rc4cryptoapi_password.doc", "rb") as f:
... officefile = Doc97File(f)
... officefile.load_key(password="Password1234_")
>>> with open("tests/inputs/rc4cryptoapi_password.doc", "rb") as f:
... officefile = Doc97File(f)
... officefile.load_key(password="0000")
Traceback (most recent call last):
...
msoffcrypto.exceptions.InvalidKeyError: ...
"""
def __init__(self, file):
self.file = file
ole = olefile.OleFileIO(file) # do not close this, would close file
self.ole = ole
self.format = "doc97"
self.keyTypes = ["password"]
self.key = None
self.salt = None
# https://msdn.microsoft.com/en-us/library/dd944620(v=office.12).aspx
with ole.openstream("wordDocument") as stream:
fib = _parseFib(stream)
# https://msdn.microsoft.com/en-us/library/dd923367(v=office.12).aspx
tablename = "1Table" if fib.base.fWhichTblStm == 1 else "0Table"
Info = namedtuple("Info", ["fib", "tablename"])
self.info = Info(
fib=fib,
tablename=tablename,
)
def load_key(self, password=None):
fib = self.info.fib
logger.debug(
"fEncrypted: {}, fObfuscation: {}".format(
fib.base.fEncrypted, fib.base.fObfuscation
)
)
if fib.base.fEncrypted == 1:
if fib.base.fObfuscation == 1: # Using XOR obfuscation
xor_obf_password_verifier = fib.base.IKey
logger.debug(hex(xor_obf_password_verifier))
else: # elif fib.base.fObfuscation == 0:
encryptionHeader_size = fib.base.IKey
logger.debug(
"encryptionHeader_size: {}".format(hex(encryptionHeader_size))
)
with self.ole.openstream(self.info.tablename) as table:
encryptionHeader = (
table # TODO why create a 2nd reference to same stream?
)
encryptionVersionInfo = table.read(4)
vMajor, vMinor = unpack("<HH", encryptionVersionInfo)
logger.debug("Version: {} {}".format(vMajor, vMinor))
if vMajor == 0x0001 and vMinor == 0x0001: # RC4
info = _parse_header_RC4(encryptionHeader)
if DocumentRC4.verifypw(
password,
info["salt"],
info["encryptedVerifier"],
info["encryptedVerifierHash"],
):
self.type = "rc4"
self.key = password
self.salt = info["salt"]
else:
raise exceptions.InvalidKeyError(
"Failed to verify password"
)
elif (
vMajor in [0x0002, 0x0003, 0x0004] and vMinor == 0x0002
): # RC4 CryptoAPI
info = _parse_header_RC4CryptoAPI(encryptionHeader)
if DocumentRC4CryptoAPI.verifypw(
password,
info["salt"],
info["keySize"],
info["encryptedVerifier"],
info["encryptedVerifierHash"],
):
self.type = "rc4_cryptoapi"
self.key = password
self.salt = info["salt"]
self.keySize = info["keySize"]
else:
raise exceptions.InvalidKeyError(
"Failed to verify password"
)
else:
raise exceptions.DecryptionError(
"Unsupported encryption method"
)
else:
raise exceptions.DecryptionError("File is not encrypted")
def decrypt(self, outfile):
# fd, _outfile_path = tempfile.mkstemp()
# shutil.copyfile(os.path.realpath(self.file.name), _outfile_path)
# outole = olefile.OleFileIO(_outfile_path, write_mode=True)
obuf1 = io.BytesIO()
fibbase = FibBase(
wIdent=self.info.fib.base.wIdent,
nFib=self.info.fib.base.nFib,
unused=self.info.fib.base.unused,
lid=self.info.fib.base.lid,
pnNext=self.info.fib.base.pnNext,
fDot=self.info.fib.base.fDot,
fGlsy=self.info.fib.base.fGlsy,
fComplex=self.info.fib.base.fComplex,
fHasPic=self.info.fib.base.fHasPic,
cQuickSaves=self.info.fib.base.cQuickSaves,
fEncrypted=0,
fWhichTblStm=self.info.fib.base.fWhichTblStm,
fReadOnlyRecommended=self.info.fib.base.fReadOnlyRecommended,
fWriteReservation=self.info.fib.base.fWriteReservation,
fExtChar=self.info.fib.base.fExtChar,
fLoadOverride=self.info.fib.base.fLoadOverride,
fFarEast=self.info.fib.base.fFarEast,
nFibBack=self.info.fib.base.nFibBack,
fObfuscation=0,
IKey=0,
envr=self.info.fib.base.envr,
fMac=self.info.fib.base.fMac,
fEmptySpecial=self.info.fib.base.fEmptySpecial,
fLoadOverridePage=self.info.fib.base.fLoadOverridePage,
reserved1=self.info.fib.base.reserved1,
reserved2=self.info.fib.base.reserved2,
fSpare0=self.info.fib.base.fSpare0,
reserved3=self.info.fib.base.reserved3,
reserved4=self.info.fib.base.reserved4,
reserved5=self.info.fib.base.reserved5,
reserved6=self.info.fib.base.reserved6,
)
FIB_LENGTH = 0x44
header = _packFibBase(fibbase).read()
logger.debug(len(header))
obuf1.seek(0)
obuf1.write(header)
with self.ole.openstream("wordDocument") as worddocument:
worddocument.seek(len(header))
header = worddocument.read(FIB_LENGTH - len(header))
worddocument.seek(0)
logger.debug(len(header))
obuf1.write(header)
if self.type == "rc4":
dec1 = DocumentRC4.decrypt(self.key, self.salt, worddocument)
elif self.type == "rc4_cryptoapi":
dec1 = DocumentRC4CryptoAPI.decrypt(
self.key, self.salt, self.keySize, worddocument
)
else:
raise exceptions.DecryptionError(
"Unsupported encryption method: {}".format(self.type)
)
dec1.seek(FIB_LENGTH)
obuf1.write(dec1.read())
obuf1.seek(0)
# TODO: Preserve header
obuf2 = io.BytesIO()
if self.type == "rc4":
with self.ole.openstream(self.info.tablename) as stream:
dec2 = DocumentRC4.decrypt(self.key, self.salt, stream)
elif self.type == "rc4_cryptoapi":
with self.ole.openstream(self.info.tablename) as stream:
dec2 = DocumentRC4CryptoAPI.decrypt(
self.key, self.salt, self.keySize, stream
)
else:
raise exceptions.DecryptionError(
"Unsupported encryption method: {}".format(self.type)
)
obuf2.write(dec2.read())
obuf2.seek(0)
obuf3 = None
if self.ole.exists("Data"):
obuf3 = io.BytesIO()
if self.type == "rc4":
with self.ole.openstream("Data") as data_stream:
dec3 = DocumentRC4.decrypt(self.key, self.salt, data_stream)
elif self.type == "rc4_cryptoapi":
with self.ole.openstream("Data") as data_stream:
dec3 = DocumentRC4CryptoAPI.decrypt(
self.key, self.salt, self.keySize, data_stream
)
else:
raise exceptions.DecryptionError(
"Unsupported encryption method: {}".format(self.type)
)
obuf3.write(dec3.read())
obuf3.seek(0)
with tempfile.TemporaryFile() as _outfile:
self.file.seek(0)
shutil.copyfileobj(self.file, _outfile)
outole = olefile.OleFileIO(_outfile, write_mode=True)
outole.write_stream("wordDocument", obuf1.read())
outole.write_stream(self.info.tablename, obuf2.read())
if obuf3:
outole.write_stream("Data", obuf3.read())
# _outfile = open(_outfile_path, 'rb')
_outfile.seek(0)
shutil.copyfileobj(_outfile, outfile)
def is_encrypted(self):
r"""
Test if the file is encrypted.
>>> f = open("tests/inputs/plain.doc", "rb")
>>> file = Doc97File(f)
>>> file.is_encrypted()
False
>>> f = open("tests/inputs/rc4cryptoapi_password.doc", "rb")
>>> file = Doc97File(f)
>>> file.is_encrypted()
True
"""
return True if self.info.fib.base.fEncrypted == 1 else False
================================================
FILE: msoffcrypto/format/ooxml.py
================================================
import base64
import io
import logging
import zipfile
from struct import unpack
from xml.dom.minidom import parseString
import olefile
from msoffcrypto import exceptions
from msoffcrypto.format import base
from msoffcrypto.format.common import _parse_encryptionheader, _parse_encryptionverifier
from msoffcrypto.method.ecma376_agile import ECMA376Agile
from msoffcrypto.method.ecma376_standard import ECMA376Standard
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())
def _is_ooxml(file):
if not zipfile.is_zipfile(file):
return False
try:
zfile = zipfile.ZipFile(file)
with zfile.open("[Content_Types].xml") as stream:
xml = parseString(stream.read())
# Heuristic
if (
xml.documentElement.tagName == "Types"
and xml.documentElement.namespaceURI
== "http://schemas.openxmlformats.org/package/2006/content-types"
):
return True
else:
return False
except Exception:
return False
def _parseinfo_standard(ole):
(headerFlags,) = unpack("<I", ole.read(4))
(encryptionHeaderSize,) = unpack("<I", ole.read(4))
block = ole.read(encryptionHeaderSize)
blob = io.BytesIO(block)
header = _parse_encryptionheader(blob)
block = ole.read()
blob = io.BytesIO(block)
algIdMap = {
0x0000660E: "AES-128",
0x0000660F: "AES-192",
0x00006610: "AES-256",
}
verifier = _parse_encryptionverifier(
blob, "AES" if header["algId"] & 0xFF00 == 0x6600 else "RC4"
) # TODO: Fix
info = {
"header": header,
"verifier": verifier,
}
return info
def _parseinfo_agile(ole):
ole.seek(8)
xml = parseString(ole.read())
keyDataSalt = base64.b64decode(
xml.getElementsByTagName("keyData")[0].getAttribute("saltValue")
)
keyDataHashAlgorithm = xml.getElementsByTagName("keyData")[0].getAttribute(
"hashAlgorithm"
)
keyDataBlockSize = int(
xml.getElementsByTagName("keyData")[0].getAttribute("blockSize")
)
encryptedHmacKey = base64.b64decode(
xml.getElementsByTagName("dataIntegrity")[0].getAttribute("encryptedHmacKey")
)
encryptedHmacValue = base64.b64decode(
xml.getElementsByTagName("dataIntegrity")[0].getAttribute("encryptedHmacValue")
)
password_node = xml.getElementsByTagNameNS(
"http://schemas.microsoft.com/office/2006/keyEncryptor/password", "encryptedKey"
)[0]
spinValue = int(password_node.getAttribute("spinCount"))
encryptedKeyValue = base64.b64decode(
password_node.getAttribute("encryptedKeyValue")
)
encryptedVerifierHashInput = base64.b64decode(
password_node.getAttribute("encryptedVerifierHashInput")
)
encryptedVerifierHashValue = base64.b64decode(
password_node.getAttribute("encryptedVerifierHashValue")
)
passwordSalt = base64.b64decode(password_node.getAttribute("saltValue"))
passwordHashAlgorithm = password_node.getAttribute("hashAlgorithm")
passwordKeyBits = int(password_node.getAttribute("keyBits"))
info = {
"keyDataSalt": keyDataSalt,
"keyDataHashAlgorithm": keyDataHashAlgorithm,
"keyDataBlockSize": keyDataBlockSize,
"encryptedHmacKey": encryptedHmacKey,
"encryptedHmacValue": encryptedHmacValue,
"encryptedVerifierHashInput": encryptedVerifierHashInput,
"encryptedVerifierHashValue": encryptedVerifierHashValue,
"encryptedKeyValue": encryptedKeyValue,
"spinValue": spinValue,
"passwordSalt": passwordSalt,
"passwordHashAlgorithm": passwordHashAlgorithm,
"passwordKeyBits": passwordKeyBits,
}
return info
def _parseinfo(ole):
versionMajor, versionMinor = unpack("<HH", ole.read(4))
if versionMajor == 4 and versionMinor == 4: # Agile
return "agile", _parseinfo_agile(ole)
elif versionMajor in [2, 3, 4] and versionMinor == 2: # Standard
return "standard", _parseinfo_standard(ole)
elif versionMajor in [3, 4] and versionMinor == 3: # Extensible
raise exceptions.DecryptionError(
"Unsupported EncryptionInfo version (Extensible Encryption)"
)
raise exceptions.DecryptionError(
"Unsupported EncryptionInfo version ({}:{})".format(versionMajor, versionMinor)
)
class OOXMLFile(base.BaseOfficeFile):
"""Return an OOXML file object.
Examples:
>>> with open("tests/inputs/example_password.docx", "rb") as f:
... officefile = OOXMLFile(f)
... officefile.load_key(password="Password1234_", verify_password=True)
>>> with open("tests/inputs/example_password.docx", "rb") as f:
... officefile = OOXMLFile(f)
... officefile.load_key(password="0000", verify_password=True)
Traceback (most recent call last):
...
msoffcrypto.exceptions.InvalidKeyError: ...
"""
def __init__(self, file):
self.format = "ooxml"
file.seek(0) # TODO: Investigate the effect (required for olefile.isOleFile)
# olefile cannot process non password protected ooxml files.
# TODO: this code is duplicate of OfficeFile(). Merge?
if olefile.isOleFile(file):
ole = olefile.OleFileIO(file)
self.file = ole
try:
with self.file.openstream("EncryptionInfo") as stream:
self.type, self.info = _parseinfo(stream)
except IOError:
raise exceptions.FileFormatError(
"Supposed to be an encrypted OOXML file, but no EncryptionInfo stream found"
)
logger.debug("OOXMLFile.type: {}".format(self.type))
self.secret_key = None
if self.type == "agile":
# TODO: Support aliases?
self.keyTypes = ("password", "private_key", "secret_key")
elif self.type == "standard":
self.keyTypes = ("password", "secret_key")
elif self.type == "extensible":
pass
elif _is_ooxml(file):
self.type = "plain"
self.file = file
else:
raise exceptions.FileFormatError("Unsupported file format")
def load_key(
self, password=None, private_key=None, secret_key=None, verify_password=False
):
"""
>>> with open("tests/outputs/ecma376standard_password_plain.docx", "rb") as f:
... officefile = OOXMLFile(f)
... officefile.load_key("1234")
"""
if password:
if self.type == "agile":
self.secret_key = ECMA376Agile.makekey_from_password(
password,
self.info["passwordSalt"],
self.info["passwordHashAlgorithm"],
self.info["encryptedKeyValue"],
self.info["spinValue"],
self.info["passwordKeyBits"],
)
if verify_password:
verified = ECMA376Agile.verify_password(
password,
self.info["passwordSalt"],
self.info["passwordHashAlgorithm"],
self.info["encryptedVerifierHashInput"],
self.info["encryptedVerifierHashValue"],
self.info["spinValue"],
self.info["passwordKeyBits"],
)
if not verified:
raise exceptions.InvalidKeyError("Key verification failed")
elif self.type == "standard":
self.secret_key = ECMA376Standard.makekey_from_password(
password,
self.info["header"]["algId"],
self.info["header"]["algIdHash"],
self.info["header"]["providerType"],
self.info["header"]["keySize"],
self.info["verifier"]["saltSize"],
self.info["verifier"]["salt"],
)
if verify_password:
verified = ECMA376Standard.verifykey(
self.secret_key,
self.info["verifier"]["encryptedVerifier"],
self.info["verifier"]["encryptedVerifierHash"],
)
if not verified:
raise exceptions.InvalidKeyError("Key verification failed")
elif self.type == "extensible":
pass
elif self.type == "plain":
pass
elif private_key:
if self.type == "agile":
self.secret_key = ECMA376Agile.makekey_from_privkey(
private_key, self.info["encryptedKeyValue"]
)
else:
raise exceptions.DecryptionError(
"Unsupported key type for the encryption method"
)
elif secret_key:
self.secret_key = secret_key
else:
raise exceptions.DecryptionError("No key specified")
def decrypt(self, outfile, verify_integrity=False):
"""
>>> from msoffcrypto import exceptions
>>> from io import BytesIO; outfile = BytesIO()
>>> with open("tests/outputs/ecma376standard_password_plain.docx", "rb") as f:
... officefile = OOXMLFile(f)
... officefile.load_key("1234")
... officefile.decrypt(outfile)
Traceback (most recent call last):
msoffcrypto.exceptions.DecryptionError: Document is not encrypted
"""
if self.type == "agile":
with self.file.openstream("EncryptedPackage") as stream:
if verify_integrity:
verified = ECMA376Agile.verify_integrity(
self.secret_key,
self.info["keyDataSalt"],
self.info["keyDataHashAlgorithm"],
self.info["keyDataBlockSize"],
self.info["encryptedHmacKey"],
self.info["encryptedHmacValue"],
stream,
)
if not verified:
raise exceptions.InvalidKeyError(
"Payload integrity verification failed"
)
obuf = ECMA376Agile.decrypt(
self.secret_key,
self.info["keyDataSalt"],
self.info["keyDataHashAlgorithm"],
stream,
)
outfile.write(obuf)
elif self.type == "standard":
with self.file.openstream("EncryptedPackage") as stream:
obuf = ECMA376Standard.decrypt(self.secret_key, stream)
outfile.write(obuf)
elif self.type == "plain":
raise exceptions.DecryptionError("Document is not encrypted")
else:
raise exceptions.DecryptionError("Unsupported encryption method")
# If the file is successfully decrypted, there must be a valid OOXML file, i.e. a valid zip file
if not zipfile.is_zipfile(io.BytesIO(obuf)):
raise exceptions.InvalidKeyError(
"The file could not be decrypted with this password"
)
def encrypt(self, password, outfile):
"""
>>> from msoffcrypto.format.ooxml import OOXMLFile
>>> from io import BytesIO; outfile = BytesIO()
>>> with open("tests/outputs/example.docx", "rb") as f:
... officefile = OOXMLFile(f)
... officefile.encrypt("1234", outfile)
"""
if self.is_encrypted():
raise exceptions.EncryptionError("File is already encrypted")
self.file.seek(0)
buf = ECMA376Agile.encrypt(password, self.file)
if not olefile.isOleFile(buf):
raise exceptions.EncryptionError("Unable to encrypt this file")
outfile.write(buf)
def is_encrypted(self):
"""
>>> with open("tests/inputs/example_password.docx", "rb") as f:
... officefile = OOXMLFile(f)
... officefile.is_encrypted()
True
>>> with open("tests/outputs/ecma376standard_password_plain.docx", "rb") as f:
... officefile = OOXMLFile(f)
... officefile.is_encrypted()
False
"""
# Heuristic
if self.type == "plain":
return False
elif isinstance(self.file, olefile.OleFileIO):
return True
else:
return False
================================================
FILE: msoffcrypto/format/ppt97.py
================================================
import io
import logging
import shutil
import tempfile
from collections import namedtuple
from struct import pack, unpack
import olefile
from msoffcrypto import exceptions
from msoffcrypto.format import base
from msoffcrypto.format.common import _parse_header_RC4CryptoAPI
from msoffcrypto.method.rc4_cryptoapi import DocumentRC4CryptoAPI
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())
RecordHeader = namedtuple(
"RecordHeader",
[
"recVer",
"recInstance",
"recType",
"recLen",
],
)
def _parseRecordHeader(blob):
# RecordHeader: https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/df201194-0cd0-4dfb-bf10-eea353d8eabc
getBitSlice = lambda bits, i, w: (bits & (2**w - 1 << i)) >> i
blob.seek(0)
(buf,) = unpack("<H", blob.read(2))
recVer = getBitSlice(buf, 0, 4)
recInstance = getBitSlice(buf, 4, 12)
(recType,) = unpack("<H", blob.read(2))
(recLen,) = unpack("<I", blob.read(4))
rh = RecordHeader(
recVer=recVer,
recInstance=recInstance,
recType=recType,
recLen=recLen,
)
return rh
def _packRecordHeader(rh):
setBitSlice = lambda bits, i, w, v: (bits & ~((2**w - 1) << i)) | (
(v & (2**w - 1)) << i
)
blob = io.BytesIO()
_buf = 0xFFFF
_buf = setBitSlice(_buf, 0, 4, rh.recVer)
_buf = setBitSlice(_buf, 4, 12, rh.recInstance)
buf = pack("<H", _buf)
blob.write(buf)
buf = pack("<H", rh.recType)
blob.write(buf)
buf = pack("<I", rh.recLen)
blob.write(buf)
blob.seek(0)
return blob
CurrentUserAtom = namedtuple(
"CurrentUserAtom",
[
"rh",
"size",
"headerToken",
"offsetToCurrentEdit",
"lenUserName",
"docFileVersion",
"majorVersion",
"minorVersion",
"unused",
"ansiUserName",
"relVersion",
"unicodeUserName",
],
)
def _parseCurrentUserAtom(blob):
# CurrentUserAtom: https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/940d5700-e4d7-4fc0-ab48-fed5dbc48bc1
# rh (8 bytes): A RecordHeader structure...
buf = io.BytesIO(blob.read(8))
rh = _parseRecordHeader(buf)
# logger.debug(rh)
# ...Sub-fields are further specified in the following table.
assert rh.recVer == 0x0
assert rh.recInstance == 0x000
assert rh.recType == 0x0FF6
(size,) = unpack("<I", blob.read(4))
# logger.debug(hex(size))
# size (4 bytes): ...It MUST be 0x00000014.
assert size == 0x00000014
# headerToken (4 bytes): An unsigned integer that specifies
# a token used to identify whether the file is encrypted.
(headerToken,) = unpack("<I", blob.read(4))
# TODO: Check headerToken value
(offsetToCurrentEdit,) = unpack("<I", blob.read(4))
(lenUserName,) = unpack("<H", blob.read(2))
(docFileVersion,) = unpack("<H", blob.read(2))
(
majorVersion,
minorVersion,
) = unpack("<BB", blob.read(2))
unused = blob.read(2)
ansiUserName = blob.read(lenUserName)
(relVersion,) = unpack("<I", blob.read(4))
unicodeUserName = blob.read(2 * lenUserName)
return CurrentUserAtom(
rh=rh,
size=size,
headerToken=headerToken,
offsetToCurrentEdit=offsetToCurrentEdit,
lenUserName=lenUserName,
docFileVersion=docFileVersion,
majorVersion=majorVersion,
minorVersion=minorVersion,
unused=unused,
ansiUserName=ansiUserName,
relVersion=relVersion,
unicodeUserName=unicodeUserName,
)
def _packCurrentUserAtom(currentuseratom):
blob = io.BytesIO()
buf = _packRecordHeader(currentuseratom.rh).read()
blob.write(buf)
buf = pack("<I", currentuseratom.size)
blob.write(buf)
buf = pack("<I", currentuseratom.headerToken)
blob.write(buf)
buf = pack("<I", currentuseratom.offsetToCurrentEdit)
blob.write(buf)
buf = pack("<H", currentuseratom.lenUserName)
blob.write(buf)
buf = pack("<H", currentuseratom.docFileVersion)
blob.write(buf)
buf = pack("<BB", currentuseratom.majorVersion, currentuseratom.minorVersion)
blob.write(buf)
buf = currentuseratom.unused
blob.write(buf)
buf = currentuseratom.ansiUserName
blob.write(buf)
buf = pack("<I", currentuseratom.relVersion)
blob.write(buf)
buf = currentuseratom.unicodeUserName
blob.write(buf)
blob.seek(0)
return blob
CurrentUser = namedtuple("CurrentUser", ["currentuseratom"])
def _parseCurrentUser(blob):
# Current User Stream: https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/76cfa657-07a6-464b-81ab-4c017c611f64
currentuser = CurrentUser(currentuseratom=_parseCurrentUserAtom(blob))
return currentuser
def _packCurrentUser(currentuser):
blob = io.BytesIO()
buf = _packCurrentUserAtom(currentuser.currentuseratom).read()
blob.write(buf)
blob.seek(0)
return blob
UserEditAtom = namedtuple(
"UserEditAtom",
[
"rh",
"lastSlideIdRef",
"version",
"minorVersion",
"majorVersion",
"offsetLastEdit",
"offsetPersistDirectory",
"docPersistIdRef",
"persistIdSeed",
"lastView",
"unused",
"encryptSessionPersistIdRef",
],
)
def _parseUserEditAtom(blob):
# UserEditAtom: https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/3ffb3fab-95de-4873-98aa-d508fbbac981
# rh (8 bytes): A RecordHeader structure...
buf = io.BytesIO(blob.read(8))
rh = _parseRecordHeader(buf)
# logger.debug(rh)
# ...Sub-fields are further specified in the following table.
assert rh.recVer == 0x0
assert rh.recInstance == 0x000
assert rh.recType == 0x0FF5
assert (
rh.recLen == 0x0000001C or rh.recLen == 0x00000020
) # 0x0000001c + len(encryptSessionPersistIdRef)
(lastSlideIdRef,) = unpack("<I", blob.read(4))
(version,) = unpack("<H", blob.read(2))
(
minorVersion,
majorVersion,
) = unpack("<BB", blob.read(2))
# majorVersion, minorVersion, = unpack("<BB", blob.read(2))
(offsetLastEdit,) = unpack("<I", blob.read(4))
(offsetPersistDirectory,) = unpack("<I", blob.read(4))
(docPersistIdRef,) = unpack("<I", blob.read(4))
(persistIdSeed,) = unpack("<I", blob.read(4))
(lastView,) = unpack("<H", blob.read(2))
unused = blob.read(2)
# encryptSessionPersistIdRef (4 bytes): An optional PersistIdRef
# that specifies the value to look up in the persist object directory
# to find the offset of the CryptSession10Container record (section 2.3.7).
buf = blob.read(4)
if len(buf) == 4:
(encryptSessionPersistIdRef,) = unpack("<I", buf)
else:
encryptSessionPersistIdRef = None
return UserEditAtom(
rh=rh,
lastSlideIdRef=lastSlideIdRef,
version=version,
minorVersion=minorVersion,
majorVersion=majorVersion,
offsetLastEdit=offsetLastEdit,
offsetPersistDirectory=offsetPersistDirectory,
docPersistIdRef=docPersistIdRef,
persistIdSeed=persistIdSeed,
lastView=lastView,
unused=unused,
encryptSessionPersistIdRef=encryptSessionPersistIdRef,
)
def _packUserEditAtom(usereditatom):
blob = io.BytesIO()
buf = _packRecordHeader(usereditatom.rh).read()
blob.write(buf)
buf = pack("<I", usereditatom.lastSlideIdRef)
blob.write(buf)
buf = pack("<H", usereditatom.version)
blob.write(buf)
buf = pack("<BB", usereditatom.minorVersion, usereditatom.majorVersion)
blob.write(buf)
buf = pack("<I", usereditatom.offsetLastEdit)
blob.write(buf)
buf = pack("<I", usereditatom.offsetPersistDirectory)
blob.write(buf)
buf = pack("<I", usereditatom.docPersistIdRef)
blob.write(buf)
buf = pack("<I", usereditatom.persistIdSeed)
blob.write(buf)
buf = pack("<H", usereditatom.lastView)
blob.write(buf)
buf = usereditatom.unused
blob.write(buf)
# Optional value
if usereditatom.encryptSessionPersistIdRef is not None:
buf = pack("<I", usereditatom.encryptSessionPersistIdRef)
blob.write(buf)
blob.seek(0)
return blob
PersistDirectoryEntry = namedtuple(
"PersistDirectoryEntry",
[
"persistId",
"cPersist",
"rgPersistOffset",
],
)
def _parsePersistDirectoryEntry(blob):
# PersistDirectoryEntry: https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/6214b5a6-7ca2-4a86-8a0e-5fd3d3eff1c9
getBitSlice = lambda bits, i, w: (bits & (2**w - 1 << i)) >> i
(buf,) = unpack("<I", blob.read(4))
persistId = getBitSlice(buf, 0, 20)
cPersist = getBitSlice(buf, 20, 12)
# cf. PersistOffsetEntry: https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/a056484a-2132-4e1e-aa54-6e387f9695cf
size_rgPersistOffset = 4 * cPersist
_rgPersistOffset = blob.read(size_rgPersistOffset)
_rgPersistOffset = io.BytesIO(_rgPersistOffset)
rgPersistOffset = []
pos = 0
while pos < size_rgPersistOffset:
(persistoffsetentry,) = unpack("<I", _rgPersistOffset.read(4))
rgPersistOffset.append(persistoffsetentry)
pos += 4
return PersistDirectoryEntry(
persistId=persistId,
cPersist=cPersist,
rgPersistOffset=rgPersistOffset,
)
def _packPersistDirectoryEntry(directoryentry):
setBitSlice = lambda bits, i, w, v: (bits & ~((2**w - 1) << i)) | (
(v & (2**w - 1)) << i
)
blob = io.BytesIO()
_buf = 0xFFFFFFFF
_buf = setBitSlice(_buf, 0, 20, directoryentry.persistId)
_buf = setBitSlice(_buf, 20, 12, directoryentry.cPersist)
buf = pack("<I", _buf)
blob.write(buf)
for v in directoryentry.rgPersistOffset:
buf = pack("<I", v)
blob.write(buf)
blob.seek(0)
return blob
PersistDirectoryAtom = namedtuple(
"PersistDirectoryAtom",
[
"rh",
"rgPersistDirEntry",
],
)
def _parsePersistDirectoryAtom(blob):
# PersistDirectoryAtom: https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/d10a093d-860f-409c-b065-aeb24b830505
# rh (8 bytes): A RecordHeader structure...
buf = io.BytesIO(blob.read(8))
rh = _parseRecordHeader(buf)
# logger.debug(rh)
# ...Sub-fields are further specified in the following table.
assert rh.recVer == 0x0
assert rh.recInstance == 0x000
assert rh.recType == 0x1772
_rgPersistDirEntry = blob.read(rh.recLen)
_rgPersistDirEntry = io.BytesIO(_rgPersistDirEntry)
rgPersistDirEntry = []
pos = 0
while pos < rh.recLen:
persistdirectoryentry = _parsePersistDirectoryEntry(_rgPersistDirEntry)
size_persistdirectoryentry = 4 + 4 * len(persistdirectoryentry.rgPersistOffset)
# logger.debug((persistdirectoryentry, size_persistdirectoryentry))
rgPersistDirEntry.append(persistdirectoryentry)
pos += size_persistdirectoryentry
return PersistDirectoryAtom(
rh=rh,
rgPersistDirEntry=rgPersistDirEntry,
)
def _packPersistDirectoryAtom(directoryatom):
blob = io.BytesIO()
buf = _packRecordHeader(directoryatom.rh).read()
blob.write(buf)
for v in directoryatom.rgPersistDirEntry:
buf = _packPersistDirectoryEntry(v)
blob.write(buf.read())
blob.seek(0)
return blob
def _parseCryptSession10Container(blob):
# CryptSession10Container: https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/b0963334-4408-4621-879a-ef9c54551fd8
CryptSession10Container = namedtuple(
"CryptSession10Container",
[
"rh",
"data",
],
)
# rh (8 bytes): A RecordHeader structure...
buf = io.BytesIO(blob.read(8))
rh = _parseRecordHeader(buf)
# logger.debug(rh)
# ...Sub-fields are further specified in the following table.
assert rh.recVer == 0xF
# The specified value fails
# assert rh.recInstance == 0x000
assert rh.recType == 0x2F14
data = blob.read(rh.recLen)
return CryptSession10Container(
rh=rh,
data=data,
)
def construct_persistobjectdirectory(data):
# PowerPoint Document Stream: https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/1fc22d56-28f9-4818-bd45-67c2bf721ccf
# 1. Read the CurrentUserAtom record (section 2.3.2) from the Current User Stream (section 2.1.1). ...
data.currentuser.seek(0)
currentuser = _parseCurrentUser(data.currentuser)
# logger.debug(currentuser)
# 2. Seek, in the PowerPoint Document Stream, to the offset specified by the offsetToCurrentEdit field of
# the CurrentUserAtom record identified in step 1.
data.powerpointdocument.seek(currentuser.currentuseratom.offsetToCurrentEdit)
persistdirectoryatom_stack = []
# The stream MUST contain exactly one UserEditAtom record.
# https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/b0963334-4408-4621-879a-ef9c54551fd8
for i in range(1):
# 3. Read the UserEditAtom record at the current offset. ...
usereditatom = _parseUserEditAtom(data.powerpointdocument)
# logger.debug(usereditatom)
# 4. Seek to the offset specified by the offsetPersistDirectory field of the UserEditAtom record identified in step 3.
data.powerpointdocument.seek(usereditatom.offsetPersistDirectory)
# 5. Read the PersistDirectoryAtom record at the current offset. ...
persistdirectoryatom = _parsePersistDirectoryAtom(data.powerpointdocument)
# logger.debug(persistdirectoryatom)
persistdirectoryatom_stack.append(persistdirectoryatom)
# 6. Seek to the offset specified by the offsetLastEdit field in the UserEditAtom record identified in step 3.
# 7. Repeat steps 3 through 6 until offsetLastEdit is 0x00000000.
if usereditatom.offsetLastEdit == 0x00000000:
break
else:
data.powerpointdocument.seek(usereditatom.offsetLastEdit)
# 8. Construct the complete persist object directory for this file as follows:
persistobjectdirectory = {}
# 8a. For each PersistDirectoryAtom record previously identified in step 5,
# add the persist object identifier and persist object stream offset pairs to
# the persist object directory starting with the PersistDirectoryAtom record
# last identified, that is, the one closest to the beginning of the stream.
# 8b. Continue adding these pairs to the persist object directory for each PersistDirectoryAtom record
# in the reverse order that they were identified in step 5; that is, the pairs from the PersistDirectoryAtom record
# closest to the end of the stream are added last.
# 8c. When adding a new pair to the persist object directory, if the persist object identifier
# already exists in the persist object directory, the persist object stream offset from
# the new pair replaces the existing persist object stream offset for that persist object identifier.
while len(persistdirectoryatom_stack) > 0:
persistdirectoryatom = persistdirectoryatom_stack.pop()
for entry in persistdirectoryatom.rgPersistDirEntry:
# logger.debug("persistId: %d" % entry.persistId)
for i, offset in enumerate(entry.rgPersistOffset):
persistobjectdirectory[entry.persistId + i] = offset
return persistobjectdirectory
class Ppt97File(base.BaseOfficeFile):
"""Return a MS-PPT file object.
Examples:
>>> with open("tests/inputs/rc4cryptoapi_password.ppt", "rb") as f:
... officefile = Ppt97File(f)
... officefile.load_key(password="Password1234_")
>>> with open("tests/inputs/rc4cryptoapi_password.ppt", "rb") as f:
... officefile = Ppt97File(f)
... officefile.load_key(password="0000")
Traceback (most recent call last):
...
msoffcrypto.exceptions.InvalidKeyError: ...
"""
def __init__(self, file):
self.file = file
ole = olefile.OleFileIO(file) # do not close this, would close file
self.ole = ole
self.format = "ppt97"
self.keyTypes = ["password"]
self.key = None
self.salt = None
# streams closed in destructor:
currentuser = ole.openstream("Current User")
powerpointdocument = ole.openstream("PowerPoint Document")
Data = namedtuple("Data", ["currentuser", "powerpointdocument"])
self.data = Data(
currentuser=currentuser,
powerpointdocument=powerpointdocument,
)
def __del__(self):
"""Destructor, closes opened streams."""
if hasattr(self, "data") and self.data:
if self.data.currentuser:
self.data.currentuser.close()
if self.data.powerpointdocument:
self.data.powerpointdocument.close()
def load_key(self, password=None):
persistobjectdirectory = construct_persistobjectdirectory(self.data)
logger.debug("[*] persistobjectdirectory: {}".format(persistobjectdirectory))
self.data.currentuser.seek(0)
currentuser = _parseCurrentUser(self.data.currentuser)
logger.debug("[*] currentuser: {}".format(currentuser))
self.data.powerpointdocument.seek(
currentuser.currentuseratom.offsetToCurrentEdit
)
usereditatom = _parseUserEditAtom(self.data.powerpointdocument)
logger.debug("[*] usereditatom: {}".format(usereditatom))
# cf. Part 2 in https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/1fc22d56-28f9-4818-bd45-67c2bf721ccf
cryptsession10container_offset = persistobjectdirectory[
usereditatom.encryptSessionPersistIdRef
]
logger.debug(
"[*] cryptsession10container_offset: {}".format(
cryptsession10container_offset
)
)
self.data.powerpointdocument.seek(cryptsession10container_offset)
cryptsession10container = _parseCryptSession10Container(
self.data.powerpointdocument
)
logger.debug("[*] cryptsession10container: {}".format(cryptsession10container))
encryptionInfo = io.BytesIO(cryptsession10container.data)
encryptionVersionInfo = encryptionInfo.read(4)
vMajor, vMinor = unpack("<HH", encryptionVersionInfo)
logger.debug("[*] encryption version: {} {}".format(vMajor, vMinor))
assert vMajor in [0x0002, 0x0003, 0x0004] and vMinor == 0x0002 # RC4 CryptoAPI
info = _parse_header_RC4CryptoAPI(encryptionInfo)
if DocumentRC4CryptoAPI.verifypw(
password,
info["salt"],
info["keySize"],
info["encryptedVerifier"],
info["encryptedVerifierHash"],
):
self.type = "rc4_cryptoapi"
self.key = password
self.salt = info["salt"]
self.keySize = info["keySize"]
else:
raise exceptions.InvalidKeyError("Failed to verify password")
def decrypt(self, outfile):
# Current User Stream
self.data.currentuser.seek(0)
currentuser = _parseCurrentUser(self.data.currentuser)
# logger.debug(currentuser)
cuatom = currentuser.currentuseratom
currentuser_new = CurrentUser(
currentuseratom=CurrentUserAtom(
rh=cuatom.rh,
size=cuatom.size,
# https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/940d5700-e4d7-4fc0-ab48-fed5dbc48bc1
# 0xE391C05F: The file SHOULD NOT<6> be an encrypted document.
headerToken=0xE391C05F,
offsetToCurrentEdit=cuatom.offsetToCurrentEdit,
lenUserName=cuatom.lenUserName,
docFileVersion=cuatom.docFileVersion,
majorVersion=cuatom.majorVersion,
minorVersion=cuatom.minorVersion,
unused=cuatom.unused,
ansiUserName=cuatom.ansiUserName,
relVersion=cuatom.relVersion,
unicodeUserName=cuatom.unicodeUserName,
)
)
buf = _packCurrentUser(currentuser_new)
buf.seek(0)
currentuser_buf = buf
# List of encrypted parts: https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/b0963334-4408-4621-879a-ef9c54551fd8
# PowerPoint Document Stream
self.data.powerpointdocument.seek(0)
powerpointdocument_size = len(self.data.powerpointdocument.read())
logger.debug("[*] powerpointdocument_size: {}".format(powerpointdocument_size))
self.data.powerpointdocument.seek(0)
dec_bytearray = bytearray(self.data.powerpointdocument.read())
# UserEditAtom
self.data.powerpointdocument.seek(
currentuser.currentuseratom.offsetToCurrentEdit
)
# currentuseratom_raw = self.data.powerpointdocument.read(40)
self.data.powerpointdocument.seek(
currentuser.currentuseratom.offsetToCurrentEdit
)
usereditatom = _parseUserEditAtom(self.data.powerpointdocument)
# logger.debug(usereditatom)
# logger.debug(["offsetToCurrentEdit", currentuser.currentuseratom.offsetToCurrentEdit])
rh_new = RecordHeader(
recVer=usereditatom.rh.recVer,
recInstance=usereditatom.rh.recInstance,
recType=usereditatom.rh.recType,
recLen=usereditatom.rh.recLen - 4, # Omit encryptSessionPersistIdRef field
)
# logger.debug([_packRecordHeader(usereditatom.rh).read(), _packRecordHeader(rh_new).read()])
usereditatom_new = UserEditAtom(
rh=rh_new,
lastSlideIdRef=usereditatom.lastSlideIdRef,
version=usereditatom.version,
minorVersion=usereditatom.minorVersion,
majorVersion=usereditatom.majorVersion,
offsetLastEdit=usereditatom.offsetLastEdit,
offsetPersistDirectory=usereditatom.offsetPersistDirectory,
docPersistIdRef=usereditatom.docPersistIdRef,
persistIdSeed=usereditatom.persistIdSeed,
lastView=usereditatom.lastView,
unused=usereditatom.unused,
encryptSessionPersistIdRef=0x00000000, # Clear
)
# logger.debug(currentuseratom_raw)
# logger.debug(_packUserEditAtom(usereditatom).read())
# logger.debug(_packUserEditAtom(usereditatom_new).read())
buf = _packUserEditAtom(usereditatom_new)
buf.seek(0)
buf_bytes = bytearray(buf.read())
offset = currentuser.currentuseratom.offsetToCurrentEdit
dec_bytearray[offset : offset + len(buf_bytes)] = buf_bytes
# PersistDirectoryAtom
self.data.powerpointdocument.seek(
currentuser.currentuseratom.offsetToCurrentEdit
)
usereditatom = _parseUserEditAtom(self.data.powerpointdocument)
# logger.debug(usereditatom)
self.data.powerpointdocument.seek(usereditatom.offsetPersistDirectory)
persistdirectoryatom = _parsePersistDirectoryAtom(self.data.powerpointdocument)
# logger.debug(persistdirectoryatom)
persistdirectoryatom_new = PersistDirectoryAtom(
rh=persistdirectoryatom.rh,
rgPersistDirEntry=[
PersistDirectoryEntry(
persistId=persistdirectoryatom.rgPersistDirEntry[0].persistId,
# Omit CryptSession10Container
cPersist=persistdirectoryatom.rgPersistDirEntry[0].cPersist - 1,
rgPersistOffset=persistdirectoryatom.rgPersistDirEntry[
0
].rgPersistOffset,
),
],
)
self.data.powerpointdocument.seek(usereditatom.offsetPersistDirectory)
buf = _packPersistDirectoryAtom(persistdirectoryatom_new)
buf_bytes = bytearray(buf.read())
offset = usereditatom.offsetPersistDirectory
dec_bytearray[offset : offset + len(buf_bytes)] = buf_bytes
# Persist Objects
self.data.powerpointdocument.seek(0)
persistobjectdirectory = construct_persistobjectdirectory(self.data)
directory_items = list(persistobjectdirectory.items())
for i, (persistId, offset) in enumerate(directory_items):
self.data.powerpointdocument.seek(offset)
buf = self.data.powerpointdocument.read(8)
rh = _parseRecordHeader(io.BytesIO(buf))
logger.debug("[*] rh: {}".format(rh))
# CryptSession10Container
if rh.recType == 0x2F14:
logger.debug("[*] CryptSession10Container found")
# Remove encryption, pad by zero to preserve stream size
dec_bytearray[offset : offset + (8 + rh.recLen)] = b"\x00" * (
8 + rh.recLen
)
continue
# The UserEditAtom record (section 2.3.3) and the PersistDirectoryAtom record (section 2.3.4) MUST NOT be encrypted.
if rh.recType in [0x0FF5, 0x1772]:
logger.debug("[*] UserEditAtom/PersistDirectoryAtom found")
continue
# TODO: Fix here
recLen = directory_items[i + 1][1] - offset - 8
logger.debug("[*] recLen: {}".format(recLen))
self.data.powerpointdocument.seek(offset)
enc_buf = io.BytesIO(self.data.powerpointdocument.read(8 + recLen))
blocksize = self.keySize * (
(8 + recLen) // self.keySize + 1
) # Undocumented
dec = DocumentRC4CryptoAPI.decrypt(
self.key,
self.salt,
self.keySize,
enc_buf,
blocksize=blocksize,
block=persistId,
)
dec_bytes = bytearray(dec.read())
dec_bytearray[offset : offset + len(dec_bytes)] = dec_bytes
# To BytesIO
dec_buf = io.BytesIO(dec_bytearray)
dec_buf.seek(0)
for i, (persistId, offset) in enumerate(directory_items):
dec_buf.seek(offset)
buf = dec_buf.read(8)
rh = _parseRecordHeader(io.BytesIO(buf))
logger.debug("[*] rh: {}".format(rh))
dec_buf.seek(0)
logger.debug(
"[*] powerpointdocument_size={}, len(dec_buf.read())={}".format(
powerpointdocument_size, len(dec_buf.read())
)
)
dec_buf.seek(0)
powerpointdocument_dec_buf = dec_buf
# TODO: Pictures Stream
# TODO: Encrypted Summary Info Stream
with tempfile.TemporaryFile() as _outfile:
self.file.seek(0)
shutil.copyfileobj(self.file, _outfile)
outole = olefile.OleFileIO(_outfile, write_mode=True)
outole.write_stream("Current User", currentuser_buf.read())
outole.write_stream(
"PowerPoint Document", powerpointdocument_dec_buf.read()
)
# Finalize
_outfile.seek(0)
shutil.copyfileobj(_outfile, outfile)
return
def is_encrypted(self):
r"""
Test if the file is encrypted.
>>> f = open("tests/inputs/plain.ppt", "rb")
>>> file = Ppt97File(f)
>>> file.is_encrypted()
False
>>> f = open("tests/inputs/rc4cryptoapi_password.ppt", "rb")
>>> file = Ppt97File(f)
>>> file.is_encrypted()
True
"""
self.data.currentuser.seek(0)
currentuser = _parseCurrentUser(self.data.currentuser)
logger.debug("[*] currentuser: {}".format(currentuser))
self.data.powerpointdocument.seek(
currentuser.currentuseratom.offsetToCurrentEdit
)
usereditatom = _parseUserEditAtom(self.data.powerpointdocument)
logger.debug("[*] usereditatom: {}".format(usereditatom))
if usereditatom.rh.recLen == 0x00000020: # Cf. _parseUserEditAtom
return True
else:
return False
================================================
FILE: msoffcrypto/format/xls97.py
================================================
import io
import logging
import shutil
import tempfile
from collections import namedtuple
from struct import pack, unpack
import olefile
from msoffcrypto import exceptions
from msoffcrypto.format import base
from msoffcrypto.format.common import _parse_header_RC4CryptoAPI
from msoffcrypto.method.rc4 import DocumentRC4
from msoffcrypto.method.rc4_cryptoapi import DocumentRC4CryptoAPI
from msoffcrypto.method.xor_obfuscation import DocumentXOR
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())
recordNameNum = {
"Formula": 6,
"EOF": 10,
"CalcCount": 12,
"CalcMode": 13,
"CalcPrecision": 14,
"CalcRefMode": 15,
"CalcDelta": 16,
"CalcIter": 17,
"Protect": 18,
"Password": 19,
"Header": 20,
"Footer": 21,
"ExternSheet": 23,
"Lbl": 24,
"WinProtect": 25,
"VerticalPageBreaks": 26,
"HorizontalPageBreaks": 27,
"Note": 28,
"Selection": 29,
"Date1904": 34,
"ExternName": 35,
"LeftMargin": 38,
"RightMargin": 39,
"TopMargin": 40,
"BottomMargin": 41,
"PrintRowCol": 42,
"PrintGrid": 43,
"FilePass": 47,
"Font": 49,
"PrintSize": 51,
"Continue": 60,
"Window1": 61,
"Backup": 64,
"Pane": 65,
"CodePage": 66,
"Pls": 77,
"DCon": 80,
"DConRef": 81,
"DConName": 82,
"DefColWidth": 85,
"XCT": 89,
"CRN": 90,
"FileSharing": 91,
"WriteAccess": 92,
"Obj": 93,
"Uncalced": 94,
"CalcSaveRecalc": 95,
"Template": 96,
"Intl": 97,
"ObjProtect": 99,
"ColInfo": 125,
"Guts": 128,
"WsBool": 129,
"GridSet": 130,
"HCenter": 131,
"VCenter": 132,
"BoundSheet8": 133,
"WriteProtect": 134,
"Country": 140,
"HideObj": 141,
"Sort": 144,
"Palette": 146,
"Sync": 151,
"LPr": 152,
"DxGCol": 153,
"FnGroupName": 154,
"FilterMode": 155,
"BuiltInFnGroupCount": 156,
"AutoFilterInfo": 157,
"AutoFilter": 158,
"Scl": 160,
"Setup": 161,
"ScenMan": 174,
"SCENARIO": 175,
"SxView": 176,
"Sxvd": 177,
"SXVI": 178,
"SxIvd": 180,
"SXLI": 181,
"SXPI": 182,
"DocRoute": 184,
"RecipName": 185,
"MulRk": 189,
"MulBlank": 190,
"Mms": 193,
"SXDI": 197,
"SXDB": 198,
"SXFDB": 199,
"SXDBB": 200,
"SXNum": 201,
"SxBool": 202,
"SxErr": 203,
"SXInt": 204,
"SXString": 205,
"SXDtr": 206,
"SxNil": 207,
"SXTbl": 208,
"SXTBRGIITM": 209,
"SxTbpg": 210,
"ObProj": 211,
"SXStreamID": 213,
"DBCell": 215,
"SXRng": 216,
"SxIsxoper": 217,
"BookBool": 218,
"DbOrParamQry": 220,
"ScenarioProtect": 221,
"OleObjectSize": 222,
"XF": 224,
"InterfaceHdr": 225,
"InterfaceEnd": 226,
"SXVS": 227,
"MergeCells": 229,
"BkHim": 233,
"MsoDrawingGroup": 235,
"MsoDrawing": 236,
"MsoDrawingSelection": 237,
"PhoneticInfo": 239,
"SxRule": 240,
"SXEx": 241,
"SxFilt": 242,
"SxDXF": 244,
"SxItm": 245,
"SxName": 246,
"SxSelect": 247,
"SXPair": 248,
"SxFmla": 249,
"SxFormat": 251,
"SST": 252,
"LabelSst": 253,
"ExtSST": 255,
"SXVDEx": 256,
"SXFormula": 259,
"SXDBEx": 290,
"RRDInsDel": 311,
"RRDHead": 312,
"RRDChgCell": 315,
"RRTabId": 317,
"RRDRenSheet": 318,
"RRSort": 319,
"RRDMove": 320,
"RRFormat": 330,
"RRAutoFmt": 331,
"RRInsertSh": 333,
"RRDMoveBegin": 334,
"RRDMoveEnd": 335,
"RRDInsDelBegin": 336,
"RRDInsDelEnd": 337,
"RRDConflict": 338,
"RRDDefName": 339,
"RRDRstEtxp": 340,
"LRng": 351,
"UsesELFs": 352,
"DSF": 353,
"CUsr": 401,
"CbUsr": 402,
"UsrInfo": 403,
"UsrExcl": 404,
"FileLock": 405,
"RRDInfo": 406,
"BCUsrs": 407,
"UsrChk": 408,
"UserBView": 425,
"UserSViewBegin": 426,
"UserSViewBegin_Chart": 426,
"UserSViewEnd": 427,
"RRDUserView": 428,
"Qsi": 429,
"SupBook": 430,
"Prot4Rev": 431,
"CondFmt": 432,
"CF": 433,
"DVal": 434,
"DConBin": 437,
"TxO": 438,
"RefreshAll": 439,
"HLink": 440,
"Lel": 441,
"CodeName": 442,
"SXFDBType": 443,
"Prot4RevPass": 444,
"ObNoMacros": 445,
"Dv": 446,
"Excel9File": 448,
"RecalcId": 449,
"EntExU2": 450,
"Dimensions": 512,
"Blank": 513,
"Number": 515,
"Label": 516,
"BoolErr": 517,
"String": 519,
"Row": 520,
"Index": 523,
"Array": 545,
"DefaultRowHeight": 549,
"Table": 566,
"Window2": 574,
"RK": 638,
"Style": 659,
"BigName": 1048,
"Format": 1054,
"ContinueBigName": 1084,
"ShrFmla": 1212,
"HLinkTooltip": 2048,
"WebPub": 2049,
"QsiSXTag": 2050,
"DBQueryExt": 2051,
"ExtString": 2052,
"TxtQry": 2053,
"Qsir": 2054,
"Qsif": 2055,
"RRDTQSIF": 2056,
"BOF": 2057,
"OleDbConn": 2058,
"WOpt": 2059,
"SXViewEx": 2060,
"SXTH": 2061,
"SXPIEx": 2062,
"SXVDTEx": 2063,
"SXViewEx9": 2064,
"ContinueFrt": 2066,
"RealTimeData": 2067,
"ChartFrtInfo": 2128,
"FrtWrapper": 2129,
"StartBlock": 2130,
"EndBlock": 2131,
"StartObject": 2132,
"EndObject": 2133,
"CatLab": 2134,
"YMult": 2135,
"SXViewLink": 2136,
"PivotChartBits": 2137,
"FrtFontList": 2138,
"SheetExt": 2146,
"BookExt": 2147,
"SXAddl": 2148,
"CrErr": 2149,
"HFPicture": 2150,
"FeatHdr": 2151,
"Feat": 2152,
"DataLabExt": 2154,
"DataLabExtContents": 2155,
"CellWatch": 2156,
"FeatHdr11": 2161,
"Feature11": 2162,
"DropDownObjIds": 2164,
"ContinueFrt11": 2165,
"DConn": 2166,
"List12": 2167,
"Feature12": 2168,
"CondFmt12": 2169,
"CF12": 2170,
"CFEx": 2171,
"XFCRC": 2172,
"XFExt": 2173,
"AutoFilter12": 2174,
"ContinueFrt12": 2175,
"MDTInfo": 2180,
"MDXStr": 2181,
"MDXTuple": 2182,
"MDXSet": 2183,
"MDXProp": 2184,
"MDXKPI": 2185,
"MDB": 2186,
"PLV": 2187,
"Compat12": 2188,
"DXF": 2189,
"TableStyles": 2190,
"TableStyle": 2191,
"TableStyleElement": 2192,
"StyleExt": 2194,
"NamePublish": 2195,
"NameCmt": 2196,
"SortData": 2197,
"Theme": 2198,
"GUIDTypeLib": 2199,
"FnGrp12": 2200,
"NameFnGrp12": 2201,
"MTRSettings": 2202,
"CompressPictures": 2203,
"HeaderFooter": 2204,
"CrtLayout12": 2205,
"CrtMlFrt": 2206,
"CrtMlFrtContinue": 2207,
"ForceFullCalculation": 2211,
"ShapePropsStream": 2212,
"TextPropsStream": 2213,
"RichTextStream": 2214,
"CrtLayout12A": 2215,
"Units": 4097,
"Chart": 4098,
"Series": 4099,
"DataFormat": 4102,
"LineFormat": 4103,
"MarkerFormat": 4105,
"AreaFormat": 4106,
"PieFormat": 4107,
"AttachedLabel": 4108,
"SeriesText": 4109,
"ChartFormat": 4116,
"Legend": 4117,
"SeriesList": 4118,
"Bar": 4119,
"Line": 4120,
"Pie": 4121,
"Area": 4122,
"Scatter": 4123,
"CrtLine": 4124,
"Axis": 4125,
"Tick": 4126,
"ValueRange": 4127,
"CatSerRange": 4128,
"AxisLine": 4129,
"CrtLink": 4130,
"DefaultText": 4132,
"Text": 4133,
"FontX": 4134,
"ObjectLink": 4135,
"Frame": 4146,
"Begin": 4147,
"End": 4148,
"PlotArea": 4149,
"Chart3d": 4154,
"PicF": 4156,
"DropBar": 4157,
"Radar": 4158,
"Surf": 4159,
"RadarArea": 4160,
"AxisParent": 4161,
"LegendException": 4163,
"ShtProps": 4164,
"SerToCrt": 4165,
"AxesUsed": 4166,
"SBaseRef": 4168,
"SerParent": 4170,
"SerAuxTrend": 4171,
"IFmtRecord": 4174,
"Pos": 4175,
"AlRuns": 4176,
"BRAI": 4177,
"SerAuxErrBar": 4187,
"ClrtClient": 4188,
"SerFmt": 4189,
"Chart3DBarShape": 4191,
"Fbi": 4192,
"BopPop": 4193,
"AxcExt": 4194,
"Dat": 4195,
"PlotGrowth": 4196,
"SIIndex": 4197,
"GelFrame": 4198,
"BopPopCustom": 4199,
"Fbi2": 4200,
}
def _parse_header_RC4(encryptionInfo):
# RC4: https://msdn.microsoft.com/en-us/library/dd908560(v=office.12).aspx
salt = encryptionInfo.read(16)
encryptedVerifier = encryptionInfo.read(16)
encryptedVerifierHash = encryptionInfo.read(16)
info = {
"salt": salt,
"encryptedVerifier": encryptedVerifier,
"encryptedVerifierHash": encryptedVerifierHash,
}
return info
class _BIFFStream:
def __init__(self, data):
self.data = data
def has_record(self, target):
pos = self.data.tell()
while True:
h = self.data.read(4)
if not h:
self.data.seek(pos)
return False
num, size = unpack("<HH", h)
if num == target:
self.data.seek(pos)
return True
else:
self.data.read(size)
def skip_to(self, target):
while True:
h = self.data.read(4)
if not h:
raise exceptions.ParseError("Record not found")
num, size = unpack("<HH", h)
if num == target:
return num, size
else:
self.data.read(size)
def iter_record(self):
while True:
h = self.data.read(4)
if not h:
break
num, size = unpack("<HH", h)
record = io.BytesIO(self.data.read(size))
yield num, size, record
class Xls97File(base.BaseOfficeFile):
"""Return a MS-XLS file object.
Examples:
>>> with open("tests/inputs/rc4cryptoapi_password.xls", "rb") as f:
... officefile = Xls97File(f)
... officefile.load_key(password="Password1234_")
>>> with open("tests/inputs/xor_password_123456789012345.xls", "rb") as f:
... officefile = Xls97File(f)
... officefile.load_key(password="123456789012345")
>>> with open("tests/inputs/rc4cryptoapi_password.xls", "rb") as f:
... officefile = Xls97File(f)
... officefile.load_key(password="0000")
Traceback (most recent call last):
...
msoffcrypto.exceptions.InvalidKeyError: ...
"""
def __init__(self, file):
self.file = file
ole = olefile.OleFileIO(file) # do not close this, would close file
self.ole = ole
self.format = "xls97"
self.keyTypes = ["password"]
self.key = None
self.salt = None
workbook = ole.openstream("Workbook") # closed in destructor
Data = namedtuple("Data", ["workbook"])
self.data = Data(
workbook=workbook,
)
def __del__(self):
"""Destructor, closes opened stream."""
if hasattr(self, "data") and self.data and self.data.workbook:
self.data.workbook.close()
def load_key(self, password=None):
self.data.workbook.seek(0)
workbook = _BIFFStream(self.data.workbook)
# workbook stream consists of records, each of which begins with its ID number.
# Record IDs (in decimal) are listed here: https://msdn.microsoft.com/en-us/library/dd945945(v=office.12).aspx
# workbook stream's structure is WORKBOOK = BOF WORKBOOKCONTENT and so forth
# as in https://msdn.microsoft.com/en-us/library/dd952177(v=office.12).aspx
# A record begins with its length (in bytes).
(num,) = unpack("<H", workbook.data.read(2))
assert num == 2057 # BOF
(size,) = unpack("<H", workbook.data.read(2))
workbook.data.read(size) # Skip BOF
num, size = workbook.skip_to(
recordNameNum["FilePass"]
) # Skip to FilePass; TODO: Raise exception if not encrypted
# FilePass: https://msdn.microsoft.com/en-us/library/dd952596(v=office.12).aspx
# If this record exists, the workbook MUST be encrypted.
(wEncryptionType,) = unpack("<H", workbook.data.read(2))
encryptionInfo = io.BytesIO(workbook.data.read(size - 2))
if wEncryptionType == 0x0000: # XOR obfuscation
key, verificationBytes = unpack("<HH", encryptionInfo.read(4))
if DocumentXOR.verifypw(password, verificationBytes):
self.type = "xor"
self.key = password
self.loc_index = 0
else:
raise exceptions.InvalidKeyError("Failed to verify password")
elif wEncryptionType == 0x0001: # RC4
encryptionVersionInfo = encryptionInfo.read(4)
vMajor, vMinor = unpack("<HH", encryptionVersionInfo)
logger.debug("Version: {} {}".format(vMajor, vMinor))
if vMajor == 0x0001 and vMinor == 0x0001: # RC4
info = _parse_header_RC4(encryptionInfo)
if DocumentRC4.verifypw(
password,
info["salt"],
info["encryptedVerifier"],
info["encryptedVerifierHash"],
):
self.type = "rc4"
self.key = password
self.salt = info["salt"]
else:
raise exceptions.InvalidKeyError("Failed to verify password")
elif (
vMajor in [0x0002, 0x0003, 0x0004] and vMinor == 0x0002
): # RC4 CryptoAPI
info = _parse_header_RC4CryptoAPI(encryptionInfo)
if DocumentRC4CryptoAPI.verifypw(
password,
info["salt"],
info["keySize"],
info["encryptedVerifier"],
info["encryptedVerifierHash"],
):
self.type = "rc4_cryptoapi"
self.key = password
self.salt = info["salt"]
self.keySize = info["keySize"]
else:
raise exceptions.InvalidKeyError("Failed to verify password")
else:
raise exceptions.DecryptionError("Unsupported encryption method")
def decrypt(self, outfile):
# fd, _outfile_path = tempfile.mkstemp()
# shutil.copyfile(os.path.realpath(self.file.name), _outfile_path)
# outole = olefile.OleFileIO(_outfile_path, write_mode=True)
# List of encrypted parts: https://msdn.microsoft.com/en-us/library/dd905723(v=office.12).aspx
# Workbook stream
self.data.workbook.seek(0)
workbook = _BIFFStream(self.data.workbook)
plain_buf = []
encrypted_buf = io.BytesIO()
record_info = []
for i, (num, size, record) in enumerate(workbook.iter_record()):
# Remove encryption, pad by zero to preserve stream size
if num == recordNameNum["FilePass"]:
plain_buf += [0, 0] + list(pack("<H", size)) + [0] * size
encrypted_buf.write(b"\x00" * (4 + size))
# The following records MUST NOT be obfuscated or encrypted: BOF (section 2.4.21),
# FilePass (section 2.4.117), UsrExcl (section 2.4.339), FileLock (section 2.4.116),
# InterfaceHdr (section 2.4.146), RRDInfo (section 2.4.227), and RRDHead (section 2.4.226).
elif num in [
recordNameNum["BOF"],
recordNameNum["FilePass"],
recordNameNum["UsrExcl"],
recordNameNum["FileLock"],
recordNameNum["InterfaceHdr"],
recordNameNum["RRDInfo"],
recordNameNum["RRDHead"],
]:
header = pack("<HH", num, size)
plain_buf += list(header) + list(record.read())
encrypted_buf.write(b"\x00" * (4 + size))
# The lbPlyPos field of the BoundSheet8 record (section 2.4.28) MUST NOT be encrypted.
elif num == recordNameNum["BoundSheet8"]:
header = pack("<HH", num, size)
plain_buf += (
list(header) + list(record.read(4)) + [-2] * (size - 4)
) # Preserve lbPlyPos
encrypted_buf.write(b"\x00" * 4 + b"\x00" * 4 + record.read())
else:
header = pack("<HH", num, size)
plain_buf += list(header) + [-1] * size
encrypted_buf.write(b"\x00" * 4 + record.read())
self.data_size = encrypted_buf.tell()
encrypted_buf.seek(0)
if self.type == "rc4":
dec = DocumentRC4.decrypt(
self.key, self.salt, encrypted_buf, blocksize=1024
)
elif self.type == "rc4_cryptoapi":
dec = DocumentRC4CryptoAPI.decrypt(
self.key, self.salt, self.keySize, encrypted_buf, blocksize=1024
)
elif self.type == "xor":
dec = DocumentXOR.decrypt(
self.key, encrypted_buf, plain_buf, record_info, 10
)
else:
raise exceptions.DecryptionError(
"Unsupported encryption method: {}".format(self.type)
)
for c in plain_buf:
if c == -1 or c == -2:
dec.seek(1, 1)
else:
dec.write(bytearray([c]))
dec.seek(0)
# f = open('Workbook', 'wb')
# f.write(dec.read())
# dec.seek(0)
workbook_dec = dec
with tempfile.TemporaryFile() as _outfile:
self.file.seek(0)
shutil.copyfileobj(self.file, _outfile)
outole = olefile.OleFileIO(_outfile, write_mode=True)
outole.write_stream("Workbook", workbook_dec.read())
# _outfile = open(_outfile_path, 'rb')
_outfile.seek(0)
shutil.copyfileobj(_outfile, outfile)
return
def is_encrypted(self):
r"""
Test if the file is encrypted.
>>> f = open("tests/inputs/plain.xls", "rb")
>>> file = Xls97File(f)
>>> file.is_encrypted()
False
>>> f = open("tests/inputs/rc4cryptoapi_password.xls", "rb")
>>> file = Xls97File(f)
>>> file.is_encrypted()
True
"""
# Utilising the method above, check for encryption type.
self.data.workbook.seek(0)
workbook = _BIFFStream(self.data.workbook)
(num,) = unpack("<H", workbook.data.read(2))
assert num == 2057
(size,) = unpack("<H", workbook.data.read(2))
workbook.data.read(size)
if not workbook.has_record(recordNameNum["FilePass"]):
return False
num, size = workbook.skip_to(recordNameNum["FilePass"])
(wEncryptionType,) = unpack("<H", workbook.data.read(2))
if wEncryptionType == 0x0001: # RC4
return True
elif wEncryptionType == 0x0000: # XOR obfuscation
return True
else:
return False
================================================
FILE: msoffcrypto/method/__init__.py
================================================
================================================
FILE: msoffcrypto/method/container/__init__.py
================================================
================================================
FILE: msoffcrypto/method/container/ecma376_encrypted.py
================================================
import io
from datetime import datetime
from struct import pack
import olefile
# An encrypted ECMA376 file is stored as an OLE container.
#
# At this point, creating an Ole file is somewhat of a chore, since
# the latest OleFile (v0.47) does not really do it.
#
# See https://github.com/decalage2/olefile/issues/6
#
# This file is not meant to support all manners of OLE files; it creates
# what we need (an OLE file with an encrypted stream + supporting streams).
# Nothing more, nothing less. So, unlike OleFile, we can take _a lot_ of
# shortcuts.
#
# Probably very brittle.
#
# File format:
#
# https://github.com/libyal/libolecf/blob/main/documentation/OLE%20Compound%20File%20format.asciidoc
#
# Initial C++ code from https://github.com/herumi/msoffice (BSD-3)
def datetime2filetime(dt):
"""
Convert Python datetime.datetime to FILETIME (64 bits unsigned int)
A file time is a 64-bit value that represents the number of 100-nanosecond intervals that have elapsed
since 12:00 A.M. January 1, 1601 Coordinated Universal Time (UTC).
https://learn.microsoft.com/en-us/windows/win32/sysinfo/file-times
"""
_FILETIME_NULL_DATE = datetime(1601, 1, 1, 0, 0, 0)
return int((dt - _FILETIME_NULL_DATE).total_seconds() * 10000000)
class RedBlack:
RED = 0 # Note that this is per-spec; olefile.py shows the opposite
BLACK = 1
class DirectoryEntryType:
EMPTY = 0
STORAGE = 1
STREAM = 2
LOCK_BYTES = 3
PROPERTY = 4
ROOT_STORAGE = 5
class SectorTypes:
MAXREGSECT = 0xFFFFFFFA
DIFSECT = 0xFFFFFFFC
FATSECT = 0xFFFFFFFD
ENDOFCHAIN = 0xFFFFFFFE
FREESECT = 0xFFFFFFFF
NOSTREAM = 0xFFFFFFFF
class DSPos:
# Order in the directories array; must be in sync with getDirectoryEntries()
iRoot = 0
iEncryptionPackage = 1
iDataSpaces = 2
iVersion = 3
iDataSpaceMap = 4
iDataSpaceInfo = 5
iStongEncryptionDataSpace = 6
iTransformInfo = 7
iStrongEncryptionTransform = 8
iPrimary = 9
iEncryptionInfo = 10
dirNum = 11
class DefaultContent:
# Lifted off of Herumi/msoffice (C++ package)
# https://github.com/herumi/msoffice/blob/master/include/resource.hpp
Version = b"\x3c\x00\x00\x00\x4d\x00\x69\x00\x63\x00\x72\x00\x6f\x00\x73\x00\x6f\x00\x66\x00\x74\x00\x2e\x00\x43\x00\x6f\x00\x6e\x00\x74\x00\x61\x00\x69\x00\x6e\x00\x65\x00\x72\x00\x2e\x00\x44\x00\x61\x00\x74\x00\x61\x00\x53\x00\x70\x00\x61\x00\x63\x00\x65\x00\x73\x00\x01\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x00"
Primary = b"\x58\x00\x00\x00\x01\x00\x00\x00\x4c\x00\x00\x00\x7b\x00\x46\x00\x46\x00\x39\x00\x41\x00\x33\x00\x46\x00\x30\x00\x33\x00\x2d\x00\x35\x00\x36\x00\x45\x00\x46\x00\x2d\x00\x34\x00\x36\x00\x31\x00\x33\x00\x2d\x00\x42\x00\x44\x00\x44\x00\x35\x00\x2d\x00\x35\x00\x41\x00\x34\x00\x31\x00\x43\x00\x31\x00\x44\x00\x30\x00\x37\x00\x32\x00\x34\x00\x36\x00\x7d\x00\x4e\x00\x00\x00\x4d\x00\x69\x00\x63\x00\x72\x00\x6f\x00\x73\x00\x6f\x00\x66\x00\x74\x00\x2e\x00\x43\x00\x6f\x00\x6e\x00\x74\x00\x61\x00\x69\x00\x6e\x00\x65\x00\x72\x00\x2e\x00\x45\x00\x6e\x00\x63\x00\x72\x00\x79\x00\x70\x00\x74\x00\x69\x00\x6f\x00\x6e\x00\x54\x00\x72\x00\x61\x00\x6e\x00\x73\x00\x66\x00\x6f\x00\x72\x00\x6d\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00"
DataSpaceMap = b"\x08\x00\x00\x00\x01\x00\x00\x00\x68\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x45\x00\x6e\x00\x63\x00\x72\x00\x79\x00\x70\x00\x74\x00\x65\x00\x64\x00\x50\x00\x61\x00\x63\x00\x6b\x00\x61\x00\x67\x00\x65\x00\x32\x00\x00\x00\x53\x00\x74\x00\x72\x00\x6f\x00\x6e\x00\x67\x00\x45\x00\x6e\x00\x63\x00\x72\x00\x79\x00\x70\x00\x74\x00\x69\x00\x6f\x00\x6e\x00\x44\x00\x61\x00\x74\x00\x61\x00\x53\x00\x70\x00\x61\x00\x63\x00\x65\x00\x00\x00"
StrongEncryptionDataSpace = b"\x08\x00\x00\x00\x01\x00\x00\x00\x32\x00\x00\x00\x53\x00\x74\x00\x72\x00\x6f\x00\x6e\x00\x67\x00\x45\x00\x6e\x00\x63\x00\x72\x00\x79\x00\x70\x00\x74\x00\x69\x00\x6f\x00\x6e\x00\x54\x00\x72\x00\x61\x00\x6e\x00\x73\x00\x66\x00\x6f\x00\x72\x00\x6d\x00\x00\x00"
class Header:
FIRSTNUMDIFAT = 109
BUFFER_SIZE = 512 # Size taken when writing out to disk/buffer
def __init__(self):
self.minorVersion = 0x003E
self.majorVersion = 3
self.sectorShift = 9
self.numDirectorySectors = 0
self.numFatSectors = 0
self.firstDirectorySectorLocation = SectorTypes.ENDOFCHAIN
self.transactionSignatureNumber = 0
self.firstMiniFatSectorLocation = SectorTypes.ENDOFCHAIN
self.numMiniFatSectors = 0
self.firstDifatSectorLocation = SectorTypes.ENDOFCHAIN
self.numDifatSectors = 0
self.sectorSize = 1 << self.sectorShift
self.difat = []
def write_to(self, obuf):
obuf.write(olefile.MAGIC)
obuf.write(b"\0" * 16) # CLSID
byteOrder = 0xFFFE # Little-Endian
miniSectorShift = 6
miniStreamCutoffSize = 0x1000
reserved = 0
obuf.write(
pack(
"<HHHHHHHHIIIIIIIII",
self.minorVersion,
self.majorVersion,
byteOrder,
self.sectorShift,
miniSectorShift,
reserved,
reserved,
reserved,
self.numDirectorySectors,
self.numFatSectors,
self.firstDirectorySectorLocation,
self.transactionSignatureNumber,
miniStreamCutoffSize,
self.firstMiniFatSectorLocation,
self.numMiniFatSectors,
self.firstDifatSectorLocation,
self.numDifatSectors,
)
)
difatSize = len(self.difat)
for i in range(min(difatSize, Header.FIRSTNUMDIFAT)):
obuf.write(pack("<I", self.difat[i]))
for i in range(difatSize, Header.FIRSTNUMDIFAT):
obuf.write(pack("<I", SectorTypes.NOSTREAM))
class DirectoryEntry:
def __init__(
self,
name="",
_type=DirectoryEntryType.EMPTY,
color=RedBlack.RED,
leftId=SectorTypes.NOSTREAM,
rightId=SectorTypes.NOSTREAM,
childId=SectorTypes.NOSTREAM,
clsid="",
bits=0,
ct=0,
mt=0,
loc=0,
content=b"",
):
self.Name = name
self.Type = _type
self.Color = color
self.LeftSiblingId = leftId
self.RightSiblingId = rightId
self.ChildId = childId
self.CLSID = clsid
self.StateBits = bits
self.CreationTime = ct
self.ModificationTime = mt
self.StartingSectorLocation = loc
self.Content = content
def write_header_to(self, obuf):
"""
Write 128 bytes header in the output buffer. The Name property needs to be converted to UTF-16; Content is _not_
written out by this method.
"""
name16 = self.Name.encode(
"UTF-16-LE"
) # Write in Little Endian; omit the BOM at the start of the output
directoryNameSize = len(name16) + 2 # Count the null terminator in the size
obuf.write(
name16 + b"\0" * 2
) # Specs calls for us to store the null-terminator
obuf.write(
b"\0" * (64 - directoryNameSize)
) # Pad name to 64 bytes (thus max 31 chars + \x00\x00)
obuf.write(pack("<H", directoryNameSize if directoryNameSize > 2 else 0))
obuf.write(
pack(
"<BBIII",
self.Type,
self.Color,
self.LeftSiblingId,
self.RightSiblingId,
self.ChildId,
)
)
if self.CLSID:
obuf.write(self.CLSID)
else:
obuf.write(b"\0" * 16)
obuf.write(pack("<I", self.StateBits))
self.write_filetime(obuf, self.CreationTime)
self.write_filetime(obuf, self.ModificationTime)
obuf.write(pack("<IQ", self.StartingSectorLocation, len(self.Content)))
def write_filetime(self, obuf, ft):
# Write the lower 32 bits and upper 32 bits, in this order.
obuf.write(pack("<II", ft & 0xFFFFFFFF, ft >> 32))
@property
def Name(self):
return self._Name
@Name.setter
def Name(self, n):
if len(n) > 31:
raise ValueError("Name cannot be longer than 31 characters")
if set("!:/").intersection(n):
raise ValueError("Name contains invalid characters (!:/)")
self._Name = n
@property
def CLSID(self):
return self._CLSID
@CLSID.setter
def CLSID(self, c):
if c and len(c) != 16:
raise ValueError("CLSID must be blank, or 16 characters long")
self._CLSID = c
@property
def LeftSiblingId(self):
return self._LeftSiblingId
@LeftSiblingId.setter
def LeftSiblingId(self, id):
self._valid_id(id)
self._LeftSiblingId = id
@property
def RightSiblingId(self):
return self._RightSiblingId
@RightSiblingId.setter
def RightSiblingId(self, id):
self._valid_id(id)
self._RightSiblingId = id
@property
def ChildId(self):
return self._ChildId
@ChildId.setter
def ChildId(self, id):
self._valid_id(id)
self._ChildId = id
def _valid_id(self, id):
if not ((id <= SectorTypes.MAXREGSECT) or (id == SectorTypes.NOSTREAM)):
raise ValueError("Invalid id received")
class ECMA376EncryptedLayout:
def __init__(self, sectorSize):
self.sectorSize = sectorSize
self.miniFatNum = 0
self.miniFatDataSectorNum = 0
self.miniFatSectors = 0
self.numMiniFatSectors = 1
self.difatSectorNum = 0
self.fatSectorNum = 0
self.difatPos = 0
self.directoryEntrySectorNum = 0
self.encryptionPackageSectorNum = 0
@property
def fatPos(self):
return self.difatPos + self.difatSectorNum
@property
def miniFatPos(self):
return self.fatPos + self.fatSectorNum
@property
def directoryEntryPos(self):
return self.miniFatPos + self.numMiniFatSectors
@property
def miniFatDataPos(self):
return self.directoryEntryPos + self.directoryEntrySectorNum
@property
def contentSectorNum(self):
return (
self.numMiniFatSectors
+ self.directoryEntrySectorNum
+ self.miniFatDataSectorNum
+ self.encryptionPackageSectorNum
)
@property
def encryptionPackagePos(self):
return self.miniFatDataPos + self.miniFatDataSectorNum
@property
def totalSectors(self):
return self.difatSectorNum + self.fatSectorNum + self.contentSectorNum
@property
def totalSize(self):
return Header.BUFFER_SIZE + self.totalSectors * self.sectorSize
@property
def offsetDirectoryEntries(self):
return Header.BUFFER_SIZE + self.directoryEntryPos * self.sectorSize
@property
def offsetMiniFatData(self):
return Header.BUFFER_SIZE + self.miniFatDataPos * self.sectorSize
@property
def offsetFat(self):
return Header.BUFFER_SIZE + self.fatPos * self.sectorSize
@property
def offsetMiniFat(self):
return Header.BUFFER_SIZE + self.miniFatPos * self.sectorSize
def offsetDifat(self, n):
return Header.BUFFER_SIZE + (self.difatPos + n) * self.sectorSize
def offsetData(self, startingSectorLocation):
return Header.BUFFER_SIZE + startingSectorLocation * self.sectorSize
def offsetMiniData(self, startingSectorLocation):
return self.offsetMiniFatData + startingSectorLocation * 64
class ECMA376Encrypted:
def __init__(self, encryptedPackage=b"", encryptionInfo=b""):
self._header = self._get_default_header()
self._dirs = self._get_directory_entries()
self.set_payload(encryptedPackage, encryptionInfo)
def write_to(self, obuf):
"""
Writes the encrypted data to obuf
"""
# Create a temporary buffer with seek/tell capabilities, we do not want to assume the passed-in buffer has such
# capabilities (ie: piping to stdout).
_obuf = io.BytesIO()
self._write_to(_obuf)
# Finalize and write to client buffer.
obuf.write(_obuf.getvalue())
def set_payload(self, encryptedPackage, encryptionInfo):
self._dirs[DSPos.iEncryptionPackage].Content = encryptedPackage
self._dirs[DSPos.iEncryptionInfo].Content = encryptionInfo
def _get_default_header(self):
return Header()
def _get_directory_entries(self):
ft = datetime2filetime(datetime.now())
directories = [ # Must follow DSPos ordering
DirectoryEntry(
"Root Entry",
DirectoryEntryType.ROOT_STORAGE,
RedBlack.RED,
ct=ft,
mt=ft,
childId=DSPos.iEncryptionInfo,
),
DirectoryEntry(
"EncryptedPackage",
DirectoryEntryType.STREAM,
RedBlack.RED,
ct=ft,
mt=ft,
),
DirectoryEntry(
"\x06DataSpaces",
DirectoryEntryType.STORAGE,
RedBlack.RED,
ct=ft,
mt=ft,
childId=DSPos.iDataSpaceMap,
),
DirectoryEntry(
"Version",
DirectoryEntryType.STREAM,
RedBlack.BLACK,
ct=ft,
mt=ft,
content=DefaultContent.Version,
),
DirectoryEntry(
"DataSpaceMap",
DirectoryEntryType.STREAM,
RedBlack.BLACK,
ct=ft,
mt=ft,
leftId=DSPos.iVersion,
rightId=DSPos.iDataSpaceInfo,
content=DefaultContent.DataSpaceMap,
),
DirectoryEntry(
"DataSpaceInfo",
DirectoryEntryType.STORAGE,
RedBlack.BLACK,
ct=ft,
mt=ft,
rightId=DSPos.iTransformInfo,
childId=DSPos.iStongEncryptionDataSpace,
),
DirectoryEntry(
"StrongEncryptionDataSpace",
DirectoryEntryType.STREAM,
RedBlack.BLACK,
ct=ft,
mt=ft,
content=DefaultContent.StrongEncryptionDataSpace,
),
DirectoryEntry(
"TransformInfo",
DirectoryEntryType.STORAGE,
RedBlack.RED,
ct=ft,
mt=ft,
childId=DSPos.iStrongEncryptionTransform,
),
DirectoryEntry(
"StrongEncryptionTransform",
DirectoryEntryType.STORAGE,
RedBlack.BLACK,
ct=ft,
mt=ft,
childId=DSPos.iPrimary,
),
DirectoryEntry(
"\x06Primary",
DirectoryEntryType.STREAM,
RedBlack.BLACK,
ct=ft,
mt=ft,
content=DefaultContent.Primary,
),
DirectoryEntry(
"EncryptionInfo",
DirectoryEntryType.STREAM,
RedBlack.BLACK,
ct=ft,
mt=ft,
leftId=DSPos.iDataSpaces,
rightId=DSPos.iEncryptionPackage,
),
]
return directories
def _write_to(self, obuf):
layout = ECMA376EncryptedLayout(self._header.sectorSize)
self._set_sector_locations_of_streams(layout)
self._detect_sector_num(layout)
self._header.firstDirectorySectorLocation = layout.directoryEntryPos
self._header.firstMiniFatSectorLocation = layout.miniFatPos
self._header.numMiniFatSectors = layout.numMiniFatSectors
self._dirs[DSPos.iRoot].StartingSectorLocation = layout.miniFatDataPos
self._dirs[DSPos.iRoot].Content = b"\0" * (64 * layout.miniFatNum)
self._dirs[
DSPos.iEncryptionPackage
].StartingSectorLocation = layout.encryptionPackagePos
for i in range(min(layout.fatSectorNum, Header.FIRSTNUMDIFAT)):
self._header.difat.append(layout.fatPos + i)
self._header.numFatSectors = layout.fatSectorNum
self._header.numDifatSectors = layout.difatSectorNum
if layout.difatSectorNum > 0:
self._header.firstDifatSectorLocation = layout.difatPos
# Zero out the output buffer; some sections pad, some sections don't ... but we need the buffer to have the proper size
# so we can jump around
obuf.write(b"\0" * layout.totalSize)
obuf.seek(0)
self._header.write_to(obuf)
self._write_DIFAT(obuf, layout)
self._write_FAT_start(obuf, layout)
self._write_MiniFAT(obuf, layout)
self._write_directory_entries(obuf, layout)
self._write_Content(obuf, layout)
def _write_directory_entries(self, obuf, layout: ECMA376EncryptedLayout):
obuf.seek(layout.offsetDirectoryEntries)
for d in self._dirs:
d.write_header_to(obuf) # This must write 128 bytes, no more, no less.
if obuf.tell() != (layout.offsetDirectoryEntries + len(self._dirs) * 128):
# TODO: Use appropriate custom exception
raise Exception(
"Buffer did not advance as expected when writing out directory entries"
)
def _write_Content(self, obuf, layout: ECMA376EncryptedLayout):
for d in self._dirs:
size = len(d.Content)
if size:
if size <= 4096: # Small content goes in the minifat section
obuf.seek(layout.offsetMiniData(d.StartingSectorLocation))
obuf.write(d.Content)
else:
obuf.seek(layout.offsetData(d.StartingSectorLocation))
obuf.write(d.Content)
def _write_FAT_start(self, obuf, layout: ECMA376EncryptedLayout):
v = ([SectorTypes.DIFSECT] * layout.difatSectorNum) + (
[SectorTypes.FATSECT] * layout.fatSectorNum
)
v += [
layout.numMiniFatSectors,
layout.directoryEntrySectorNum,
layout.miniFatDataSectorNum,
layout.encryptionPackageSectorNum,
]
obuf.seek(layout.offsetFat)
self._write_FAT(obuf, v, layout.fatSectorNum * layout.sectorSize)
def _write_MiniFAT(self, obuf, layout: ECMA376EncryptedLayout):
obuf.seek(layout.offsetMiniFat)
self._write_FAT(
obuf, layout.miniFatSectors, layout.numMiniFatSectors * layout.sectorSize
)
def _write_FAT(self, obuf, entries, blockSize):
v = 0
startPos = obuf.tell()
max_n = blockSize // 4 # 4 bytes per entry with <I
# TODO: Use appropriate custom exception
for e in entries:
if e <= SectorTypes.MAXREGSECT:
for j in range(1, e):
v += 1
if v > max_n:
raise Exception("Attempting to write beyond block size")
obuf.write(pack("<I", v))
if v == max_n:
raise Exception("Attempting to write beyond block size")
obuf.write(pack("<I", SectorTypes.ENDOFCHAIN))
else:
if v == max_n:
raise Exception("Attempting to write beyond block size")
obuf.write(pack("<I", e))
v += 1
obuf.write(pack("<I", SectorTypes.FREESECT) * (max_n - v))
if obuf.tell() - startPos != blockSize:
# TODO: Use appropriate custom exception
raise Exception("_write_FAT() did not completely fill the block space.")
def _write_DIFAT(self, obuf, layout: ECMA376EncryptedLayout):
if layout.difatSectorNum < 1:
return
v = Header.FIRSTNUMDIFAT + layout.difatSectorNum
for i in range(layout.difatSectorNum):
obuf.seek(layout.offsetDifat(i))
for j in range(layout.sectorSize // 4 - 1): # 4 == sizeof(32 bit int)
obuf.write(pack("<I", v))
v += 1
if v > layout.difatSectorNum + layout.fatSectorNum:
for k in range(j, layout.sectorSize // 4 - 1):
obuf.write(pack("<I", SectorTypes.FREESECT))
obuf.write(pack("<I", SectorTypes.ENDOFCHAIN))
return
# The next seek is _probably_ not needed...
obuf.seek(layout.offsetDifat(i) + layout.sectorSize - 4)
obuf.write(pack("<I", layout.difatPos + i + 1))
def _detect_sector_num(self, layout: ECMA376EncryptedLayout):
numInFat = layout.sectorSize // 4 # Number of 4-bytes integers
difatSectorNum = 0
fatSectorNum = 0
for i in range(10):
a = self._get_block_num(
difatSectorNum + fatSectorNum + layout.contentSectorNum, numInFat
)
b = (
0
if a <= Header.FIRSTNUMDIFAT
else self._get_block_num(a - Header.FIRSTNUMDIFAT, numInFat - 1)
)
if (b == difatSectorNum) and (a == fatSectorNum):
layout.fatSectorNum = fatSectorNum
layout.difatSectorNum = difatSectorNum
return
difatSectorNum = b
fatSectorNum = a
raise IndexError(
"Unable to detect sector number within a reasonsable amount of loops"
)
def _set_sector_locations_of_streams(self, layout: ECMA376EncryptedLayout):
# Use all streams, except the encrypted package which is special (and the main reason why we're doing all this!)
streamsOfInterest = list(
filter(
lambda d: d.Type == DirectoryEntryType.STREAM
and d.Name != "EncryptedPackage",
self._dirs,
)
)
miniFatSectors = []
miniFatNum = 0
miniFatDataSectorNum = 0
pos = 0
for s in streamsOfInterest:
n = self._get_MiniFAT_sector_number(len(s.Content))
miniFatSectors.append(n)
s.StartingSectorLocation = pos
pos += n
miniFatNum = pos
miniFatDataSectorNum = self._get_block_num(
miniFatNum, (self._header.sectorSize // 64)
)
if self._get_block_num(miniFatDataSectorNum, 128) > 1:
raise ValueError("Unexpected layout size; too large")
layout.miniFatNum = miniFatNum
layout.miniFatDataSectorNum = miniFatDataSectorNum
layout.miniFatSectors = miniFatSectors
layout.directoryEntrySectorNum = self._get_block_num(len(self._dirs), 4)
layout.encryptionPackageSectorNum = self._get_block_num(
len(self._dirs[DSPos.iEncryptionPackage].Content), layout.sectorSize
)
def _get_MiniFAT_sector_number(self, size):
return self._get_block_num(size, 64)
def _get_block_num(self, x, block):
return (x + block - 1) // block
================================================
FILE: msoffcrypto/method/ecma376_agile.py
================================================
from __future__ import annotations
import base64
import functools
import hmac
import io
import logging
import secrets
from hashlib import sha1, sha256, sha384, sha512
from struct import pack, unpack
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives.asymmetric import padding
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from msoffcrypto import exceptions
from msoffcrypto.method.container.ecma376_encrypted import ECMA376Encrypted
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())
ALGORITHM_HASH = {
"SHA1": sha1,
"SHA256": sha256,
"SHA384": sha384,
"SHA512": sha512,
}
blkKey_VerifierHashInput = bytearray([0xFE, 0xA7, 0xD2, 0x76, 0x3B, 0x4B, 0x9E, 0x79])
blkKey_encryptedVerifierHashValue = bytearray(
[0xD7, 0xAA, 0x0F, 0x6D, 0x30, 0x61, 0x34, 0x4E]
)
blkKey_encryptedKeyValue = bytearray([0x14, 0x6E, 0x0B, 0xE7, 0xAB, 0xAC, 0xD0, 0xD6])
blkKey_dataIntegrity1 = bytearray([0x5F, 0xB2, 0xAD, 0x01, 0x0C, 0xB9, 0xE1, 0xF6])
blkKey_dataIntegrity2 = bytearray([0xA0, 0x67, 0x7F, 0x02, 0xB2, 0x2C, 0x84, 0x33])
def _random_buffer(sz):
return secrets.token_bytes(sz)
def _get_num_blocks(sz, block):
return (sz + block - 1) // block
def _round_up(sz, block):
return _get_num_blocks(sz, block) * block
def _resize_buffer(buf, n, c=b"\0"):
if len(buf) >= n:
return buf[:n]
return buf + c * (n - len(buf))
def _normalize_key(key, n):
return _resize_buffer(key, n, b"\x36")
def _get_hash_func(algorithm):
return ALGORITHM_HASH.get(algorithm, sha1)
def _decrypt_aes_cbc(data, key, iv):
aes = Cipher(algorithms.AES(key), modes.CBC(iv), backend=default_backend())
decryptor = aes.decryptor()
decrypted = decryptor.update(data) + decryptor.finalize()
return decrypted
def _encrypt_aes_cbc(data, key, iv):
aes = Cipher(algorithms.AES(key), modes.CBC(iv), backend=default_backend())
encryptor = aes.encryptor()
encrypted = encryptor.update(data) + encryptor.finalize()
return encrypted
def _encrypt_aes_cbc_padded(data, key, iv, blockSize):
buf = data
if len(buf) % blockSize:
buf = _resize_buffer(buf, _round_up(len(buf), blockSize))
return _encrypt_aes_cbc(buf, key, iv)
def _get_salt(salt_value=None, salt_size=16):
if salt_value is not None:
if len(salt_value) != salt_size:
raise exceptions.EncryptionError(
f"Invalid salt value size, should be {salt_size}"
)
return salt_value
return _random_buffer(salt_size)
# Hardcoded to AES256 + SHA512 for OOXML.
class ECMA376AgileCipherParams:
def __init__(self):
self.cipherName = "AES"
self.hashName = "SHA512"
self.saltSize = 16
self.blockSize = 16
self.keyBits = 256
self.hashSize = 64
self.saltValue: bytes | None = None
def _enc64(b):
return base64.b64encode(b).decode("UTF-8")
class ECMA376AgileEncryptionInfo:
def __init__(self):
self.spinCount = 100000
self.keyData = ECMA376AgileCipherParams()
self.encryptedHmacKey: bytes | None = None
self.encryptedHmacValue: bytes | None = None
self.encryptedKey = ECMA376AgileCipherParams()
self.encryptedVerifierHashInput: bytes | None = None
self.encryptedVerifierHashValue: bytes | None = None
self.encryptedKeyValue: bytes | None = None
def getEncryptionDescriptorHeader(self):
# https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-offcrypto/87020a34-e73f-4139-99bc-bbdf6cf6fa55
return pack("<HHI", 4, 4, 0x40)
def toEncryptionDescriptor(self):
"""
Returns an XML description of the encryption information.
"""
return f"""<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<encryption xmlns="http://schemas.microsoft.com/office/2006/encryption" xmlns:p="http://schemas.microsoft.com/office/2006/keyEncryptor/password" xmlns:c="http://schemas.microsoft.com/office/2006/keyEncryptor/certificate">
<keyData saltSize="{self.keyData.saltSize}" blockSize="{self.keyData.blockSize}" keyBits="{self.keyData.keyBits}" hashSize="{self.keyData.hashSize}"
cipherAlgorithm="{self.keyData.cipherName}" cipherChaining="ChainingModeCBC" hashAlgorithm="{self.keyData.hashName}" saltValue="{_enc64(self.keyData.saltValue)}" />
<dataIntegrity encryptedHmacKey="{_enc64(self.encryptedHmacKey)}" encryptedHmacValue="{_enc64(self.encryptedHmacValue)}" />
<keyEncryptors>
<keyEncryptor uri="http://schemas.microsoft.com/office/2006/keyEncryptor/password">
<p:encryptedKey spinCount="{self.spinCount}" saltSize="{self.encryptedKey.saltSize}" blockSize="{self.encryptedKey.blockSize}" keyBits="{self.encryptedKey.keyBits}"
hashSize="{self.encryptedKey.hashSize}" cipherAlgorithm="{self.encryptedKey.cipherName}" cipherChaining="ChainingModeCBC" hashAlgorithm="{self.encryptedKey.hashName}"
saltValue="{_enc64(self.encryptedKey.saltValue)}" encryptedVerifierHashInput="{_enc64(self.encryptedVerifierHashInput)}"
encryptedVerifierHashValue="{_enc64(self.encryptedVerifierHashValue)}" encryptedKeyValue="{_enc64(self.encryptedKeyValue)}" />
</keyEncryptor>
</keyEncryptors>
</encryption>
"""
def _generate_iv(params: ECMA376AgileCipherParams, blkKey, salt_value):
if not blkKey:
return _normalize_key(salt_value, params.blockSize)
hashCalc = _get_hash_func(params.hashName)
return _normalize_key(hashCalc(salt_value + blkKey).digest(), params.blockSize)
class ECMA376Agile:
def __init__(self):
pass
@staticmethod
def _derive_iterated_hash_from_password(
password, saltValue, hashAlgorithm, spinValue
):
r"""
Do a partial password-based hash derivation.
Note the block key is not taken into consideration in this function.
"""
# TODO: This function is quite expensive and it should only be called once.
# We need to save the result for later use.
# This is not covered by the specification, but MS Word does so.
hashCalc = _get_hash_func(hashAlgorithm)
# NOTE: Initial round sha512(salt + password)
h = hashCalc(saltValue + password.encode("UTF-16LE"))
# NOTE: Iteration of 0 -> spincount-1; hash = sha512(iterator + hash)
for i in range(0, spinValue, 1):
h = hashCalc(pack("<I", i) + h.digest())
return h
@staticmethod
def _derive_encryption_key(h, blockKey, hashAlgorithm, keyBits):
r"""
Finish the password-based key derivation by hashing last hash + blockKey.
"""
hashCalc = _get_hash_func(hashAlgorithm)
h_final = hashCalc(h + blockKey)
# NOTE: Needed to truncate encryption key to bitsize
encryption_key = h_final.digest()[: keyBits // 8]
return encryption_key
@staticmethod
def decrypt(key, keyDataSalt, hashAlgorithm, ibuf):
r"""
Return decrypted data.
>>> key = b'@ f\t\xd9\xfa\xad\xf2K\x07j\xeb\xf2\xc45\xb7B\x92\xc8\xb8\xa7\xaa\x81\xbcg\x9b\xe8\x97\x11\xb0*\xc2'
>>> keyDataSalt = b'\x8f\xc7x"+P\x8d\xdcL\xe6\x8c\xdd\x15<\x16\xb4'
>>> hashAlgorithm = 'SHA512'
"""
# NOTE: See https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-offcrypto/9e61da63-8ddb-4c0a-b25d-f85d990f44c8
SEGMENT_LENGTH = 4096
hashCalc = _get_hash_func(hashAlgorithm)
obuf = io.BytesIO()
# NOTE: See https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-offcrypto/b60c8b35-2db2-4409-8710-59d88a793f83
ibuf.seek(0)
totalSize = unpack("<Q", ibuf.read(8))
totalSize = totalSize[0]
logger.debug("totalSize: {}".format(totalSize))
remaining = totalSize
for i, buf in enumerate(
iter(functools.partial(ibuf.read, SEGMENT_LENGTH), b"")
):
saltWithBlockKey = keyDataSalt + pack("<I", i)
iv = hashCalc(saltWithBlockKey).digest()
iv = iv[:16]
aes = Cipher(algorithms.AES(key), modes.CBC(iv), backend=default_backend())
decryptor = aes.decryptor()
dec = decryptor.update(buf) + decryptor.finalize()
# TODO: Check
if remaining < len(dec):
dec = dec[:remaining]
obuf.write(dec)
remaining -= len(dec)
# TODO: Check if this is needed
if remaining <= 0:
break
return obuf.getvalue() # return obuf.getbuffer()
@staticmethod
def encrypt(key, ibuf, salt_value=None, spin_count=100000):
"""
Return an OLE compound file buffer (complete with headers) which contains ibuf encrypted into a single stream.
When salt_value is not specified (the default), we generate a random one.
"""
# Encryption ported from C++ (https://github.com/herumi/msoffice, BSD-3)
info, secret_key = ECMA376Agile.generate_encryption_parameters(
key, salt_value, spin_count
)
encrypted_data = ECMA376Agile.encrypt_payload(
ibuf, info.encryptedKey, secret_key, info.keyData.saltValue
)
encryption_info = ECMA376Agile.get_encryption_information(
info, encrypted_data, secret_key
)
obuf = io.BytesIO()
ECMA376Encrypted(encrypted_data, encryption_info).write_to(obuf)
return obuf.getvalue()
@staticmethod
def get_encryption_information(
info: ECMA376AgileEncryptionInfo, encrypted_data, secretKey
):
"""
Return the content of an EncryptionInfo Stream, including the short header, per the specifications at
https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-offcrypto/87020a34-e73f-4139-99bc-bbdf6cf6fa55
"""
hmacKey, hmacValue = ECMA376Agile.generate_integrity_parameter(
encrypted_data, info.keyData, secretKey, info.keyData.saltValue
)
info.encryptedHmacKey = hmacKey
info.encryptedHmacValue = hmacValue
xml_descriptor = info.toEncryptionDescriptor().encode("UTF-8")
header_descriptor = info.getEncryptionDescriptorHeader()
return header_descriptor + xml_descriptor
@staticmethod
def generate_encryption_parameters(key, salt_value=None, spin_count=100000):
"""
Generates encryption parameters used to encrypt a payload.
Returns the information + a secret key.
"""
info = ECMA376AgileEncryptionInfo()
info.spinCount = spin_count
info.encryptedKey.saltValue = _get_salt(salt_value, info.encryptedKey.saltSize)
h = ECMA376Agile._derive_iterated_hash_from_password(
key, info.encryptedKey.saltValue, info.encryptedKey.hashName, info.spinCount
).digest()
key1 = ECMA376Agile._derive_encryption_key(
h,
blkKey_VerifierHashInput,
info.encryptedKey.hashName,
info.encryptedKey.keyBits,
)
key2 = ECMA376Agile._derive_encryption_key(
h,
blkKey_encryptedVerifierHashValue,
info.encryptedKey.hashName,
info.encryptedKey.keyBits,
)
key3 = ECMA376Agile._derive_encryption_key(
h,
blkKey_encryptedKeyValue,
info.encryptedKey.hashName,
info.encryptedKey.keyBits,
)
verifierHashInput = _random_buffer(info.encryptedKey.saltSize)
verifierHashInput = _resize_buffer(
verifierHashInput,
_round_up(len(verifierHashInput), info.encryptedKey.blockSize),
)
info.encryptedVerifierHashInput = _encrypt_aes_cbc(
verifierHashInput, key1, info.encryptedKey.saltValue
)
hashedVerifier = _get_hash_func(info.encryptedKey.hashName)(
verifierHashInput
).digest()
hashedVerifier = _resize_buffer(
hashedVerifier, _round_up(len(hashedVerifier), info.encryptedKey.blockSize)
)
info.encryptedVerifierHashValue = _encrypt_aes_cbc(
hashedVerifier, key2, info.encryptedKey.saltValue
)
secret_key = _random_buffer(info.encryptedKey.saltSize)
secret_key = _normalize_key(secret_key, info.encryptedKey.keyBits // 8)
info.encryptedKeyValue = _encrypt_aes_cbc(
secret_key, key3, info.encryptedKey.saltValue
)
info.keyData.saltValue = _get_salt(salt_size=info.keyData.saltSize)
return info, secret_key
@staticmethod
def encrypt_payload(ibuf, params: ECMA376AgileCipherParams, secret_key, salt_value):
"""
Encrypts a payload using the params and secrets passed in.
Returns the encrypted data as a byte array.
"""
# Specifications calls for storing the original (unpadded) size as a 64 bit little-endian
# number at the start of the buffer. We'll loop while there's data, and come back at the
# end to update the total size, instead of seeking to the end of ibuf to get the size,
# just in case ibuf is a streaming buffer...
total_size = 0
obuf = io.BytesIO()
obuf.write(pack("<Q", total_size))
hashCalc = _get_hash_func(params.hashName)
SEGMENT_LENGTH = 4096
i = 0
while True:
buf = ibuf.read(SEGMENT_LENGTH)
if not buf:
break
iv = _normalize_key(
hashCalc(salt_value + pack("<I", i)).digest(), params.saltSize
)
# Per the specifications, we need to make sure the last chunk is padded to our
# block size
enc = _encrypt_aes_cbc_padded(buf, secret_key, iv, params.blockSize)
obuf.write(enc)
total_size += len(buf)
i += 1
# Update size in the header
obuf.seek(0)
obuf.write(pack("<Q", total_size))
return obuf.getvalue()
@staticmethod
def generate_integrity_parameter(
encrypted_data, params: ECMA376AgileCipherParams, secret_key, salt_value
):
"""
Returns the encrypted HmacKey and HmacValue.
"""
salt = _random_buffer(params.hashSize)
iv1 = _generate_iv(params, blkKey_dataIntegrity1, salt_value)
iv2 = _generate_iv(params, blkKey_dataIntegrity2, salt_value)
encryptedHmacKey = _encrypt_aes_cbc(salt, secret_key, iv1)
msg_hmac = hmac.new(salt, encrypted_data, _get_hash_func(params.hashName))
hmacValue = msg_hmac.digest()
encryptedHmacValue = _encrypt_aes_cbc(hmacValue, secret_key, iv2)
return encryptedHmacKey, encryptedHmacValue
@staticmethod
def verify_password(
password,
saltValue,
hashAlgorithm,
encryptedVerifierHashInput,
encryptedVerifierHashValue,
spinValue,
keyBits,
):
r"""
Return True if the given password is valid.
>>> password = 'Password1234_'
>>> saltValue = b'\xcb\xca\x1c\x99\x93C\xfb\xad\x92\x07V4\x15\x004\xb0'
>>> hashAlgorithm = 'SHA512'
>>> encryptedVerifierHashInput = b'9\xee\xa5N&\xe5\x14y\x8c(K\xc7qM8\xac'
>>> encryptedVerifierHashValue = b'\x147mm\x81s4\xe6\xb0\xffO\xd8"\x1a|g\x8e]\x8axN\x8f\x99\x9fL\x18\x890\xc3jK)\xc5\xb33`' + \
... b'[\\\xd4\x03\xb0P\x03\xad\xcf\x18\xcc\xa8\xcb\xab\x8d\xeb\xe3s\xc6V\x04\xa0\xbe\xcf\xae\\\n\xd0'
>>> spinValue = 100000
>>> keyBits = 256
>>> ECMA376Agile.verify_password(password, saltValue, hashAlgorithm, encryptedVerifierHashInput, encryptedVerifierHashValue, spinValue, keyBits)
True
"""
# NOTE: See https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-offcrypto/a57cb947-554f-4e5e-b150-3f2978225e92
h = ECMA376Agile._derive_iterated_hash_from_password(
password, saltValue, hashAlgorithm, spinValue
)
key1 = ECMA376Agile._derive_encryption_key(
h.digest(), blkKey_VerifierHashInput, hashAlgorithm, keyBits
)
key2 = ECMA376Agile._derive_encryption_key(
h.digest(), blkKey_encryptedVerifierHashValue, hashAlgorithm, keyBits
)
hash_input = _decrypt_aes_cbc(encryptedVerifierHashInput, key1, saltValue)
hashCalc = _get_hash_func(hashAlgorithm)
acutal_hash = hashCalc(hash_input)
acutal_hash = acutal_hash.digest()
expected_hash = _decrypt_aes_cbc(encryptedVerifierHashValue, key2, saltValue)
return acutal_hash == expected_hash
@staticmethod
def verify_integrity(
secretKey,
keyDataSalt,
keyDataHashAlgorithm,
keyDataBlockSize,
encryptedHmacKey,
encryptedHmacValue,
stream,
):
r"""
Return True if the HMAC of the data payload is valid.
"""
# NOTE: See https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-offcrypto/63d9c262-82b9-4fa3-a06d-d087b93e3b00
hashCalc = _get_hash_func(keyDataHashAlgorithm)
iv1 = hashCalc(keyDataSalt + blkKey_dataIntegrity1).digest()
iv1 = iv1[:keyDataBlockSize]
iv2 = hashCalc(keyDataSalt + blkKey_dataIntegrity2).digest()
iv2 = iv2[:keyDataBlockSize]
hmacKey = _decrypt_aes_cbc(encryptedHmacKey, secretKey, iv1)
hmacValue = _decrypt_aes_cbc(encryptedHmacValue, secretKey, iv2)
msg_hmac = hmac.new(hmacKey, stream.read(), hashCalc)
actualHmac = msg_hmac.digest()
stream.seek(0)
return hmacValue == actualHmac
@staticmethod
def makekey_from_privkey(privkey, encryptedKeyValue):
privkey = serialization.load_pem_private_key(
privkey.read(), password=None, backend=default_backend()
)
skey = privkey.decrypt(encryptedKeyValue, padding.PKCS1v15())
return skey
@staticmethod
def makekey_from_password(
password, saltValue, hashAlgorithm, encryptedKeyValue, spinValue, keyBits
):
r"""
Generate intermediate key from given password.
>>> password = 'Password1234_'
>>> saltValue = b'Lr]E\xdca\x0f\x93\x94\x12\xa0M\xa7\x91\x04f'
>>> hashAlgorithm = 'SHA512'
>>> encryptedKeyValue = b"\xa1l\xd5\x16Zz\xb9\xd2q\x11>\xd3\x86\xa7\x8c\xf4\x96\x92\xe8\xe5'\xb0\xc5\xfc\x00U\xed\x08\x0b|\xb9K"
>>> spinValue = 100000
>>> keyBits = 256
>>> expected = b'@ f\t\xd9\xfa\xad\xf2K\x07j\xeb\xf2\xc45\xb7B\x92\xc8\xb8\xa7\xaa\x81\xbcg\x9b\xe8\x97\x11\xb0*\xc2'
>>> ECMA376Agile.makekey_from_password(password, saltValue, hashAlgorithm, encryptedKeyValue, spinValue, keyBits) == expected
True
"""
h = ECMA376Agile._derive_iterated_hash_from_password(
password, saltValue, hashAlgorithm, spinValue
)
encryption_key = ECMA376Agile._derive_encryption_key(
h.digest(), blkKey_encryptedKeyValue, hashAlgorithm, keyBits
)
skey = _decrypt_aes_cbc(encryptedKeyValue, encryption_key, saltValue)
return skey
================================================
FILE: msoffcrypto/method/ecma376_extensible.py
================================================
class ECMA376Extensible:
def __init__(self):
pass
================================================
FILE: msoffcrypto/method/ecma376_standard.py
================================================
import io
import logging
from hashlib import sha1
from struct import pack, unpack
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())
class ECMA376Standard:
def __init__(self):
pass
@staticmethod
def decrypt(key, ibuf):
r"""
Return decrypted data.
"""
obuf = io.BytesIO()
totalSize = unpack("<I", ibuf.read(4))[0]
logger.debug("totalSize: {}".format(totalSize))
ibuf.seek(8)
aes = Cipher(algorithms.AES(key), modes.ECB(), backend=default_backend())
decryptor = aes.decryptor()
x = ibuf.read()
dec = decryptor.update(x) + decryptor.finalize()
obuf.write(dec[:totalSize])
return obuf.getvalue() # return obuf.getbuffer()
@staticmethod
def verifykey(key, encryptedVerifier, encryptedVerifierHash):
r"""
Return True if the given intermediate key is valid.
>>> key = b'@\xb1:q\xf9\x0b\x96n7T\x08\xf2\xd1\x81\xa1\xaa'
>>> encryptedVerifier = b'Qos.\x96o\xac\x17\xb1\xc5\xd7\xd8\xcc6\xc9('
>>> encryptedVerifierHash = b'+ah\xda\xbe)\x11\xad+\xd3|\x17Ft\\\x14\xd3\xcf\x1b\xb1@\xa4\x8fNo=#\x88\x08r\xb1j'
>>> ECMA376Standard.verifykey(key, encryptedVerifier, encryptedVerifierHash)
True
"""
# TODO: For consistency with Agile, rename method to verify_password or the like
logger.debug([key, encryptedVerifier, encryptedVerifierHash])
# https://msdn.microsoft.com/en-us/library/dd926426(v=office.12).aspx
aes = Cipher(algorithms.AES(key), modes.ECB(), backend=default_backend())
decryptor = aes.decryptor()
verifier = decryptor.update(encryptedVerifier)
expected_hash = sha1(verifier).digest()
decryptor = aes.decryptor()
verifierHash = decryptor.update(encryptedVerifierHash)[: sha1().digest_size]
return expected_hash == verifierHash
@staticmethod
def makekey_from_password(
password, algId, algIdHash, providerType, keySize, saltSize, salt
):
r"""
Generate intermediate key from given password.
>>> password = 'Password1234_'
>>> algId = 0x660e
>>> algIdHash = 0x8004
>>> providerType = 0x18
>>> keySize = 128
>>> saltSize = 16
>>> salt = b'\xe8\x82fI\x0c[\xd1\xee\xbd+C\x94\xe3\xf80\xef'
>>> expected = b'@\xb1:q\xf9\x0b\x96n7T\x08\xf2\xd1\x81\xa1\xaa'
>>> ECMA376Standard.makekey_from_password(password, algId, algIdHash, providerType, keySize, saltSize, salt) == expected
True
"""
logger.debug(
[
password,
hex(algId),
hex(algIdHash),
hex(providerType),
keySize,
saltSize,
salt,
]
)
xor_bytes = lambda a, b: bytearray(
[p ^ q for p, q in zip(bytearray(a), bytearray(b))]
) # bytearray() for Python 2 compat.
# https://msdn.microsoft.com/en-us/library/dd925430(v=office.12).aspx
ITER_COUNT = 50000
password = password.encode("UTF-16LE")
h = sha1(salt + password).digest()
for i in range(ITER_COUNT):
ibytes = pack("<I", i)
h = sha1(ibytes + h).digest()
block = 0
blockbytes = pack("<I", block)
hfinal = sha1(h + blockbytes).digest()
cbRequiredKeyLength = keySize // 8
cbHash = sha1().digest_size
buf1 = b"\x36" * 64
buf1 = xor_bytes(hfinal, buf1[:cbHash]) + buf1[cbHash:]
x1 = sha1(buf1).digest()
buf2 = b"\x5c" * 64
buf2 = xor_bytes(hfinal, buf2[:cbHash]) + buf2[cbHash:]
x2 = sha1(buf2).digest() # In spec but unused
x3 = x1 + x2
keyDerived = x3[:cbRequiredKeyLength]
logger.debug(keyDerived)
return keyDerived
================================================
FILE: msoffcrypto/method/rc4.py
================================================
import functools
import io
import logging
from hashlib import md5
from struct import pack
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives.ciphers import Cipher
try:
# NOTE: Avoid DeprecationWarning since cryptography>=43.0
# TODO: .algorithm differs from the official documentation
from cryptography.hazmat.decrepit.ciphers.algorithms import ARC4
except ImportError:
from cryptography.hazmat.primitives.ciphers.algorithms import ARC4
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())
def _makekey(password, salt, block):
r"""
Return a intermediate key.
>>> password = 'password1'
>>> salt = b'\xe8w,\x1d\x91\xc5j7\x96Ga\xb2\x80\x182\x17'
>>> block = 0
>>> expected = b' \xbf2\xdd\xf5@\x85\x8cQ7D\xaf\x0f$\xe0<'
>>> _makekey(password, salt, block) == expected
True
"""
# https://msdn.microsoft.com/en-us/library/dd920360(v=office.12).aspx
password = password.encode("UTF-16LE")
h0 = md5(password).digest()
truncatedHash = h0[:5]
intermediateBuffer = (truncatedHash + salt) * 16
h1 = md5(intermediateBuffer).digest()
truncatedHash = h1[:5]
blockbytes = pack("<I", block)
hfinal = md5(truncatedHash + blockbytes).digest()
key = hfinal[: 128 // 8]
return key
class DocumentRC4:
def __init__(self):
pass
@staticmethod
def verifypw(password, salt, encryptedVerifier, encryptedVerifierHash):
r"""
Return True if the given password is valid.
>>> password = 'password1'
>>> salt = b'\xe8w,\x1d\x91\xc5j7\x96Ga\xb2\x80\x182\x17'
>>> encryptedVerifier = b'\xc9\xe9\x97\xd4T\x97=1\x0b\xb1\xbap\x14&\x83~'
>>> encryptedVerifierHash = b'\xb1\xde\x17\x8f\x07\xe9\x89\xc4M\xae^L\xf9j\xc4\x07'
>>> DocumentRC4.verifypw(password, salt, encryptedVerifier, encryptedVerifierHash)
True
"""
# https://msdn.microsoft.com/en-us/library/dd952648(v=office.12).aspx
block = 0
key = _makekey(password, salt, block)
cipher = Cipher(ARC4(key), mode=None, backend=default_backend())
decryptor = cipher.decryptor()
verifier = decryptor.update(encryptedVerifier)
verfiferHash = decryptor.update(encryptedVerifierHash)
hash = md5(verifier).digest()
logging.debug([verfiferHash, hash])
return hash == verfiferHash
@staticmethod
def decrypt(password, salt, ibuf, blocksize=0x200):
r"""
Return decrypted data.
"""
obuf = io.BytesIO()
block = 0
key = _makekey(password, salt, block)
for c, buf in enumerate(iter(functools.partial(ibuf.read, blocksize), b"")):
cipher = Cipher(ARC4(key), mode=None, backend=default_backend())
decryptor = cipher.decryptor()
dec = decryptor.update(buf) + decryptor.finalize()
obuf.write(dec)
# From wvDecrypt:
# at this stage we need to rekey the rc4 algorithm
# Dieter Spaar <spaar@mirider.augusta.de> figured out
# this rekeying, big kudos to him
block += 1
key = _makekey(password, salt, block)
obuf.seek(0)
return obuf
================================================
FILE: msoffcrypto/method/rc4_cryptoapi.py
================================================
import functools
import io
import logging
from hashlib import sha1
from struct import pack
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives.ciphers import Cipher
try:
# NOTE: Avoid DeprecationWarning since cryptography>=43.0
# TODO: .algorithm differs from the official documentation
from cryptography.hazmat.decrepit.ciphers.algorithms import ARC4
except ImportError:
from cryptography.hazmat.primitives.ciphers.algorithms import ARC4
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())
def _makekey(password, salt, keyLength, block, algIdHash=0x00008004):
r"""
Return a intermediate key.
"""
# https://msdn.microsoft.com/en-us/library/dd920677(v=office.12).aspx
password = password.encode("UTF-16LE")
h0 = sha1(salt + password).digest()
blockbytes = pack("<I", block)
hfinal = sha1(h0 + blockbytes).digest()
if keyLength == 40:
key = hfinal[:5] + b"\x00" * 11
else:
key = hfinal[: keyLength // 8]
return key
class DocumentRC4CryptoAPI:
def __init__(self):
pass
@staticmethod
def verifypw(
password,
salt,
keySize,
encryptedVerifier,
encryptedVerifierHash,
algId=0x00006801,
block=0,
):
r"""
Return True if the given password is valid.
"""
# TODO: For consistency with others, rename method to verify_password or the like
# https://msdn.microsoft.com/en-us/library/dd953617(v=office.12).aspx
key = _makekey(password, salt, keySize, block)
cipher = Cipher(ARC4(key), mode=None, backend=default_backend())
decryptor = cipher.decryptor()
verifier = decryptor.update(encryptedVerifier)
verfiferHash = decryptor.update(encryptedVerifierHash)
hash = sha1(verifier).digest()
logging.debug([verfiferHash, hash])
return hash == verfiferHash
@staticmethod
def decrypt(password, salt, keySize, ibuf, blocksize=0x200, block=0):
r"""
Return decrypted data.
"""
obuf = io.BytesIO()
key = _makekey(password, salt, keySize, block)
for c, buf in enumerate(iter(functools.partial(ibuf.read, blocksize), b"")):
cipher = Cipher(ARC4(key), mode=None, backend=default_backend())
decryptor = cipher.decryptor()
dec = decryptor.update(buf) + decryptor.finalize()
obuf.write(dec)
# From wvDecrypt:
# at this stage we need to rekey the rc4 algorithm
# Dieter Spaar <spaar@mirider.augusta.de> figured out
# this rekeying, big kudos to him
block += 1
key = _makekey(password, salt, keySize, block)
obuf.seek(0)
return obuf
================================================
FILE: msoffcrypto/method/xor_obfuscation.py
================================================
import io
import logging
from hashlib import md5
from struct import pack
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())
def _makekey(password, salt, block):
r"""
Return a intermediate key.
>>> password = 'password1'
>>> salt = b'\xe8w,\x1d\x91\xc5j7\x96Ga\xb2\x80\x182\x17'
>>> block = 0
>>> expected = b' \xbf2\xdd\xf5@\x85\x8cQ7D\xaf\x0f$\xe0<'
>>> _makekey(password, salt, block) == expected
True
"""
# https://msdn.microsoft.com/en-us/library/dd920360(v=office.12).aspx
password = password.encode("UTF-16LE")
h0 = md5(password).digest()
truncatedHash = h0[:5]
intermediateBuffer = (truncatedHash + salt) * 16
h1 = md5(intermediateBuffer).digest()
truncatedHash = h1[:5]
blockbytes = pack("<I", block)
hfinal = md5(truncatedHash + blockbytes).digest()
key = hfinal[: 128 // 8]
return key
class DocumentXOR:
def __init__(self):
pass
pad_array = [
0xBB,
0xFF,
0xFF,
0xBA,
0xFF,
0xFF,
0xB9,
0x80,
0x00,
0xBE,
0x0F,
0x00,
0xBF,
0x0F,
0x00,
]
initial_code = [
0xE1F0,
0x1D0F,
0xCC9C,
0x84C0,
0x110C,
0x0E10,
0xF1CE,
0x313E,
0x1872,
0xE139,
0xD40F,
0x84F9,
0x280C,
0xA96A,
0x4EC3,
]
xor_matrix = [
0xAEFC,
0x4DD9,
0x9BB2,
0x2745,
0x4E8A,
0x9D14,
0x2A09,
0x7B61,
0xF6C2,
0xFDA5,
0xEB6B,
0xC6F7,
0x9DCF,
0x2BBF,
0x4563,
0x8AC6,
0x05AD,
0x0B5A,
0x16B4,
0x2D68,
0x5AD0,
0x0375,
0x06EA,
0x0DD4,
0x1BA8,
0x3750,
0x6EA0,
0xDD40,
0xD849,
0xA0B3,
0x5147,
0xA28E,
0x553D,
0xAA7A,
0x44D5,
0x6F45,
0xDE8A,
0xAD35,
0x4A4B,
0x9496,
0x390D,
0x721A,
0xEB23,
0xC667,
0x9CEF,
0x29FF,
0x53FE,
0xA7FC,
0x5FD9,
0x47D3,
0x8FA6,
0x0F6D,
0x1EDA,
0x3DB4,
0x7B68,
0xF6D0,
0xB861,
0x60E3,
0xC1C6,
0x93AD,
0x377B,
0x6EF6,
0xDDEC,
0x45A0,
0x8B40,
0x06A1,
0x0D42,
0x1A84,
0x3508,
0x6A10,
0xAA51,
0x4483,
0x8906,
0x022D,
0x045A,
0x08B4,
0x1168,
0x76B4,
0xED68,
0xCAF1,
0x85C3,
0x1BA7,
0x374E,
0x6E9C,
0x3730,
0x6E60,
0xDCC0,
0xA9A1,
0x4363,
0x86C6,
0x1DAD,
0x3331,
0x6662,
0xCCC4,
0x89A9,
0x0373,
0x06E6,
0x0DCC,
0x1021,
0x2042,
0x4084,
0x8108,
0x1231,
0x2462,
0x48C4,
]
@staticmethod
def verifypw(password, verificationBytes):
r"""
Return True if the given password is valid.
>>> from struct import unpack
>>> password = 'VelvetSweatshop'
>>> (key,) = unpack('<H', b'\x0A\x9A') # 0x9a0a
>>> DocumentXOR.verifypw(password, key)
True
"""
# https://interoperability.blob.core.windows.net/files/MS-OFFCRYPTO/%5bMS-OFFCRYPTO%5d.pdf
verifier = 0
password_array = []
password_array.append(len(password))
password_array.extend([ord(ch) for ch in password])
password_array.reverse()
for password_byte in password_array:
if verifier & 0x4000 == 0x0000:
intermidiate_1 = 0
else:
intermidiate_1 = 1
intermidiate_2 = verifier * 2
intermidiate_2 = (
intermidiate_2 & 0x7FFF
) # SET most significant bit of Intermediate2 TO 0
intermidiate_3 = intermidiate_1 ^ intermidiate_2
verifier = intermidiate_3 ^ password_byte
return True if (verifier ^ 0xCE4B) == verificationBytes else False
@staticmethod
def xor_ror(byte1, byte2):
return DocumentXOR.ror(byte1 ^ byte2, 1, 8)
@staticmethod
def create_xor_key_method1(password):
xor_key = DocumentXOR.initial_code[len(password) - 1]
current_element = 0x00000068
data = [ord(ch) for ch in reversed(password)]
for ch in data:
for i in range(7):
if ch & 0x40 != 0:
xor_key = (
xor_key ^ DocumentXOR.xor_matrix[current_element]
) % 65536
ch = (ch << 1) % 256
current_element -= 1
return xor_key
@staticmethod
def create_xor_array_method1(password):
xor_key = DocumentXOR.create_xor_key_method1(password)
index = len(password)
obfuscation_array = [
0x00,
0x00,
0x00,
0x00,
0x00,
0x00,
0x00,
0x00,
0x00,
0x00,
0x00,
0x00,
0x00,
0x00,
0x00,
0x00,
]
if index % 2 == 1:
temp = (
xor_key & 0xFF00
) >> 8 # SET Temp TO most significant byte of XorKey
obfuscation_array[index] = DocumentXOR.xor_ror(
DocumentXOR.pad_array[0], temp
)
index -= 1
temp = xor_key & 0x00FF
password_last_char = ord(password[-1])
obfuscation_array[index] = DocumentXOR.xor_ror(password_last_char, temp)
while index > 0:
index -= 1
temp = (xor_key & 0xFF00) >> 8
obfuscation_array[index] = DocumentXOR.xor_ror(ord(password[index]), temp)
index -= 1
temp = xor_key & 0x00FF
obfuscation_array[index] = DocumentXOR.xor_ror(ord(password[index]), temp)
index = 15
pad_index = 15 - len(password)
while pad_index > 0:
temp = (xor_key & 0xFF00) >> 8
obfuscation_array[index] = DocumentXOR.xor_ror(
DocumentXOR.pad_array[pad_index], temp
)
index -= 1
pad_index -= 1
temp = xor_key & 0x00FF
obfuscation_array[index] = DocumentXOR.xor_ror(
DocumentXOR.pad_array[pad_index], temp
)
index -= 1
pad_index -= 1
return obfuscation_array
@staticmethod
def ror(n, rotations, width):
return (2**width - 1) & (n >> rotations | n << (width - rotations))
@staticmethod
def rol(n, rotations, width):
return (2**width - 1) & (n << rotations | n >> (width - rotations))
@staticmethod
def decrypt(password, ibuf, plaintext, records, base):
r"""
Return decrypted data (DecryptData_Method1)
"""
obuf = io.BytesIO()
xor_array = DocumentXOR.create_xor_array_method1(password)
data_index = 0
record_index = 0
while data_index < len(plaintext):
count = 1
if plaintext[data_index] == -1 or plaintext[data_index] == -2:
for j in range(data_index + 1, len(plaintext)):
if plaintext[j] >= 0:
break
count += 1
if plaintext[data_index] == -2:
xor_array_index = (data_index + count + 4) % 16
else:
xor_array_index = (data_index + count) % 16
temp_res = 0
for item in range(count):
data_byte = ibuf.read(1)
temp_res = data_byte[0] ^ xor_array[xor_array_index]
temp_res = DocumentXOR.ror(temp_res, 5, 8)
obuf.write(temp_res.to_bytes(1, "little"))
xor_array_index += 1
xor_array_index = xor_array_index % 16
record_index += 1
else:
obuf.write(ibuf.read(1))
data_index += count
obuf.seek(0)
return obuf
================================================
FILE: pyproject.toml
================================================
[tool.poetry]
name = "msoffcrypto-tool"
version = "6.0.0"
description = "Python tool and library for decrypting and encrypting MS Office files using a password or other keys"
license = "MIT"
homepage = "https://github.com/nolze/msoffcrypto-tool"
authors = ["nolze <nolze@int3.net>"]
readme = "README.md"
packages = [{ include = "msoffcrypto" }, { include = "NOTICE.txt" }]
[tool.poetry.dependencies]
python = "^3.10"
cryptography = ">=39.0"
olefile = ">=0.46"
[tool.poetry.group.dev.dependencies]
# pytest = { version = ">=6.2.1", python = "^3.7" }
pytest = "^9.0.2"
coverage = { extras = ["toml"], version = "^7.5" }
[tool.poetry.group.docs.dependencies]
sphinx = "^8"
sphinx-autobuild = "2024.10.02"
furo = "2025.12.19"
myst-parser = "^4.0.1"
sphinxcontrib-autoprogram = "^0.1.8"
[tool.poetry.scripts]
msoffcrypto-tool = 'msoffcrypto.__main__:main'
[tool.poetry.requires-plugins]
poetry-plugin-export = ">=1.8"
[tool.black]
line-length = 140
exclude = '/(\.git|\.pytest_cache|\.venv|\.vscode|dist|docs)/'
[tool.pytest.ini_options]
addopts = "-ra -q --doctest-modules"
testpaths = ["msoffcrypto", "tests"]
[tool.coverage.run]
omit = [".venv/*", "tests/*"]
[build-system]
requires = ["poetry_core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
================================================
FILE: tests/__init__.py
================================================
================================================
FILE: tests/test_cli.py
================================================
import subprocess
import unittest
class CLITest(unittest.TestCase):
def test_cli(self):
# Python 3:
# cp = subprocess.run("./tests/test_cli.sh", shell=True)
# self.assertEqual(cp.returncode, 0)
# For Python 2 compat:
returncode = subprocess.call("./tests/test_cli.sh", shell=True)
self.assertEqual(returncode, 0)
if __name__ == "__main__":
unittest.main()
================================================
FILE: tests/test_cli.sh
================================================
#!/usr/bin/env bash
set -ev
cd "$(dirname "$0")"
msoffcrypto-tool () {
python ../msoffcrypto "$@"
}
# Decryption
msoffcrypto-tool --test inputs/example_password.docx && : ; [ $? = 0 ]
msoffcrypto-tool --test outputs/example.docx && : ; [ $? = 1 ]
msoffcrypto-tool -p Password1234_ inputs/example_password.docx /tmp/example.docx
diff /tmp/example.docx outputs/example.docx
msoffcrypto-tool --test inputs/example_password.xlsx && : ; [ $? = 0 ]
msoffcrypto-tool --test outputs/example.xlsx && : ; [ $? = 1 ]
msoffcrypto-tool -p Password1234_ inputs/example_password.xlsx /tmp/example.xlsx
diff /tmp/example.xlsx outputs/example.xlsx
msoffcrypto-tool --test inputs/ecma376standard_password.docx && : ; [ $? = 0 ]
msoffcrypto-tool --test outputs/ecma376standard_password_plain.docx && : ; [ $? = 1 ]
msoffcrypto-tool -p Password1234_ inputs/ecma376standard_password.docx /tmp/ecma376standard_password_plain.docx
diff /tmp/ecma376standard_password_plain.docx outputs/ecma376standard_password_plain.docx
msoffcrypto-tool --test inputs/rc4cryptoapi_password.doc && : ; [ $? = 0 ]
msoffcrypto-tool --test outputs/rc4cryptoapi_password_plain.doc && : ; [ $? = 1 ]
msoffcrypto-tool -p Password1234_ inputs/rc4cryptoapi_password.doc /tmp/rc4cryptoapi_password_plain.doc
diff /tmp/rc4cryptoapi_password_plain.doc outputs/rc4cryptoapi_password_plain.doc
msoffcrypto-tool --test inputs/rc4cryptoapi_password.xls && : ; [ $? = 0 ]
msoffcrypto-tool --test outputs/rc4cryptoapi_password_plain.xls && : ; [ $? = 1 ]
msoffcrypto-tool -p Password1234_ inputs/rc4cryptoapi_password.xls /tmp/rc4cryptoapi_password_plain.xls
diff /tmp/rc4cryptoapi_password_plain.xls outputs/rc4cryptoapi_password_plain.xls
msoffcrypto-tool --test inputs/rc4cryptoapi_password.ppt && : ; [ $? = 0 ]
msoffcrypto-tool --test outputs/rc4cryptoapi_password_plain.ppt && : ; [ $? = 1 ]
msoffcrypto-tool -p Password1234_ inputs/rc4cryptoapi_password.ppt /tmp/rc4cryptoapi_password_plain.ppt
diff /tmp/rc4cryptoapi_password_plain.ppt outputs/rc4cryptoapi_password_plain.ppt
# Encryption
msoffcrypto-tool -e -p Password1234_ outputs/example.docx /tmp/example_password.docx
msoffcrypto-tool --test /tmp/example_password.docx && : ; [ $? = 0 ]
msoffcrypto-tool -p Password1234_ /tmp/example_password.docx /tmp/example.docx
diff /tmp/example.docx outputs/example.docx
msoffcrypto-tool -e -p Password1234_ outputs/example.xlsx /tmp/example_password.xlsx
msoffcrypto-tool --test /tmp/example_password.xlsx && : ; [ $? = 0 ]
msoffcrypto-tool -p Password1234_ /tmp/example_password.xlsx /tmp/example.xlsx
diff /tmp/example.xlsx outputs/example.xlsx
================================================
FILE: tests/test_compare_known_output.py
================================================
#!/usr/bin/env python
"""Compare output of msoffcrypto-tool for a few input files."""
import os
import sys
import unittest
from difflib import SequenceMatcher
from os.path import abspath, dirname, isfile
from os.path import join as pjoin
from tempfile import mkstemp
try:
import cryptography
except ImportError:
cryptography = None
# add base dir to path so we always import local msoffcrypto
TEST_BASE_DIR = dirname(abspath(__file__))
MODULE_BASE_DIR = dirname(TEST_BASE_DIR)
if sys.path[0] != MODULE_BASE_DIR:
sys.path.insert(0, MODULE_BASE_DIR)
import msoffcrypto
#: encryption password for files tested here
PASSWORD = "Password1234_"
#: input dir
INPUT_DIR = "inputs"
#: pairs of input/output files
EXAMPLE_FILES = (
("example_password.docx", "example.docx", PASSWORD),
("example_password.xlsx", "example.xlsx", PASSWORD),
("ecma376standard_password.docx", "ecma376standard_password_plain.docx", PASSWORD),
("rc4cryptoapi_password.doc", "rc4cryptoapi_password_plain.doc", PASSWORD),
("rc4cryptoapi_password.xls", "rc4cryptoapi_password_plain.xls", PASSWORD),
("rc4cryptoapi_password.ppt", "rc4cryptoapi_password_plain.ppt", PASSWORD),
("xor_password_123456789012345.xls", "xor_password_123456789012345_plain.xls", "123456789012345"),
)
#: output dir:
OUTPUT_DIR = "outputs"
@unittest.skipIf(
cryptography is None, "Cryptography module not installed for python{}.{}".format(sys.version_info.major, sys.version_info.minor)
)
class KnownOutputCompare(unittest.TestCase):
"""See module doc."""
def test_known_output(self):
"""See module doc."""
for in_name, out_name, password in EXAMPLE_FILES:
input_path = pjoin(TEST_BASE_DIR, INPUT_DIR, in_name)
expect_path = pjoin(TEST_BASE_DIR, OUTPUT_DIR, out_name)
# now run the relevant parts of __main__.main:
with open(input_path, "rb") as input_handle:
file = msoffcrypto.OfficeFile(input_handle)
if file.format == "ooxml" and file.type in ["standard", "agile"]:
file.load_key(password=password, verify_password=True)
else:
file.load_key(password=password)
out_desc = None
out_path = None
output = []
try:
# create temp file for output of decryption function
out_desc, out_path = mkstemp(prefix="msoffcrypto-test-", suffix=".txt", text=True)
with os.fdopen(out_desc, "wb") as out_handle:
out_desc = None # out_handle now owns this
# run decryption, capture output
print("decrypting {}".format(in_name))
if file.format == "ooxml" and file.type in ["agile"]:
file.decrypt(out_handle, verify_integrity=True)
else:
file.decrypt(out_handle)
# read extracted output file into memory
with open(expect_path, "rb") as reader:
output = reader.read()
finally:
# ensure we do not leak temp files. Always close & remove
if out_desc:
os.close(out_desc)
if out_path and isfile(out_path):
os.unlink(out_path)
# read output file into memory
with open(expect_path, "rb") as reader:
expect = reader.read()
# compare:
print("comparing output to {}".format(out_name))
similarity = SequenceMatcher(None, expect, output).ratio()
self.assertGreater(similarity, 0.99)
if __name__ == "__main__":
unittest.main()
================================================
FILE: tests/test_file_handle.py
================================================
"""Check that given file handles are not closed."""
import unittest
from os.path import dirname, join
from msoffcrypto import OfficeFile
#: directory with input
DATA_DIR = join(dirname(__file__), "inputs")
class FileHandleTest(unittest.TestCase):
"""See module doc."""
def test_file_handle_open(self):
"""Check that file handles are open after is_encrypted()."""
for suffix in "doc", "ppt", "xls":
path = join(DATA_DIR, "plain." + suffix)
with open(path, "rb") as file_handle:
ofile = OfficeFile(file_handle)
# do something with ofile
self.assertEqual(ofile.is_encrypted(), False)
# check that file handle is still open
self.assertFalse(file_handle.closed)
# destroy OfficeFile, calls destructor
del ofile
# check that file handle is still open
self.assertFalse(file_handle.closed)
# just for completeness:
# check that file handle is now closed
self.assertTrue(file_handle.closed)
# if someone calls this as script, run unittests
if __name__ == "__main__":
unittest.main()
gitextract_ab9olciu/
├── .github/
│ ├── SECURITY.md
│ └── workflows/
│ └── ci.yaml
├── .gitignore
├── .readthedocs.yml
├── CHANGELOG.md
├── LICENSE.txt
├── NOTICE.txt
├── README.md
├── docs/
│ ├── Makefile
│ ├── cli.rst
│ ├── conf.py
│ ├── index.rst
│ ├── make.bat
│ ├── modules.rst
│ ├── msoffcrypto.exceptions.rst
│ ├── msoffcrypto.format.rst
│ ├── msoffcrypto.method.container.rst
│ ├── msoffcrypto.method.rst
│ ├── msoffcrypto.rst
│ └── requirements.txt
├── msoffcrypto/
│ ├── __init__.py
│ ├── __main__.py
│ ├── exceptions/
│ │ └── __init__.py
│ ├── format/
│ │ ├── __init__.py
│ │ ├── base.py
│ │ ├── common.py
│ │ ├── doc97.py
│ │ ├── ooxml.py
│ │ ├── ppt97.py
│ │ └── xls97.py
│ └── method/
│ ├── __init__.py
│ ├── container/
│ │ ├── __init__.py
│ │ └── ecma376_encrypted.py
│ ├── ecma376_agile.py
│ ├── ecma376_extensible.py
│ ├── ecma376_standard.py
│ ├── rc4.py
│ ├── rc4_cryptoapi.py
│ └── xor_obfuscation.py
├── pyproject.toml
└── tests/
├── __init__.py
├── inputs/
│ ├── ecma376standard_password.docx
│ ├── example_password.docx
│ ├── example_password.xlsx
│ ├── plain.doc
│ ├── plain.ppt
│ ├── plain.xls
│ ├── rc4cryptoapi_password.doc
│ ├── rc4cryptoapi_password.ppt
│ ├── rc4cryptoapi_password.xls
│ └── xor_password_123456789012345.xls
├── outputs/
│ ├── ecma376standard_password_plain.docx
│ ├── example.docx
│ ├── example.xlsx
│ ├── rc4cryptoapi_password_plain.doc
│ ├── rc4cryptoapi_password_plain.ppt
│ ├── rc4cryptoapi_password_plain.xls
│ └── xor_password_123456789012345_plain.xls
├── test_cli.py
├── test_cli.sh
├── test_compare_known_output.py
└── test_file_handle.py
SYMBOL INDEX (192 symbols across 19 files)
FILE: msoffcrypto/__init__.py
function OfficeFile (line 8) | def OfficeFile(file):
FILE: msoffcrypto/__main__.py
function _get_version (line 15) | def _get_version():
function ifWIN32SetBinary (line 26) | def ifWIN32SetBinary(io):
function is_encrypted (line 34) | def is_encrypted(file):
function main (line 61) | def main():
FILE: msoffcrypto/exceptions/__init__.py
class FileFormatError (line 1) | class FileFormatError(Exception):
class ParseError (line 7) | class ParseError(Exception):
class DecryptionError (line 13) | class DecryptionError(Exception):
class EncryptionError (line 19) | class EncryptionError(Exception):
class InvalidKeyError (line 25) | class InvalidKeyError(DecryptionError):
FILE: msoffcrypto/format/base.py
class BaseOfficeFile (line 8) | class BaseOfficeFile(ABC):
method __init__ (line 9) | def __init__(self):
method load_key (line 13) | def load_key(self):
method decrypt (line 17) | def decrypt(self, outfile):
method is_encrypted (line 21) | def is_encrypted(self) -> bool:
FILE: msoffcrypto/format/common.py
function _parse_encryptionheader (line 10) | def _parse_encryptionheader(blob):
function _parse_encryptionverifier (line 36) | def _parse_encryptionverifier(blob, algorithm: str):
function _parse_header_RC4CryptoAPI (line 62) | def _parse_header_RC4CryptoAPI(encryptionHeader):
FILE: msoffcrypto/format/doc97.py
function _parseFibBase (line 57) | def _parseFibBase(blob):
function _packFibBase (line 170) | def _packFibBase(fibbase):
function _parseFib (line 244) | def _parseFib(blob):
function _parse_header_RC4 (line 250) | def _parse_header_RC4(encryptionHeader):
class Doc97File (line 263) | class Doc97File(base.BaseOfficeFile):
method __init__ (line 279) | def __init__(self, file):
method load_key (line 301) | def load_key(self, password=None):
method decrypt (line 366) | def decrypt(self, outfile):
method is_encrypted (line 488) | def is_encrypted(self):
FILE: msoffcrypto/format/ooxml.py
function _is_ooxml (line 20) | def _is_ooxml(file):
function _parseinfo_standard (line 40) | def _parseinfo_standard(ole):
function _parseinfo_agile (line 63) | def _parseinfo_agile(ole):
function _parseinfo (line 114) | def _parseinfo(ole):
class OOXMLFile (line 129) | class OOXMLFile(base.BaseOfficeFile):
method __init__ (line 145) | def __init__(self, file):
method load_key (line 181) | def load_key(
method decrypt (line 247) | def decrypt(self, outfile, verify_integrity=False):
method encrypt (line 297) | def encrypt(self, password, outfile):
method is_encrypted (line 317) | def is_encrypted(self):
FILE: msoffcrypto/format/ppt97.py
function _parseRecordHeader (line 30) | def _parseRecordHeader(blob):
function _packRecordHeader (line 53) | def _packRecordHeader(rh):
function _parseCurrentUserAtom (line 96) | def _parseCurrentUserAtom(blob):
function _packCurrentUserAtom (line 150) | def _packCurrentUserAtom(currentuseratom):
function _parseCurrentUser (line 184) | def _parseCurrentUser(blob):
function _packCurrentUser (line 190) | def _packCurrentUser(currentuser):
function _parseUserEditAtom (line 220) | def _parseUserEditAtom(blob):
function _packUserEditAtom (line 277) | def _packUserEditAtom(usereditatom):
function _parsePersistDirectoryEntry (line 320) | def _parsePersistDirectoryEntry(blob):
function _packPersistDirectoryEntry (line 346) | def _packPersistDirectoryEntry(directoryentry):
function _parsePersistDirectoryAtom (line 377) | def _parsePersistDirectoryAtom(blob):
function _packPersistDirectoryAtom (line 407) | def _packPersistDirectoryAtom(directoryatom):
function _parseCryptSession10Container (line 422) | def _parseCryptSession10Container(blob):
function construct_persistobjectdirectory (line 452) | def construct_persistobjectdirectory(data):
class Ppt97File (line 511) | class Ppt97File(base.BaseOfficeFile):
method __init__ (line 527) | def __init__(self, file):
method __del__ (line 546) | def __del__(self):
method load_key (line 554) | def load_key(self, password=None):
method decrypt (line 607) | def decrypt(self, outfile):
method is_encrypted (line 812) | def is_encrypted(self):
FILE: msoffcrypto/format/xls97.py
function _parse_header_RC4 (line 379) | def _parse_header_RC4(encryptionInfo):
class _BIFFStream (line 392) | class _BIFFStream:
method __init__ (line 393) | def __init__(self, data):
method has_record (line 396) | def has_record(self, target):
method skip_to (line 410) | def skip_to(self, target):
method iter_record (line 421) | def iter_record(self):
class Xls97File (line 431) | class Xls97File(base.BaseOfficeFile):
method __init__ (line 451) | def __init__(self, file):
method __del__ (line 467) | def __del__(self):
method load_key (line 472) | def load_key(self, password=None):
method decrypt (line 552) | def decrypt(self, outfile):
method is_encrypted (line 649) | def is_encrypted(self):
FILE: msoffcrypto/method/container/ecma376_encrypted.py
function datetime2filetime (line 28) | def datetime2filetime(dt):
class RedBlack (line 41) | class RedBlack:
class DirectoryEntryType (line 46) | class DirectoryEntryType:
class SectorTypes (line 55) | class SectorTypes:
class DSPos (line 64) | class DSPos:
class DefaultContent (line 81) | class DefaultContent:
class Header (line 91) | class Header:
method __init__ (line 95) | def __init__(self):
method write_to (line 110) | def write_to(self, obuf):
class DirectoryEntry (line 151) | class DirectoryEntry:
method __init__ (line 152) | def __init__(
method write_header_to (line 180) | def write_header_to(self, obuf):
method write_filetime (line 220) | def write_filetime(self, obuf, ft):
method Name (line 225) | def Name(self):
method Name (line 229) | def Name(self, n):
method CLSID (line 239) | def CLSID(self):
method CLSID (line 243) | def CLSID(self, c):
method LeftSiblingId (line 250) | def LeftSiblingId(self):
method LeftSiblingId (line 254) | def LeftSiblingId(self, id):
method RightSiblingId (line 259) | def RightSiblingId(self):
method RightSiblingId (line 263) | def RightSiblingId(self, id):
method ChildId (line 268) | def ChildId(self):
method ChildId (line 272) | def ChildId(self, id):
method _valid_id (line 276) | def _valid_id(self, id):
class ECMA376EncryptedLayout (line 281) | class ECMA376EncryptedLayout:
method __init__ (line 282) | def __init__(self, sectorSize):
method fatPos (line 295) | def fatPos(self):
method miniFatPos (line 299) | def miniFatPos(self):
method directoryEntryPos (line 303) | def directoryEntryPos(self):
method miniFatDataPos (line 307) | def miniFatDataPos(self):
method contentSectorNum (line 311) | def contentSectorNum(self):
method encryptionPackagePos (line 320) | def encryptionPackagePos(self):
method totalSectors (line 324) | def totalSectors(self):
method totalSize (line 328) | def totalSize(self):
method offsetDirectoryEntries (line 332) | def offsetDirectoryEntries(self):
method offsetMiniFatData (line 336) | def offsetMiniFatData(self):
method offsetFat (line 340) | def offsetFat(self):
method offsetMiniFat (line 344) | def offsetMiniFat(self):
method offsetDifat (line 347) | def offsetDifat(self, n):
method offsetData (line 350) | def offsetData(self, startingSectorLocation):
method offsetMiniData (line 353) | def offsetMiniData(self, startingSectorLocation):
class ECMA376Encrypted (line 357) | class ECMA376Encrypted:
method __init__ (line 358) | def __init__(self, encryptedPackage=b"", encryptionInfo=b""):
method write_to (line 364) | def write_to(self, obuf):
method set_payload (line 378) | def set_payload(self, encryptedPackage, encryptionInfo):
method _get_default_header (line 382) | def _get_default_header(self):
method _get_directory_entries (line 385) | def _get_directory_entries(self):
method _write_to (line 484) | def _write_to(self, obuf):
method _write_directory_entries (line 523) | def _write_directory_entries(self, obuf, layout: ECMA376EncryptedLayout):
method _write_Content (line 535) | def _write_Content(self, obuf, layout: ECMA376EncryptedLayout):
method _write_FAT_start (line 547) | def _write_FAT_start(self, obuf, layout: ECMA376EncryptedLayout):
method _write_MiniFAT (line 561) | def _write_MiniFAT(self, obuf, layout: ECMA376EncryptedLayout):
method _write_FAT (line 567) | def _write_FAT(self, obuf, entries, blockSize):
method _write_DIFAT (line 602) | def _write_DIFAT(self, obuf, layout: ECMA376EncryptedLayout):
method _detect_sector_num (line 627) | def _detect_sector_num(self, layout: ECMA376EncryptedLayout):
method _set_sector_locations_of_streams (line 656) | def _set_sector_locations_of_streams(self, layout: ECMA376EncryptedLay...
method _get_MiniFAT_sector_number (line 696) | def _get_MiniFAT_sector_number(self, size):
method _get_block_num (line 699) | def _get_block_num(self, x, block):
FILE: msoffcrypto/method/ecma376_agile.py
function _random_buffer (line 39) | def _random_buffer(sz):
function _get_num_blocks (line 43) | def _get_num_blocks(sz, block):
function _round_up (line 47) | def _round_up(sz, block):
function _resize_buffer (line 51) | def _resize_buffer(buf, n, c=b"\0"):
function _normalize_key (line 58) | def _normalize_key(key, n):
function _get_hash_func (line 62) | def _get_hash_func(algorithm):
function _decrypt_aes_cbc (line 66) | def _decrypt_aes_cbc(data, key, iv):
function _encrypt_aes_cbc (line 73) | def _encrypt_aes_cbc(data, key, iv):
function _encrypt_aes_cbc_padded (line 82) | def _encrypt_aes_cbc_padded(data, key, iv, blockSize):
function _get_salt (line 91) | def _get_salt(salt_value=None, salt_size=16):
class ECMA376AgileCipherParams (line 104) | class ECMA376AgileCipherParams:
method __init__ (line 105) | def __init__(self):
function _enc64 (line 115) | def _enc64(b):
class ECMA376AgileEncryptionInfo (line 119) | class ECMA376AgileEncryptionInfo:
method __init__ (line 120) | def __init__(self):
method getEncryptionDescriptorHeader (line 131) | def getEncryptionDescriptorHeader(self):
method toEncryptionDescriptor (line 135) | def toEncryptionDescriptor(self):
function _generate_iv (line 156) | def _generate_iv(params: ECMA376AgileCipherParams, blkKey, salt_value):
class ECMA376Agile (line 165) | class ECMA376Agile:
method __init__ (line 166) | def __init__(self):
method _derive_iterated_hash_from_password (line 170) | def _derive_iterated_hash_from_password(
method _derive_encryption_key (line 193) | def _derive_encryption_key(h, blockKey, hashAlgorithm, keyBits):
method decrypt (line 206) | def decrypt(key, keyDataSalt, hashAlgorithm, ibuf):
method encrypt (line 246) | def encrypt(key, ibuf, salt_value=None, spin_count=100000):
method get_encryption_information (line 271) | def get_encryption_information(
method generate_encryption_parameters (line 292) | def generate_encryption_parameters(key, salt_value=None, spin_count=10...
method encrypt_payload (line 359) | def encrypt_payload(ibuf, params: ECMA376AgileCipherParams, secret_key...
method generate_integrity_parameter (line 402) | def generate_integrity_parameter(
method verify_password (line 423) | def verify_password(
method verify_integrity (line 469) | def verify_integrity(
method makekey_from_privkey (line 500) | def makekey_from_privkey(privkey, encryptedKeyValue):
method makekey_from_password (line 508) | def makekey_from_password(
FILE: msoffcrypto/method/ecma376_extensible.py
class ECMA376Extensible (line 1) | class ECMA376Extensible:
method __init__ (line 2) | def __init__(self):
FILE: msoffcrypto/method/ecma376_standard.py
class ECMA376Standard (line 13) | class ECMA376Standard:
method __init__ (line 14) | def __init__(self):
method decrypt (line 18) | def decrypt(key, ibuf):
method verifykey (line 35) | def verifykey(key, encryptedVerifier, encryptedVerifierHash):
method makekey_from_password (line 57) | def makekey_from_password(
FILE: msoffcrypto/method/rc4.py
function _makekey (line 21) | def _makekey(password, salt, block):
class DocumentRC4 (line 45) | class DocumentRC4:
method __init__ (line 46) | def __init__(self):
method verifypw (line 50) | def verifypw(password, salt, encryptedVerifier, encryptedVerifierHash):
method decrypt (line 73) | def decrypt(password, salt, ibuf, blocksize=0x200):
FILE: msoffcrypto/method/rc4_cryptoapi.py
function _makekey (line 22) | def _makekey(password, salt, keyLength, block, algIdHash=0x00008004):
class DocumentRC4CryptoAPI (line 38) | class DocumentRC4CryptoAPI:
method __init__ (line 39) | def __init__(self):
method verifypw (line 43) | def verifypw(
method decrypt (line 67) | def decrypt(password, salt, keySize, ibuf, blocksize=0x200, block=0):
FILE: msoffcrypto/method/xor_obfuscation.py
function _makekey (line 10) | def _makekey(password, salt, block):
class DocumentXOR (line 34) | class DocumentXOR:
method __init__ (line 35) | def __init__(self):
method verifypw (line 182) | def verifypw(password, verificationBytes):
method xor_ror (line 216) | def xor_ror(byte1, byte2):
method create_xor_key_method1 (line 220) | def create_xor_key_method1(password):
method create_xor_array_method1 (line 237) | def create_xor_array_method1(password):
method ror (line 306) | def ror(n, rotations, width):
method rol (line 310) | def rol(n, rotations, width):
method decrypt (line 314) | def decrypt(password, ibuf, plaintext, records, base):
FILE: tests/test_cli.py
class CLITest (line 5) | class CLITest(unittest.TestCase):
method test_cli (line 6) | def test_cli(self):
FILE: tests/test_compare_known_output.py
class KnownOutputCompare (line 49) | class KnownOutputCompare(unittest.TestCase):
method test_known_output (line 52) | def test_known_output(self):
FILE: tests/test_file_handle.py
class FileHandleTest (line 13) | class FileHandleTest(unittest.TestCase):
method test_file_handle_open (line 16) | def test_file_handle_open(self):
Condensed preview — 62 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (201K chars).
[
{
"path": ".github/SECURITY.md",
"chars": 209,
"preview": "# Security Policy\n\n## Reporting a Vulnerability\n\nTo report a security vulnerability, please use the\n[Tidelift security c"
},
{
"path": ".github/workflows/ci.yaml",
"chars": 1899,
"preview": "name: build\n\non:\n push:\n # branches: [$default-branch]\n branches: [\"master\"]\n tags: [\"*\"]\n pull_request:\n "
},
{
"path": ".gitignore",
"chars": 2253,
"preview": "docs/_static/\ndocs/_templates/\ndocs/_build/\n\n### https://raw.github.com/github/gitignore/4bff4a2986af526650f1d329d97047d"
},
{
"path": ".readthedocs.yml",
"chars": 196,
"preview": "version: 2\nsphinx:\n configuration: docs/conf.py\nbuild:\n os: ubuntu-22.04\n tools:\n python: \"3.11\"\npython:\n install"
},
{
"path": "CHANGELOG.md",
"chars": 3267,
"preview": "\nv6.0.0 / 2026-01-12\n===================\n\n * (BREAKING) Drop support for Python 3.8 and 3.9, add Python 3.14 to CI\n * "
},
{
"path": "LICENSE.txt",
"chars": 1062,
"preview": "MIT License\n\nCopyright (c) 2015 nolze\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof t"
},
{
"path": "NOTICE.txt",
"chars": 1705,
"preview": "This software contains derivative works from https://github.com/herumi/msoffice\nwhich is licensed under the BSD 3-Clause"
},
{
"path": "README.md",
"chars": 9719,
"preview": "# msoffcrypto-tool\n\n[](https://pypi.org/project/msoffcrypto-t"
},
{
"path": "docs/Makefile",
"chars": 634,
"preview": "# Minimal makefile for Sphinx documentation\n#\n\n# You can set these variables from the command line, and also\n# from the "
},
{
"path": "docs/cli.rst",
"chars": 133,
"preview": "Command-line interface\n======================\n\n.. toctree::\n\n.. autoprogram:: msoffcrypto.__main__:parser\n :prog: msof"
},
{
"path": "docs/conf.py",
"chars": 2254,
"preview": "# Configuration file for the Sphinx documentation builder.\n#\n# For the full list of built-in configuration values, see t"
},
{
"path": "docs/index.rst",
"chars": 504,
"preview": ".. msoffcrypto-tool documentation master file, created by\n sphinx-quickstart on Tue Oct 17 02:16:54 2023.\n You can a"
},
{
"path": "docs/make.bat",
"chars": 800,
"preview": "@ECHO OFF\r\n\r\npushd %~dp0\r\n\r\nREM Command file for Sphinx documentation\r\n\r\nif \"%SPHINXBUILD%\" == \"\" (\r\n\tset SPHINXBUILD=sp"
},
{
"path": "docs/modules.rst",
"chars": 70,
"preview": "msoffcrypto\n===========\n\n.. toctree::\n :maxdepth: 1\n\n msoffcrypto\n"
},
{
"path": "docs/msoffcrypto.exceptions.rst",
"chars": 189,
"preview": "msoffcrypto.exceptions package\n==============================\n\nModule contents\n---------------\n\n.. automodule:: msoffcry"
},
{
"path": "docs/msoffcrypto.format.rst",
"chars": 1166,
"preview": "msoffcrypto.format package\n==========================\n\nSubmodules\n----------\n\nmsoffcrypto.format.base module\n-----------"
},
{
"path": "docs/msoffcrypto.method.container.rst",
"chars": 459,
"preview": "msoffcrypto.method.container package\n====================================\n\nSubmodules\n----------\n\nmsoffcrypto.method.con"
},
{
"path": "docs/msoffcrypto.method.rst",
"chars": 1408,
"preview": "msoffcrypto.method package\n==========================\n\nSubpackages\n-----------\n\n.. toctree::\n :maxdepth: 1\n\n msoffcr"
},
{
"path": "docs/msoffcrypto.rst",
"chars": 282,
"preview": "msoffcrypto package\n===================\n\nSubpackages\n-----------\n\n.. toctree::\n :maxdepth: 1\n\n msoffcrypto.exception"
},
{
"path": "docs/requirements.txt",
"chars": 3205,
"preview": "accessible-pygments==0.0.5 ; python_version >= \"3.10\" and python_version < \"4.0\"\nalabaster==1.0.0 ; python_version >= \"3"
},
{
"path": "msoffcrypto/__init__.py",
"chars": 2813,
"preview": "import zipfile\n\nimport olefile\n\nfrom msoffcrypto import exceptions\n\n\ndef OfficeFile(file):\n \"\"\"Return an office file "
},
{
"path": "msoffcrypto/__main__.py",
"chars": 3136,
"preview": "import argparse\nimport getpass\nimport logging\nimport sys\n\nimport olefile\n\nfrom msoffcrypto import OfficeFile, exceptions"
},
{
"path": "msoffcrypto/exceptions/__init__.py",
"chars": 555,
"preview": "class FileFormatError(Exception):\n \"\"\"Raised when the format of given file is unsupported or unrecognized.\"\"\"\n\n pa"
},
{
"path": "msoffcrypto/format/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "msoffcrypto/format/base.py",
"chars": 415,
"preview": "import abc\n\n# For 2 and 3 compatibility\n# https://stackoverflow.com/questions/35673474/\nABC = abc.ABCMeta(\"ABC\", (object"
},
{
"path": "msoffcrypto/format/common.py",
"chars": 2675,
"preview": "import io\nimport logging\nfrom struct import unpack\n\nlogger = logging.getLogger(__name__)\nlogger.addHandler(logging.NullH"
},
{
"path": "msoffcrypto/format/doc97.py",
"chars": 16371,
"preview": "import io\nimport logging\nimport shutil\nimport tempfile\nfrom collections import namedtuple\nfrom struct import pack, unpac"
},
{
"path": "msoffcrypto/format/ooxml.py",
"chars": 12768,
"preview": "import base64\nimport io\nimport logging\nimport zipfile\nfrom struct import unpack\nfrom xml.dom.minidom import parseString\n"
},
{
"path": "msoffcrypto/format/ppt97.py",
"chars": 28358,
"preview": "import io\nimport logging\nimport shutil\nimport tempfile\nfrom collections import namedtuple\nfrom struct import pack, unpac"
},
{
"path": "msoffcrypto/format/xls97.py",
"chars": 18985,
"preview": "import io\nimport logging\nimport shutil\nimport tempfile\nfrom collections import namedtuple\nfrom struct import pack, unpac"
},
{
"path": "msoffcrypto/method/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "msoffcrypto/method/container/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "msoffcrypto/method/container/ecma376_encrypted.py",
"chars": 23528,
"preview": "import io\nfrom datetime import datetime\nfrom struct import pack\n\nimport olefile\n\n# An encrypted ECMA376 file is stored a"
},
{
"path": "msoffcrypto/method/ecma376_agile.py",
"chars": 19499,
"preview": "from __future__ import annotations\n\nimport base64\nimport functools\nimport hmac\nimport io\nimport logging\nimport secrets\nf"
},
{
"path": "msoffcrypto/method/ecma376_extensible.py",
"chars": 62,
"preview": "class ECMA376Extensible:\n def __init__(self):\n pass\n"
},
{
"path": "msoffcrypto/method/ecma376_standard.py",
"chars": 4107,
"preview": "import io\nimport logging\nfrom hashlib import sha1\nfrom struct import pack, unpack\n\nfrom cryptography.hazmat.backends imp"
},
{
"path": "msoffcrypto/method/rc4.py",
"chars": 3321,
"preview": "import functools\nimport io\nimport logging\nfrom hashlib import md5\nfrom struct import pack\n\nfrom cryptography.hazmat.back"
},
{
"path": "msoffcrypto/method/rc4_cryptoapi.py",
"chars": 2833,
"preview": "import functools\nimport io\nimport logging\nfrom hashlib import sha1\nfrom struct import pack\n\nfrom cryptography.hazmat.bac"
},
{
"path": "msoffcrypto/method/xor_obfuscation.py",
"chars": 8526,
"preview": "import io\nimport logging\nfrom hashlib import md5\nfrom struct import pack\n\nlogger = logging.getLogger(__name__)\nlogger.ad"
},
{
"path": "pyproject.toml",
"chars": 1258,
"preview": "[tool.poetry]\nname = \"msoffcrypto-tool\"\nversion = \"6.0.0\"\ndescription = \"Python tool and library for decrypting and encr"
},
{
"path": "tests/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "tests/test_cli.py",
"chars": 416,
"preview": "import subprocess\nimport unittest\n\n\nclass CLITest(unittest.TestCase):\n def test_cli(self):\n # Python 3:\n "
},
{
"path": "tests/test_cli.sh",
"chars": 2611,
"preview": "#!/usr/bin/env bash\n\nset -ev\n\ncd \"$(dirname \"$0\")\"\n\nmsoffcrypto-tool () {\n python ../msoffcrypto \"$@\"\n}\n\n# Decryption"
},
{
"path": "tests/test_compare_known_output.py",
"chars": 3836,
"preview": "#!/usr/bin/env python\n\n\"\"\"Compare output of msoffcrypto-tool for a few input files.\"\"\"\n\nimport os\nimport sys\nimport unit"
},
{
"path": "tests/test_file_handle.py",
"chars": 1221,
"preview": "\"\"\"Check that given file handles are not closed.\"\"\"\n\n\nimport unittest\nfrom os.path import dirname, join\n\nfrom msoffcrypt"
}
]
// ... and 17 more files (download for full content)
About this extraction
This page contains the full source code of the nolze/msoffcrypto-tool GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 62 files (184.3 KB), approximately 51.8k tokens, and a symbol index with 192 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.