Showing preview only (544K chars total). Download the full file or copy to clipboard to get everything.
Repository: mebeim/systrack
Branch: master
Commit: 0eed1f95234d
Files: 46
Total size: 523.5 KB
Directory structure:
gitextract_4nemvxd8/
├── .editorconfig
├── .gitattributes
├── .github/
│ └── workflows/
│ ├── publish.yml
│ └── test.yml
├── .gitignore
├── CHANGELOG.md
├── LICENSE
├── README.md
├── assets/
│ ├── github-social-card.xcf
│ └── logo.xcf
├── pyproject.toml
├── src/
│ └── systrack/
│ ├── __init__.py
│ ├── __main__.py
│ ├── arch/
│ │ ├── __init__.py
│ │ ├── arch_base.py
│ │ ├── arm.py
│ │ ├── arm64.py
│ │ ├── mips.py
│ │ ├── powerpc.py
│ │ ├── riscv.py
│ │ ├── s390.py
│ │ └── x86.py
│ ├── elf.py
│ ├── kconfig.py
│ ├── kconfig_options.py
│ ├── kernel.py
│ ├── location.py
│ ├── log.py
│ ├── output.py
│ ├── signature.py
│ ├── syscall.py
│ ├── templates/
│ │ ├── syscall_table.css
│ │ ├── syscall_table.html
│ │ └── syscall_table.js
│ ├── type_hints.py
│ ├── utils.py
│ └── version.py
└── tests/
├── __init__.py
├── data/
│ ├── .gitignore
│ ├── Makefile
│ └── x86_no_table_syscall_handlers.s
├── test_arch_sanity.py
├── test_mips.py
├── test_powerpc.py
├── test_x86.py
└── utils.py
================================================
FILE CONTENTS
================================================
================================================
FILE: .editorconfig
================================================
root = true
[*]
charset = utf-8
indent_style = tab
indent_size = 4
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true
[*.md]
indent_style = unset
[*.yml]
indent_style = space
indent_size = 2
================================================
FILE: .gitattributes
================================================
# Exclude assembly from linguist code stats (prevents GitHub from marking the
# repository as >50% assembly).
*.s linguist-vendored
================================================
FILE: .github/workflows/publish.yml
================================================
name: Publish to PyPI
on:
release:
types:
- published
# Allow only one concurrent job
concurrency:
group: publish
cancel-in-progress: false
jobs:
test-before-publish:
uses: ./.github/workflows/test.yml
publish:
needs: [test-before-publish]
runs-on: ubuntu-latest
environment:
name: hatch
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Ensure matching version and release tag
run: test v"$(python3 src/systrack/version.py)" = "${{github.ref_name}}"
- name: Install build dependencies
run: python3 -m pip install --upgrade build hatch
- name: Build wheel and sdist
run: hatch build
- name: Publish to PyPI
run: hatch publish --no-prompt
env:
HATCH_INDEX_USER: __token__
HATCH_INDEX_AUTH: ${{secrets.HATCH_INDEX_AUTH}}
================================================
FILE: .github/workflows/test.yml
================================================
name: Test
on:
push:
branches:
- main
- dev
workflow_call:
jobs:
test:
runs-on: ubuntu-22.04
strategy:
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install test dependencies
run: python3 -m pip install --upgrade build hatch pytest
- name: Run tests
run: hatch test
================================================
FILE: .gitignore
================================================
dist
systrack.egg-info
__pycache__
.pytest_cache
================================================
FILE: CHANGELOG.md
================================================
Systrack changelog
==================
v0.8
----
New arch support: IBM Z-Architecture S390 64-bit and compat 32-bit, tested on
v4.0+ kernels. Thanks to Ilya Leoshkevich ([@iii-i](https://github.com/iii-i))
for the initial implementation (#3).
**Improvements**:
- Produce lighter builds (hopefully) stripping apparmor and USB support as they
do not affect syscalls.
- Reduce possibility of build errors disabling `-Werror` where possible.
- Detect and deprioritize symbols coming from interprocedural optimization
(`xxx.localalias`) implemented in recent compiler versions for more precise
syscall symbol and name detection.
- Improve Kconfig parsing, sanity checks and warnings about Kconfig options.
- arm64: new arch-specific dummy syscall implementation detection helper.
**Bug fixes**:
- Fix internal `Versioned{Dict,List}` caching implementation, used for Kconfig
options mostly.
- Fix command formatting in debug logs, which should be now correctly
copy-pasteable into a shell as is.
- arm64: fix broken pkey syscalls detection. Implemented in v6.12 under
`ARM64_POE` config, but was wrongly detected as present on earlier kernels.
- powerpc, riscv: fix some imprecise/incorrect Kconfig option versioning and
dependenceis.
**Internal changes**:
- Move kconfig parsing logic into own `Kconfig` class.
- Improve `Kernel` exception semantics: throw exceptions at analysis time
instead of causing program exit.
- Improve `Arch` subclass method overrides and implement unit test to perform
sanity checks around abstract methods.
v0.7
----
New arch support: RISC-V 32-bit and 64-bit, tested on v4.15+ kernels (i.e.,
since the first Linux version supporting RISC-V).
**Improvements**:
- Improve dummy syscall implementation detection: try to first match known
"ni_syscall" code.
- Improve error messages and debug/info logs, pretty printing command-line
arguments and executed commands instead of dumping their tuple/list
representation.
- mips: implement simple arch-specific dummy syscall detection.
- arm64: remove "arm64_" arch-specific prefix from syscall names.
**Bug fixes**:
- mips: new dummy syscall detection now correctly identifies some dummy syscalls
that were previously missed (notably `cachestat`).
**Internal changes**:
- Archs can now specify multiple kernel Makefile config targets to run one after
the other as a "base" config.
v0.6
----
**Improvements**:
- More robust and comprehensive syscall definition location search.
**Bug fixes**:
- Fix broken syscall definition location search and subsequent signature
extraction. Some syscalls were incorrectly reported as defined in place of
others, also causing the wrong signature to be extracted. Do not fully trust
the output of `addr2line` and perform full syscall name matching to fix this.
PowerPC was notably affected the most by this issue.
v0.5.1
------
**Improvements**:
- x86: improve x86 syscall extraction code fixing undetected CALL targets.
**Internal changes**:
- x86: add some tests for syscall extraction based on v6.11 kernel build.
v0.5
----
We tried so hard, and got so far, but in the end, we need a disassembler! x86
mitigations have defeated us, we no longer have syscall tables to rely on.
Kernel developers were kind enough to write very simple ABI-specific
switch-based handlers to dispach syscalls, so analysis is still possible... just
significantly more complicated.
**Breaking changes**:
- Drop support for Python 3.6 and 3.7. Systrack now requires Python 3.8+. This
is because of the new dependency on
[`iced-x86`](https://pypi.org/project/iced-x86/).
**Improvements**:
- x86: support new kernels (6.9+) with no syscall tables.
- Remove unnecessary spaces between asterisks for double pointers in function
signatures.
- Avoid KFCI `__{cfi,pfx}_` symbols when looking for `ni_syscall` symbols.
**Internal changes**:
- Depend on [`iced-x86`](https://pypi.org/project/iced-x86/) for disassembling
x86 instructions and on [`jinja2`](https://pypi.org/project/jinja2/) for HTML
output directly. Remove optional dependencies and only build one package.
- Rename `test` folder to `tests` to use the `hatch test` as test commnad
- Improve logging reproducibility by sorting more debugging log output.
- Improve broken Python package metadata (Python packaging moment).
v0.4
----
New arch support: PowerPC 32-bit, tested on v5.0+ kernels.
**Improvements**:
- Improve kconfig dependency checking logic for better warning/error messages.
- PowerPC PPC64: improve esoteric fast switch_endian syscall detection.
- Better (narrower) emoji spacing in HTML output.
**Bug fixes**:
- Correctly report `delete_module` depending on `CONFIG_MODULE_UNLOAD=y`.
- Fix incorrectly handled shared syscall table in x86-64 x32 ABI resulting in
duplicated and unwanted entries in the output for kernels older than v5.4.
- Fix chance of building kernels without `memfd_create`, `memfd_secret`,
`delete_module` (and possibly others) by always enabling `MEMFD_CREATE`,
`MODULE_UNLOAD`, `NET` and `SECRETMEM` when available.
- Fix wrong handling of relative `--kdir` path (e.g., `.`) in some cases.
- Fix missed detection of non-implemented syscalls pointing to `kernel/sys_ni.c`
when DWARF debug info contains relative paths.
- x86 x32: fix some x64 syscalls reported twice because both the x64 number and
the historycally misnumbered x32 numbers (512-547) were being considered
valid.
**Internal changes**:
- Ignore `sound/` and `user/` dirs to speed up grepping syscall definitions.
- Implement some basic unit tests for powerpc dummy/esoteric syscall detection.
v0.3.3
------
**Improvements**:
- Correctly report `lsm_{list_modules,get_self_attr,set_self_attr}` depending on
`CONFIG_SECURITY=y`.
v0.3.2
------
**Improvements**:
- Correctly report `futex_{wait,wake,requeue}` depending on `CONFIG_FUTEX=y`.
- Use unicorn emoji (cuter) instead of test tube for esoteric syscalls in HTML
output.
v0.3.1
------
**Improvements**:
- x86: Add build support for `map_shadow_stack`.
- Prefer `compat_sys_` over `__se_compat_sys_` and other longer symbol synonyms;
same for `.compat_sys_` on PowerPC.
**Bug fixes**:
- Fix broken naive grepping of syscall definitions when no ripgrep is available.
- Correctly report `cachestat` depending on `CACHESTAT_SYSCALL=y`.
**Internal changes**:
- Sort stderr logs for reproducible output and easier diffing.
- Skip `lib/` directory in kernel sources to improve grepping performance.
v0.3
----
New arch support: PowerPC 64-bit, all ABIs, tested on v5.0+ kernels.
**Improvements:**
- Add ABI `bits` (integer) and `compat` (boolean) fields to JSON output.
- Support ELF symbols with weird names (special chars in the name).
- Support function descriptors for syscall table entries (useful for PowerPC64
and Itanium 64).
- Support weird arch-specific `SYSCALL_DEFINEn` macros.
- Building kernels now generates relative paths in DWARF debug symbols through
`-fdebug-prefix-map`.
- Improve stdout output and add a table header.
- Use `null` instead of `??`/`?` for unknown file/line info in JSON output.
- x86: improve dummy syscall implementation detection (handling endbr64/32
instructions).
- ARM OABI: output syscall number location for the calling convention
(`swi <NR>`).
**Bug fixes**:
- Correctly report `socketcall` depending on `CONFIG_NET=y`.
- Correctly strip more syscall symbol prefixes for more accurate syscall names.
- Fix bad symbol prefix detection in some weird edge cases, leading to wrong
syscall names.
- x86: fix wrong register names for x86-64 compat 32-bit ABI (IA-32).
**Internal changes**:
- Reorganize arch-specific code.
- Handle SIGINT for more graceful termination.
- Auto-remap definition locations relative to KDIR for ease of use.
v0.2.1
------
**Improvements**:
- Make syscall symbol preference more consistent (in particular, stop mixing
`__se_sys_xxx` and `sys_xxx` when possible).
- Achieve W3C compliance for HTML output format.
**Bug fixes**:
- x86: correct wrong syscall numbers for x32 ABI, they should all be ORed with
`0x40000000` (`__X32_SYSCALL_BIT`).
v0.2
----
**Improvements**:
- Improve existing MIPS build and analysis support: use `ip27_defconfig` for
64-bit for NUMA support and strip more symbol prefixes.
- Improve dummy syscall implementation detection (x86-64, ARM).
**Bug fixes**:
- Fix help text for `--arch`: building with `--arch arm` creates an
EABI-only kernel.
- Fix a logging bug that caused not loging syscalls' `.origname` for not-found
locations after grepping.
- x86: use the right Kconfig option for vm86 and vm86old
v0.1
----
First release.
================================================
FILE: LICENSE
================================================
GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for
software and other kinds of works.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users. We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors. You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights. Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received. You must make sure that they, too, receive
or can get the source code. And you must show them these terms so they
know their rights.
Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.
For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software. For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.
Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so. This is fundamentally incompatible with the aim of
protecting users' freedom to change the software. The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable. Therefore, we
have designed this version of the GPL to prohibit the practice for those
products. If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary. To prevent this, the GPL assures that
patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Use with the GNU Affero General Public License.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:
<program> Copyright (C) <year> <name of author>
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU GPL, see
<https://www.gnu.org/licenses/>.
The GNU General Public License does not permit incorporating your program
into proprietary programs. If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. But first, please read
<https://www.gnu.org/licenses/why-not-lgpl.html>.
================================================
FILE: README.md
================================================
Systrack
========
[![License][license-badge]](./LICENSE)
[![GitHub actions workflow status][actions-badge]][actions-link]
[![PyPI version][pypi-badge]][pypi-systrack]
[![PyPI downloads][pypi-badge2]][pypistats-systrack]
<img align="left" width="150" height="150" src="https://raw.githubusercontent.com/mebeim/systrack/master/assets/logo.png" alt="Systrack logo"></img>
**See [mebeim/linux-syscalls](https://github.com/mebeim/linux-syscalls) for live syscall tables powered by Systrack**.
Systrack is a tool to analyze Linux kernel images (`vmlinux`) and extract
information about implemented syscalls. Given a `vmlinux` image, Systrack can
extract syscall numbers, names, symbol names, definition locations within kernel
sources, function signatures, and more.
Systrack can configure and build kernels for all its
[supported architectures](#supported-architectures-and-abis), and works best at
analyzing kernels that it has configured and built by itself.
Installation
------------
Systrack is [available on PyPI][pypi-systrack], it requires Python 3.8+ and is
installable through Pip:
```bash
pip install systrack
```
Building and installaing from source requires [`hatch`][pypi-hatch]:
```bash
hatch build
pip install dist/systrack-XXX.whl
```
Usage
-----
Systrack can mainly be used for two purposes: analyzing or building Linux
kernels. See also [Command line help](#command-line-help) (`systrack --help`)
and [Supported architectures and ABIs](#supported-architectures-and-abis)
(`systrack --arch help`) below.
- **Analyzing** a kernel image can be done given a `vmlinux` ELF with symbols,
and optionally also a kernel source directory (`--kdir`). Systrack will
extract information about implemented syscalls from the symbol table present
in the given `vmlinux` ELF, and if debugging information is present, it will
also extract file and line number information for syscall definitions.
Supplying a `--kdir` pointing Systrack to the checked-out sources for the
right kernel version (the same as the one to analyze) will help refine and/or
correct the location of the definitions.
Systrack can guess the architecture and ABI to analyze, but if the given
kernel was built with support for multiple ABIs, the right one can be selected
through `--arch`.
```none
systrack path/to/vmlinux
systrack --format json path/to/vmlinux
systrack --format html path/to/vmlinux
systrack --kdir path/to/linux_git_repo path/to/vmlinux
systrack --kdir path/to/linux_git_repo --arch x86-64-ia32 path/to/vmlinux
```
- **Building** can be done through the `--build` option. You will need to
provide a kernel source directory (`--kdir`) and an architecture/ABI
combination to build for (`--arch`).
```none
systrack --build --kdir path/to/linux_source_dir --arch x86-64
```
When building, kernel sources are configured to enable all syscalls available
for the selected architecture/ABI as to produce a `vmlinux` with a "complete"
syscall table.
Cross-compilation with GCC is possible specifying the correct toolchain prefix
with the `--cross` option, which will set the `CROSS_COMPILE` variable for the
kernel's `Makefile`. Other environment variables can also be used as usual and
are passed as is to `make`, so LLVM [cross]-compilation and custom toolchain
usage is also possible.
```none
systrack --build --kdir path/to/linux_source --arch arm64 --cross aarch64-linux-gnu-
```
Supported architectures and ABIs
--------------------------------
Here's a list of supported arch/ABI combinations accepted via `--arch` (values
are case-insensitive). This information is also available running
`systrack --arch help`.
| Value | Aliases | Arch | Kernel | Syscall ABI | Build based on | Notes |
|:----------------|:-------------------|:--------|:-------|:---------------|:------------------------------|:--------|
| `arm` | `arm-eabi`, `eabi` | ARM | 32-bit | 32-bit EABI | `multi_v7_defconfig` | *[2]* |
| `arm-oabi` | `oabi` | ARM | 32-bit | 32-bit OABI | `multi_v7_defconfig` | *[2,4]* |
| `arm64` | `aarch64` | ARM | 64-bit | 64-bit AArch64 | `defconfig` | |
| `arm64-aarch32` | `aarch32` | ARM | 64-bit | 32-bit AArch32 | `defconfig` | *[1]* |
| `mips` | `mips32`, `o32` | MIPS | 32-bit | 32-bit O32 | `defconfig` | |
| `mips64` | `n64` | MIPS | 64-bit | 64-bit N64 | `ip27_defconfig` | *[1]* |
| `mips64-n32` | `n32` | MIPS | 64-bit | 64-bit N32 | `ip27_defconfig` | *[1]* |
| `mips64-o32` | `o32-64` | MIPS | 64-bit | 32-bit O32 | `ip27_defconfig` | *[1]* |
| `powerpc` | `ppc`, `ppc32` | PowerPC | 32-bit | 32-bit PPC32 | `ppc64_defconfig` | |
| `powerpc64` | `ppc64` | PowerPC | 64-bit | 64-bit PPC64 | `ppc64_defconfig` | *[1]* |
| `powerpc64-32` | `ppc64-32` | PowerPC | 64-bit | 32-bit PPC32 | `ppc64_defconfig` | *[1]* |
| `powerpc64-spu` | `ppc64-spu`, `spu` | PowerPC | 64-bit | 64-bit "SPU" | `ppc64_defconfig` | *[1,5]* |
| `riscv` | `riscv32`, `rv32` | RISC-V | 32-bit | 32-bit "RV32" | `defconfig` + `32-bit.config` | *[3,6]* |
| `riscv64` | `rv64` | RISC-V | 64-bit | 64-bit "RV64" | `defconfig` | *[1,6]* |
| `riscv64-32` | `rv64-32` | RISC-V | 64-bit | 32-bit "RV32" | `defconfig` | *[1,6]* |
| `s390x` | | IBM Z | 64-bit | 64-bit s390x | `defconfig` | *[1]* |
| `s390` | | IBM Z | 64-bit | 32-bit s390 | `defconfig` | *[1]* |
| `x86` | `i386`, `ia32` | x86 | 32-bit | 32-bit IA32 | `i386_defconfig` | |
| `x86-64` | `x64` | x86 | 64-bit | 64-bit x86-64 | `x86_64_defconfig` | *[1]* |
| `x86-64-x32` | `x32` | x86 | 64-bit | 64-bit x32 | `x86_64_defconfig` | *[1]* |
| `x86-64-ia32` | `ia32-64` | x86 | 64-bit | 32-bit IA32 | `x86_64_defconfig` | *[1]* |
Notes:
1. Building creates a kernel supporting all ABIs for this architecture.
2. Build based on `defconfig` for Linux <= v3.7.
3. Build based on `rv32_defconfig` for Linux <= v6.7 and `defconfig` for
Linux <= v5.0.
4. Building creates an EABI kernel with compat OABI support. Building an
OABI-only kernel is NOT supported. The seccomp filter system will be missing.
5. "SPU" is not a real ABI. It indicates a Cell processor SPU (Synergistic
Processing Unit). The ABI is really PPC64, but SPUs can only use a subset of
syscalls.
6. "RV32" and "RV64" are not real ABIs, but rather ISAs. The RISC-V syscall
ABI is the same for 32-bit and 64-bit (only register size differs). These
names are only used for clarity.
Runtime dependencies
--------------------
External (non-Python) runtime dependencies are:
- **Required**: `readelf` (from GNU binutils) is used to parse and extract ELF
metadata such as symbols and sections. This is currently the only *compulsory*
external dependency of Systrack.
- Optional: `addr2line` (from GNU binutils) is used to extract location
information from DWARF debug info. Without this program, Systrack will not
output any information about syscall definition locations.
- Optional: `rg` ([ripgrep][ripgrep]) is used for much faster recursive
grepping of syscall definition locations within kernel sources when needed.
Otherwise, a slower pure-Python implementation is used.
- Optional: a working compiler toolchain and
[kernel build dependencies](https://www.kernel.org/doc/html/latest/process/changes.html)
are obviously needed if you want Systrack to *build* kernels from source.
Limitations
-----------
- Supported kernel images: Systrack works with regular *uncompressed* `vmlinux`
ELF images and *needs* ELF symbols. Compressed and stripped kernel images are
not supported. Tools such as
[`vmlinux-to-elf`](https://github.com/marin-m/vmlinux-to-elf) can be used to
uncompress and unstrip kernel images, after which Systrack will be able to
analyze them.
- Old kernel versions: Systrack was mainly designed for and tested on modern
kernels (>= v4.0) and has not been tested on older kernels. It should still
*somewhat* work on older kernels, but without the same level of guarantee on
the correctness of the output. Support for old kernels may come gradually in
the future.
- Relocatable kernels: Systrack does not currently parse and apply ELF
relocations. This means that Systrack does not support kernels using
relocation entries for the syscall table. On some architectures (notably MIPS)
if the kernel is relocatable the syscall table is relocated at startup and
does not contain valid virtual addresses: Systrack will currently fail to
analyze such kernels.
- Building kernels: when building kernels fot you, Systrack does not aim at
building usable or sane kernel images. In fact, a lot of unneeded features are
disabled at build time (e.g., USB support). The goal is only to correctly
include all syscalls in the syscall table for later extraction.
*Do not run kernels built with Systrack.*
Command line help
-----------------
```none
$ systrack --help
usage: systrack [OPTIONS...] [VMLINUX]
Analyze a Linux kernel image and extract information about implemented syscalls
positional arguments:
VMLINUX path to vmlinux, if not inside KDIR or no KDIR supplied
options:
-h, --help show this help message and exit
-k KDIR, --kdir KDIR kernel source directory
-a ARCH, --arch ARCH kernel architecture/ABI combination; pass "help" for a list
(default: autodetect)
-b, --build configure and build kernel and exit
-c, --config configure kernel and exit
-C, --clean clean kernel sources (make distclean) and exit
-x PREFIX, --cross PREFIX
toolchain prefix for cross-compilation; use with -b/-c/-C
-o OUTDIR, --out OUTDIR
output directory for out-of-tree kernel build (make O=...); only
meaningful with -b/-c/-C
-f FMT, --format FMT output format: text, json or html (default: text)
--absolute-paths output absolute paths instead of paths relative to KDIR
--remap ORIG_KDIR replace ORIG_KDIR with the KDIR provided with -k/--kdir for paths
obtained from ELF debug information; needed if the kernel was
built with ORIG_KDIR as source directory instead of KDIR, and
debug info contains absolute paths to ORIG_KDIR
--checkout REF git checkout to REF inside KDIR before doing anything; the
special value "auto" can be used to checkout to the tag
corresponding to the detected kernel version from VMLINUX
--disable-opt try building kernel with reduced/disabled optimizations for more
reliable location results; only meaningful with -b
-q, --quiet quietness level:
-q = no info, -qq = no warnings, -qqq = no errors
-qqqq = no standard error output whatsoever
-v, --verbose verbosity level:
-v = info, -vv = debug, -vvv = more debug
-V, --version show version information and exit
```
---
*Copyright © 2023-2025 Marco Bonelli. Licensed under the GNU General Public License v3.0.*
[license-badge]: https://img.shields.io/github/license/mebeim/systrack?color=blue
[actions-badge]: https://img.shields.io/github/actions/workflow/status/mebeim/systrack/publish.yml?event=release&label=publish
[actions-link]: https://github.com/mebeim/systrack/actions/workflows/publish.yml
[pypi-badge]: https://img.shields.io/pypi/v/systrack
[pypi-badge2]: https://img.shields.io/pypi/dm/systrack
[pypi-systrack]: https://pypi.org/project/systrack/
[pypistats-systrack]: https://pypistats.org/packages/systrack
[pypi-hatch]: https://pypi.org/project/hatch
[ripgrep]: https://github.com/BurntSushi/ripgrep
================================================
FILE: pyproject.toml
================================================
[project]
name = 'systrack'
description = 'Linux kernel syscall implementation tracker'
authors = [{name = 'Marco Bonelli'}, {name = 'Marco Bonelli', email = 'marco@mebeim.net'}]
maintainers = [{name = 'Marco Bonelli'}, {name = 'Marco Bonelli', email = 'marco@mebeim.net'}]
license = {text = 'GNU General Public License v3 (GPLv3)'}
readme = 'README.md'
platforms = 'any'
requires-python = '>=3.8'
dynamic = ['version']
keywords = ['systrack', 'linux', 'kernel', 'syscall', 'kconfig', 'elf', 'abi']
classifiers = [
'Development Status :: 4 - Beta',
'Environment :: Console',
'Intended Audience :: Developers',
'Intended Audience :: Science/Research',
'Intended Audience :: System Administrators',
'License :: OSI Approved :: GNU General Public License v3 (GPLv3)',
'Natural Language :: English',
'Operating System :: OS Independent',
'Programming Language :: Python :: 3',
'Topic :: Security',
'Topic :: Software Development :: Embedded Systems',
'Topic :: Software Development :: Testing',
'Topic :: System :: Operating System Kernels :: Linux',
'Topic :: Utilities',
]
dependencies = [
'iced-x86~=1.21.0',
'jinja2~=3.1.2'
]
[project.urls]
Homepage = 'https://github.com/mebeim/systrack'
Repository = 'https://github.com/mebeim/systrack.git'
Changelog = 'https://github.com/mebeim/systrack/blob/master/CHANGELOG.md'
[project.scripts]
systrack = 'systrack.__main__:main'
[build-system]
requires = ['hatchling']
build-backend = 'hatchling.build'
[tool.hatch.version]
path = 'src/systrack/version.py'
[tool.hatch.build]
ignore-vcs = true
include = ['src/systrack/templates/*']
[tool.hatch.build.targets.wheel]
packages = ['src/systrack']
[tool.hatch.build.targets.sdist]
include = ['src', 'CHANGELOG.md']
[tool.hatch.envs.default]
python = '3'
[tool.hatch.envs.test]
dependencies = ['pytest']
[tool.ruff.lint]
# Don't warn for multi-line statements
ignore = ['E701']
[tool.ruff.lint.per-file-ignores]
# Don't warn for star imports in these files
'arch/__init__.py' = ['F403', 'F405']
'tests/*' = ['F403', 'F405']
================================================
FILE: src/systrack/__init__.py
================================================
================================================
FILE: src/systrack/__main__.py
================================================
import argparse
import logging
import os
import signal
import sys
from pathlib import Path
from textwrap import TextWrapper
from .arch import SUPPORTED_ARCHS, SUPPORTED_ARCHS_HELP
from .kernel import Kernel, KernelError, KernelArchError, KernelMultiABIError
from .kernel import KernelVersionError, KernelWithoutSymbolsError
from .log import log_setup, eprint
from .output import output_syscalls
from .utils import command_argv_to_string, command_available
from .utils import gcc_version, git_checkout, maybe_rel, format_duration
from .version import VERSION, VERSION_HELP
def sigint_handler(_, __):
sys.stderr.write('Caught SIGINT, stopping\n')
sys.exit(1)
def wrap_help(body: str) -> str:
'''Wrap a string to 65 columns without breaking words for a nice --help
output of the tool.
'''
tx = TextWrapper(65, break_long_words=False, replace_whitespace=False)
return '\n'.join(tx.fill(line) for line in body.splitlines() if line.strip())
def parse_args() -> argparse.Namespace:
'''Parse and partially validate command line arguments through argparse.
'''
ap = argparse.ArgumentParser(
prog='systrack',
usage='systrack [OPTIONS...] [VMLINUX]',
description='Analyze a Linux kernel image and extract information about implemented syscalls',
formatter_class=argparse.RawTextHelpFormatter
)
ap.add_argument('vmlinux', metavar='VMLINUX', nargs='?',
help=wrap_help('path to vmlinux, if not inside KDIR or no KDIR supplied'))
ap.add_argument('-k', '--kdir', metavar='KDIR',
help=wrap_help('kernel source directory'))
ap.add_argument('-a', '--arch', metavar='ARCH',
help=wrap_help('kernel architecture/ABI combination; pass "help" for a '
'list (default: autodetect)'))
ap.add_argument('-b', '--build', action='store_true',
help=wrap_help('configure and build kernel and exit'))
ap.add_argument('-c', '--config', action='store_true',
help=wrap_help('configure kernel and exit'))
ap.add_argument('-C', '--clean', action='store_true',
help=wrap_help('clean kernel sources (make distclean) and exit'))
ap.add_argument('-x', '--cross', metavar='PREFIX',
help=wrap_help('toolchain prefix for cross-compilation; use with -b/-c/-C'))
ap.add_argument('-o', '--out', metavar='OUTDIR',
help=wrap_help('output directory for out-of-tree kernel build (make '
'O=...); only meaningful with -b/-c/-C'))
ap.add_argument('-f', '--format', metavar='FMT',
choices=('text', 'json', 'html'), default='text',
help=wrap_help('output format: text, json or html (default: text)'))
ap.add_argument('--absolute-paths', action='store_true',
help=wrap_help('output absolute paths instead of paths relative to KDIR'))
ap.add_argument('--remap', metavar='ORIG_KDIR',
help=wrap_help('replace ORIG_KDIR with the KDIR provided with '
'-k/--kdir for paths obtained from ELF debug information; needed '
'if the kernel was built with ORIG_KDIR as source directory '
'instead of KDIR, and debug info contains absolute paths to '
'ORIG_KDIR'))
ap.add_argument('--checkout', metavar='REF',
help=wrap_help('git checkout to REF inside KDIR before doing anything; '
'the special value "auto" can be used to checkout to the tag '
'corresponding to the detected kernel version from VMLINUX'))
ap.add_argument('--disable-opt', action='store_true',
help=wrap_help('try building kernel with reduced/disabled '
'optimizations for more reliable location results; only meaningful '
'with -b'))
ap.add_argument('-q', '--quiet', action='count', default=0,
help=wrap_help('quietness level:\n'
' -q: no info; -qq: no warnings; -qqq: no errors;\n'
' -qqqq: no standard error output whatsoever'))
ap.add_argument('-v', '--verbose', action='count', default=0,
help=wrap_help('verbosity level:\n'
' -v: info; -vv: debug; -vvv: more debug;\n'
' -vvvv: also pass V=1 to make when building'))
ap.add_argument('-V', '--version', action='version', version=VERSION_HELP,
help=wrap_help('show version information and exit'))
return ap.parse_args()
def instantiate_kernel(*a, **kwa) -> Kernel:
'''Instantiate the Kernel class with the given parameters, handling and
printing possible errors.
'''
try:
return Kernel(*a, **kwa)
except KernelArchError as e:
eprint(str(e))
sys.exit(f"See '{sys.argv[0]} --arch help' for more information")
except KernelMultiABIError as e:
arch_class, abis = e.args[1:]
sys.exit(
f'Detected architecture: {arch_class.name}\n'
f'Detected ABIs: {", ".join(abis)}\n'
'This kernel was built with support for multiple syscall ABIs.\n'
'Select one using --arch NAME (see --arch HELP for more info).'
)
except KernelVersionError:
sys.exit(
'Unable to determine kernel version!\n',
'Did you specify a valid kernel source directory (--kdir) or vmlinux path?'
)
except KernelWithoutSymbolsError:
sys.exit(
'The provided kernel image has no symbols, which are necessary for Systrack to work.\n',
'You can try unstripping the image with tools such as "vmlinux-to-elf".'
)
except KernelError as e:
eprint(str(e))
sys.exit(1)
def main() -> int:
signal.signal(signal.SIGINT, sigint_handler)
args = parse_args()
log_setup(args.quiet, args.verbose, os.isatty(sys.stderr.fileno()))
logging.debug('Systrack v%s', VERSION)
logging.debug('Command line: systrack %s', command_argv_to_string(sys.argv[1:]))
arch_name = args.arch
if arch_name is not None:
arch_name = arch_name.lower()
if arch_name not in SUPPORTED_ARCHS:
if arch_name not in ('help', '?'):
eprint(f'Unsupported architecture/ABI combination: {arch_name}')
eprint('See --arch HELP for a list')
return 1
eprint(SUPPORTED_ARCHS_HELP)
return 0
if not args.kdir and not args.vmlinux:
eprint('Need to specify a kernel source direcory and/or path to vmlinux')
eprint('See --help for more information')
return 1
if not args.kdir and (args.checkout or args.config or args.build):
eprint('Need to specify a kernel source direcory (--kdir)')
return 1
if not arch_name and (args.config or args.build):
eprint('Need to specify an architecture/ABI combination (--arch)')
eprint('See --arch HELP for a list')
return 1
cross = args.cross or ''
vmlinux = Path(args.vmlinux) if args.vmlinux else None
kdir = Path(args.kdir) if args.kdir else None
outdir = Path(args.out) if args.out else None
rdir = Path(args.remap) if args.remap else None
# Checkout before building only if not set to auto
if args.checkout and args.checkout != 'auto':
eprint('Checking out to', args.checkout)
git_checkout(kdir, args.checkout)
if args.clean or args.config or args.build:
if args.out:
out = Path(args.out)
try:
if out.exists() and not out.is_dir():
eprint(f'Output directory "{args.out}" already exists and is not a directory')
return 1
out.mkdir(exist_ok=True)
except Exception as e:
eprint(f'Failed to create output directory "{args.out}": {str(e)}')
return 1
# Check that GCC is available and log its version for our own sanity to
# avoid mixing up toolchains
gcc_cmd = cross + 'gcc'
if not command_available(gcc_cmd):
eprint(f'Command "{gcc_cmd}" not found')
eprint('Make sure your cross-compilation toolchain is in $PATH')
return 127
if args.config or args.build:
eprint('Compiler:', gcc_version(gcc_cmd))
kernel = instantiate_kernel(arch_name, kdir=kdir, outdir=outdir, toolchain_prefix=cross)
if args.build:
eprint('Cleaning kernel sources')
kernel.clean()
eprint('Detected kernel version:', kernel.version_str)
eprint('Configuring kernel')
kernel.configure()
eprint('Building kernel (might take a while)')
elapsed = kernel.build(args.disable_opt)
eprint('Build took', format_duration(elapsed))
elif args.config:
eprint('Cleaning kernel sources')
kernel.clean()
eprint('Detected kernel version:', kernel.version_str)
eprint('Configuring kernel')
kernel.configure()
eprint('Done')
elif args.clean:
eprint('Cleaning kernel sources')
kernel.clean()
eprint('Done')
return 0
# Auto-checkout to the correct tag is only possible if we already have a
# vmlinux to extract the version from
if args.checkout == 'auto' and not vmlinux:
eprint('Cannot perform auto-checkout without a vmlinux image!')
return 1
if not vmlinux:
vmlinux = kdir / 'vmlinux'
if not vmlinux.is_file():
eprint(f'Unable to find vmlinux at "{vmlinux}".')
eprint('Build the kernel or provide a valid path.')
return 1
if not command_available('readelf'):
eprint('Command "readelf" unavailable, can\'t do much without it!')
return 127
kernel = instantiate_kernel(arch_name, vmlinux, kdir, outdir, rdir)
eprint('Detected kernel version:', kernel.version_str)
if args.checkout == 'auto':
assert kernel.version_source == 'vmlinux'
eprint('Checking out to', kernel.version_tag)
git_checkout(kdir, kernel.version_tag)
if not kernel.syscalls:
return 1
# Apply a couple of transformations that are independent of the chosen
# output format, and also check how many syscalls do not have location or
# signature information.
syscalls = kernel.syscalls
kdir = kernel.kdir
abs_paths = args.absolute_paths
n_no_loc = 0
n_no_sig = 0
n_grepped = 0
for sc in kernel.syscalls:
if sc.file is None:
n_no_loc += 1
else:
if kdir and not abs_paths:
sc.file = maybe_rel(sc.file, kdir)
if kdir and sc.signature is None:
n_no_sig += 1
if sc.grepped_location:
n_grepped += 1
eprint('Found', len(syscalls), 'implemented syscalls')
if n_grepped:
eprint('Found', n_grepped, 'definition location' + ('s' if n_grepped > 1 else ''), 'through grepping')
if n_no_loc:
eprint('Could not find definition location for', n_no_loc, 'syscall' + ('s' if n_no_loc > 1 else ''))
if n_no_sig:
eprint('Could not extract signature for', n_no_sig, 'syscall' + ('s' if n_no_sig > 1 else ''))
eprint()
output_syscalls(kernel, args.format)
return 0
# NOTE: this is NOT executed in a normal install, because the `systrack` command
# will point to a script that imports and directly calls the main() function
# above.
if __name__ == '__main__':
sys.exit(main())
================================================
FILE: src/systrack/arch/__init__.py
================================================
import logging
from typing import Optional, Type, Tuple, List
from ..elf import ELF
from ..type_hints import KernelVersion
from .arch_base import Arch
from .arm import ArchArm
from .arm64 import ArchArm64
from .mips import ArchMips
from .powerpc import ArchPowerPC
from .riscv import ArchRiscV
from .s390 import ArchS390
from .x86 import ArchX86
ARCH_CLASSES = (
ArchArm,
ArchArm64,
ArchMips,
ArchPowerPC,
ArchRiscV,
ArchS390,
ArchX86,
)
# NOTE: For the sake of mental sanity, try keeping abi= the same name as the one
# in the *.tbl files in the kernel sources.
SUPPORTED_ARCHS = {
'x86' : lambda v: ArchX86(v, abi='ia32', bits32=True), # "i386" ABI
'x86-64' : lambda v: ArchX86(v, abi='x64'), # "64" ABI
'x86-64-x32' : lambda v: ArchX86(v, abi='x32'),
'x86-64-ia32' : lambda v: ArchX86(v, abi='ia32'),
'arm' : lambda v: ArchArm(v, abi='eabi'),
'arm-oabi' : lambda v: ArchArm(v, abi='oabi'),
'arm64' : lambda v: ArchArm64(v, abi='aarch64'),
'arm64-aarch32': lambda v: ArchArm64(v, abi='aarch32'),
'mips' : lambda v: ArchMips(v, abi='o32', bits32=True),
'mips64' : lambda v: ArchMips(v, abi='n64'),
'mips64-n32' : lambda v: ArchMips(v, abi='n32'),
'mips64-o32' : lambda v: ArchMips(v, abi='o32'),
'powerpc' : lambda v: ArchPowerPC(v, abi='ppc32', bits32=True), # "32" ABI
'powerpc64' : lambda v: ArchPowerPC(v, abi='ppc64'), # "64" ABI
'powerpc64-32' : lambda v: ArchPowerPC(v, abi='ppc32'), # "32" ABI
'powerpc64-spu': lambda v: ArchPowerPC(v, abi='spu'),
'riscv' : lambda v: ArchRiscV(v, abi='rv32', bits32=True),
'riscv64' : lambda v: ArchRiscV(v, abi='rv64'),
'riscv64-32' : lambda v: ArchRiscV(v, abi='rv32'),
's390x' : lambda v: ArchS390(v, abi='s390x'),
's390' : lambda v: ArchS390(v, abi='s390'),
}
ARCH_ALIASES = (
# name alias
('x86' , 'i386' ),
('x86' , 'ia32' ),
('x86-64' , 'x64' ),
('x86-64-x32' , 'x32' ),
('x86-64-ia32' , 'ia32-64' ),
('arm' , 'arm-eabi' ),
('arm' , 'eabi' ),
('arm-oabi' , 'oabi' ),
('arm64' , 'aarch64' ),
('arm64-aarch32', 'aarch32' ),
('mips' , 'mips32' ),
('mips' , 'o32' ),
('mips64' , 'n64' ),
('mips64-n32' , 'n32' ),
('mips64-o32' , 'o32-64' ),
('powerpc' , 'ppc' ),
('powerpc' , 'ppc32' ),
('powerpc64' , 'ppc64' ),
('powerpc64-32' , 'ppc64-32' ),
('powerpc64-spu', 'ppc64-spu' ),
('powerpc64-spu', 'spu' ),
('riscv' , 'riscv32' ),
('riscv' , 'rv32' ),
('riscv64' , 'rv64' ),
('riscv64-32' , 'rv64-32' ),
)
SUPPORTED_ARCHS.update({alias: SUPPORTED_ARCHS[arch] for arch, alias in ARCH_ALIASES})
SUPPORTED_ARCHS_HELP = '''\
Supported architectures and ABIs (values are case-insensitive):
Value Aliases Arch Kernel Syscall ABI Build based on Notes
------------------------------------------------------------------------------------------------
arm arm-eabi, eabi ARM 32-bit 32-bit EABI multi_v7_defconfig [2]
arm-oabi oabi ARM 32-bit 32-bit OABI multi_v7_defconfig [2,4]
------------------------------------------------------------------------------------------------
arm64 aarch64 ARM 64-bit 64-bit AArch64 defconfig
arm64-aarch32 aarch32 ARM 64-bit 32-bit AArch32 defconfig [1]
------------------------------------------------------------------------------------------------
mips mips32, o32 MIPS 32-bit 32-bit O32 defconfig
mips64 n64 MIPS 64-bit 64-bit N64 ip27_defconfig [1]
mips64-n32 n32 MIPS 64-bit 64-bit N32 ip27_defconfig [1]
mips64-o32 o32-64 MIPS 64-bit 32-bit O32 ip27_defconfig [1]
------------------------------------------------------------------------------------------------
powerpc ppc, ppc32 PowerPC 32-bit 32-bit PPC32 ppc64_defconfig
powerpc64 ppc64 PowerPC 64-bit 64-bit PPC64 ppc64_defconfig [1]
powerpc64-32 ppc64-32 PowerPC 64-bit 32-bit PPC32 ppc64_defconfig [1]
powerpc64-spu ppc64-spu, spu PowerPC 64-bit 64-bit "SPU" ppc64_defconfig [1,5]
------------------------------------------------------------------------------------------------
riscv riscv32, rv32 RISC-V 32-bit 32-bit "RV32" defconfig + 32-bit.config [3,6]
riscv64 rv64 RISC-V 64-bit 64-bit "RV64" defconfig [1,6]
riscv64-32 rv64-32 RISC-V 64-bit 32-bit "RV32" defconfig [1,6]
------------------------------------------------------------------------------------------------
s390x IBM Z 64-bit 64-bit s390x defconfig [1]
s390 IBM Z 64-bit 32-bit s390 defconfig [1]
------------------------------------------------------------------------------------------------
x86 i386, ia32 x86 32-bit 32-bit IA32 i386_defconfig
x86-64 x64 x86 64-bit 64-bit x86-64 x86_64_defconfig [1]
x86-64-x32 x32 x86 64-bit 64-bit x32 x86_64_defconfig [1]
x86-64-ia32 ia32-64 x86 64-bit 32-bit IA32 x86_64_defconfig [1]
[1] Building creates a kernel supporting all ABIs for this architecture.
[2] Build based on "defconfig" for Linux <= v3.7.
[3] Build based on "rv32_defconfig" for Linux <= v6.7 and "defconfig" for Linux <= v5.0.
[4] Building creates an EABI kernel with compat OABI support. Building an OABI-only kernel is
NOT supported. The seccomp filter system will be missing.
[5] "SPU" is not a real ABI. It indicates a Cell processor SPU (Synergistic Processing Unit).
The ABI is really PPC64, but SPUs can only use a subset of syscalls.
[6] "RV32" and "RV64" are not real ABIs, but rather ISAs. The RISC-V syscall ABI is the same
for 32-bit and 64-bit (only register size differs). These names are only used for clarity.
'''
def arch_from_name(name: str, kernel_version: KernelVersion) -> Arch:
'''Instantiate and return the right Arch subclass given a human-friendly
name (--arch). The name should be already validated.
'''
return SUPPORTED_ARCHS[name](kernel_version)
def arch_from_vmlinux(vmlinux: ELF) -> Optional[Tuple[Type[Arch],bool,List[str]]]:
'''Determine architecture and supported ABIs from vmlinux ELF. Returns the
correct Arch subclass, the bitness and a list of detected ABIs.
'''
for klass in ARCH_CLASSES:
match = klass.match(vmlinux)
if match:
return klass, *match
logging.fatal('Unknown or unsupported architecture: e_machine = %d, '
'e_flags = 0x%x', vmlinux.e_machine, vmlinux.e_flags)
return None
================================================
FILE: src/systrack/arch/arch_base.py
================================================
import logging
from abc import ABC, abstractmethod
from typing import Tuple, List, Dict, Optional, final
from ..elf import Symbol, ELF
from ..syscall import Syscall
from ..type_hints import KernelVersion, EsotericSyscall
from ..utils import VersionedDict, anysuffix, noprefix, nosuffix
class Arch(ABC):
# Directory name for this arch in the kernel source, under arch/
name: Optional[str] = None
# Whether this arch is 32-bits or not
bits32: bool = False
# Selected ABI to inspect/build for
abi: Optional[str] = None
# Whether the selected ABI is 32-bits or not
abi_bits32: bool = False
# Whether this architecture makes use of function descriptors for function
# pointers or not
uses_function_descriptors: bool = False
# Are we looking for compat syscalls (COMPACT_SYSCALL_DEFINEn)? Or, in other
# words, is this not the "main" ABI of the kernel we're analyzing?
compat: bool = False
# Kernel version that we are intersted in analyzing
kernel_version: Optional[KernelVersion] = None
# Make targets to run (one by one in the specified order) to obtain the base
# config to build the kernel with
config_targets: Tuple[str,...] = ('defconfig',)
# Name of the syscall table symbol to look for
syscall_table_name: Optional[str] = 'sys_call_table'
# Base syscall number (actual syscall number is base + idx in syscall table)
# NOTE: easiest way to check this is to just compile a binary that makes a
# raw syscall for the right arch/ABI. The arch_syscall_addr() kernel
# function can also be useful to inspect.
syscall_num_base: int = 0
# Syscall number destination (register name, None if no register is used,
# e.g. arm/OABI where the instruction is swi <nr>). Subclasses must override
# this.
syscall_num_reg: Optional[str] = None
# Registers for syscall arguments. Subclasses must override this.
syscall_arg_regs: Optional[Tuple[str, ...]] = None
# Additional kconfig options to set
kconfig: VersionedDict = VersionedDict()
# Arch-specific syscall kconfig options dependency (supersedes global
# arch-agnostic KCONFIG_SYSCALL_DEPS (see the comment in kconfig_options.py
# to know how to fill this)
kconfig_syscall_deps: VersionedDict = VersionedDict()
def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bool):
self.kernel_version = kernel_version
self.bits32 = bits32
self.abi = abi
def __repr__(s):
return (f'{s.__class__.__name__}(name={s.name!r}, '
f'bits32={s.bits32}, abi={s.abi!r}, compat={s.compat!r}, ...)')
@staticmethod
@abstractmethod
def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
'''Determine if the given vmlinux ELF was built for this architecture,
and if so return the bitness as boolean (True if 32-bit) and a list of
detected ABIs. This is useful to determine which Arch subclass to
instantiate (if any).
'''
pass
@abstractmethod
def matches(self, vmlinux: ELF) -> bool:
'''Determine whether this architecture matches the one of the provided
vmlinux (machine and bits). This is useful as a sanity check, e.g. if
a subclass is instantiated and then we want to use it on an unknown
vmlinux (or multiple ones).
'''
pass
def adjust_abi(self, vmlinux: ELF):
'''Adjust internal ABI-specific attributes that can be ambiguous for a
certain ABI selection (e.g. syscall_table_name) to the correct value
based on the provided vmlinux.
'''
pass
def _preferred_symbol(self, a: Symbol, b: Symbol) -> Optional[Symbol]:
'''Arch-specific choices for preferred_symbol(). Returns None if no
preference can be determined.'''
return None
# NOTE: subclasses should only override _preferred_symbol() above
@final
def preferred_symbol(self, a: Symbol, b: Symbol) -> Symbol:
'''Decide which symbol should be preferred when multiple syscall symbols
point to the same virtual address.
This does not have any meaningful effect on the correctness of the
output, since at the end of the day if multiple symbols point to the
same vaddr, they are in fact the same function, and the location
information will also be correct regardless of which one is picked.
'''
# If only one symbol is compat, pick the most relevant one based on
# self.compat
xa = 'compat' in a.name
xb = 'compat' in b.name
if xa ^ xb:
if self.compat:
return a if xa else b
return b if xa else a
# Deprioritize symbols generated by interprocedural optimization
xa = '.localalias' in a.name
xb = '.localalias' in b.name
if xa ^ xb:
return a if xb else b
# Let subclasses have a say before falling back to generic criteria
p = self._preferred_symbol(a, b)
if p is not None:
return p
if a.name.startswith('sys_'): return a
if b.name.startswith('sys_'): return b
return a if a.name.startswith('compat_sys_') else b
def symbol_is_ni_syscall(self, sym: Symbol) -> bool:
'''Determine whether the symbol name identifies the special
"not implemented" syscall a.k.a. ni_syscall.
There can be multiple ni_syscall implementations with different
prefixes and at different vaddrs (go figure). Make sure to get all of
them (readelf -s vmlinux | grep ni_syscall).
For example on x86 v5.0+:
sys_ni_syscall
__x64_sys_ni_syscall
__ia32_sys_ni_syscall
By default, also avoid ftrace-related _eil_addr_XXX symbols generated
with CONFIG_FTRACE_SYSCALLS=y.
'''
# This generic approach should be good enough
return (
sym.type == 'FUNC'
and anysuffix(sym.name, 'sys_ni_syscall', 'compat_ni_syscall')
# Avoid ftrace-related symbols
and not sym.name.startswith('_eil_addr_')
# Avoid KCFI-related symbols
and not sym.name.startswith('__cfi_')
and not sym.name.startswith('__pfx_')
)
def skip_syscall(self, sc: Syscall) -> bool:
'''Determine whether to skip this syscall.
Kernels compiled with support for multiple ABIs might share the same
syscall table between two or more ABIs, and in such case we want to
filter out syscalls that aren't for the ABI we are currently inspecting.
E.G. on x86-64 the 64 and x32 ABI share the same syscall table
(sys_call_table) before v5.4, which also holds some x32 compat syscalls
that are only available for applications using the x32 ABI.
'''
return False
def _translate_syscall_symbol_name(self, sym_name: str) -> str:
'''Arch-specific choices for translate_syscall_symbol_name().'''
return sym_name
# NOTE: subclasses should only override _translate_syscall_symbol_name() above
@final
def translate_syscall_symbol_name(self, sym_name: str) -> str:
'''Translate symbol name into syscall name, potentially stripping or
replacing arch-specific suffixes/prefixes from the symbol name, in order
to be able to correctly identify a syscall. Overriding this shouldn't be
needed in most cases.
This default implementation just removes prefixes/suffixes that are not
common enough to be indentified as common prefixes and stripped
automatically.
'''
return noprefix(self._translate_syscall_symbol_name(sym_name),
'ptregs_sys_', 'ptregs_compat_sys_', '__se_compat_sys_',
'__se_sys_', '__sys_', 'compat_sys_')
def _normalize_syscall_name(self, name: str) -> str:
'''Normalize a syscall name possibly stripping unneeded arch-specific
prefixes/suffixes (e.g., "ia32_", "aarch32_", "oabi_", "ppc_" etc.).
These are prefixes/suffixes that are ACTUALLY PRESENT IN THE SOURCE,
and not just in the symbol name.
'''
return name
# NOTE: subclasses should only override _normalize_syscall_name() above
@final
def normalize_syscall_name(self, name: str) -> str:
'''Normalize a syscall name removing unneeded prefixes and suffixes.
These are prefixes/suffixes that are ACTUALLY PRESENT IN THE SOURCE,
and not just in the symbol name.
'''
# In theory we could also remove the trailing "16" from 16-bit UID
# syscalls (setuid16, chown16, etc.) since it's not the real syscall
# name, but that'd make the output a bit confusing because we'd have
# both 16-bit and 32-bit UID syscalls with the same names, so let's
# avoid it.
#name = nosuffix(name, '16')
# Y2038 patches rename syscalls that deal with time adding a "_time64"
# or "_time32" suffix to distinguish whether they use 64-bit time
# structs (e.g. `struct __kernel_timespec`) or 32-bit time structs (e.g.
# `struct old_timespec32`). The suffix is shortened to just "64" or "32"
# if the syscall name already ends in "time". This suffix is independent
# of the arch, so strip it regardless.
#
# In v5.1 a bunch of 64-bit time syscalls were added to 32-bit archs
# with some exceptions (notably riscv).
#
# SYSCALL_DEFINE5(recvmmsg_time32, ...) -> recvmmsg
# SYSCALL_DEFINE2(clock_adjtime32, ...) -> clock_adjtime
#
name = nosuffix(name, '_time32', '_time64')
if name.endswith('time32') or name.endswith('time64'):
name = name[:-2]
# Some architectures have a "sys32_" or "32_" prefix for... whatever
# annoying reason (e.g. v5.1 MIPS 64bit o32). Stripping it regardless of
# arch seems fine, so do it.
#
# asmlinkage long sys32_sync_file_range(...) -> sync_file_range
# SYSCALL_DEFINE4(32_truncate64, ...) -> truncate64
#
name = noprefix(name, '32_', 'sys32_')
# Some architectures have an "old_" prefix for old syscalls which have
# been superseded by new ones. There is also stuff like "oldumount"
# (v5.18 ARM), but that's actually a different syscall and the kernel
# also has "umount" under a different number, so leave it be.
#
# SYSCALL_DEFINE2(old_getrlimit, ...) -> getrlimit
# SYSCALL_DEFINE1(oldumount, ...) -> oldumount (leave it be)
#
name = noprefix(name, 'old_')
return self._normalize_syscall_name(name)
def _dummy_syscall_code(self, sc: Syscall, vmlinux: ELF) -> Optional[bytes]:
'''Determine whether a syscall has a dummy implementation (e.g. one that
only does `return -ENOSYS/-EINVAL`). If this is the case, return the
machine code of the syscall, otherwise None.
'''
return None
# NOTE: subclasses should only override _dummy_syscall_code() above
@final
def is_dummy_syscall(self, sc: Syscall, vmlinux: ELF,
ni_sym: Optional[bytes]=None, ni_code: Optional[bytes]=None) -> bool:
'''Determine whether a syscall has a dummy implementation (e.g. one that
only does `return -ENOSYS/-EINVAL`). Try matching the vaddr or code of a
known ni_syscall symbol first, otherwise fall back to arch-specific
logic.
'''
if ni_sym is not None:
if sc.symbol.real_vaddr == ni_sym.real_vaddr:
logging.info('Syscall %s (%s) is not really implemented: '
'vaddr matches %s', sc.name, sc.symbol.name,
ni_sym.name)
return True
# Cache ni_syscall code for speed as this function will definitely
# be called multiple times for the same ni_syscall.
if ni_code is not None:
code = vmlinux.vaddr_read(sc.symbol.real_vaddr, len(ni_code))
if code == ni_code:
logging.info('Syscall %s (%s) is not really implemented: '
'code matches %s', sc.name, sc.symbol.name,
ni_sym.name)
return True
code = self._dummy_syscall_code(sc, vmlinux)
if code is None:
return False
logging.info('Syscall %s (%s) is not really implemented: dummy '
'implementation: %s', sc.name, sc.symbol.name, code.hex())
return True
def adjust_syscall_number(self, number: int) -> int:
'''Adjust the number for the given syscall according to any
arch-specific quirk there might be (e.g. PowerPC with its interleaved
syscall numbers).
'''
return number
def have_syscall_table(self) -> bool:
'''Return whether the standard method of extracting virtual addresses
of syscall functions via syscall table works.'''
return self.syscall_table_name is not None
def extract_syscall_vaddrs(self, vmlinux: ELF) -> Dict[int,int]:
'''Extract virtual addresses of syscall functions. Implemented in case
this isn't just as simple as looking at the addresses in the syscall
table (e.g., there might not be one to begin with).
'''
logging.error("Sorry, don't know how to extract syscall vaddrs for this arch!")
return {}
def extract_esoteric_syscalls(self, vmlinux: ELF) -> List[EsotericSyscall]:
'''Extract weird arch-specific syscalls not in the syscall table: there
isn't much else to do except either manually list these (if they are
always present) or perform static binary analysis.
The returned value is a list of tuples of the form: (number, name,
symbol_name, signature, kconfig_opts).
NOTE: the symbol_name that is returned needs to exist in the given
vmlinux.
'''
return []
def syscall_def_regexp(self, syscall_name: Optional[str]=None) -> Optional[str]:
'''Return a regexp capable of matching syscall definitions using
arch-specific SYSCALL_DEFINEx macros with weird names or arch-specific
adsmlinkage function name prefixes. If syscall_name is given, return a
regexp to match this syscall definition exactly, otherwise just a
generic one.
With syscall_name: the returned regexp should match a macro call up to
and **including** the syscall name plus a word boundary or any useful
delimiter after the name to match it completely.
E.g.: r'SYSCALL_DEFINE\\d\\(name\\b' or r'asmlinkage long sys_name\\('.
Without syscall_name: the returned regexp should match the macro call up
to and **including** the first open parenthesis.
E.g.: r'SYSCALL_DEFINE\\d\\(' or r'asmlinkage long sys_\\w+\\('.
'''
# Dev note: the \\ above are because that's a docstring (lol), you
# obviously only need one in the regexp itself with the r'' syntax.
return None
================================================
FILE: src/systrack/arch/arm.py
================================================
from typing import Tuple, List, Optional
from ..elf import ELF, E_MACHINE, E_FLAGS
from ..kconfig_options import VERSION_INF
from ..syscall import Syscall
from ..type_hints import KernelVersion, EsotericSyscall
from ..utils import VersionedDict, noprefix, nosuffix
from .arch_base import Arch
class ArchArm(Arch):
name = 'arm'
bits32 = True
abi_bits32 = True
syscall_arg_regs = ('r0', 'r1', 'r2', 'r3', 'r4', 'r5', 'r6')
kconfig = VersionedDict((
# kexec_load
((2,6,21), VERSION_INF, 'KEXEC=y' , ['PM_SLEEP_SMP=y', 'MMU=y']),
# seccomp
((2,6,37), (5,10) , 'SECCOMP=y', []),
# No NUMA support => no mbind, migrate_pages, {get,set}_mempolicy
))
def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bool = True):
assert bits32, f'{self.__class__.__name__} is 32-bit only'
super().__init__(kernel_version, abi, True)
assert self.bits32 and self.abi_bits32
assert self.abi in ('eabi', 'oabi')
if self.kernel_version >= (3,7):
# We want a modern-enough processor for which SMP=y by default
self.config_targets = ('multi_v7_defconfig',)
else:
# TODO: not sure which config is best for < 3.7, but defconfig
# definitely isn't that good, we might be missing some syscalls e.g.
# kexec if SMP=n, so warn about it. This is something to think about
# when we get around supporting such kernel versions.
self.config_targets = ('defconfig',)
if self.abi == 'eabi':
# Apparently OABI_COMPAT is on by default on old kernels (e.g. 4.0),
# so disable it if not needed, or we're gonna build a kernel with
# no seccomp.
self.kconfig.add((2,6,16), VERSION_INF, 'OABI_COMPAT=n', [])
self.syscall_num_reg = 'r7'
elif self.abi == 'oabi':
self.syscall_num_base = 0x900000
# No register, number passed as immediate to the SWI instruction
self.syscall_num_reg = 'swi <NR>'
# Building an old OABI-only kernel is annoying. Assume EABI + compat
# OABI (OABI_COMPAT=y) and just build with support for both ABIs.
# FIXME: this will disable the seccomp syscall. Configure for an
# OABI-only kernel here in the future...
self.kconfig.add((2,6,16), VERSION_INF, 'OABI_COMPAT=y', ['AEABI=y', 'THUMB2_KERNEL=n'])
@staticmethod
def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
if vmlinux.e_machine != E_MACHINE.EM_ARM:
return None
assert vmlinux.bits32, 'EM_ARM 64-bit? WAT'
if 'sys_oabi_call_table' in vmlinux.symbols:
abis = ['eabi', 'oabi']
else:
# For EABI, e_flags in the ELF header should tell us the EABI
# version (assuming it is set).
if (vmlinux.e_flags & E_FLAGS.EF_ARM_EABI_MASK) != 0:
abis = ['eabi']
abis = ['oabi']
return True, abis
def matches(self, vmlinux: ELF) -> bool:
return vmlinux.bits32 and vmlinux.e_machine == E_MACHINE.EM_ARM
def adjust_abi(self, vmlinux: ELF):
# We could be dealing with an EABI + compat OABI kernel or an
# EABI/OABI-only kernel. In the former case, we'll need to select the
# compat syscall table.
if self.abi == 'oabi' and 'sys_oabi_call_table' in vmlinux.symbols:
# EABI + compat OABI
self.compat = True
self.syscall_table_name = 'sys_oabi_call_table'
else:
# EABI/OABI only
self.compat = False
self.syscall_table_name = 'sys_call_table'
def _translate_syscall_symbol_name(self, sym_name: str) -> str:
# For some reason some syscalls are wrapped in assembly at the entry
# point e.g. sys_sigreturn_wrapper v5.18 arch/arm/kernel/entry-common.S.
# Stripping the "_wrapper" suffix can help locate them through source
# code grepping.
return nosuffix(sym_name, '_wrapper')
def _normalize_syscall_name(self, name: str) -> str:
if self.abi == 'oabi':
# E.g. v5.18 asmlinkage long sys_oabi_connect(...)
name = noprefix(name, 'oabi_')
# E.g. v5.18 asmlinkage long sys_arm_fadvise64_64(...)
return noprefix(name, 'arm_')
def _dummy_syscall_code(self, sc: Syscall, vmlinux: ELF) -> Optional[bytes]:
# Match the following code exactly with either #21 (EINVAL - 1) or #37
# (ENOSYS - 1) as immediate for MVN:
#
# f06f 0015 mvn.w r0, #21
# 4770 bx lr
#
# Taken from sys_fork on v5.0 multi_v7_defconfig with MMU=n.
#
if sc.symbol.size != 6:
return None
code = vmlinux.read_symbol(sc.symbol)
if code in (b'\x6f\xf0\x15\x00\x70\x47', b'\x6f\xf0\x25\x00\x70\x47'):
return code
return None
def extract_esoteric_syscalls(self, vmlinux: ELF) -> List[EsotericSyscall]:
# ARM-specific syscalls that are outside the syscall table, with numbers
# in the range 0x0f0000-0x0fffff for EABI and 0x9f0000-0x9fffff for
# OABI. These are all implemented in arm_syscall()
# (arch/arm/kernel/traps.c) with a switch statement. WEEEIRD!
#
if 'arm_syscall' not in vmlinux.functions:
return []
base = self.syscall_num_base + 0x0f0000
res = [
(base + 1, 'breakpoint', 'arm_syscall', (), None),
(base + 2, 'cacheflush', 'arm_syscall', ('unsigned long start', 'unsigned long end', 'int flags'), None),
(base + 3, 'usr26' , 'arm_syscall', (), None),
(base + 4, 'usr32' , 'arm_syscall', (), None),
(base + 5, 'set_tls' , 'arm_syscall', ('unsigned long val',), None),
]
if self.kernel_version >= (4,15):
res.append((base + 6, 'get_tls', 'arm_syscall', (), None))
return res
def syscall_def_regexp(self, syscall_name: Optional[str]=None) -> Optional[str]:
if self.abi != 'oabi':
return None
if syscall_name is not None:
if syscall_name.startswith('sys_oabi_'):
return rf'\basmlinkage\s*(unsigned\s+)?\w+\s*{syscall_name}\s*\('
return rf'\basmlinkage\s*(unsigned\s+)?\w+\s*sys_oabi_{syscall_name}\s*\('
return r'\basmlinkage\s*(unsigned\s+)?\w+\s*sys_oabi_\w+\s*\('
================================================
FILE: src/systrack/arch/arm64.py
================================================
from typing import Tuple, List, Optional
from ..elf import Symbol, ELF, E_MACHINE
from ..kconfig_options import VERSION_INF
from ..syscall import Syscall
from ..type_hints import KernelVersion
from ..utils import VersionedDict, noprefix
from .arch_base import Arch
class ArchArm64(Arch):
name = 'arm64'
bits32 = False
syscall_num_reg = 'w8'
syscall_arg_regs = ('x0', 'x1', 'x2', 'x3', 'x4', 'x5')
kconfig = VersionedDict((
# Enable aarch32 ABI regardless, should be =y by default, but better safe than sorry
((3,7) , VERSION_INF, 'COMPAT=y' , ['ARM64_4K_PAGES=y', 'EXPERT=y']),
# kexec[_file]_load
((4,8) , VERSION_INF, 'KEXEC=y' , ['PM_SLEEP_SMP=y']),
((5,0) , VERSION_INF, 'KEXEC_FILE=y' , []),
# seccomp
((3,19), (5,10) , 'SECCOMP=y' , []),
# mbind, migrate_pages, {get,set}_mempolicy
((4,7) , VERSION_INF, 'NUMA=y' , []),
# pkey syscalls, technically defaults to =y
((6,12), VERSION_INF, 'ARM64_POE=y' , []),
# map_shadow_stack (needs UPROBES=n disabled via UPROBE_EVENTS=n)
((6,13), VERSION_INF, 'UPROBE_EVENTS=n', []),
((6,13), VERSION_INF, 'ARM64_GCS=y' , ['UPROBES=n']),
))
kconfig_syscall_deps = VersionedDict((
((6,13), VERSION_INF, 'map_shadow_stack', 'ARM64_GCS'),
((6,12), VERSION_INF, 'pkey_alloc' , 'ARM64_POE'),
((6,12), VERSION_INF, 'pkey_free' , 'ARM64_POE'),
((6,12), VERSION_INF, 'pkey_mprotect' , 'ARM64_POE'),
))
def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bool = False):
assert not bits32, f'{self.__class__.__name__} is 64-bit only'
assert kernel_version >= (3,7), 'Linux only supports arm64 from v3.7'
super().__init__(kernel_version, abi, False)
assert not self.bits32
assert self.abi in ('aarch64', 'aarch32')
if self.abi == 'aarch32':
self.compat = True
self.abi_bits32 = True
self.syscall_table_name = 'compat_sys_call_table'
@staticmethod
def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
if vmlinux.e_machine != E_MACHINE.EM_AARCH64:
return None
assert not vmlinux.bits32, 'EM_AARCH64 32-bit? WAT'
if 'compat_sys_call_table' in vmlinux.symbols:
abis = ['aarch64', 'aarch32']
else:
abis = ['aarch64']
return False, abis
def matches(self, vmlinux: ELF) -> bool:
return not vmlinux.bits32 and vmlinux.e_machine == E_MACHINE.EM_AARCH64
def _preferred_symbol(self, a: Symbol, b: Symbol) -> Optional[Symbol]:
# See commit 4378a7d4be30ec6994702b19936f7d1465193541
if a.name.startswith('__arm64_'):
return a
if b.name.startswith('__arm64_'):
return b
return None
def _normalize_syscall_name(self, name: str) -> str:
# E.g. v5.18 COMPAT_SYSCALL_DEFINE6(aarch32_mmap2, ...)
# E.g. v5.2-v6.13+ SYSCALL_DEFINE1(arm64_personality, ...)
return noprefix(name, 'aarch32_', 'arm64_')
def _dummy_syscall_code(self, sc: Syscall, vmlinux: ELF) -> Optional[bytes]:
# Match the following code exactly with either -22 (EINVAL) or -38
# (-ENOSYS) as immediate for MOV:
#
# 928004a0 mov x0, #0xffffffffffffffda // #-38
# d65f03c0 ret
#
# Taken from __arm64_sys_pkey_alloc on v6.11.
#
if sc.symbol.size > 8 or sc.symbol.size == 4:
return None
assert not vmlinux.big_endian
code = vmlinux.read_symbol(sc.symbol)
if not code.endswith(b'\xc0\x03\x5f\xd6'): # ret
return None
# MOVN <Xd>, #<imm>{, LSL #<shift>}
mov = int.from_bytes(code[:4], 'little')
if mov & 0xff80001f != 0x92800000:
return None
hw = (mov >> 20) & 0x3
imm = ~(((mov >> 5) & 0xffff) << (hw * 16))
if imm == -38 or imm == -22:
return code
return None
================================================
FILE: src/systrack/arch/mips.py
================================================
from typing import Tuple, List, Optional
from ..elf import ELF, E_MACHINE
from ..kconfig_options import VERSION_ZERO, VERSION_INF
from ..syscall import Syscall
from ..type_hints import KernelVersion
from ..utils import VersionedDict, anyprefix, noprefix
from .arch_base import Arch
class ArchMips(Arch):
name = 'mips'
syscall_num_reg = 'v0'
kconfig = VersionedDict((
# kexec[_file]_load
((2,6,20), (3,9) , 'KEXEC=y' , ['EXPERIMENTAL=y']),
((3,9) , VERSION_INF, 'KEXEC=y' , []),
# seccomp
((2,6,15), (5,10) , 'SECCOMP=y', []),
))
def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bool = False):
super().__init__(kernel_version, abi, bits32)
assert self.abi in ('o32', 'n32', 'n64')
if self.abi == 'o32':
self.abi_bits32 = True
# Interestingly, man 2 syscall states: "The mips/o32 system call
# convention passes arguments 5 through 8 on the user stack".
# What syscall takes 8 arguments on MIPS o32? WTF.
self.syscall_num_base = 4000
self.syscall_arg_regs = ('a0', 'a1', 'a2', 'a3', 'stack', 'stack', 'stack', 'stack')
if not self.bits32:
self.syscall_table_name = 'sys32_call_table'
else:
self.abi_bits32 = False
self.syscall_arg_regs = ('a0', 'a1', 'a2', 'a3', 'a4', 'a5')
if self.abi == 'n64':
self.syscall_num_base = 5000
else: # n32
self.syscall_num_base = 6000
self.syscall_table_name = 'sysn32_call_table'
if self.bits32:
# MIPS 32bit means o32 ABI.
assert self.abi == 'o32'
# Just to be clear: for 32-bit we are ok with defconfig
self.config_targets = ('defconfig',)
self.kconfig.add(VERSION_ZERO, VERSION_INF, '32BIT=y', [])
self.kconfig.add(VERSION_ZERO, VERSION_INF, '64BIT=n', [])
# Select CPU release. It does not seem to matter much, so select R2,
# which has the best kernel version compatibility (along with R1).
# These are a multiple choice menu, so better set all of them.
self.kconfig.add((2,6,15), VERSION_INF, 'CPU_MIPS32_R1=n', [])
self.kconfig.add((2,6,15), VERSION_INF, 'CPU_MIPS32_R2=y', ['SYS_HAS_CPU_MIPS32_R2=y'])
self.kconfig.add((4,0) , VERSION_INF, 'CPU_MIPS32_R6=n', [])
else:
self.compat = self.abi != 'n64'
# Grab SGI IP27 (Origin200/2000), which apparently is one of the
# only two MIPS machine with NUMA support along with Longsoon64
# (loongson3_defconfig), as the latter is more of a pain in the ass
# to build. No need to select CPU release for this, it's R10000.
self.config_targets = ('ip27_defconfig',)
self.kconfig.add(VERSION_ZERO, VERSION_INF, '32BIT=n', [])
self.kconfig.add(VERSION_ZERO, VERSION_INF, '64BIT=y', [])
# 32-bit has no NUMA support (apparently), but 64-bit does and
# ip27_defconfig should include it. Make sure an error is raised in
# case of no NUMA. Needed for mbind, migrate_pages,
# {get,set}_mempolicy.
self.kconfig.add(VERSION_ZERO, VERSION_INF, 'NUMA=y', ['SYS_SUPPORTS_NUMA=y'])
# MIPS 64bit supports all ABIs: 32bit o32, 64bit n32, 64bit n64.
# Enable all of them regardless, we will be able to extract the
# right syscall table anyway.
self.kconfig.add(VERSION_ZERO, VERSION_INF, 'MIPS32_O32=y', [])
self.kconfig.add(VERSION_ZERO, VERSION_INF, 'MIPS32_N32=y', [])
@staticmethod
def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
if vmlinux.e_machine != E_MACHINE.EM_MIPS:
return None
if vmlinux.bits32:
abis = ['o32']
else:
abis = ['n64']
if 'sys32_call_table' in vmlinux.symbols:
abis.append('o32')
if 'sysn32_call_table' in vmlinux.symbols:
abis.append('n32')
return vmlinux.bits32, abis
def matches(self, vmlinux: ELF) -> bool:
return (
vmlinux.e_machine == E_MACHINE.EM_MIPS
and vmlinux.bits32 == self.bits32
)
def _normalize_syscall_name(self, name: str) -> str:
# E.G. v5.1 asmlinkage int sysm_pipe(void) for weird historical reasons
# E.G. v5.18 SYSCALL_DEFINE6(mips_mmap, ...)
# E.G. v5.0-6.13+ asmlinkage long mipsmt_sys_sched_setaffinity(...)
return noprefix(name, 'sysm_', 'mips_', 'mipsmt_sys_')
def _dummy_syscall_code(self, sc: Syscall, vmlinux: ELF) -> Optional[bytes]:
# Match the following code exactly with either -22 (EINVAL) or -89
# (-ENOSYS), which of course is different than normalon MIPS) as
# immediate for LI:
#
# 03e00008 jr ra
# 2402ffa7 li v0,-89
#
# Taken from __se_sys_cachectl on v6.9 64-bit ip27_defconfig.
#
if sc.symbol.size != 8:
return None
code = vmlinux.read_symbol(sc.symbol)
if vmlinux.big_endian:
if not code.startswith(b'\x03\xe0\x00\x08\x24\x02'):
return None
imm = int.from_bytes(code[6:], 'big', signed=True)
else:
if not (code.startswith(b'\x08\x00\xe0\x03') and code.endswith(b'\x02\x24')):
return None
imm = int.from_bytes(code[4:6], 'little', signed=True)
if imm == -22 or imm == -89:
return code
return None
def syscall_def_regexp(self, syscall_name: Optional[str]=None) -> Optional[str]:
# Absolutely insane old-style prefixes on MIPS...
exps = []
if syscall_name is not None:
if anyprefix(syscall_name, 'sysm_', 'mipsmt_sys_'):
exps.append(rf'\basmlinkage\s*(unsigned\s+)?\w+\s*{syscall_name}\s*\(')
else:
exps.append(rf'\basmlinkage\s*(unsigned\s+)?\w+\s*(sysm|mipsmt_sys)_{syscall_name}\s*\(')
if self.abi == 'n32':
if anyprefix(syscall_name, 'sysn32_'):
exps.append(rf'\basmlinkage\s*(unsigned\s+)?\w+\s*{syscall_name}\s*\(')
else:
exps.append(rf'\basmlinkage\s*(unsigned\s+)?\w+\s*sysn32_{syscall_name}\s*\(')
else:
exps.append(r'\basmlinkage\s*(unsigned\s+)?\w+\s*(sysm|mipsmt_sys)_\w+\s*\(')
if self.abi == 'n32':
exps.append(r'\basmlinkage\s*(unsigned\s+)?\w+\s*sysn32_\w+\s*\(')
return '|'.join(exps)
================================================
FILE: src/systrack/arch/powerpc.py
================================================
from struct import iter_unpack
from typing import Tuple, List, Optional
from operator import itemgetter
from ..elf import Symbol, ELF, E_MACHINE
from ..kconfig_options import VERSION_ZERO, VERSION_INF
from ..syscall import Syscall
from ..type_hints import KernelVersion, EsotericSyscall
from ..utils import VersionedDict, noprefix
from .arch_base import Arch
class ArchPowerPC(Arch):
name = 'powerpc'
syscall_num_base = 0
syscall_num_reg = 'r0'
# NOTE: We treat "SPU" as an ABI, even though it's not a real ABI. It stands
# for "Synergistic Processor Unit", one of the CPUs composing a Cell
# processor: https://en.wikipedia.org/wiki/Cell_(processor). SPUs are quite
# peculiar: as the comment in arch/powerpc/platforms/cell/spu_callbacks.c
# (v5.0) explains, they can only use a subset of the syscalls defined for
# the "64" ABI.
# NOTE: we are assuming to have PPC_BOOK3S=y (and therefore PPC_BOOK3S_32=y
# for 32-bit or PPC_BOOK3S_64=y for 64-bit)
kconfig = VersionedDict((
# These are needed for RELOCATABLE=n, we do not really need to list
# dependencies since we are disabling them.
((2,6,30) , VERSION_INF, 'PPC_OF_BOOT_TRAMPOLINE=n', []),
((2,6,16) , (2,6,27) , 'CRASH_DUMP=n' , []),
((2,6,27) , VERSION_INF, 'CRASH_DUMP=n' , []),
((4,12) , VERSION_INF, 'CRASH_DUMP=n' , []),
((3,4) , VERSION_INF, 'FA_DUMP=n' , []),
# Needs to be set here too because arch-specific kconfigs are applied
# after those listed in KCONFIG_DEBUGGING (kconfig_options.py)
(VERSION_ZERO, VERSION_INF, 'RELOCATABLE=n', ['PPC_OF_BOOT_TRAMPOLINE=n', 'CRASH_DUMP=n', 'FA_DUMP=n']),
# kexec_load
((2,6,15) , (3,9) , 'KEXEC=y', ['PPC_BOOK3S=y', 'EXPERIMENTAL=y']),
((3,9) , VERSION_INF, 'KEXEC=y', ['PPC_BOOK3S=y']),
# seccomp
((2,6,15) , (5,10) , 'SECCOMP=y', ['PROC_FS=y']),
# rtas
((2,6,15) , VERSION_INF, 'PPC_RTAS=y', []),
))
# FIXME: more like a curiosity, but why the hell do migrate_pages and
# move_pages look like they depend on MIGRATION and not necessarily on NUMA,
# but then aren't available for PPC 32-bit which has NUMA=n???
kconfig_syscall_deps = VersionedDict((
(VERSION_ZERO, VERSION_INF, 'pkey_alloc' , 'PPC_MEM_KEYS'),
(VERSION_ZERO, VERSION_INF, 'pkey_free' , 'PPC_MEM_KEYS'),
(VERSION_ZERO, VERSION_INF, 'pkey_mprotect', 'PPC_MEM_KEYS'),
))
def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bool = False):
super().__init__(kernel_version, abi, bits32)
assert self.abi in ('ppc32', 'ppc64', 'spu')
# The "powerpc" directory was added under arch in v2.6.15 and it weirdly
# coexisted with "ppc" until v2.6.27, when the latter was removed.
assert self.kernel_version >= (2,6,15), 'kernel too old, sorry!'
if self.abi == 'spu':
# spu_syscall_table only exists since v2.6.16, I have no idea how
# things were handled before then. This is a rather old kernel
# version, we'll worry about it in the future (if ever).
assert self.kernel_version >= (2,6,16), 'kernel too old, sorry!'
if self.abi == 'ppc32':
self.syscall_arg_regs = ('r3', 'r4', 'r5', 'r6', 'r7', 'r8', 'r9')
self.abi_bits32 = True
else:
self.syscall_arg_regs = ('r3', 'r4', 'r5', 'r6', 'r7', 'r8')
self.abi_bits32 = False
if self.bits32:
self.compat = False
self.uses_function_descriptors = False
self.syscall_table_name = 'sys_call_table'
# PPC_BOOK3S_32 was introduced in v2.6.31. We'll worry about
# older kernels in the future (if ever).
assert self.kernel_version >= (2,6,31), 'kernel too old, sorry!'
# Apparently there isn't a nice 32-bit defconfig and one needs
# to manually disable 64-bit??? What in tarnation >:( lame!
# There's ppc_defconfig from v5.2, which also takes half the time to
# build so it'd be nice to use... but using it as is without tweaks
# compiles a kernel without memfd_create.
self.config_targets = ('ppc64_defconfig',)
self.kconfig.add(VERSION_ZERO, VERSION_INF, 'PPC64=n', [])
self.kconfig.add(VERSION_ZERO, VERSION_INF, 'PPC_BOOK3S_32=y', [])
else:
self.compat = self.abi != 'ppc64'
self.abi_bits32 = self.abi == 'ppc32'
self.config_targets = ('ppc64_defconfig',)
self.uses_function_descriptors = True
if self.abi == 'spu':
self.syscall_table_name = 'spu_syscall_table'
elif self.abi == 'ppc32' and self.kernel_version >= (5,0):
# 32-bit and 64-bit syscalls before v5.0 share the same table
# (see skip_syscall() below), they are split in two tables only
# from v5.0.
self.syscall_table_name = 'compat_sys_call_table'
# PowerPC64 supports all ABIs: 64, 32, "spu". Enable all of them, we
# will be able to extract the right syscall table regardless.
self.kconfig.add((2,6,15), (5,7) , 'COMPAT=y', ['PPC64=y'])
self.kconfig.add((5,7) , VERSION_INF, 'COMPAT=y', ['PPC64=y', 'CPU_LITTLE_ENDIAN=n', 'CC_IS_CLANG=n'])
# Needed for NUMA=y
self.kconfig.add((2,6,15), (2,6,22) , 'PPC_PSERIES=y', ['PPC64=y', 'PPC_MULTIPLATFORM=y']),
self.kconfig.add((2,6,22), VERSION_INF, 'PPC_PSERIES=y', ['PPC64=y', 'PPC_BOOK3S=y']),
# mbind, migrate_pages, {get,set}_mempolicy
# NOTE: in theory depends on (PPC_PSERIES || PPC_POWERNV) after
# 5.10, but we are assuming PPC_PSERIES=y
self.kconfig.add((2,6,15), VERSION_INF, 'NUMA=y', ['PPC64=y', 'SMP=y', 'PPC_PSERIES=y'])
# kexec_file_load
self.kconfig.add((4,10) , VERSION_INF, 'KEXEC_FILE=y', ['PPC64=y', 'CRYPTO=y', 'CRYPTO_SHA256=y'])
# Needed for PPC_SUBPAGE_PROT=y
# NOTE: in theory depends on (44x || PPC_BOOK3S_64), but we are
# assuming PPC_BOOK3S_64=y
self.kconfig.add((2,6,15), VERSION_INF, 'PPC_64K_PAGES=y', ['PPC_BOOK3S_64=y'])
# subpage_prot (ppc only, 64-bit only)
self.kconfig.add((2,6,25), (5,17) , 'PPC_SUBPAGE_PROT=y', ['PPC_64K_PAGES=y', 'PPC_BOOK3S_64=y'])
self.kconfig.add((5,17) , VERSION_INF, 'PPC_SUBPAGE_PROT=y', ['PPC_64K_PAGES=y', 'PPC_64S_HASH_MMU=y'])
# pkey_alloc, pkey_free, pkey_mprotect
self.kconfig.add((4,16) , (5,17) , 'PPC_MEM_KEYS=y', ['PPC_BOOK3S_64=y'])
self.kconfig.add((5,17) , VERSION_INF, 'PPC_MEM_KEYS=y', ['PPC_BOOK3S_64=y', 'PPC_64S_HASH_MMU=y'])
# switch_endian (esoteric fast version)
self.kconfig.add((4,15) , (6,12) , 'PPC_FAST_ENDIAN_SWITCH=y', []),
# spu_run, spu_create
self.kconfig.add((2,6,16), VERSION_INF, 'SPU_FS=y' , ['PPC_CELL=y', 'COREDUMP=y']),
self.kconfig.add((2,6,18), VERSION_INF, 'SPU_BASE=y', []),
@staticmethod
def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
if vmlinux.e_machine == E_MACHINE.EM_PPC:
assert vmlinux.bits32, 'EM_PPC 64-bit? WAT'
elif vmlinux.e_machine == E_MACHINE.EM_PPC64:
assert not vmlinux.bits32, 'EM_PPC64 32-bit? WAT'
else:
return None
if vmlinux.bits32:
abis = ['ppc32']
else:
abis = ['ppc64']
# v5.0+ has a separate compat table and can be built with COMPAT=n.
# Before v5.0 64-bit and 32-bit syscalls share a single table and
# apparently it's always COMPAT=y. If none of these match, we must
# be dealing with a v5.0+ COMPAT=n kernel, which is the only case
# where there's no 32-bit syscall table.
if 'compat_sys_call_table' in vmlinux.symbols \
or 'compat_sys_execve' in vmlinux.symbols \
or '.compat_sys_execve' in vmlinux.symbols:
abis.append('ppc32')
if 'spu_syscall_table' in vmlinux.symbols:
abis.append('spu')
return vmlinux.bits32, abis
def matches(self, vmlinux: ELF) -> bool:
# Linux PPC 32-bit should be big-endian only
assert vmlinux.big_endian, 'Little-endian PowerPC 32-bit kernel? WAT'
return (
vmlinux.e_machine == (E_MACHINE.EM_PPC64, E_MACHINE.EM_PPC)[self.bits32]
and vmlinux.bits32 == self.bits32
)
def _preferred_symbol(self, a: Symbol, b: Symbol) -> Optional[Symbol]:
# Function descriptors take the "nice" symbol name, while the actual
# functions have a goofy dot prefix.
adot = a.name.startswith('.')
bdot = b.name.startswith('.')
if adot or bdot:
if not adot: return b
if not bdot: return a
if a.name.startswith('.sys_'): return a
if b.name.startswith('.sys_'): return b
return a if a.name.startswith('.compat_sys_') else b
return None
def skip_syscall(self, sc: Syscall) -> bool:
if self.bits32 or self.kernel_version >= (5,0):
return False
# On PowerPC 64-bit before v5.0, 64-bit and 32-bit syscalls are
# *interleaved* in the same syscall table, with 64-bit syscalls at even
# indexes. This means that we need to ignore half the syscall table! :')
if self.abi == 'ppc32':
return sc.index % 2 == 0
# 'ppc64' or 'spu'
return sc.index % 2 == 1
def _translate_syscall_symbol_name(self, sym_name: str) -> str:
return noprefix(sym_name, '.sys_', '.')
def _normalize_syscall_name(self, name: str) -> str:
return noprefix(name, 'ppc64_', 'ppc32_', 'ppc_')
def _dummy_syscall_code(self, sc: Syscall, vmlinux: ELF) -> Optional[bytes]:
# Check for `li r3,-ENOSYS; blr` optionally accompained by some other
# known non-branching instructions along the way:
#
# - {mflr,mtlr} r0
# - {stw,std,lwz,ld} r0,X(r1)
# - matching stwu/stdu and addi on r1 (stack pointer)
# - bl (to call _mcount() or other func, which *has* to return)
# - nop (ori 0,0,0)
#
# TODO: relies on the symbol having a valid size (!= 0), improve?
if sc.symbol.size < 8:
return None
code = vmlinux.read_symbol(sc.symbol)
r1_dec = r1_inc = None
insns = []
for insn in map(itemgetter(0), iter_unpack('<>'[vmlinux.big_endian] + 'L', code)):
hi = insn >> 16
# mflr r0 / mtlr r0 / nop (ori 0,0,0)
if insn in (0x7c0802a6, 0x7c0803a6, 0x60000000):
continue
# bl X
if (hi >> 8) == 0x4b:
continue
# stw r0,X(r1) / std r0,X(r1) / lwz r0,X(r1) / ld r0,X(r1)
if hi in (0x9001, 0xf801, 0xe801, 0x8001):
continue
# stdu r1,X(r1)
if insn & 0xffff0003 == 0xf8210001:
r1_dec = 0x10000 - (insn & 0xfffc)
continue
# stwu r1,X(r1)
if hi in (0x9421, 0xf821):
r1_dec = 0x10000 - (insn & 0xffff)
continue
# addi r1,r1,X (after stwu/stdu)
if hi == 0x3821 and r1_dec is not None:
r1_inc = insn & 0xffff
continue
if len(insns) > 2:
return None
insns.append(insn)
# Stack pointer decrement/increment must match
if (r1_dec is not None or r1_inc is not None) and r1_dec != r1_inc:
return None
# li r3,-ENOSYS; blr
if insns == [0x3860ffda, 0x4e800020]:
return code
return None
def adjust_syscall_number(self, number: int) -> int:
if self.bits32 or self.kernel_version >= (5,0):
return number
# See comment in skip_syscall() above.
return number // 2
def extract_esoteric_syscalls(self, vmlinux: ELF) -> List[EsotericSyscall]:
# This is currently only used for fast switch_endian, which is only
# implemented for ppc64 and was killed in v6.12. Save some time here.
if self.abi != 'ppc64' or self.kernel_version >= (6,12):
return []
# The switch_endian syscall has a "fast" version implemented with a
# branch at syscall entry point (arch/powerpc/kernel/exceptions-64s.S).
#
# The symbol to look at is exc_real_0xc00_system_call, where we should
# find `cmpdi r0,0x1ebe` followed by a `beq-` to code that updates the
# saved LE bit in SRR1. The same code has been there since at least
# v2.6.31.
#
# 2c 20 1e be cmpdi r0,7870
# 41 c2 00 20 beq X
# ...
# 7d 9b 02 a6 X: mfsrr1 r12
# 69 8c 00 01 xori r12,r12,1
# 7d 9b 03 a6 mtsrr1 r12
# 4c 00 00 24 rfid
#
# This "fast" implementation depends on PPC_FAST_ENDIAN_SWITCH from
# v4.15 onwards. It was removed in v6.12. Old kernels only had this fast
# version and no switch_endian syscall in the syscall table, which was
# added in v4.1 (529d235a0e190ded1d21ccc80a73e625ebcad09b).
#
# FIXME: on older kernels (< v5.0) the associated syscall entry symbol
# may be different.
#
exc = vmlinux.symbols.get('exc_real_0xc00_system_call')
if exc is None:
return []
# Unfortunately we cannot rely on the symbol having a good size, so just
# find the next symbol after it and use it as a boundary.
boundary = vmlinux.next_symbol(exc)
boundary = boundary.vaddr if boundary else exc.vaddr + 0x80
code = vmlinux.vaddr_read(exc.vaddr, boundary - exc.vaddr)
insns = iter_unpack('<>'[vmlinux.big_endian] + 'L', code)
insns = list(map(itemgetter(0), insns))
try:
idx_cmpdi = insns.index(0x2c201ebe)
beq = insns[idx_cmpdi + 1]
except (IndexError, ValueError):
return []
idx_mfsrr1 = idx_cmpdi + 1 + (beq & 0xffff) // 4
if idx_mfsrr1 >= len(insns) or insns[idx_mfsrr1] != 0x7d9b02a6:
return []
# Match the branch after the cmpdi. Technically it should be a `beq-`
# (beq with not taken branch prediction), but also accept others.
# beq- beq+ beq beq
if (beq >> 16) not in (0x41c2, 0x41e2, 0x4182, 0x41a2):
return []
try:
idx_xori = insns.index(0x698c0001, idx_mfsrr1 + 1)
idx_mtsrr1 = insns.index(0x7d9b03a6, idx_xori + 1)
insns.index(0x4c000024, idx_mtsrr1 + 1)
except ValueError:
return []
# We have the syscall
kconf = 'PPC_FAST_ENDIAN_SWITCH' if self.kernel_version >= (4,15) else None
return [(0x1ebe, 'switch_endian', exc.name, (), kconf)]
def syscall_def_regexp(self, syscall_name: Optional[str]=None) -> Optional[str]:
if self.abi != 'ppc32':
return None
if syscall_name is not None:
return rf'\bPPC32_SYSCALL_DEFINE\d\s*\({syscall_name}\b'
return r'\bPPC32_SYSCALL_DEFINE\d\s*\('
================================================
FILE: src/systrack/arch/riscv.py
================================================
from typing import Tuple, List, Optional
from ..elf import Symbol, ELF, E_MACHINE
from ..kconfig_options import VERSION_INF
from ..type_hints import KernelVersion
from ..utils import VersionedDict
from .arch_base import Arch
class ArchRiscV(Arch):
name = 'riscv'
syscall_num_reg = 'a7'
syscall_arg_regs = ('a0', 'a1', 'a2', 'a3', 'a4', 'a5')
kconfig = VersionedDict((
# kexec_load
((5,13), VERSION_INF, 'KEXEC=y', ['MMU=y']),
# seccomp
((5,5) , (5,10) , 'SECCOMP=y', []),
# mbind, {migrate.move}_pages, {get,set}_mempolicy
((5,12), VERSION_INF, 'NUMA=y', ['SMP=y', 'MMU=y']),
))
def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bool=False):
super().__init__(kernel_version, abi, bits32)
assert kernel_version >= (4,15), 'Linux only supports RISC-V from v4.15'
assert self.abi in ('rv32', 'rv64')
if self.abi == 'rv32':
self.abi_bits32 = True
if not self.bits32:
assert self.kernel_version >= (5,19), 'Linux only supports compat RV32 from v5.19'
self.compat = True
self.syscall_table_name = 'compat_sys_call_table'
if self.bits32:
if self.kernel_version >= (6,8):
# rv32_defconfig removed in v6.8
self.config_targets = ('defconfig', '32-bit.config')
elif self.kernel_version >= (5,1):
self.config_targets = ('rv32_defconfig',)
else:
self.config_targets = ('defconfig',)
# No "easy" make target for 32-bit before 5.1. Need manual config.
self.kconfig.add((4,15), (5,1) , '32BIT=y', [])
self.kconfig.add((4,15), (5,1) , '64BIT=n', [])
self.kconfig.add((4,15), (5,1) , 'ARCH_RV32I=y', [])
self.kconfig.add((4,15), (5,1) , 'ARCH_RV64I=n', [])
self.kconfig.add((4,15), (4,18), 'CPU_SUPPORTS_32BIT_KERNEL=y', [])
self.kconfig.add((4,15), (4,18), 'CPU_SUPPORTS_64BIT_KERNEL=n', [])
else:
self.config_targets = ('defconfig',)
# Enable compat ABI regardless (should be =y by default, but better
# safe than sorry)
self.kconfig.add((5,19), VERSION_INF, 'COMPAT=y', ['64BIT=y', 'MMU=y']),
# kexec_file_load
self.kconfig.add((5,19), (6,2) , 'KEXEC_FILE=y', ['64BIT=y'])
self.kconfig.add((5,19), (6,7) , 'KEXEC_FILE=y', ['64BIT=y','MMU=y'])
self.kconfig.add((6,7) , VERSION_INF, 'KEXEC_FILE=y', ['64BIT=y'])
@staticmethod
def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
if vmlinux.e_machine != E_MACHINE.EM_RISCV:
return None
if vmlinux.bits32:
abis = ['rv32']
else:
abis = ['rv64']
if 'compat_sys_call_table' in vmlinux.symbols:
abis.append('rv32')
return vmlinux.bits32, abis
def matches(self, vmlinux: ELF) -> bool:
return (
vmlinux.e_machine == E_MACHINE.EM_RISCV
and vmlinux.bits32 == self.bits32
)
def _preferred_symbol(self, a: Symbol, b: Symbol) -> Optional[Symbol]:
if a.name.startswith('__riscv_'):
return a
if b.name.startswith('__riscv_'):
return b
return None
================================================
FILE: src/systrack/arch/s390.py
================================================
import re
import struct
from typing import Tuple, List, Optional, Dict
from ..elf import Symbol, ELF, E_MACHINE
from ..kconfig_options import VERSION_INF
from ..type_hints import KernelVersion
from ..utils import VersionedDict, noprefix
from .arch_base import Arch
class ArchS390(Arch):
name = 's390'
syscall_table_name = 'sys_call_table'
syscall_num_reg = 'r1'
syscall_arg_regs = ('r2', 'r3', 'r4', 'r5', 'r6', 'r7')
kconfig = VersionedDict((
# TODO: validate and see which ones of these (if any) may make sense to
# move in global kconfig options.
# 32-bit abi
((2,6,12), VERSION_INF, 'COMPAT=y', []),
# error: invalid hard register usage between output operands
((2,6,19), VERSION_INF, 'ZCRYPT=n', []),
# Error: junk at end of line
((2,6,37), VERSION_INF, 'JUMP_LABEL=n', []),
# s390-specific pci syscalls implemented in
# commit cd24834130ac ("s390/pci: base support")
((3,8), VERSION_INF, 'PCI=y', []),
# misaligned symbol `__nospec_call_start'
((4,16), VERSION_INF, 'EXPOLINE=n', []),
# load BTF from vmlinux: Invalid argument
((5,16), (6,0), 'DEBUG_INFO_BTF=n', []),
))
def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bool = False):
assert not bits32, f'{self.__class__.__name__} is 64-bit only'
super().__init__(kernel_version, abi, False)
assert self.abi in ('s390', 's390x')
if self.abi == 's390':
self.compat = True
self.abi_bits32 = True
self.syscall_table_name = 'sys_call_table_emu'
@staticmethod
def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
if vmlinux.e_machine != E_MACHINE.EM_S390:
return None
assert not vmlinux.bits32, 'EM_S390 32-bit? WAT'
if 'sys_call_table_emu' in vmlinux.symbols:
abis = ['s390', 's390x']
else:
abis = ['s390x']
return False, abis
def matches(self, vmlinux: ELF) -> bool:
return not vmlinux.bits32 and vmlinux.e_machine == E_MACHINE.EM_S390
def _preferred_symbol(self, a: Symbol, b: Symbol) -> Optional[Symbol]:
# See commit aa0d6e70d3b34e710a6a57a53a3096cb2e0ea99f
if a.name.startswith('__s390x_'):
return a
if b.name.startswith('__s390x_'):
return b
return None
def _translate_syscall_symbol_name(self, sym_name: str) -> str:
if self.abi == 's390':
# sys_ prefix is used for compat syscalls with 0 arguments, which
# do not need wrapping. It is not common enough to be detected by
# common_syscall_symbol_prefixes().
return noprefix(sym_name, 'sys_')
return sym_name
def _normalize_syscall_name(self, name: str) -> str:
# Unlike most other archs where there is an arch-specific prefix for a
# significant number of (or nearly all) syscalls, in S390 this prefix
# is actually part of the syscall name for some arch-specific syscalls.
# Use a whitelist approach instead of blindly stripping it. These
# syscalls are also named using the prefix in man section 2.
known = {
's390_guarded_storage',
's390_pci_mmio_read',
's390_pci_mmio_write',
's390_runtime_instr',
's390_sthyi',
}
if name.startswith('s390_') and name not in known:
return noprefix(name, 's390_')
return name
def have_syscall_table(self) -> bool:
# FIXME: This is not true, we do have a table, it just requires custom
# parsing. Move parsing logic in Arch class?
return False
def extract_syscall_vaddrs(self, vmlinux: ELF) -> Dict[int, int]:
symbol = vmlinux.symbols[self.syscall_table_name]
size = symbol.size
if size == 0:
# FIXME: In case of 32-bit (abi=='s390') we calculate the size of
# sys_call_table, but then look at sys_call_table_emu. Can we do any
# better?
# sys_call_table_emu immediately follows sys_call_table.
# See arch/s390/kernel/entry.S.
size = (vmlinux.symbols['sys_call_table_emu'].vaddr -
vmlinux.symbols['sys_call_table'].vaddr)
entry_size, format = 8, "Q"
entry0 = vmlinux.vaddr_read(symbol.vaddr, entry_size)
vaddr0, = struct.unpack(f">{format}", entry0)
text = vmlinux.sections[".text"]
if not (text.vaddr <= vaddr0 < text.vaddr + text.size):
# s390 before commit ff4a742dde3c stored vaddrs as ints, because
# they were guaranteed to be < 4G before relocatable kernel support
# was added.
entry_size, format = 4, "I"
count = size // entry_size
table = vmlinux.vaddr_read(symbol.vaddr, size)
vaddrs = struct.unpack(">" + format * count, table)
return dict(enumerate(vaddrs))
def syscall_def_regexp(self, syscall_name: Optional[str]=None) -> Optional[str]:
if self.abi == 's390':
if syscall_name is None:
return r'\bCOMPAT_SYSCALL_WRAP\d\s*\('
else:
return rf'\bCOMPAT_SYSCALL_WRAP\d\s*\({syscall_name}\b'
else:
return None
================================================
FILE: src/systrack/arch/x86.py
================================================
import logging
from collections import defaultdict
from operator import itemgetter
from typing import Tuple, List, Dict, DefaultDict, Set, FrozenSet, Optional
from iced_x86 import Decoder, Instruction
from iced_x86.Mnemonic import Mnemonic, RET, CMP, TEST, JA, JAE, JB, JBE, JE, JNE
from iced_x86.OpKind import REGISTER
from ..elf import Symbol, ELF, E_MACHINE
from ..kconfig_options import VERSION_ZERO, VERSION_INF
from ..syscall import Syscall
from ..type_hints import KernelVersion
from ..utils import VersionedDict, noprefix
from .arch_base import Arch
class ArchX86(Arch):
name = 'x86'
kconfig = VersionedDict((
# Disable retpoline mitigations for better compiler compatibility
((4,15) , VERSION_INF, 'RETPOLINE=n' , []),
# kexec_load
((2,6,13), (2,6,19) , 'KEXEC=y' , ['EXPERIMENTAL=y']),
((2,6,19), VERSION_INF, 'KEXEC=y' , []),
# seccomp
((2,6,12), (2,6,24) , 'SECCOMP=y' , ['PROC_FS=y']),
((2,6,24), (5,10) , 'SECCOMP=y' , []),
# iopl, ioperm (x86 only)
((5,5) , VERSION_INF, 'X86_IOPL_IOPERM=y' , []),
# modify_ldt
((4,3) , VERSION_INF, 'MODIFY_LDT_SYSCALL=y', []),
((4,3) , VERSION_INF, 'MODIFY_LDT_SYSCALL=y', []),
))
kconfig_syscall_deps = VersionedDict((
(VERSION_ZERO, VERSION_INF, 'map_shadow_stack', 'X86_USER_SHADOW_STACK' ),
(VERSION_ZERO, VERSION_INF, 'pkey_alloc' , 'X86_INTEL_MEMORY_PROTECTION_KEYS'),
(VERSION_ZERO, VERSION_INF, 'pkey_free' , 'X86_INTEL_MEMORY_PROTECTION_KEYS'),
(VERSION_ZERO, VERSION_INF, 'pkey_mprotect' , 'X86_INTEL_MEMORY_PROTECTION_KEYS'),
))
# Numbers marked as "64" in syscall_64.tbl before v5.4 (when x64 and x32
# still shared the same table), which should therefore NOT be used in x32
# mode. These also include the (lower) x64 numbers for the misnumbered
# 512-547 syscalls.
#
# cat arch/x86/entry/syscalls/syscall_64.tbl | rg '\t64' | cut -f1
#
__bad_x32_numbers = {
13, 15, 16, 19, 20, 45, 46, 47, 54, 55, 59, 101, 127, 128, 129, 131,
134, 156, 174, 177, 178, 180, 205, 206, 209, 211, 214, 215, 222, 236,
244, 246, 247, 273, 274, 278, 279, 295, 296, 297, 299, 307, 310, 311,
322, 327, 328
}
def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bool = False):
super().__init__(kernel_version, abi, bits32)
assert self.abi in ('x64', 'ia32', 'x32')
# i386_defconfig and x86_64_defconfig don't exist before v2.6.24: need
# a different configuration in such case. We'll think about it when (if)
# we ever get to supporting such old kernels. Additionally, there were
# two directories under arch before v2.6.24 ("i386" and "x86_64"), so
# self.name should reflect that too too.
assert self.kernel_version >= (2,6,24), 'kernel too old, sorry!'
# Syscall tables are no longer guaranteed to exists since v6.9
# (see commit 1e3ad78334a69b36e107232e337f9d693dcc9df2). We will
# determine later in adjust_abi() if we actually have a table for the
# selected ABI (in case of FTRACE_SYSCALLS=y we may have one).
if self.kernel_version < (6,9):
self.syscall_table_name = 'sys_call_table'
if not self.bits32:
if self.abi == 'ia32':
self.syscall_table_name = 'ia32_sys_call_table'
elif self.abi == 'x32' and self.kernel_version >= (5,4):
self.syscall_table_name = 'x32_sys_call_table'
else:
self.syscall_table_name = None
if self.abi == 'ia32':
self.syscall_num_reg = 'eax'
self.syscall_arg_regs = ('ebx', 'ecx', 'edx', 'esi', 'edi', 'ebp')
else:
self.syscall_num_reg = 'rax'
self.syscall_arg_regs = ('rdi', 'rsi', 'rdx', 'r10', 'r8', 'r9')
if self.bits32:
assert self.abi == 'ia32'
self.abi_bits32 = True
self.config_targets = ('i386_defconfig',)
# vm86 (x86 only, 32-bit only, no compat support in 64-bit kernels)
self.kconfig.add((2,6,16), (2,6,18) , 'VM86=y' , ['X86=y', 'EMBEDDED=y']),
self.kconfig.add((2,6,18), (2,6,24) , 'VM86=y' , ['EMBEDDED=y']),
self.kconfig.add((2,6,24), (4,3) , 'VM86=y' , ['X86_32=y', 'EXPERT=y']),
self.kconfig.add((4,3) , VERSION_INF, 'X86_LEGACY_VM86=y', ['X86_32=y']),
self.kconfig.add((4,3) , VERSION_INF, 'X86_LEGACY_VM86=y', ['X86_32=y']),
# Needed for NUMA=y (NUMA support dropped in v6.15)
self.kconfig.add(VERSION_ZERO, (6,15), 'NOHIGHMEM=n', [])
self.kconfig.add(VERSION_ZERO, (6,15), 'HIGHMEM4G=n', [])
self.kconfig.add(VERSION_ZERO, (6,15), 'HIGHMEM64G=y', [])
self.kconfig.add(VERSION_ZERO, (6,15), 'X86_BIGSMP=y', ['SMP=y'])
# mbind, migrate_pages, {get,set}_mempolicy
# NOTE: before v2.6.29 NUMA actually also needs more options in
# OR, but we don't support checking kconfig expressions
self.kconfig.add(VERSION_ZERO, (2,6,23), 'NUMA=y', ['SMP=y', 'HIGHMEM64G=y'])
self.kconfig.add((2,6,23) , (2,6,29), 'NUMA=y', ['SMP=y', 'HIGHMEM64G=y', 'EXPERIMENTAL=y'])
self.kconfig.add((2,6,29) , (6,15) , 'NUMA=y', ['SMP=y', 'HIGHMEM64G=y', 'X86_BIGSMP=y'])
else:
self.abi_bits32 = self.abi == 'ia32'
self.compat = self.abi != 'x64'
self.config_targets = ('x86_64_defconfig',)
if self.abi == 'x32':
# x32 syscalls have this bit set (__X32_SYSCALL_BIT)
self.syscall_num_base = 0x40000000
# x86-64 supports all ABIs: ia32, x64, x32. Enable all of them, we
# will be able to extract the right syscall table regardless.
self.kconfig.add(VERSION_ZERO, VERSION_INF, 'IA32_EMULATION=y', [])
self.kconfig.add((3,4) , (3,9) , 'X86_X32=y' , ['EXPERIMENTAL=y'])
self.kconfig.add((3,9) , (5,18) , 'X86_X32=y' , [])
self.kconfig.add((5,18) , VERSION_INF, 'X86_X32_ABI=y' , [])
# kexec_file_load
self.kconfig.add((3,17) , VERSION_INF, 'KEXEC_FILE=y', ['X86_64=y', 'CRYPTO=y', 'CRYPTO_SHA256=y'])
# mbind, migrate_pages, {get,set}_mempolicy
self.kconfig.add(VERSION_ZERO, (2,6,15) , 'NUMA=y', [])
self.kconfig.add((2,6,15) , (2,6,29) , 'NUMA=y', ['SMP=y'])
self.kconfig.add((2,6,29) , VERSION_INF, 'NUMA=y', ['SMP=y'])
# pkey_alloc, pkey_free, pkey_mprotect
# NOTE: in theory depends on (CPU_SUP_INTEL || CPU_SUP_AMD) but we
# are pretty sure that CPU_SUP_INTEL will be =y
self.kconfig.add((4,6) , VERSION_INF, 'X86_INTEL_MEMORY_PROTECTION_KEYS=y', ['X86_64=y', 'CPU_SUP_INTEL=y'])
# map_shadow_stack
# NOTE: depends on assembler support for WRUSS instruction
# (GNU binutils >= 2.31)
self.kconfig.add((6,6) , VERSION_INF, 'X86_USER_SHADOW_STACK=y', ['AS_WRUSS=y'])
@staticmethod
def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
if vmlinux.e_machine == E_MACHINE.EM_386:
assert vmlinux.bits32, 'EM_386 64-bit? WAT'
elif vmlinux.e_machine == E_MACHINE.EM_X86_64:
assert not vmlinux.bits32, 'EM_X86_64 32-bit? WAT'
else:
return None
if vmlinux.bits32:
abis = ['ia32']
else:
abis = ['x64']
if 'ia32_sys_call_table' in vmlinux.symbols:
abis.append('ia32')
elif 'ia32_sys_call' in vmlinux.symbols:
# Since v6.9 no more tables, but we have this function instead
abis.append('ia32')
if 'x32_sys_call_table' in vmlinux.symbols:
abis.append('x32')
elif 'x32_sys_call' in vmlinux.symbols:
# Since v6.9 no more tables, but we have this function instead
abis.append('x32')
elif any('x32_compat_sys' in s for s in vmlinux.symbols):
# Before v5.4 x32 did NOT have its own table
abis.append('x32')
return vmlinux.bits32, abis
def matches(self, vmlinux: ELF) -> bool:
return (
vmlinux.e_machine == (E_MACHINE.EM_X86_64, E_MACHINE.EM_386)[self.bits32]
and vmlinux.bits32 == self.bits32
)
def adjust_abi(self, vmlinux: ELF):
if self.kernel_version < (6,9):
return
# Figure out if we have a syscall table (FTRACE_SYSCALLS=y) or not. The
# sys_call_table symbol represents the x64 table for 64-bit and the ia32
# table for 32-bit. There is no ia32 nor x32 table for 64-bit kernels.
if 'sys_call_table' in vmlinux.symbols and not self.compat:
self.syscall_table_name = 'sys_call_table'
__is_ia32_name = staticmethod(lambda n: n.startswith('__ia32_')) # __ia32_[compat_]sys_xxx
__is_x64_name = staticmethod(lambda n: n.startswith('__x64_')) # __x64_[compat_]sys_xxx
__is_x32_name = staticmethod(lambda n: n.startswith('__x32_')) # __x32_compat_sys_xxx
def _preferred_symbol(self, a: Symbol, b: Symbol) -> Optional[Symbol]:
# Try preferring the symbol with the right ABI in its prefix.
na, nb = a.name, b.name
if self.abi == 'ia32':
if self.__is_ia32_name(na): return a
if self.__is_ia32_name(nb): return b
if self.__is_x64_name(na): return a
if self.__is_x64_name(nb): return b
if not na.islower(): return b
if not nb.islower(): return a
return None
if self.abi == 'x32':
if self.__is_x32_name(na): return a
if self.__is_x32_name(nb): return b
if self.__is_x64_name(na): return a
if self.__is_x64_name(nb): return b
if self.__is_ia32_name(na): return b
if self.__is_ia32_name(nb): return a
if not na.islower(): return b
if not nb.islower(): return a
return None
def skip_syscall(self, sc: Syscall) -> bool:
# Syscalls 512 through 547 are historically misnumbered and x32 only,
# see comment in v5.10 arch/x86/entry/syscalls/syscall_64.tbl.
#
# x32 should only use the x32 numbers (512-547) ORed with the special
# __X32_SYSCALL_BIT, and NOT the x64 numbers for the same syscalls.
# x64 should use the x64 numbers and NOT the x32 numbers (512-547) for
# the same syscalls.
#
# The checks performed by the kernel (mostly in do_syscall_64() under
# arch/x86/entry/common.c) however are completely idiotic, and the fact
# that before v5.4 there is only one syscall table for both x64 and x32
# does not help: this makes it technically possible to mix up the
# numbers in funny ways.
#
# In fact, in v5.3, execve can be called using *four* different numbers
# from both x64 and x32 mode (determining which number/mode combination
# will result in rax=-EFAULT is left as an exercise to the reader):
#
# 1. 0x3b : the x64 number
# (techincally only correct for x64 mode)
# 2. 0x208 : the x32 number without __X32_SYSCALL_BIT set
# (techincally incorrect in both modes)
# 3. 0x4000003b: the x64 number with __X32_SYSCALL_BIT set
# (techincally incorrect in both modes)
# 4. 0x40000208: the x32 number with __X32_SYSCALL_BIT set
# (techincally only correct for x32 mode)
#
# In v5.4 (commit 6365b842aae4490ebfafadfc6bb27a6d3cc54757) a separate
# x32 syscall table was introduced to try and make things less
# confusing. After this commit, options 2 and 3 above give -ENOSYS,
# while 1 and 4 both work (again, try to guess which number/mode combo
# will result in rax=-EFAULT).
#
if self.abi == 'x64' and 512 <= sc.number <= 547:
# x64 cannot use x32 numbers even though they are in the table
return True
if self.abi == 'x32':
if self.kernel_version >= (5,4):
# We have our own table, anything we find there is acceptable
return False
if (sc.number & ~0x40000000) in self.__bad_x32_numbers:
# x32 should NOT use these!
return True
if self.abi == 'ia32':
# vm86 and vm86old are only available in 32-bit kernels, but might
# still be implemented as simple wrappers that print a warning to
# dmesg and return -ENOSYS in 64-bit kernels, so ignore them
if not self.bits32 and sc.number in (113, 166):
return True
# pkey_{alloc,free,mprotect} are available for compat ia32 on
# 64-bit, but not for 32-bit kernels (on x86 they depend X86_64=y),
# so avoid wasting time with these
if self.bits32 and sc.number in (380, 381, 382):
return True
return False
def _translate_syscall_symbol_name(self, sym_name: str) -> str:
# For whatever reason some syscalls are wrapped in assembly at the entry
# point e.g. in v4.0 stub_execve in arch/x86/kernel/entry_64.S or
# stub32_execve in arch/x86/ia32/ia32entry.S. These stubs with prefix
# "stub[32]_" make calls to the actual syscall function.
#
# Removing the prefix helps locate the actual syscall definition through
# source code grepping IFF they do not have any other prefix/suffix in
# the source (stub_fork -> fork -> easily find SYSCALL_DEFINE0(fork)).
#
# In some cases this is not enough though, because the actual function
# has another prefix: e.g. stub_rt_sigreturn, which calls
# sys_rt_sigreturn, defined as `asmlinkage long sys_rt_sigreturn`
# and not `asmlinkage long rt_sigreturn` or
# `SYSCALL_DEFINE0(rt_sigreturn)`. Kind of a bummer, but I don't really
# want to become insane to accomodate all these quirks.
return noprefix(sym_name, 'stub32_', 'stub_')
def _normalize_syscall_name(self, name: str) -> str:
# E.g. v5.18 COMPAT_SYSCALL_DEFINE1(ia32_mmap, ...)
return noprefix(name, 'ia32_', 'x86_', 'x32_')
def _dummy_syscall_code(self, sc: Syscall, vmlinux: ELF) -> Optional[bytes]:
# Check if the code of the syscall only consists of
# `MOV rax/eax, -ENOSYS/-EINVAL` followed by a RET or relative JMP and
# optionally preceded by an ENDBR64/32. E.G., lookup_dcookie in v6.3:
#
# <__x64_sys_lookup_dcookie>:
# f3 0f 1e fa endbr64
# 48 c7 c0 da ff ff ff mov rax,0xffffffffffffffda
# e9 74 8d 90 00 jmp ffffffff819b8b84 <__x86_return_thunk>
#
# TODO: relies on the symbol having a valid size (!= 0), improve?
sz = sc.symbol.size
if sz < 6 or sz > 16:
return None
orig = code = vmlinux.read_symbol(sc.symbol)
bad_imm = (b'\xda\xff\xff\xff', b'\xea\xff\xff\xff')
# endbr64/endbr32
if code.startswith(b'\xf3\x0f\x1e\xfa') or code.startswith(b'\xf3\x0f\x1e\xfb'):
code = code[4:]
sz -= 4
# 32-bit kernel
if code[:1] == b'\xb8' and code[1:5] in bad_imm: # mov eax, -ENOSYS/-EINVAL
if sz == 6 and code[5] == 0xc3: return orig # ret
if sz == 7 and code[5] == 0xeb: return orig # jmp rel8
if sz == 10 and code[5] == 0xe9: return orig # jmp rel32
# 64-bit kernel
if code[:3] == b'\x48\xc7\xc0' and code[3:7] in bad_imm: # mov rax, -ENOSYS/-EINVAL
if sz == 8 and code[7] == 0xc3: return orig # ret
if sz == 9 and code[7] == 0xeb: return orig # jmp rel8
if sz == 12 and code[7] == 0xe9: return orig # jmp rel32
return None
def __emulate_syscall_switch(self, func: Symbol, func_code: bytes) -> Optional[Tuple[DefaultDict[int,Set[int]],Set[Instruction]]]:
start = func.real_vaddr
end = func.real_vaddr + func.size
insns = list(Decoder(32 if self.bits32 else 64, func_code, ip=start))
# Register used to hold syscall number
nr_reg = None
# Assume first compared register holds syscall number
for insn in insns:
if insn.op_code().mnemonic in (CMP, TEST):
for i in range(insn.op_count):
if insn.op_kind(i) == REGISTER:
nr_reg = insn.op_register(i)
break
if nr_reg is not None:
break
if nr_reg is None:
logging.error('Could not find syscall number register')
return None
# Supported Jcc instructions
jccs = {JA, JAE, JB, JBE, JE, JNE}
# Maximum syscall number supported plus 1
nr_max = 0x1000
# Possible syscall numbers at a given address (instruction pointer)
nrs: DefaultDict[int,FrozenSet[int]] = defaultdict(frozenset, {start: frozenset(range(nr_max))})
# Candidate branches to syscall functions
candidate_insns: Set[Instruction] = set()
# Accumulate non-NOP skipped insns for logging/debugging purposes
skipped_insns: DefaultDict[Instruction,int] = defaultdict(int)
keep_going = True
iteration = 0
# Symbolically trace the function code to determine the possible syscall
# numbers and the instructions that lead to them
while keep_going:
iteration += 1
keep_going = False
invert_condition = False
mnemonic: Optional[Mnemonic] = None
last_cmp_immediate: Optional[int] = None
for insn in insns:
ip = insn.ip
next_ip = insn.next_ip
prev_mnemonic = mnemonic
mnemonic = insn.op_code().mnemonic
cur_nrs = nrs[ip]
# Only support a TEST that appears right before JE/JNE, which is
# functionally equal to a CMP with 0.
if prev_mnemonic == TEST and mnemonic not in (JE, JNE):
logging.error('Unsupported instruction after TEST: %#x: %r', ip, insn)
return None
if mnemonic == RET:
continue
if mnemonic == TEST:
if insn.op0_kind != REGISTER or insn.op1_kind != REGISTER:
logging.error('Unsupported TEST instruction %#x: %r', ip, insn)
return None
# Treat `TEST reg, reg` as `CMP reg, 0`. We make sure that
# this is the only possible case above.
last_cmp_immediate = 0
nrs[next_ip] |= cur_nrs
continue
if mnemonic == CMP:
if insn.op0_kind == REGISTER:
reg = insn.op0_register
imm_op_idx = 1
invert_condition = False
elif insn.op1_kind == REGISTER:
reg = insn.op1_register
imm_op_idx = 0
invert_condition = True
else:
# Should not happen, but guard against it anyway.
imm_op_idx = None
try:
last_cmp_immediate = insn.immediate(imm_op_idx)
except (ValueError, TypeError):
logging.error('Unsupported CMP instruction %#x: %r', ip, insn)
return None
if reg != nr_reg:
logging.error('Unexpected register in CMP instruction '
'%#x: %r', ip, insn)
return None
nrs[next_ip] |= cur_nrs
continue
new_taken_nrs = frozenset()
new_not_taken_nrs = frozenset()
if insn.is_jmp_short_or_near:
target_ip = insn.near_branch_target
new_taken_nrs = cur_nrs
elif insn.is_jcc_short_or_near:
if mnemonic not in jccs:
logging.error('Unsupported Jcc instruction %#x: %r', ip, insn)
return None
if last_cmp_immediate is None:
logging.error('No previous CMP/TEST instruction for Jcc: '
'%#x: %r', ip, insn)
return None
target_ip = insn.near_branch_target
if mnemonic == JA:
taken_filter = frozenset(range(last_cmp_immediate + 1, nr_max))
elif mnemonic == JAE:
taken_filter = frozenset(range(last_cmp_immediate, nr_max))
elif mnemonic == JB:
taken_filter = frozenset(range(last_cmp_immediate))
elif mnemonic == JBE:
taken_filter = frozenset(range(last_cmp_immediate + 1))
elif mnemonic == JE:
taken_filter = frozenset((last_cmp_immediate,))
elif mnemonic == JNE:
taken_filter = frozenset(range(0, last_cmp_immediate))
taken_filter |= frozenset(range(last_cmp_immediate + 1, nr_max))
new_taken_nrs = cur_nrs & taken_filter
new_not_taken_nrs = cur_nrs - taken_filter
if invert_condition:
new_taken_nrs, new_not_taken_nrs = new_not_taken_nrs, new_taken_nrs
elif insn.is_call_near:
target_ip = insn.near_branch_target
new_taken_nrs = cur_nrs
if start <= target_ip < end:
logging.error('%s calling itself??? %r', func.name, insn)
return None
else:
if iteration == 1 and not insn.op_code().is_nop:
skipped_insns[insn] += 1
# YOLO
nrs[next_ip] |= cur_nrs
continue
# We get here for JMP, Jcc and CALL near
if start <= target_ip < end:
# Branch target inside function
if target_ip < ip:
# Backward branch: new numbers may be added to the
# target instruction, but we are already past it. In
# such case, we'll need an additional iteration to
# propagate the information.
if not new_taken_nrs.issubset(nrs[target_ip]):
keep_going = True
else:
# Branch target outside function, assume it's a branch to a
# syscall function
candidate_insns.add(insn)
nrs[target_ip] |= new_taken_nrs
nrs[next_ip] |= new_not_taken_nrs
logging.info('Symbolic emulation done in %d iteration%s', iteration,
's'[:iteration ^ 1])
if skipped_insns:
n_skipped = sum(skipped_insns.values())
skipped = sorted(skipped_insns.items(), key=itemgetter(1, 0), reverse=True)
skipped = '; '.join((f'{i:r} (x{n})' for i, n in skipped))
logging.debug('Skipped %d instruction%s: %s', n_skipped,
's'[:n_skipped ^ 1], skipped)
return nrs, candidate_insns
def extract_syscall_vaddrs(self, vmlinux: ELF) -> Dict[int,int]:
# We need to go through a painful examination of the switch statement
# implemented by {x64,x32,ia32}_sys_call():
#
# #define __SYSCALL(nr, sym) case nr: return __x64_##sym(regs);
#
# long x64_sys_call(const struct pt_regs *regs, unsigned int nr)
# {
# switch (nr) {
# #include <asm/syscalls_64.h>
# default: return __x64_sys_ni_syscall(regs);
# }
# }
#
# The switch statement on the second argument is implemented as a binary
# search. Therefore, the generated instructions should simply be a bunch
# of CMP/Jcc/JMP. No other implementation is supported right now.
#
assert self.syscall_table_name is None
func_name = f'{self.abi}_sys_call'
sym = vmlinux.functions.get(func_name)
if sym is None:
logging.error('Could not find function %s', func_name)
return {}
if sym.size < 0x10:
logging.error('%s is too small (%d bytes)', sym.name, sym.size)
return {}
logging.info('Extracting syscalls from code of %s() at %#x', sym.name,
sym.real_vaddr)
res = self.__emulate_syscall_switch(sym, vmlinux.read_symbol(sym))
if res is None:
return {}
nrs, candidate_insns = res
vaddrs: Dict[int,int] = {}
found_default_case = False
for insn in candidate_insns:
# Guaranteed to have .near_branch_target by the code in
# __emulate_syscall_switch() above
vaddr = insn.near_branch_target
numbers = nrs[vaddr]
if len(numbers) == 0:
# This should never happen, bail out
logging.error('Empty set of syscall numbers for %#x (target of '
'%r). Unreachable!?', vaddr, insn)
return {}
if len(numbers) > 100:
logging.debug('Default switch case at %#x (reachable %d '
'times): %r => %#x is ni_syscall', insn.ip,
len(numbers), insn, vaddr)
if found_default_case:
logging.error('Multiple default switch cases!?')
return {}
found_default_case = True
continue
# Let the caller handle de-duplication in case a single vaddr can be
# reached by multiple syscall numbers
for nr in numbers:
if nr in vaddrs:
if vaddrs[nr] != vaddr:
logging.error('Number %d leads to multiple vaddrs!? '
'Got %#x and %#x. Bailing out!', nr, vaddrs[nr], vaddr)
return {}
continue
vaddrs[nr] = vaddr
return vaddrs
def syscall_def_regexp(self, syscall_name: Optional[str]=None) -> Optional[str]:
if self.abi != 'x32':
return None
if syscall_name is not None:
if syscall_name.startswith('sys32_x32_'):
return rf'\basmlinkage\s*(unsigned\s+)?\w+\s*{syscall_name}\s*\('
return rf'\basmlinkage\s*(unsigned\s+)?\w+\s*sys32_x32_{syscall_name}\s*\('
return r'\basmlinkage\s*(unsigned\s+)?\w+\s*sys32_x32_\w+\s*\('
================================================
FILE: src/systrack/elf.py
================================================
import re
from enum import IntEnum
from functools import lru_cache
from pathlib import Path
from struct import unpack
from operator import attrgetter
from collections import namedtuple
from typing import Union, Dict, Optional
from .utils import ensure_command
# Only EM_* macros relevant for vmlinux ELFs
class E_MACHINE(IntEnum):
EM_386 = 3 # x86
EM_MIPS = 8 # MIPS R3000 (32 or 64 bit)
EM_PPC = 20 # PowerPC 32-bit
EM_PPC64 = 21 # PowerPC 64-bit
EM_S390 = 22 # IBM S/390
EM_ARM = 40 # ARM 32-bit
EM_X86_64 = 62 # x86-64
EM_AARCH64 = 183 # ARM 64-bit
EM_RISCV = 243 # RISC-V
# Only EF_* macros that we actually use
class E_FLAGS(IntEnum):
EF_ARM_EABI_MASK = 0xff000000
Section = namedtuple('Section', ('name', 'vaddr', 'off', 'size'))
_Symbol = namedtuple('_Symbol', ('vaddr', 'real_vaddr', 'size', 'type', 'name'))
# NOTE: other code may assume that Symbol acts like a tuple. Think twice about
# making this a full-fledged class and not a subclass of namedtuple. Classes are
# not hashable and two classes only compare equal if they are both the exact
# same instance.
class Symbol(_Symbol):
'''Class representing an ELF symbol.
'''
def __repr__(s):
if s.real_vaddr == s.vaddr:
return f'Symbol("{s.name}" at 0x{s.vaddr:x}, type={s.type}, size=0x{s.size:x})'
else:
return f'Symbol("{s.name}" at 0x{s.vaddr:x} (real 0x{s.real_vaddr:x}), type={s.type}, size=0x{s.size:x})'
class ELF:
__slots__ = (
'path', 'file', 'bits32', 'big_endian', 'e_machine', 'e_flags',
'__sections', '__symbols', '__functions'
)
def __init__(self, path: Union[str,Path]):
self.path = Path(path)
self.file = self.path.open('rb')
self.__sections = None
self.__symbols = None
self.__functions = None
magic, ei_class, ei_data = unpack('<4sBB', self.file.read(6))
if magic != b'\x7fELF':
raise ValueError(f'Invalid ELF magic: {magic!r}')
if ei_class == 1:
self.bits32 = True
elif ei_class == 2:
self.bits32 = False
else:
raise ValueError(f'Invalid ELF e_ident[EI_CLASS]: {ei_class}')
if ei_data == 1:
self.big_endian = False
elif ei_data == 2:
self.big_endian = True
else:
raise ValueError(f'Invalid ELF e_ident[EI_DATA]: {ei_data}')
unpack_endian = '<>'[self.big_endian]
assert self.file.seek(0x12) == 0x12
self.e_machine = unpack(unpack_endian + 'H', self.file.read(2))[0]
assert self.file.seek(0x24) == 0x24
self.e_flags = unpack(unpack_endian + 'L', self.file.read(4))[0]
@property
def sections(self) -> Dict[str,Section]:
if self.__sections is not None:
return self.__sections
# We actually only really care about SHT_PROGBITS or SHT_NOBITS
exp = re.compile(r'\s([.\w]+)\s+(PROGBITS|NOBITS)\s+([0-9a-fA-F]+)\s+([0-9a-fA-F]+)\s+([0-9a-fA-F]+)')
out = ensure_command(['readelf', '-WS', self.path])
secs = {}
for match in exp.finditer(out):
name, _, va, off, sz = match.groups()
secs[name] = Section(name, int(va, 16), int(off, 16), int(sz, 16))
self.__sections = secs
return secs
@property
def symbols(self) -> Dict[str, Symbol]:
if self.__symbols is None:
self.__extract_symbols()
return self.__symbols
@property
def functions(self) -> Dict[str, Symbol]:
if self.__functions is None:
self.__extract_symbols()
return self.__functions
@property
def has_debug_info(self) -> bool:
return '.debug_line' in self.sections
def __extract_symbols(self):
exp = re.compile(r'\d+:\s+([0-9a-fA-F]+)\s+(\d+)\s+(\w+).+\s+(\S+)$')
out = ensure_command(['readelf', '-Ws', self.path]).splitlines()
syms = {}
funcs = {}
for line in out:
match = exp.search(line)
if not match:
continue
vaddr, sz, typ, name = match.groups()
vaddr = real_vaddr = int(vaddr, 16)
# Unaligned vaddr on ARM 32-bit means the function code is in
# Thumb mode. Nonetheless, the actual code is aligned, so the
# real vaddr is a multiple of 2.
if self.e_machine == E_MACHINE.EM_ARM and typ == 'FUNC' and vaddr & 1:
real_vaddr &= 0xfffffffe
sym = Symbol(vaddr, real_vaddr, int(sz), typ, name)
syms[sym.name] = sym
if typ == 'FUNC':
funcs[sym.name] = sym
self.__symbols = syms
self.__functions = funcs
def vaddr_to_file_offset(self, vaddr: int) -> int:
for sec in self.sections.values():
if sec.vaddr <= vaddr < sec.vaddr + sec.size:
return sec.off + vaddr - sec.vaddr
raise ValueError('vaddr not in range of any known section')
def vaddr_read_string(self, vaddr: int) -> str:
off = self.vaddr_to_file_offset(vaddr)
assert self.file.seek(off) == off
data = self.file.read(1)
while data[-1]:
data += self.file.read(1)
return data[:-1].decode()
def vaddr_read(self, vaddr: int, size: int) -> bytes:
off = self.vaddr_to_file_offset(vaddr)
assert self.file.seek(off) == off
return self.file.read(size)
def read_symbol(self, sym: Union[str,Symbol]) -> bytes:
if not isinstance(sym, Symbol):
sym = self.symbols[sym]
return self.vaddr_read(sym.real_vaddr, sym.size)
@lru_cache(maxsize=128)
def next_symbol(self, sym: Symbol) -> Optional[Symbol]:
'''Find and return the symbol (if any) with the lowest real virtual
address higher than the one of sym.
'''
candidates = filter(lambda s: s.real_vaddr > sym.real_vaddr, self.symbols.values())
try:
return min(candidates, key=attrgetter('vaddr'))
except ValueError:
return None
================================================
FILE: src/systrack/kconfig.py
================================================
#
# Automatic kernel Kconfig configuration.
#
# This module contains utility functions to edit configuration options through
# the kernel's `scripts/config` script, plus all arch-agnostig Kconfig options
# needed.
#
import logging
from pathlib import Path
from typing import List, Dict, Iterable, Optional
from .arch import Arch
from .kconfig_options import *
from .type_hints import KernelVersion
from .utils import anyprefix, ensure_command
def kconfig_debugging(kernel_version: KernelVersion) -> List[str]:
return KCONFIG_DEBUGGING[kernel_version]
def kconfig_compatibility(kernel_version: KernelVersion) -> List[str]:
return KCONFIG_COMPATIBILITY[kernel_version]
def kconfig_more_syscalls(kernel_version: KernelVersion) -> Dict[str,List[str]]:
return KCONFIG_MORE_SYSCALLS[kernel_version]
def kconfig_syscall_deps(syscall_name: str, kernel_version: KernelVersion, arch: Arch) -> str:
opt = arch.kconfig_syscall_deps[kernel_version].get(syscall_name)
opt = opt or KCONFIG_SYSCALL_DEPS[kernel_version].get(syscall_name)
return ('CONFIG_' + opt) if opt else None
def run_config_script(kdir: Path, config_file: Path, args: List[str]):
return ensure_command(['./scripts/config', '--file', config_file] + args, cwd=kdir)
class Kconfig:
file: Path
kdir: Path
config: Dict[str,Optional[str]]
__slots__ = ['file', 'kdir', 'config']
def __init__(self, file: Path, kdir: Path):
self.file = file
self.kdir = kdir
self.config = {}
lines = map(str.strip, self.file.open().readlines())
for line in lines:
# Unset is equivalent to =n, but keep track of it with None
if line.startswith('# CONFIG_') and line.endswith(' is not set'):
name = line[9:-11]
self.config[name] = None
continue
# Skip empty lines and comments
if not line or line.startswith('#'):
continue
name, val = line.split('=', 1)
assert name.startswith('CONFIG_')
self.config[name[7:]] = val
def get(self, name: str) -> Optional[str]:
'''Get the value of a config option given its name. Query scripts/config
in case it is not present in the config file. Return None if not set.
'''
try:
return self.config[name]
except KeyError:
# Option not explicitly set: try getting its default value
val = run_config_script(self.kdir, self.file, ['-s', name]).strip()
if val == 'undef':
val = None
self.config[name] = val
return val
def check(self, name: str, wanted: str) -> bool:
'''Check if two values are equal accounting for unset values and
treating them as =n.
'''
actual = self.get(name)
return wanted == ('n' if actual is None else actual)
def human_readable(self, name: str) -> str:
'''Return a human-readable representation for a config option and its
actual value.'''
val = self.get(name)
if val is None:
return f'CONFIG_{name} is undef'
return f'CONFIG_{name}={val}'
# TODO: auto check for choice menus to enable only one opt and disable others?
def kconfig_edit(config_file: Path, kdir: Path, options: Iterable[str]):
if not options:
return
args = []
for opt in options:
name, val = opt.split('=', 1)
if val == 'y':
args += ['-e', name]
elif val == 'n':
args += ['-d', name]
elif val == 'm':
args += ['-m', name]
else:
args += ['--set-val', name, val]
run_config_script(kdir, config_file, args)
# TODO: actually check deps parsing Kconfig instead of taking an hardcoded
# dictionary {opt: deps} which is error prone and very annoying to maintain.
def kconfig_check_with_deps(config_file: Path, kdir: Path, options: Dict[str,List[str]]):
config = Kconfig(config_file, kdir)
# TODO: check options that are set even though deps not set as intended?
for opt, deps in options.items():
opt_name, opt_wanted = opt.split('=', 1)
if config.check(opt_name, opt_wanted):
continue
bad_deps: List[str] = []
unsupported = False
# Something is not right, check dependencies for more insight...
for dep in deps:
dep_name, dep_wanted = dep.split('=', 1)
if config.check(dep_name, dep_wanted):
continue
# It's ok if we want to enable some config, but we cannot do it
# because the arch we are building for doesn't declare support
# for one of its dependencies
dep_actual = config.get(dep_name)
if dep_wanted != 'n' and (dep_actual is None or dep_actual == 'n'):
if anyprefix(dep_name, 'HAVE_', 'ARCH_HAS_', 'ARCH_SUPPORTS_'):
unsupported = True
logging.warning(config.human_readable(opt_name)
+ f' instead of ={opt_wanted}, likely because '
+ config.human_readable(dep_name))
continue
bad_deps.append(dep_name)
if unsupported:
continue
if bad_deps:
# Config does not match, likely because of deps
logging.error(config.human_readable(opt_name)
+ f' instead of ={opt_wanted}, likely because '
+ ', '.join(map(config.human_readable, bad_deps)))
else:
# Config does not match, but deps are ok (weird!)
logging.error(config.human_readable(opt_name)
+ f' instead of ={opt_wanted} (deps ok)')
def kconfig_debug_check(config_file: Path, kdir: Path, options: Iterable[str]):
config = Kconfig(config_file, kdir)
for opt in options:
opt_name, opt_wanted = opt.split('=', 1)
if config.check(opt_name, opt_wanted):
continue
# As of now we are quite lax here. We only use this to check for configs
# that we apply for compatibility or debugging and are not vital. Unlike
# kconfig_check_with_deps() above, encountering a mismatch is usually
# not an error.
logging.debug(config.human_readable(opt_name) + f' instead of ={opt_wanted}')
================================================
FILE: src/systrack/kconfig_options.py
================================================
#
# Kernels built by Systrack need to be configured with debug information (for
# file/line info) and with the most complete syscall table possible. In order to
# do this a lot of Kconfig options need to be set to the right value depending
# on the kernel version.
#
# Only arch-agnostic Kconfig options are present here. Arch-specific Kconfig
# options are defined separately in each `Arch` subclass in `arch/`.
#
# Versions for Kconfig options can be looked up using the online LKDDB:
# https://cateee.net/lkddb/web-lkddb/ - seriously a godsend for this job.
#
from .utils import VersionedList, VersionedDict
__all__ = [
'VERSION_ZERO', 'VERSION_INF',
'KCONFIG_DEBUGGING', 'KCONFIG_COMPATIBILITY',
'KCONFIG_MORE_SYSCALLS', 'KCONFIG_SYSCALL_DEPS'
]
# We will probably never get even close to v2.6.12 (first tag in the main repo)
VERSION_ZERO = (2,6,12,)
VERSION_INF = (9999999999,)
# Kconfig options that help Systrack do its job. We don't check dependencies on
# other Kconfig options for these as they are all global and dependency-free.
# We can add another different VersionedDict in the future if the need arises.
#
# Motivations behind these:
#
# - DEBUG_INFO=y is obviously essential to have file and line number information
# in the vmlinux ELF. In v5.12 a multiple choice menu for the DWARF version
# was added, and in v5.18 the choice DEBUG_INFO_NONE was added, making
# DEBUG_INFO no longer selectable by hand, but only automatically enabled when
# the choice is not DEBUG_INFO_NONE.
# - RELOCATABLE=n is essential to avoid relocations, which could result in the
# entire syscall table being relocatable, making it significantly more
# annoying to recover syscall symbols (don't really want to parse and apply
# relocations to be honest).
# - EXPERT=y is needed for various stuff incl. some arch-specific kconfigs
# - EXPERIMENTAL=y might also be useful for arch-specific stuff, though it's old
# so I haven't really *experimented* with it yet (lol)
# - FTRACE_SYSCALLS=y adds `__syscall_meta_xxx` structs for each syscall, which
# are very useful to extract signature info. It also helps on x86 since v6.9
# as syscall tables are not used anymore and `sys_call_table` is only
# generated for ftrace.
# - FTRACE=y is needed for FTRACE_SYSCALLS
#
# TODO: version for RELOCATABLE and RANDOMIZE_BASE depends on arch
# TODO: is optimizing for size and not performance useful?
# Enable CC_OPTIMIZE_FOR_SIZE (since 2.6.1) and disable
# CC_OPTIMIZE_FOR_PERFORMANCE (since 4.7) for that.
# TODO: Enable DEBUG_INFO_BTF? It generates BTF typeinfo, which may be useful.
# TODO: Enable NO_AUTO_INLINE?
#
KCONFIG_DEBUGGING = VersionedList((
# since removed in list of name=value
(VERSION_ZERO, (3,8) , ['EXPERIMENTAL=y']),
(VERSION_ZERO, VERSION_INF, ['DEBUG_KERNEL=y', 'DEBUG_INFO=y']),
(VERSION_ZERO, VERSION_INF, ['RELOCATABLE=n', 'RANDOMIZE_BASE=n']),
((2, 6, 30) , VERSION_INF, ['FTRACE_SYSCALLS=y']),
((2, 6, 31) , VERSION_INF, ['FTRACE=y']),
((2, 6, 36) , VERSION_INF, ['DEBUG_INFO_REDUCED=n']),
((2, 6, 38) , VERSION_INF, ['EXPERT=y']),
((5, 12) , VERSION_INF, ['DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y']),
((5, 18) , VERSION_INF, ['DEBUG_INFO_NONE=n']),
))
# Kconfig options that are not strictly needed for building or improving
# Systrack analysis, but which ease the build process by improving
# compiler/toolchain compatibility, removing unneeded build dependencies and
# disabling unneeded pieces of kernel code.
#
# Some of these are arch-specific, but we don't care as they will simply be
# ignored on the wrong arch.
#
KCONFIG_COMPATIBILITY = VersionedList((
# since removed in list of name=value
(VERSION_ZERO, VERSION_INF, ['USB=n']),
(VERSION_ZERO, (6,6) , ['EMBEDDED=n']),
((2,6,28) , VERSION_INF, ['WIRELESS=n']),
((2,6,32) , VERSION_INF, ['USB_SUPPORT=n', 'WLAN=n', 'NETDEVICES=n']),
((2,6,36) , VERSION_INF, ['SECURITY_APPARMOR=n']),
((3,7) , VERSION_INF, ['MODULE_SIG=n']),
((3,13) , VERSION_INF, ['SYSTEM_TRUSTED_KEYRING=n']),
((4,6) , VERSION_INF, ['STACK_VALIDATION=n']),
((4,14) , (4,15) , ['GUESS_UNWINDER=y', 'ORC_UNWINDER=n', 'FRAME_POINTER_UNWINDER=n']),
((4,15) , VERSION_INF, ['UNWINDER_GUESS=y', 'UNWINDER_ORC=n', 'UNWINDER_FRAME_POINTER=n']),
((5,2) , VERSION_INF, ['DEBUG_INFO_BTF=n']),
((5,15) , VERSION_INF, ['WERROR=n']),
))
# Kconfig options to enable optional syscalls. We want to build a kernel with as
# many syscalls as possible. These are some arch-agnostic config options to set
# in order to enable more syscalls. Arch-specific configs (or configs that are
# present in different kernel versions depending on the arch) are uner `arch/`.
#
# Notes on some of these:
#
# - CRYPTO_SHA256=y is needed for KEXEC_FILE
# - INOTIFY=y is needed for INOTIFY_USER (only from v2.6.18 to v2.6.28)
# - PCI=y is needed for pci syscalls and is arch-specific before v5.0 (with
# different dependencies too), but we can enable it here regardless as a
# sanity check
# - PROFILING=y is needed for PERF_EVENTS
# - QUOTA=y is needed for QUOTACTL, which should be auto-selected by QUOTA=y
# - SECCOMP was arch-specific before v5.10, then became arch-agnostic
# - SECURITY=y is needed for SECURITY_LANDLOCK
# - UID16 is technically arch-dependent before v2.6.16, but it's practically
# useless to differentiate between archs for this, the kernel Makefile will
# just remove it if unneeded
#
KCONFIG_MORE_SYSCALLS = VersionedDict((
# since removed in name=value dependencies
((3,18) , VERSION_INF, 'ADVISE_SYSCALLS=y' , []),
((2,6,28) , VERSION_INF, 'AIO=y' , []),
((2,6,19) , VERSION_INF, 'BLOCK=y' , ['EXPERT=y']),
((3,18) , VERSION_INF, 'BPF_SYSCALL=y' , []),
(VERSION_ZERO, (4,1) , 'BSD_PROCESS_ACCT=y' , []),
((4,1) , VERSION_INF, 'BSD_PROCESS_ACCT=y' , ['MULTIUSER=y']),
((3,3) , VERSION_INF, 'CHECKPOINT_RESTORE=y' , []),
((3,15) , VERSION_INF, 'CROSS_MEMORY_ATTACH=y', ['MMU=y']),
(VERSION_ZERO, VERSION_INF, 'CRYPTO_SHA256=y' , []),
((2,6,36) , VERSION_INF, 'FANOTIFY=y' , []),
((2,6,39) , VERSION_INF, 'FHANDLE=y' , []),
# TODO: FUTEX depends on !(SPARC32 && SMP), but we do not support
# expressions to check kconfig dependencies :(
(VERSION_ZERO, VERSION_INF, 'FUTEX=y' , []),
(VERSION_ZERO, VERSION_INF, 'INET=y' , []),
((2,6,13) , (2,6,29) , 'INOTIFY=y' , []),
((2,6,18) , VERSION_INF, 'INOTIFY_USER=y' , []),
((5,1) , VERSION_INF, 'IO_URING=y' , ['EXPERT=y']),
((5,12) , VERSION_INF, 'KCMP=y' , ['EXPERT=y']),
(VERSION_ZERO, VERSION_INF, 'KEYS=y' , []),
((4,3) , VERSION_INF, 'MEMBARRIER=y' , []),
((4,18) , (6,6) , 'MEMFD_CREATE=y' , []),
((6,6) , VERSION_INF, 'MEMFD_CREATE=y' , ['EXPERT=y']),
# TODO: MIGRATION depends on (NUMA || ARCH_ENABLE_MEMORY_HOTREMOVE || COMPACTION || CMA) && MMU
# but we do not support expressions to check kconfig dependencies :(
((2,6,16) , VERSION_INF, 'MIGRATION=y' , ['MMU=y']),
(VERSION_ZERO, VERSION_INF, 'MODULE_UNLOAD=y' , []),
(VERSION_ZERO, VERSION_INF, 'MODULES=y' , []),
(VERSION_ZERO, VERSION_INF, 'NET=y' , []),
(VERSION_ZERO, (2,6,29) , 'NFSD=y' , ['INET=y']),
# Though NSFD still exists, nfsservctl was removed in 3.1, so it's pointless
# to enable it past that
((2,6,29) , (3,1) , 'NFSD=y' , ['INET=y', 'FILE_LOCKING=y', 'FSNOTIFY=y']),
((2,6,32) , VERSION_INF, 'PROFILING=y' , []),
(VERSION_ZERO, VERSION_INF, 'PERF_EVENTS=y' , ['HAVE_PERF_EVENTS=y']),
(VERSION_ZERO, (5,0) , 'PCI=y' , []),
((5,0) , VERSION_INF, 'PCI=y' , ['HAVE_PCI=y']),
(VERSION_ZERO, VERSION_INF, 'POSIX_MQUEUE=y' , ['NET=y']),
((4,10) , VERSION_INF, 'POSIX_TIMERS=y' , ['EXPERT=y']),
((2,6,30) , VERSION_INF, 'QUOTA=y' , []),
((4,18) , VERSION_INF, 'RSEQ=y' , ['HAVE_RSEQ=y']),
((5,10) , VERSION_INF, 'SECCOMP=y' , ['HAVE_ARCH_SECCOMP=y']),
((5,14) , (6,2) , 'SECRETMEM=y' , ['ARCH_HAS_SET_DIRECT_MAP=y', 'EMBEDDED=n']),
((6,2) , VERSION_INF, 'SECRETMEM=y' , ['ARCH_HAS_SET_DIRECT_MAP=y']),
(VERSION_ZERO, (4,1) , 'SECURITY=y' , ['SYSFS=y']),
((4,1) , VERSION_INF, 'SECURITY=y' , ['SYSFS=y', 'MULTIUSER=y']),
((5,13) , (6,5) , 'SECURITY_LANDLOCK=y' , ['SECURITY=y', 'ARCH_EPHEMERAL_INODES=n']),
((6,5) , VERSION_INF, 'SECURITY_LANDLOCK=y' , ['SECURITY=y']),
((3,16) , VERSION_INF, 'SGETMASK_SYSCALL=y' , []),
((2,6,22) , VERSION_INF, 'SIGNALFD=y' , ['EXPERT=y']),
(VERSION_ZERO, (5,5) , 'SYSCTL_SYSCALL=y' , ['PROC_SYSCTL=y']),
((3,15) , VERSION_INF, 'SYSFS_SYSCALL=y' , []),
(VERSION_ZERO, VERSION_INF, 'SYSVIPC=y' , []),
(VERSION_ZERO, (2,6,16) , 'UID16=y' , []),
((2,6,16) , (4,1) , 'UID16=y' , ['EXPERT=y', 'HAVE_UID16=y']),
((4,1) , VERSION_INF, 'UID16=y' , ['EXPERT=y', 'HAVE_UID16=y', 'MULTIUSER=y']),
((4,3) , VERSION_INF, 'USERFAULTFD=y' , ['MMU=y']),
((3,15) , (6,16) , 'USELIB=y' , []),
))
# Keep track of which syscall depends on which config option. Since syscalls are
# uniquely named there is no issue in keeping track of arch-specific syscalls
# here too.
#
# NOTE: for syscalls that are gated behind different configs depending on arch,
# the .kconfig_syscall_deps attr of Arch subclasses overrides the entries here.
#
# This info is only to give a richer output (namely list the kconfig options
# needed to enable a certain syscall), it is not functional to the tool.
#
# 1. Most optional syscalls exist IF AND ONLY IF the corresponding config
# exists, so just set "since" VERSION_ZERO and "removed in" VERSION_INF for
# those.
# 2. If a certain syscall existed prior to it being put behind a config,
# set "since" to the first appearence of the config.
# 3. If a certain syscall was behind a config, but then the config was removed
# (while keeping the syscall), set "removed in" to the version the config was
# removed in.
# 4. If both point 2 and 3 above apply, then add 2+ entries for such a syscall.
#
KCONFIG_SYSCALL_DEPS = VersionedDict((
# since removed in syscall name depends on
(VERSION_ZERO, VERSION_INF, 'fadvise64' , 'ADVISE_SYSCALLS' ),
(VERSION_ZERO, VERSION_INF, 'fadvise64_64' , 'ADVISE_SYSCALLS' ), # 32-bit only
(VERSION_ZERO, VERSION_INF, 'madvise' , 'ADVISE_SYSCALLS' ),
(VERSION_ZERO, VERSION_INF, 'process_madvise' , 'ADVISE_SYSCALLS' ),
(VERSION_ZERO, VERSION_INF, 'io_setup' , 'AIO' ),
(VERSION_ZERO, VERSION_INF, 'io_destroy' , 'AIO' ),
(VERSION_ZERO, VERSION_INF, 'io_getevents' , 'AIO' ),
(VERSION_ZERO, VERSION_INF, 'io_submit' , 'AIO' ),
(VERSION_ZERO, VERSION_INF, 'io_cancel' , 'AIO' ),
(VERSION_ZERO, VERSION_INF, 'io_pgetevents' , 'AIO' ),
(VERSION_ZERO, VERSION_INF, 'ioprio_get' , 'BLOCK' ),
(VERSION_ZERO, VERSION_INF, 'ioprio_set' , 'BLOCK' ),
(VERSION_ZERO, VERSION_INF, 'bpf' , 'BPF_SYSCALL' ),
(VERSION_ZERO, VERSION_INF, 'acct' , 'BSD_PROCESS_ACCT' ),
((6,5) , VERSION_INF, 'cachestat' , 'CACHESTAT_SYSCALL' ),
(VERSION_ZERO, (5,12) , 'kcmp' , 'CHECKPOINT_RESTORE' ),
(VERSION_ZERO, VERSION_INF, 'process_vm_readv' , 'CROSS_MEMORY_ATTACH'),
(VERSION_ZERO, VERSION_INF, 'process_vm_writev' , 'CROSS_MEMORY_ATTACH'),
(VERSION_ZERO, VERSION_INF, 'epoll_create' , 'EPOLL' ),
(VERSION_ZERO, VERSION_INF, 'epoll_create1' , 'EPOLL' ),
(VERSION_ZERO, VERSION_INF, 'epoll_ctl' , 'EPOLL' ),
(VERSION_ZERO, VERSION_INF, 'epoll_pwait' , 'EPOLL' ),
(VERSION_ZERO, VERSION_INF, 'epoll_pwait2' , 'EPOLL' ),
(VERSION_ZERO, VERSION_INF, 'epoll_wait' , 'EPOLL' ),
(VERSION_ZERO, VERSION_INF, 'name_to_handle_at' , 'FHANDLE' ),
(VERSION_ZERO, VERSION_INF, 'open_by_handle_at' , 'FHANDLE' ),
(VERSION_ZERO, VERSION_INF, 'fanotify_init' , 'FANOTIFY' ),
(VERSION_ZERO, VERSION_INF, 'fanotify_mark' , 'FANOTIFY' ),
(VERSION_ZERO, VERSION_INF, 'fork' , 'MMU' ),
(VERSION_ZERO, VERSION_INF, 'futex' , 'FUTEX' ),
(VERSION_ZERO, VERSION_INF, 'futex_wait' , 'FUTEX' ),
(VERSION_ZERO, VERSION_INF, 'futex_waitv' , 'FUTEX' ),
(VERSION_ZERO, VERSION_INF, 'futex_wake' , 'FUTEX' ),
(VERSION_ZERO, VERSION_INF, 'futex_requeue' , 'FUTEX' ),
(VERSION_ZERO, VERSION_INF, 'get_robust_list' , 'FUTEX' ),
(VERSION_ZERO, VERSION_INF, 'set_robust_list' , 'FUTEX' ),
(VERSION_ZERO, VERSION_INF, 'inotify_add_watch' , 'INOTIFY_USER' ),
(VERSION_ZERO, VERSION_INF, 'inotify_init' , 'INOTIFY_USER' ),
(VERSION_ZERO, VERSION_INF, 'inotify_init1' , 'INOTIFY_USER' ),
(VERSION_ZERO, VERSION_INF, 'inotify_rm_watch' , 'INOTIFY_USER' ),
(VERSION_ZERO, VERSION_INF, 'io_uring_enter' , 'IO_URING' ),
(VERSION_ZERO, VERSION_INF, 'io_uring_setup' , 'IO_URING' ),
(VERSION_ZERO, VERSION_INF, 'io_uring_register' , 'IO_URING' ),
((5,12) , VERSION_INF, 'kcmp' , 'KCMP' ),
(VERSION_ZERO, VERSION_INF, 'kexec_load' , 'KEXEC' ),
(VERSION_ZERO, VERSION_INF, 'kexec_file_load' , 'KEXEC_FILE' ),
(VERSION_ZERO, VERSION_INF, 'add_key' , 'KEYS' ),
(VERSION_ZERO, VERSION_INF, 'keyctl' , 'KEYS' ),
(VERSION_ZERO, VERSION_INF, 'request_key' , 'KEYS' ),
(VERSION_ZERO, VERSION_INF, 'membarrier' , 'MEMBARRIER' ),
((4,18) , VERSION_INF, 'memfd_create' , 'MEMFD_CREATE' ),
(VERSION_ZERO, VERSION_INF, 'mincore' , 'MMU' ),
(VERSION_ZERO, VERSION_INF, 'mlock' , 'MMU' ),
(VERSION_ZERO, VERSION_INF, 'mlock2' , 'MMU' ),
(VERSION_ZERO, VERSION_INF, 'mlockall' , 'MMU' ),
(VERSION_ZERO, VERSION_INF, 'mprotect' , 'MMU' ),
(VERSION_ZERO, VERSION_INF, 'mseal' , 'MMU' ),
(VERSION_ZERO, VERSION_INF, 'msync' , 'MMU' ),
(VERSION_ZERO, VERSION_INF, 'munlock' , 'MMU' ),
(VERSION_ZERO, VERSION_INF, 'munlockall' , 'MMU' ),
(VERSION_ZERO, VERSION_INF, 'pkey_alloc' , 'MMU' ),
(VERSION_ZERO, VERSION_INF, 'pkey_free' , 'MMU' ),
(VERSION_ZERO, VERSION_INF, 'pkey_mprotect' , 'MMU' ),
(VERSION_ZERO, VERSION_INF, 'process_mrelease' , 'MMU' ),
(VERSION_ZERO, VERSION_INF, 'remap_file_pages' , 'MMU' ), # obsolete
((4,3) , VERSION_INF, 'modify_ldt' , 'MODIFY_LDT_SYSCALL' ), # x86 only
(VERSION_ZERO, VERSION_INF, 'delete_module' , 'MODULE_UNLOAD' ),
(VERSION_ZERO, VERSION_INF, 'init_module' , 'MODULES' ),
(VERSION_ZERO, VERSION_INF, 'finit_module' , 'MODULES' ),
((4,1) , VERSION_INF, 'capget' , 'MULTIUSER' ),
((4,1) , VERSION_INF, 'capset' , 'MULTIUSER' ),
((4,1) , VERSION_INF, 'setuid' , 'MULTIUSER' ),
((4,1) , VERSION_INF, 'setgid' , 'MULTIUSER' ),
((4,1) , VERSION_INF, 'setreuid' , 'MULTIUSER' ),
((4,1) , VERSION_INF, 'setregid' , 'MULTIUSER' ),
((4,1) , VERSION_INF, 'getresuid' , 'MULTIUSER' ),
((4,1) , VERSION_INF, 'setresuid' , 'MULTIUSER' ),
((4,1) , VERSION_INF, 'getresgid' , 'MULTIUSER' ),
((4,1) , VERSION_INF, 'setresgid' , 'MULTIUSER' ),
((4,1) , VERSION_INF, 'setfsuid' , 'MULTIUSER' ),
((4,1) , VERSION_INF, 'setfsgid' , 'MULTIUSER' ),
((4,1) , VERSION_INF, 'getgroups' , 'MULTIUSER' ),
((4,1) , VERSION_INF, 'setgroups' , 'MULTIUSER' ),
(VERSION_ZERO, VERSION_INF, 'accept' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'accept4' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'bind' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'connect' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'listen' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'getpeername' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'getsockname' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'getsockopt' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'recv' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'recvfrom' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'recvmsg' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'recvmmsg' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'send' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'sendmmsg' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'sendmsg' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'sendto' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'setsockopt' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'shutdown' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'socket' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'socketcall' , 'NET' ),
(VERSION_ZERO, VERSION_INF, 'socketpair' , 'NET' ),
(VERSION_ZERO, (3, 1) , 'nfsservctl' , 'NFSD' ), # dead
(VERSION_ZERO, VERSION_INF, 'mbind' , 'NUMA' ),
(VERSION_ZERO, VERSION_INF, 'migrate_pages' , 'MIGRATION' ),
(VERSION_ZERO, VERSION_INF, 'move_pages' , 'MIGRATION' ),
(VERSION_ZERO, VERSION_INF, 'get_mempolicy' , 'NUMA' ),
(VERSION_ZERO, VERSION_INF, 'set_mempolicy' , 'NUMA' ),
(VERSION_ZERO, VERSION_INF, 'set_mempolicy_home_node', 'NUMA' ),
(VERSION_ZERO, VERSION_INF, 'pciconfig_read' , 'PCI' ),
(VERSION_ZERO, VERSION_INF, 'pciconfig_write' , 'PCI' ),
(VERSION_ZERO, VERSION_INF, 'pciconfig_iobase' , 'PCI' ),
(VERSION_ZERO, VERSION_INF, 'perf_event_open' , 'PERF_EVENTS' ),
(VERSION_ZERO, VERSION_INF, 'mq_notify' , 'POSIX_MQUEUE' ),
(VERSION_ZERO, VERSION_INF, 'mq_open' , 'POSIX_MQUEUE' ),
(VERSION_ZERO, VERSION_INF, 'mq_timedreceive' , 'POSIX_MQUEUE' ),
(VERSION_ZERO, VERSION_INF, 'mq_timedsend' , 'POSIX_MQUEUE' ),
(VERSION_ZERO, VERSION_INF, 'mq_unlink' , 'POSIX_MQUEUE' ),
(VERSION_ZERO, VERSION_INF, 'mq_getsetattr' , 'POSIX_MQUEUE' ),
(VERSION_ZERO, VERSION_INF, 'timer_create' , 'POSIX_TIMERS' ),
(VERSION_ZERO, VERSION_INF, 'timer_delete' , 'POSIX_TIMERS' ),
(VERSION_ZERO, VERSION_INF, 'timer_getoverrun' , 'POSIX_TIMERS' ),
(VERSION_ZERO, VERSION_INF, 'timer_gettime' , 'POSIX_TIMERS' ),
(VERSION_ZERO, VERSION_INF, 'timer_settime' , 'POSIX_TIMERS' ),
(VERSION_ZERO, VERSION_INF, 'rtas' , 'PPC_RTAS' ), # powerpc only
(VERSION_ZERO, VERSION_INF, 'subpage_prot' , 'PPC_SUBPAGE_PROT' ), # powerpc 64-bit only
(VERSION_ZERO, VERSION_INF, 'quotactl' , 'QUOTACTL' ),
(VERSION_ZERO, VERSION_INF, 'quotactl_fd' , 'QUOTACTL' ),
(VERSION_ZERO, VERSION_INF, 'rseq' , 'RSEQ' ),
(VERSION_ZERO, VERSION_INF, 'lsm_get_self_attr' , 'SECURITY' ),
(VERSION_ZERO, VERSION_INF, 'lsm_list_modules' , 'SECURITY' ),
(VERSION_ZERO, VERSION_INF, 'lsm_set_self_attr' , 'SECURITY' ),
(VERSION_ZERO, VERSION_INF, 'landlock_create_ruleset', 'SECURITY_LANDLOCK' ),
(VERSION_ZERO, VERSION_INF, 'landlock_add_rule' , 'SECURITY_LANDLOCK' ),
(VERSION_ZERO, VERSION_INF, 'landlock_restrict_self' , 'SECURITY_LANDLOCK' ),
(VERSION_ZERO, VERSION_INF, 'seccomp' , 'SECCOMP' ),
(VERSION_ZERO, VERSION_INF, 'memfd_secret' , 'SECRETMEM' ),
(VERSION_ZERO, VERSION_INF, 'sgetmask' , 'SGETMASK_SYSCALL' ), # obsolete
(VERSION_ZERO, VERSION_INF, 'ssetmask' , 'SGETMASK_SYSCALL' ), # obsolete
(VERSION_ZERO, VERSION_INF, 'signalfd' , 'SIGNALFD' ),
(VERSION_ZERO, VERSION_INF, 'signalfd4' , 'SIGNALFD' ),
(VERSION_ZERO, VERSION_INF, 'spu_create' , 'SPU_FS' ), # powerpc only
(VERSION_ZERO, VERSION_INF, 'spu_run' , 'SPU_FS' ), # powerpc only
(VERSION_ZERO, VERSION_INF, 'swapon' , 'SWAP' ),
(VERSION_ZERO, VERSION_INF, 'swapoff' , 'SWAP' ),
(VERSION_ZERO, (5, 5) , 'sysctl' , 'SYSCTL_SYSCALL' ), # dead since v5.9
(VERSION_ZERO, VERSION_INF, 'sysfs' , 'SYSFS_SYSCALL' ), # obsolete
(VERSION_ZERO, VERSION_INF, 'ipc' , 'SYSVIPC' ),
(VERSION_ZERO, VERSION_INF, 'msgctl' , 'SYSVIPC' ),
(VERSION_ZERO, VERSION_INF, 'msgget' , 'SYSVIPC' ),
(VERSION_ZERO, VERSION_INF, 'msgrcv' , 'SYSVIPC' ),
(VERSION_ZERO, VERSION_INF, 'msgsnd' , 'SYSVIPC' ),
(VERSION_ZERO, VERSION_INF, 'semctl' , 'SYSVIPC' ),
(VERSION_ZERO, VERSION_INF, 'semget' , 'SYSVIPC' ),
(VERSION_ZERO, VERSION_INF, 'semop' , 'SYSVIPC' ),
(VERSION_ZERO, VERSION_INF, 'semtimedop' , 'SYSVIPC' ),
(VERSION_ZERO, VERSION_INF, 'shmat' , 'SYSVIPC' ),
(VERSION_ZERO, VERSION_INF, 'shmctl' , 'SYSVIPC' ),
(VERSION_ZERO, VERSION_INF, 'shmdt' , 'SYSVIPC' ),
(VERSION_ZERO, VERSION_INF, 'shmget' , 'SYSVIPC' ),
((3,17) , (4,18) , 'memfd_create' , 'TMPFS' ),
(VERSION_ZERO, VERSION_INF, 'chown16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'fchown16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'lchown16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'getuid16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'getgid16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'geteuid16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'getegid16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'getresuid16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'getresgid16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'getgroups16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'setuid16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'setgid16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'setreuid16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'setregid16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'setfsuid16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'setfsgid16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'setresuid16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'setresgid16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'setgroups16' , 'UID16' ), # legacy
(VERSION_ZERO, VERSION_INF, 'userfaultfd' , 'USERFAULTFD' ),
(VERSION_ZERO, VERSION_INF, 'uselib' , 'USELIB' ), # obsolete (32bit only?)
(VERSION_ZERO, (4,3) , 'vm86old' , 'VM86' ), # x86 32-bit only
(VERSION_ZERO, (4,3) , 'vm86' , 'VM86' ), # x86 32-bit only
((4,3) , VERSION_INF, 'vm86old' , 'X86_LEGACY_VM86' ), # x86 32-bit only, legacy
((4,3) , VERSION_INF, 'vm86' , 'X86_LEGACY_VM86' ), # x86 32-bit only, legacy
((5,5) , VERSION_INF, 'ioperm' , 'X86_IOPL_IOPERM' ), # x86 only
((5,5) , VERSION_INF, 'iopl' , 'X86_IOPL_IOPERM' ), # x86 only
))
================================================
FILE: src/systrack/kernel.py
================================================
import re
import logging
import struct
import atexit
from pathlib import Path
from time import monotonic
from os import sched_getaffinity
from operator import itemgetter, attrgetter
from collections import defaultdict, Counter
from typing import Tuple, List, Dict, Iterable, Iterator, Union, Any, Optional
from .arch import arch_from_name, arch_from_vmlinux
from .elf import ELF, Symbol, Section
from .kconfig import kconfig_edit, kconfig_check_with_deps, kconfig_debug_check
from .kconfig import kconfig_more_syscalls, kconfig_debugging
from .kconfig import kconfig_compatibility, kconfig_syscall_deps
from .location import extract_syscall_locations
from .log import log_verbosity
from .signature import extract_syscall_signatures
from .syscall import Syscall, common_syscall_symbol_prefixes
from .type_hints import KernelVersion
from .utils import ensure_command, maybe_rel, noprefix, run_command
class KernelError(RuntimeError):
pass
class KernelArchError(KernelError):
pass
class KernelELFError(KernelError):
pass
class KernelMultiABIError(KernelError):
pass
class KernelVersionError(KernelError):
pass
class KernelWithoutSymbolsError(KernelError):
pass
class Kernel:
__version = None
__version_source = None
__syscalls = None
__backup_makefile = None
__long_size = None
__long_pack_fmt = None
def __init__(self, arch_name: Optional[str] = None,
vmlinux: Optional[Path] = None, kdir: Optional[Path] = None,
outdir: Optional[Path] = None, rdir: Optional[Path] = None,
toolchain_prefix: Optional[str] = None):
if not kdir and not vmlinux:
raise ValueError('at least one of vmlinux or kdir is needed')
if arch_name is None and vmlinux is None:
raise ValueError('need vmlinux to determine arch if not supplied')
if vmlinux:
try:
self.vmlinux = ELF(vmlinux)
except ValueError as e:
raise KernelELFError(f'Bad vmlinux ELF: {e}') from e
else:
self.vmlinux = None
self.kdir = kdir
self.outdir = outdir
self.rdir = rdir
self.arch_name = arch_name
self.toolchain_prefix = toolchain_prefix
if self.vmlinux and not self.vmlinux.symbols:
raise KernelWithoutSymbolsError('Provided vmlinux ELF has no symbols')
if self.arch_name is None:
m = arch_from_vmlinux(self.vmlinux)
if m is None:
raise KernelArchError('Failed to detect kernel architecture/ABI')
arch_class, bits32, abis = m
if len(abis) > 1:
raise KernelMultiABIError('Multiple ABIs supported, need to '
'select one', arch_class, abis)
self.arch = arch_class(self.version, abis[0], bits32)
else:
self.arch = arch_from_name(self.arch_name, self.version)
if self.vmlinux:
if not self.arch.matches(self.vmlinux):
raise KernelArchError(f'Architecture {arch_name} does not '
'match provided vmlinux')
self.__long_size = (8, 4)[self.vmlinux.bits32]
self.__long_pack_fmt = '<>'[self.vmlinux.big_endian] + 'QL'[self.vmlinux.bits32]
@staticmethod
def version_from_str(s: str) -> KernelVersion:
m = re.match(r'(\d+)\.(\d+)(\.(\d+))?', s)
if not m:
return None
a, b, c = int(m.group(1)), int(m.group(2)), m.group(4)
return (a, b) if c is None else (a, b, int(c))
@staticmethod
def version_from_banner(banner: Union[str,bytes]) -> KernelVersion:
if isinstance(banner, bytes):
banner = banner.decode()
if not banner.startswith('Linux version '):
return None
return Kernel.version_from_str(banner[14:])
def __version_from_vmlinux(self) -> KernelVersion:
banner = self.vmlinux.symbols.get('linux_banner')
if banner is None:
return None
if banner.size:
banner = self.vmlinux.read_symbol(banner)
else:
banner = self.vmlinux.vaddr_read_string(banner.vaddr)
return self.version_from_banner(banner)
def __version_from_make(self) -> KernelVersion:
v = ensure_command('make kernelversion', self.kdir)
return self.version_from_str(v)
@property
def version(self) -> KernelVersion:
if self.__version is None:
if self.vmlinux:
self.__version = self.__version_from_vmlinux()
self.__version_source = 'vmlinux'
elif self.kdir:
# This could in theory be tried even if __version_from_vmlinux()
# fails... but if that fails there are probably bigger problems.
self.__version = self.__version_from_make()
self.__version_source = 'make'
if self.__version is None:
raise KernelVersionError('unable to determine kernel version')
return self.__version
@property
def version_str(self) -> str:
return '.'.join(map(str, self.version)) + f' (from {self.__version_source})'
@property
def version_tag(self) -> str:
a, b, c = self.version
if c == 0:
return f'v{a}.{b}'
return f'v{a}.{b}.{c}'
@property
def version_source(self) -> str:
if self.__version_source or self.version:
return self.__version_source
return None
@property
def can_extract_location_info(self):
return self.vmlinux.has_debug_info
@property
def can_extract_signature_info(self):
return (
'__start_syscalls_metadata' in self.vmlinux.symbols
or self.vmlinux.has_debug_info
)
@property
def syscalls(self) -> List[Syscall]:
if self.__syscalls is None:
self.__syscalls = self.__extract_syscalls()
return self.__syscalls
def __rel(self, path: Path) -> Path:
return maybe_rel(path, self.kdir)
def __unpack_long(self, vaddr: int) -> int:
return struct.unpack(self.__long_pack_fmt, self.vmlinux.vaddr_read(vaddr, self.__long_size))[0]
def __iter_unpack_vmlinux(self, fmt: str, off: int, size: int = None) -> Iterator[Tuple[Any, ...]]:
f = self.vmlinux.file
assert f.seek(off) == off
if size is None:
chunk_size = struct.calcsize(fmt)
while 1:
yield struct.unpack(fmt, f.read(chunk_size))
else:
yield from struct.iter_unpack(fmt, f.read(size))
def __iter_unpack_vmlinux_long(self, off: int, size: int = None) -> Iterator[int]:
yield from map(itemgetter(0), self.__iter_unpack_vmlinux(self.__long_pack_fmt, off, size))
def __unpack_syscall_table(self, tbl: Symbol, target_section: Section) -> List[int]:
tbl_file_off = self.vmlinux.vaddr_to_file_offset(tbl.vaddr)
# This is the section we would like the function pointers to point to,
# we'll warn or halt in case we find fptrs pointing outside
vstart = target_section.vaddr
vend = vstart + target_section.size
if tbl.size > 0x80:
logging.info('Syscall table (%s) is %d bytes, %d entries', tbl.name,
tbl.size, tbl.size // self.__long_size)
vaddrs = list(self.__iter_unpack_vmlinux_long(tbl_file_off, tbl.size))
# Sanity check: ensure all vaddrs are within the target section
for idx, vaddr in enumerate(vaddrs):
if not (vstart <= vaddr < vend):
logging.warning('Virtual address 0x%x idx %d is outside %s: '
'something is off!', vaddr, tbl.name, idx, target_section.name)
else:
# Apparently on some archs (e.g. MIPS, PPC) the syscall table symbol
# can have size 0. In this case we'll just warn the user and keep
# extracting vaddrs as long as they are valid, stopping at the first
# invalid one or at the next symbol we encounter.
logging.warning('Syscall table (%s) has bad size (%d), doing my '
' best to figure out when to stop', tbl.name, tbl.size)
cur_idx_vaddr = tbl.vaddr
boundary = self.vmlinux.next_symbol(tbl)
boundary = boundary.vaddr if boundary else float('inf')
vaddrs = []
for vaddr in self.__iter_unpack_vmlinux_long(tbl_file_off):
# Stop at the first vaddr pointing outside target_section
if not (vstart <= vaddr < vend):
break
# Stop if we collide with another symbol right after the syscall
# table (may be another syscall table e.g. the compat one)
if cur_idx_vaddr >= boundary:
break
vaddrs.append(vaddr)
cur_idx_vaddr += self.__long_size
logging.info('Syscall table seems to be %d bytes, %d entries',
cur_idx_vaddr - tbl.vaddr, len(vaddrs))
return vaddrs
def __syscall_vaddrs_from_syscall_table(self) -> Dict[int,int]:
tbl = self.vmlinux.symbols.get(self.arch.syscall_table_name)
if not tbl:
logging.critical('Unable to find %s symbol!',
self.arch.syscall_table_name)
return {}
logging.debug('Syscall table: %r', tbl)
# Read and parse the syscall table unpacking all virtual addresses it
# contains. Depending on arch, we might need to parse function
# descriptors for the function pointers in the syscall table.
text = self.vmlinux.sections['.text']
vaddrs = {}
if self.arch.uses_function_descriptors:
text_vstart = text.vaddr
text_vend = text_vstart + text.size
# Even if this arch uses function descriptors, we don't know if they
# are effectively used for function pointers in the syscall table.
# This needs to be tested, and in case they aren't used, we can
# fallback to "normal" parsing instead.
if not (text_vstart <= self.__unpack_long(tbl.vaddr) < text_vend):
logging.debug('Syscall table uses function descriptors')
opd = self.vmlinux.sections.get('.opd')
if not opd:
logging.critical('Arch uses function descriptors, but '
'vmlinux h
gitextract_4nemvxd8/
├── .editorconfig
├── .gitattributes
├── .github/
│ └── workflows/
│ ├── publish.yml
│ └── test.yml
├── .gitignore
├── CHANGELOG.md
├── LICENSE
├── README.md
├── assets/
│ ├── github-social-card.xcf
│ └── logo.xcf
├── pyproject.toml
├── src/
│ └── systrack/
│ ├── __init__.py
│ ├── __main__.py
│ ├── arch/
│ │ ├── __init__.py
│ │ ├── arch_base.py
│ │ ├── arm.py
│ │ ├── arm64.py
│ │ ├── mips.py
│ │ ├── powerpc.py
│ │ ├── riscv.py
│ │ ├── s390.py
│ │ └── x86.py
│ ├── elf.py
│ ├── kconfig.py
│ ├── kconfig_options.py
│ ├── kernel.py
│ ├── location.py
│ ├── log.py
│ ├── output.py
│ ├── signature.py
│ ├── syscall.py
│ ├── templates/
│ │ ├── syscall_table.css
│ │ ├── syscall_table.html
│ │ └── syscall_table.js
│ ├── type_hints.py
│ ├── utils.py
│ └── version.py
└── tests/
├── __init__.py
├── data/
│ ├── .gitignore
│ ├── Makefile
│ └── x86_no_table_syscall_handlers.s
├── test_arch_sanity.py
├── test_mips.py
├── test_powerpc.py
├── test_x86.py
└── utils.py
SYMBOL INDEX (222 symbols across 25 files)
FILE: src/systrack/__main__.py
function sigint_handler (line 19) | def sigint_handler(_, __):
function wrap_help (line 23) | def wrap_help(body: str) -> str:
function parse_args (line 30) | def parse_args() -> argparse.Namespace:
function instantiate_kernel (line 90) | def instantiate_kernel(*a, **kwa) -> Kernel:
function main (line 121) | def main() -> int:
FILE: src/systrack/arch/__init__.py
function arch_from_name (line 127) | def arch_from_name(name: str, kernel_version: KernelVersion) -> Arch:
function arch_from_vmlinux (line 133) | def arch_from_vmlinux(vmlinux: ELF) -> Optional[Tuple[Type[Arch],bool,Li...
FILE: src/systrack/arch/arch_base.py
class Arch (line 11) | class Arch(ABC):
method __init__ (line 64) | def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bo...
method __repr__ (line 69) | def __repr__(s):
method match (line 75) | def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
method matches (line 84) | def matches(self, vmlinux: ELF) -> bool:
method adjust_abi (line 92) | def adjust_abi(self, vmlinux: ELF):
method _preferred_symbol (line 99) | def _preferred_symbol(self, a: Symbol, b: Symbol) -> Optional[Symbol]:
method preferred_symbol (line 106) | def preferred_symbol(self, a: Symbol, b: Symbol) -> Symbol:
method symbol_is_ni_syscall (line 139) | def symbol_is_ni_syscall(self, sym: Symbol) -> bool:
method skip_syscall (line 167) | def skip_syscall(self, sc: Syscall) -> bool:
method _translate_syscall_symbol_name (line 180) | def _translate_syscall_symbol_name(self, sym_name: str) -> str:
method translate_syscall_symbol_name (line 186) | def translate_syscall_symbol_name(self, sym_name: str) -> str:
method _normalize_syscall_name (line 200) | def _normalize_syscall_name(self, name: str) -> str:
method normalize_syscall_name (line 210) | def normalize_syscall_name(self, name: str) -> str:
method _dummy_syscall_code (line 259) | def _dummy_syscall_code(self, sc: Syscall, vmlinux: ELF) -> Optional[b...
method is_dummy_syscall (line 268) | def is_dummy_syscall(self, sc: Syscall, vmlinux: ELF,
method adjust_syscall_number (line 300) | def adjust_syscall_number(self, number: int) -> int:
method have_syscall_table (line 307) | def have_syscall_table(self) -> bool:
method extract_syscall_vaddrs (line 312) | def extract_syscall_vaddrs(self, vmlinux: ELF) -> Dict[int,int]:
method extract_esoteric_syscalls (line 320) | def extract_esoteric_syscalls(self, vmlinux: ELF) -> List[EsotericSysc...
method syscall_def_regexp (line 333) | def syscall_def_regexp(self, syscall_name: Optional[str]=None) -> Opti...
FILE: src/systrack/arch/arm.py
class ArchArm (line 11) | class ArchArm(Arch):
method __init__ (line 25) | def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bo...
method match (line 59) | def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
method matches (line 76) | def matches(self, vmlinux: ELF) -> bool:
method adjust_abi (line 79) | def adjust_abi(self, vmlinux: ELF):
method _translate_syscall_symbol_name (line 92) | def _translate_syscall_symbol_name(self, sym_name: str) -> str:
method _normalize_syscall_name (line 99) | def _normalize_syscall_name(self, name: str) -> str:
method _dummy_syscall_code (line 106) | def _dummy_syscall_code(self, sc: Syscall, vmlinux: ELF) -> Optional[b...
method extract_esoteric_syscalls (line 123) | def extract_esoteric_syscalls(self, vmlinux: ELF) -> List[EsotericSysc...
method syscall_def_regexp (line 146) | def syscall_def_regexp(self, syscall_name: Optional[str]=None) -> Opti...
FILE: src/systrack/arch/arm64.py
class ArchArm64 (line 11) | class ArchArm64(Arch):
method __init__ (line 41) | def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bo...
method match (line 54) | def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
method matches (line 67) | def matches(self, vmlinux: ELF) -> bool:
method _preferred_symbol (line 70) | def _preferred_symbol(self, a: Symbol, b: Symbol) -> Optional[Symbol]:
method _normalize_syscall_name (line 78) | def _normalize_syscall_name(self, name: str) -> str:
method _dummy_syscall_code (line 83) | def _dummy_syscall_code(self, sc: Syscall, vmlinux: ELF) -> Optional[b...
FILE: src/systrack/arch/mips.py
class ArchMips (line 11) | class ArchMips(Arch):
method __init__ (line 23) | def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bo...
method match (line 88) | def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
method matches (line 104) | def matches(self, vmlinux: ELF) -> bool:
method _normalize_syscall_name (line 110) | def _normalize_syscall_name(self, name: str) -> str:
method _dummy_syscall_code (line 116) | def _dummy_syscall_code(self, sc: Syscall, vmlinux: ELF) -> Optional[b...
method syscall_def_regexp (line 146) | def syscall_def_regexp(self, syscall_name: Optional[str]=None) -> Opti...
FILE: src/systrack/arch/powerpc.py
class ArchPowerPC (line 13) | class ArchPowerPC(Arch):
method __init__ (line 57) | def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bo...
method match (line 140) | def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
method matches (line 168) | def matches(self, vmlinux: ELF) -> bool:
method _preferred_symbol (line 176) | def _preferred_symbol(self, a: Symbol, b: Symbol) -> Optional[Symbol]:
method skip_syscall (line 191) | def skip_syscall(self, sc: Syscall) -> bool:
method _translate_syscall_symbol_name (line 203) | def _translate_syscall_symbol_name(self, sym_name: str) -> str:
method _normalize_syscall_name (line 206) | def _normalize_syscall_name(self, name: str) -> str:
method _dummy_syscall_code (line 209) | def _dummy_syscall_code(self, sc: Syscall, vmlinux: ELF) -> Optional[b...
method adjust_syscall_number (line 267) | def adjust_syscall_number(self, number: int) -> int:
method extract_esoteric_syscalls (line 274) | def extract_esoteric_syscalls(self, vmlinux: ELF) -> List[EsotericSysc...
method syscall_def_regexp (line 343) | def syscall_def_regexp(self, syscall_name: Optional[str]=None) -> Opti...
FILE: src/systrack/arch/riscv.py
class ArchRiscV (line 10) | class ArchRiscV(Arch):
method __init__ (line 24) | def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bo...
method match (line 66) | def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
method matches (line 80) | def matches(self, vmlinux: ELF) -> bool:
method _preferred_symbol (line 86) | def _preferred_symbol(self, a: Symbol, b: Symbol) -> Optional[Symbol]:
FILE: src/systrack/arch/s390.py
class ArchS390 (line 12) | class ArchS390(Arch):
method __init__ (line 36) | def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bo...
method match (line 47) | def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
method matches (line 60) | def matches(self, vmlinux: ELF) -> bool:
method _preferred_symbol (line 63) | def _preferred_symbol(self, a: Symbol, b: Symbol) -> Optional[Symbol]:
method _translate_syscall_symbol_name (line 71) | def _translate_syscall_symbol_name(self, sym_name: str) -> str:
method _normalize_syscall_name (line 79) | def _normalize_syscall_name(self, name: str) -> str:
method have_syscall_table (line 97) | def have_syscall_table(self) -> bool:
method extract_syscall_vaddrs (line 102) | def extract_syscall_vaddrs(self, vmlinux: ELF) -> Dict[int, int]:
method syscall_def_regexp (line 127) | def syscall_def_regexp(self, syscall_name: Optional[str]=None) -> Opti...
FILE: src/systrack/arch/x86.py
class ArchX86 (line 18) | class ArchX86(Arch):
method __init__ (line 58) | def __init__(self, kernel_version: KernelVersion, abi: str, bits32: bo...
method match (line 145) | def match(vmlinux: ELF) -> Optional[Tuple[bool,List[str]]]:
method matches (line 175) | def matches(self, vmlinux: ELF) -> bool:
method adjust_abi (line 181) | def adjust_abi(self, vmlinux: ELF):
method _preferred_symbol (line 195) | def _preferred_symbol(self, a: Symbol, b: Symbol) -> Optional[Symbol]:
method skip_syscall (line 220) | def skip_syscall(self, sc: Syscall) -> bool:
method _translate_syscall_symbol_name (line 282) | def _translate_syscall_symbol_name(self, sym_name: str) -> str:
method _normalize_syscall_name (line 300) | def _normalize_syscall_name(self, name: str) -> str:
method _dummy_syscall_code (line 304) | def _dummy_syscall_code(self, sc: Syscall, vmlinux: ELF) -> Optional[b...
method __emulate_syscall_switch (line 341) | def __emulate_syscall_switch(self, func: Symbol, func_code: bytes) -> ...
method extract_syscall_vaddrs (line 522) | def extract_syscall_vaddrs(self, vmlinux: ELF) -> Dict[int,int]:
method syscall_def_regexp (line 601) | def syscall_def_regexp(self, syscall_name: Optional[str]=None) -> Opti...
FILE: src/systrack/elf.py
class E_MACHINE (line 14) | class E_MACHINE(IntEnum):
class E_FLAGS (line 26) | class E_FLAGS(IntEnum):
class Symbol (line 36) | class Symbol(_Symbol):
method __repr__ (line 39) | def __repr__(s):
class ELF (line 45) | class ELF:
method __init__ (line 51) | def __init__(self, path: Union[str,Path]):
method sections (line 86) | def sections(self) -> Dict[str,Section]:
method symbols (line 103) | def symbols(self) -> Dict[str, Symbol]:
method functions (line 109) | def functions(self) -> Dict[str, Symbol]:
method has_debug_info (line 115) | def has_debug_info(self) -> bool:
method __extract_symbols (line 118) | def __extract_symbols(self):
method vaddr_to_file_offset (line 147) | def vaddr_to_file_offset(self, vaddr: int) -> int:
method vaddr_read_string (line 153) | def vaddr_read_string(self, vaddr: int) -> str:
method vaddr_read (line 162) | def vaddr_read(self, vaddr: int, size: int) -> bytes:
method read_symbol (line 167) | def read_symbol(self, sym: Union[str,Symbol]) -> bytes:
method next_symbol (line 174) | def next_symbol(self, sym: Symbol) -> Optional[Symbol]:
FILE: src/systrack/kconfig.py
function kconfig_debugging (line 19) | def kconfig_debugging(kernel_version: KernelVersion) -> List[str]:
function kconfig_compatibility (line 22) | def kconfig_compatibility(kernel_version: KernelVersion) -> List[str]:
function kconfig_more_syscalls (line 25) | def kconfig_more_syscalls(kernel_version: KernelVersion) -> Dict[str,Lis...
function kconfig_syscall_deps (line 28) | def kconfig_syscall_deps(syscall_name: str, kernel_version: KernelVersio...
function run_config_script (line 33) | def run_config_script(kdir: Path, config_file: Path, args: List[str]):
class Kconfig (line 36) | class Kconfig:
method __init__ (line 42) | def __init__(self, file: Path, kdir: Path):
method get (line 63) | def get(self, name: str) -> Optional[str]:
method check (line 78) | def check(self, name: str, wanted: str) -> bool:
method human_readable (line 85) | def human_readable(self, name: str) -> str:
function kconfig_edit (line 94) | def kconfig_edit(config_file: Path, kdir: Path, options: Iterable[str]):
function kconfig_check_with_deps (line 115) | def kconfig_check_with_deps(config_file: Path, kdir: Path, options: Dict...
function kconfig_debug_check (line 160) | def kconfig_debug_check(config_file: Path, kdir: Path, options: Iterable...
FILE: src/systrack/kernel.py
class KernelError (line 25) | class KernelError(RuntimeError):
class KernelArchError (line 28) | class KernelArchError(KernelError):
class KernelELFError (line 31) | class KernelELFError(KernelError):
class KernelMultiABIError (line 34) | class KernelMultiABIError(KernelError):
class KernelVersionError (line 37) | class KernelVersionError(KernelError):
class KernelWithoutSymbolsError (line 40) | class KernelWithoutSymbolsError(KernelError):
class Kernel (line 44) | class Kernel:
method __init__ (line 52) | def __init__(self, arch_name: Optional[str] = None,
method version_from_str (line 101) | def version_from_str(s: str) -> KernelVersion:
method version_from_banner (line 110) | def version_from_banner(banner: Union[str,bytes]) -> KernelVersion:
method __version_from_vmlinux (line 118) | def __version_from_vmlinux(self) -> KernelVersion:
method __version_from_make (line 130) | def __version_from_make(self) -> KernelVersion:
method version (line 135) | def version(self) -> KernelVersion:
method version_str (line 151) | def version_str(self) -> str:
method version_tag (line 155) | def version_tag(self) -> str:
method version_source (line 162) | def version_source(self) -> str:
method can_extract_location_info (line 168) | def can_extract_location_info(self):
method can_extract_signature_info (line 172) | def can_extract_signature_info(self):
method syscalls (line 179) | def syscalls(self) -> List[Syscall]:
method __rel (line 184) | def __rel(self, path: Path) -> Path:
method __unpack_long (line 187) | def __unpack_long(self, vaddr: int) -> int:
method __iter_unpack_vmlinux (line 190) | def __iter_unpack_vmlinux(self, fmt: str, off: int, size: int = None) ...
method __iter_unpack_vmlinux_long (line 201) | def __iter_unpack_vmlinux_long(self, off: int, size: int = None) -> It...
method __unpack_syscall_table (line 204) | def __unpack_syscall_table(self, tbl: Symbol, target_section: Section)...
method __syscall_vaddrs_from_syscall_table (line 254) | def __syscall_vaddrs_from_syscall_table(self) -> Dict[int,int]:
method __extract_syscalls (line 314) | def __extract_syscalls(self) -> List[Syscall]:
method __try_set_optimization_level (line 602) | def __try_set_optimization_level(self, lvl: int) -> bool:
method __restore_makefile (line 619) | def __restore_makefile(self):
method __edit_config (line 628) | def __edit_config(self, options: Iterable[str]):
method __edit_config_with_deps (line 640) | def __edit_config_with_deps(self, options: Dict[str,List[str]]):
method make (line 650) | def make(self, target: str, stdin=None, ensure=True) -> int:
method sync_config (line 677) | def sync_config(self):
method clean (line 687) | def clean(self):
method configure (line 691) | def configure(self):
method build (line 712) | def build(self, try_disable_opt: bool = False) -> float:
FILE: src/systrack/location.py
function addr2line (line 14) | def addr2line(elf: Path, addrs: Iterable[int]) -> Iterator[Tuple[Optiona...
function smart_addr2line (line 25) | def smart_addr2line(elf: Path, addrs: Iterable[int], srcdir: Path = None...
function grep_file (line 49) | def grep_file(root: Path, exp: re.Pattern, file: Path) -> Iterator[str]:
function grep_recursive (line 58) | def grep_recursive(root: Path, exp: re.Pattern, exclude: Set[str],
function grep_kernel_sources (line 72) | def grep_kernel_sources(kdir: Path, arch: Arch, syscalls: List[Syscall])...
function good_definition (line 145) | def good_definition(arch: Arch, definition: str, syscall_name: str) -> b...
function good_location (line 166) | def good_location(file: Path, line: int, arch: Arch, sc: Syscall) -> bool:
function adjust_line (line 176) | def adjust_line(file: Path, line: int, sc: Syscall) -> int:
function extract_syscall_locations (line 237) | def extract_syscall_locations(syscalls: List[Syscall], vmlinux: ELF, arc...
FILE: src/systrack/log.py
function log_setup (line 12) | def log_setup(quietness: int, verbosity: int, colors: bool = True):
function log_verbosity (line 60) | def log_verbosity() -> bool:
function eprint (line 65) | def eprint(*a, **kwa):
FILE: src/systrack/output.py
class SyscallJSONEncoder (line 14) | class SyscallJSONEncoder(JSONEncoder):
method default (line 15) | def default(self, o):
function output_syscalls_text (line 30) | def output_syscalls_text(syscalls: Iterable[Syscall], spacing: int = 2):
function output_syscalls_json (line 76) | def output_syscalls_json(kernel: Kernel):
function output_syscalls_html (line 102) | def output_syscalls_html(kernel: Kernel):
function output_syscalls (line 129) | def output_syscalls(kernel: Kernel, fmt: str):
FILE: src/systrack/signature.py
function expand_macros (line 12) | def expand_macros(sig: Iterable[str], big_endian: bool) -> Iterator[str]:
function parse_signature (line 28) | def parse_signature(sig: str, big_endian: bool) -> Tuple[str, ...]:
function syscall_signature_from_source (line 41) | def syscall_signature_from_source(file: Path, line: int, big_endian: boo...
function extract_syscall_signatures (line 99) | def extract_syscall_signatures(syscalls: List[Syscall], vmlinux: ELF, ha...
FILE: src/systrack/syscall.py
class Syscall (line 7) | class Syscall:
method __init__ (line 18) | def __init__(self, index: int, number: int, name: str, origname: str,
method __repr__ (line 34) | def __repr__(s):
function common_syscall_symbol_prefixes (line 43) | def common_syscall_symbol_prefixes(names: List[str], threshold: int) -> ...
FILE: src/systrack/templates/syscall_table.js
function sortTable (line 3) | function sortTable(e) {
function highlightRow (line 36) | function highlightRow(e) {
FILE: src/systrack/utils.py
class VersionedDict (line 18) | class VersionedDict:
method __init__ (line 26) | def __init__(self, iterable: Optional[Iterable[Tuple[Hashable,Hashable...
method __getitem__ (line 41) | def __getitem__(self, version: Hashable) -> dict:
method _getversion (line 48) | def _getversion(self, version: Hashable) -> dict:
method add (line 57) | def add(self, vstart: Hashable, vend: Hashable, key: Hashable, value: ...
class VersionedList (line 69) | class VersionedList:
method __init__ (line 76) | def __init__(self, iterable: Optional[Iterable[Tuple[Hashable,Hashable...
method __getitem__ (line 91) | def __getitem__(self, version: Hashable) -> list:
method _getversion (line 98) | def _getversion(self, version: Hashable) -> list:
method add (line 107) | def add(self, vstart: Hashable, vend: Hashable, values: Iterable[Any]):
function maybe_rel (line 119) | def maybe_rel(path: Path, root: Path) -> Path:
function anyprefix (line 125) | def anyprefix(s: str, *pxs: str) -> bool:
function anysuffix (line 130) | def anysuffix(s: str, *sxs: str) -> bool:
function noprefix (line 135) | def noprefix(s: str, *pxs: str) -> str:
function nosuffix (line 145) | def nosuffix(s: str, *sxs: str) -> str:
function do_popen (line 155) | def do_popen(cmd: Union[AnyStr,Iterable[AnyStr]], cwd: Union[AnyStr,Path...
function command_argv_to_string (line 175) | def command_argv_to_string(cmd: Union[AnyStrOrPath,Iterable[AnyStrOrPath...
function run_command (line 197) | def run_command(cmd: Union[AnyStrOrPath,Iterable[AnyStrOrPath]],
function ensure_command (line 220) | def ensure_command(cmd: Union[AnyStrOrPath,Iterable[AnyStrOrPath]],
function command_available (line 261) | def command_available(name: AnyStr) -> bool:
function gcc_version (line 267) | def gcc_version(gcc_cmd: AnyStr) -> str:
function git_checkout (line 273) | def git_checkout(repo_dir: Union[AnyStr,Path], ref: AnyStr):
function format_duration (line 279) | def format_duration(s: float) -> str:
FILE: tests/test_arch_sanity.py
function test_arch_subclass_method_overrides (line 6) | def test_arch_subclass_method_overrides():
FILE: tests/test_mips.py
function test_dummy_syscall_64 (line 6) | def test_dummy_syscall_64():
FILE: tests/test_powerpc.py
function test_dummy_syscall_simple (line 6) | def test_dummy_syscall_simple():
function test_dummy_syscall_64 (line 33) | def test_dummy_syscall_64():
function test_dummy_syscall_32 (line 53) | def test_dummy_syscall_32():
function test_esoteric_fast_endian_switch_simple (line 68) | def test_esoteric_fast_endian_switch_simple():
function test_esoteric_fast_endian_switch_real (line 99) | def test_esoteric_fast_endian_switch_real():
FILE: tests/test_x86.py
function test_x86_no_table_extract_syscall_vaddrs (line 7) | def test_x86_no_table_extract_syscall_vaddrs():
FILE: tests/utils.py
class MockELF (line 9) | class MockELF:
method __init__ (line 13) | def __init__(self, big_endian: bool, symbols_with_code: Dict[Symbol,by...
method next_symbol (line 21) | def next_symbol(self, sym: Symbol) -> Union[Symbol,None]:
method vaddr_read (line 24) | def vaddr_read(self, vaddr: int, size: int) -> bytes:
method read_symbol (line 32) | def read_symbol(self, sym: Union[str,Symbol]) -> bytes:
function arch_is_dummy_syscall (line 38) | def arch_is_dummy_syscall(arch: Arch, big_endian: bool, code: bytes) ->...
function make_test_elf (line 44) | def make_test_elf(name: str) -> Path:
Condensed preview — 46 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (563K chars).
[
{
"path": ".editorconfig",
"chars": 220,
"preview": "root = true\n\n[*]\ncharset = utf-8\nindent_style = tab\nindent_size = 4\nend_of_line = lf\ninsert_final_newline = true\ntrim_tr"
},
{
"path": ".gitattributes",
"chars": 132,
"preview": "# Exclude assembly from linguist code stats (prevents GitHub from marking the\n# repository as >50% assembly).\n*.s lingui"
},
{
"path": ".github/workflows/publish.yml",
"chars": 874,
"preview": "name: Publish to PyPI\n\non:\n release:\n types:\n - published\n\n# Allow only one concurrent job\nconcurrency:\n group"
},
{
"path": ".github/workflows/test.yml",
"chars": 567,
"preview": "name: Test\n\non:\n push:\n branches:\n - main\n - dev\n workflow_call:\n\njobs:\n test:\n runs-on: ubuntu-22.04"
},
{
"path": ".gitignore",
"chars": 49,
"preview": "dist\nsystrack.egg-info\n__pycache__\n.pytest_cache\n"
},
{
"path": "CHANGELOG.md",
"chars": 8675,
"preview": "Systrack changelog\n==================\n\n\nv0.8\n----\n\nNew arch support: IBM Z-Architecture S390 64-bit and compat 32-bit, t"
},
{
"path": "LICENSE",
"chars": 35148,
"preview": " GNU GENERAL PUBLIC LICENSE\n Version 3, 29 June 2007\n\n Copyright (C) 2007 Free "
},
{
"path": "README.md",
"chars": 12540,
"preview": "Systrack\n========\n\n[![License][license-badge]](./LICENSE)\n[![GitHub actions workflow status][actions-badge]][actions-lin"
},
{
"path": "pyproject.toml",
"chars": 2038,
"preview": "[project]\nname = 'systrack'\ndescription = 'Linux kernel syscall implementation tracker'\nauthors = [{name = 'Marco Bonell"
},
{
"path": "src/systrack/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "src/systrack/__main__.py",
"chars": 10151,
"preview": "import argparse\nimport logging\nimport os\nimport signal\nimport sys\n\nfrom pathlib import Path\nfrom textwrap import TextWra"
},
{
"path": "src/systrack/arch/__init__.py",
"chars": 7253,
"preview": "import logging\nfrom typing import Optional, Type, Tuple, List\n\nfrom ..elf import ELF\nfrom ..type_hints import KernelVers"
},
{
"path": "src/systrack/arch/arch_base.py",
"chars": 13562,
"preview": "import logging\n\nfrom abc import ABC, abstractmethod\nfrom typing import Tuple, List, Dict, Optional, final\n\nfrom ..elf im"
},
{
"path": "src/systrack/arch/arm.py",
"chars": 5729,
"preview": "from typing import Tuple, List, Optional\n\nfrom ..elf import ELF, E_MACHINE, E_FLAGS\nfrom ..kconfig_options import VERSIO"
},
{
"path": "src/systrack/arch/arm64.py",
"chars": 3627,
"preview": "from typing import Tuple, List, Optional\n\nfrom ..elf import Symbol, ELF, E_MACHINE\nfrom ..kconfig_options import VERSION"
},
{
"path": "src/systrack/arch/mips.py",
"chars": 5768,
"preview": "from typing import Tuple, List, Optional\n\nfrom ..elf import ELF, E_MACHINE\nfrom ..kconfig_options import VERSION_ZERO, V"
},
{
"path": "src/systrack/arch/powerpc.py",
"chars": 13603,
"preview": "from struct import iter_unpack\nfrom typing import Tuple, List, Optional\nfrom operator import itemgetter\n\nfrom ..elf impo"
},
{
"path": "src/systrack/arch/riscv.py",
"chars": 2889,
"preview": "from typing import Tuple, List, Optional\n\nfrom ..elf import Symbol, ELF, E_MACHINE\nfrom ..kconfig_options import VERSION"
},
{
"path": "src/systrack/arch/s390.py",
"chars": 4662,
"preview": "import re\nimport struct\nfrom typing import Tuple, List, Optional, Dict\n\nfrom ..elf import Symbol, ELF, E_MACHINE\nfrom .."
},
{
"path": "src/systrack/arch/x86.py",
"chars": 23075,
"preview": "import logging\nfrom collections import defaultdict\nfrom operator import itemgetter\nfrom typing import Tuple, List, Dict,"
},
{
"path": "src/systrack/elf.py",
"chars": 5375,
"preview": "import re\n\nfrom enum import IntEnum\nfrom functools import lru_cache\nfrom pathlib import Path\nfrom struct import unpack\nf"
},
{
"path": "src/systrack/kconfig.py",
"chars": 5565,
"preview": "#\n# Automatic kernel Kconfig configuration.\n#\n# This module contains utility functions to edit configuration options thr"
},
{
"path": "src/systrack/kconfig_options.py",
"chars": 26309,
"preview": "#\n# Kernels built by Systrack need to be configured with debug information (for\n# file/line info) and with the most comp"
},
{
"path": "src/systrack/kernel.py",
"chars": 24599,
"preview": "import re\nimport logging\nimport struct\nimport atexit\nfrom pathlib import Path\nfrom time import monotonic\nfrom os import "
},
{
"path": "src/systrack/location.py",
"chars": 16954,
"preview": "import logging\nimport re\nimport sys\n\nfrom operator import attrgetter\nfrom pathlib import Path\nfrom typing import Tuple, "
},
{
"path": "src/systrack/log.py",
"chars": 1942,
"preview": "import logging\nimport sys\n\n\n__all__ = ['log_setup', 'log_verbosity', 'eprint']\n\nSETUP_DONE = False\nVERBOSITY = 0\nSILENT_"
},
{
"path": "src/systrack/output.py",
"chars": 3853,
"preview": "import sys\n\nfrom itertools import starmap\nfrom json import JSONEncoder, dump\nfrom pathlib import Path\nfrom typing import"
},
{
"path": "src/systrack/signature.py",
"chars": 6505,
"preview": "import logging\n\nfrom operator import itemgetter\nfrom pathlib import Path\nfrom struct import unpack, iter_unpack\nfrom typ"
},
{
"path": "src/systrack/syscall.py",
"chars": 2083,
"preview": "from collections import Counter\nfrom pathlib import Path\nfrom typing import List\n\nfrom .elf import Symbol\n\nclass Syscall"
},
{
"path": "src/systrack/templates/syscall_table.css",
"chars": 1802,
"preview": ":root {\n\t--main-bg: white;\n\t--main-fg: black;\n\t--table-fg: black;\n\t--table-bg: white;\n\t--table-head-bg: #d7efff;\n\t--tabl"
},
{
"path": "src/systrack/templates/syscall_table.html",
"chars": 3035,
"preview": "<!DOCTYPE html>\n<html lang=\"en\">\n\t<head>\n\t\t<title>Linux {{kernel_version_tag}} {{arch}} {{bits}}-bit, {{'compat ' if com"
},
{
"path": "src/systrack/templates/syscall_table.js",
"chars": 1273,
"preview": "const table = document.getElementsByTagName('table')[0]\n\nfunction sortTable(e) {\n\tconst header = e.target\n\tconst idx "
},
{
"path": "src/systrack/type_hints.py",
"chars": 189,
"preview": "from typing import Union, Tuple, List, Optional\n\nKernelVersion = Union[Tuple[int],Tuple[int,int],Tuple[int,int,int]]\nEso"
},
{
"path": "src/systrack/utils.py",
"chars": 9560,
"preview": "import sys\nimport logging\n\nfrom collections import defaultdict\nfrom pathlib import Path\nfrom shlex import join as shlex_"
},
{
"path": "src/systrack/version.py",
"chars": 241,
"preview": "VERSION = '0.8'\nVERSION_COPY = '''\\\nCopyright (C) 2023-2025 Marco Bonelli\nLicensed under the GNU General Public License "
},
{
"path": "tests/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "tests/data/.gitignore",
"chars": 29,
"preview": "*\n!.gitignore\n!Makefile\n!*.s\n"
},
{
"path": "tests/data/Makefile",
"chars": 324,
"preview": "ASMS = $(wildcard *.s)\nBINS = $(ASMS:.s=)\n\n.PHONY: all clean\nall: $(BINS)\n\n# Need to link because GNU AS generates reloc"
},
{
"path": "tests/data/x86_no_table_syscall_handlers.s",
"chars": 268163,
"preview": ".section .text\n\n.globl x64_sys_call\n.type x64_sys_call @function\nx64_sys_call:\n\tendbr64\n\tcmp $0xbe,%esi\n\tje 0xfff"
},
{
"path": "tests/test_arch_sanity.py",
"chars": 646,
"preview": "import inspect\n\nfrom systrack.arch import Arch, ARCH_CLASSES\n\n\ndef test_arch_subclass_method_overrides():\n\t# Ensure that"
},
{
"path": "tests/test_mips.py",
"chars": 468,
"preview": "from systrack.arch import ArchMips\n\nfrom .utils import *\n\n\ndef test_dummy_syscall_64():\n\tfor abi in ('n64', 'n32', 'o32'"
},
{
"path": "tests/test_powerpc.py",
"chars": 4680,
"preview": "from systrack.arch import ArchPowerPC\n\nfrom .utils import *\n\n\ndef test_dummy_syscall_simple():\n\tassert arch_is_dummy_sys"
},
{
"path": "tests/test_x86.py",
"chars": 501,
"preview": "from systrack.arch import ArchX86\nfrom systrack.elf import ELF\n\nfrom .utils import *\n\n\ndef test_x86_no_table_extract_sys"
},
{
"path": "tests/utils.py",
"chars": 1446,
"preview": "from pathlib import Path\nfrom subprocess import check_call\nfrom typing import Dict, Union\n\nfrom systrack.arch import Arc"
}
]
// ... and 2 more files (download for full content)
About this extraction
This page contains the full source code of the mebeim/systrack GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 46 files (523.5 KB), approximately 184.0k tokens, and a symbol index with 222 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.